@dantohe - welcome to OHDSI!
With Truven MarkeScan, being one of the most popular data set, you have multiple choices in OHDSI:
1) Use one of the open source ETL code sets, contributed by Janssen:
It is a great code base and ETL specs. You will be at the mercy of that fantastic team to share their code into OHDSI (which they do quite regularly). Based, on my knowledge, it is based on Microsoft SSIS platform but it might have changed.
2) Use one of the commercial vendors (including Odysseus where I am at) that do it for living. They can do the Truven MarkeScan OMOP CDM conversion for you using the latest and greatest Truven MarketScan ETL scripts, including support for OMOP CDM v5.3, latest THEMIS business rules, Truven specs and OMOP CDM vocabularies etc.. And btw, it is Hadoop / Spark based and can be easily plugged into any Hadoop based platform, including Amazon AWS EMR or Cloudera.
The Truven MarketScan data set, while being one of the most popular, is also one of the biggest, so not only some needs to know the details of OMOP CDM, OMOP Standardized Vocabs and various business rules (look up THEMIS) but also know how to handle a large data set like that, including keeping it up to date.
You get the idea, happy to discuss and provide more hints. Good luck!