Having and actively been implementing ETLs in three context - pre-hospital emergency medicine patient care records to OMOP, veterinary patients, and devices - my suggestion is to do it in steps. As many others have said on other threads to this question - there aren’t a universal ETL solution even when dealing with the same EHR system as everyone documents and customizes their EHR systems differently.
The approach I have settled on is to first populate the OMOP.person table and introduce a master_index table which provides a link between the patient’s true medical record number to the OMOP assigned person.person_id. Do one month or one year worth of data depending on your volume so it is easy to check and validate. Then move on to visits and care_sites. Again check and validate. Doing care sites then gets you into dealing with mapping and resolving variations in care site names to standard concept codes.
Then move on to drugs, then devices, then procedures, then observations and conditions. I found it also helpful to have a use case to work against as it helps focus the validation and checking as well as bring clinicians in to help with the validation and checking.
As for ETL implementation - I design and implement in Java. My data sources come from MS SQL databases, CSV format text flles, ASCII text, XML, and in all sorts of formats. The implementation choice greatly depends on your local software expertise and experience in utilizing various tools available from OHDSI and non-OHDSI.