There are three areas in Epic where lab data lives.
The lab result table contains records at the component level. So, each individual row represents a component of a lab panel or individual lab and its result.
The “orders” table. Most of these orders are coded with custom Epic codes and have to be custom mapped to standard concepts. The data are generally at the lab panel level, but sometimes a provider orders a single component. These do not contain results.
The billing tables. These are the billing records and are usually coded with CPT4 codes at the lab panel level. These do not contain results.
Lab panel data do not contain results because the test is for multiple different measurements per panel. So, results are only found in the Order Results table.
It’s been a while since I have dug into this source data, but when I was working directly with the data, we were unable to directly connect the billing data tables with the orders or results data. Using the patient identifier and the date to de-duplicate these two tables would be very messy, especially for hospitalized patients since they can have multiple labs the same day, one individual lab component can be contained in multiple panels, lab draws can be duplicated, clotted, re-run, etc. Order Results and Order Proc can be connected, but there are a few “what to do when ___” questions for these data.
Mark suggests doing an inner join between orders and results. But what if 13 out of 14 lab components of one label panel have a result, but 1 component doesn’t have a result? Do you remove the panel or keep it since it was performed, but 1 result remains unknown? What if it’s the opposite and 13 out of 14 lab components aren’t resulted? Also, how do you tell the ETL how many components are in a panel? This metadata is not part of the CPT4 code. I’m sure there are other IF/THEN questions to answer once you look into the data. I’ve also heard of use cases (note- this is not an OHDSI approved use case ) where researchers want to know which lab panels are ordered and it’s next to impossible to know if you only have the component data since components can be in multiple panels.
So, here’s a suggestion. Instead of de-duplicating, removing records and making your ETL more complex, alter the N3C view. Have one full PHI OMOP CDM with very minimal filters. Only remove clinical events which did not occur or implausible. Then from this master CDM, create different views for different purposes. De-Id, LDS, project specific, etc. The University of Colorado has been successfully using this method for a while. We bring in all our Flowsheet data which are mostly mapped to concept_id = 0, but then filter these data out when delivering data for different OMOP projects since concept_id = 0 has no meaning in the OHDSI world. We also filter out data based on our privacy agreements and other restrictions. It’s much easier to create views for individual projects, than alter the full ETL for every project.