OHDSI Home | Forums | Wiki | Github

Retaining multiple providers per claim line in claims data


I’m relatively new to OMOP and am currently modeling transformations to claims data. I’m stuck on how to address multiple providers on a claim line and would appreciate feedback.

Context on Claims Data

Healthcare claims data is generally structured in a Header/Line format. From OMOP’s documentation, it seems like the standard is to model using visit_occurence and visit_detail, with each distinct claim_header existing as a record in visit_occurrence and each claim line existing as a record in visit_detail.

An important feature of claims data is that each line contains multiple providers - for example Medicaid/Medicare claims from CMS have “billing” providers and “rendering” providers. There might also be other types - e.g., billing, rendering, referring, prescribing, attending, operating.

Problem Statement

Problem TLDR: it looks like claims data requires additional Foreign Keys for providers and/or a provider_type modifier.

OMOP data structures and ETL guidance suggest choosing the most important provider and using that to populate provider_id. That seems insufficient for claims research - many use cases require visibility to the different providers.

I am looking for the best way to model claims data into OMOP without losing information on the multiple providers tied to each claim line.
I would appreciate guidance on the best practice here. I have listed a few ideas below, but am open to guidance on other (better) approaches.

Ideas on Path Forward

A. Leave provider_id null on the visit_occurrence and visit_detail tables. Instead, use the fact_relationship table as a bridge table linking the fact (visit_detail) to the dimension (provider). The fact_relationship table would need to have a custom vocabulary describing the relationship type as {billing_provider, rendering_provider, referring_provider}.
This seems feasible but unconventional, and potentially counterintuitive to researchers. The bridge table would be long - 3x as many records as the visit_detail table if data retains billing, rendering, and referring providers.

B. Leave provider_id null on the visit occurrence and visit_detail tables. For each visit_detail record, insert on additional child visit detail record per provider type linking to the appropriate providers. This seems to be what the OMOP docs suggest, but it introduces granularity challenges if researchers need to join visit_detail records in a 1:M parent-child fashion.

C. Supplement the standard visit_detail table with a sidecar table listing additional provider_ids. This supplemental table would look like:
{visit_detail_id, provider_1_id, provider_1_concept_id, provider_2_id, provider_2_concept_id, … maybe more FKs}. It would have a 1:1 relationship to the visit_detail table. There are some variations on this approach but the general idea is to make an entirely new non-standard table, which obviously impacts interoperability.

Hello @Dan_Angelelli and welcome to OHDSI!

The question you are asking of the data will inform how you should model them. What’s your use case for knowing the billing provider is different than the attending/referring/rendering provider? Are you studying Providers? Who treats more patients or brings in more revenue? Or?

Hi Melanie!

I’m building a model for a large research project that needs to support many different use cases for many different researchers. I’m hoping to lose as little information as possible because I don’t know all the use cases that will emerge.

I’m more on the engineering side, so am less familiar with specific research questions. That said, I could see an example research question looking at things like
• Analyzing how provider and patient demographic similarity/differences impact quality of care, which might need rendering and/or referring providers
• Some sort of classification of what type of visit based on the billing provider

I want to support the maximum number of use cases and generally I’m hoping to find a home for all of the data elements that researchers typically have access to from flattened or header/line style data sets.


You surely can do all sorts of acrobatics to shove the source data into the CDM. But @MPhilofsky’s question is valid: What is the purpose?

Generally, the OMOP CDM is a patient-centric model. So, it matters what happened to the patient. It is irrelevant to the patient who billed the payer, but it is very relevant who treated and refered to other providers.

So, probably the best approximation is to leave the referring provider in the visit, and put the rendering provider to the actual procedure (or whatever is claimed in the claim). Any FACT_RELATIONSHIP records will be nominally correct, but the analyst is hardly going to dig through that thicket.

Makes sense?

Yes, that makes sense although I still don’t love the idea of needing to “approximate” and lose a little information as a result of transforming the data.

I certainly hear your point though, @Christian_Reich and @MPhilofsky. Framing it as a patient-centric model is helping to conceptualize what’s going on.

I need to take this back for discussion to some folks on my team.

I think the patient centric evaluation criterion helps to limit the effort spent on defining a common data model.

The community also somewhat agreed that if a CMD user needed to extend the model for some extreme use case, it is nice to bring it back to the community.

So I am not arguing for extending the limits of CDM. Boundaries help in focusing.

Just re-emphasizing the community sharing of solutions to extreme (non patient centric) use cases and sharing solutions to those.

So @Dan_Angelelli, consider how would you locally extend the model to not loose any claims data. And how complex it is and the analytical SQL or other code that elegantly solves the ‘beyond current CDM’ or ‘beyond patient-centric’. We have examples of genomic-centric and oncology-centric adventures.

Health-services-research centric extension (optimization of care network, payment models) seems not far fetched for me. OMOP started with claims data and later did EHR and registries and … No reason not to return to what else in claims is low hanging fruit and useful (beyond patient centric).