OHDSI Home | Forums | Wiki | Github

How to Differentiate Closed Claims data sets from Open Claims data sets

I’m working on the modelling / etl for healthcare claims data to OMOP.
The data set has a mix of open claims (claims submitted by the providers, pre-adjudication) and closed claims (adjudicated claims provided by the payers).
I’m stuck on how to differentiate the open claims vs closed claims data in OMOP, which is important information to provide to researchers because closed claims are generally viewed as higher quality.

1st plan:
I was initially thinking that I could use the cost_type_concept_id to capture this, as shown below.
Concept Code Concept Interpretation
OMOP4822212 Provider System Open Claims
OMOP4822218 Payer system (Primary payer) Closed Claims
OMOP4822217 Payer system (Secondary payer) Closed Claims

I’m not sure that’s best though for two reasons -
A) Open (pre-adjudicated) claims should generally have something like a charged/billed, probably co-pay, and maybe allowed amount. Open claims don’t have a “paid amount” though and might not have any amount at the line level, so I’m not 100% sure that every open claims record would hit the cost table (at least for the claim line level in visit_detail).
B) Open vs closed claims seems more like it’s describing the visit_type, and it feels a bit like hiding that important descriptor off to the side if it lives in the cost table.

2nd plan:
I’m modelling every claim line (grain: header_id + line_number) to be a record in the visit_detail table and then will aggregate that up into visit_occurrence so that a visit_occurrence record is basically an episode of care. I think it makes most sense to capture the open claim vs closed claim differentiation on the visit_detail and visit_occurrence records themselves, probably by the visit_type_concept_id.

Here, though, I’m stuck again. I see concepts for medical/pharma/dental claim that all seem like they would represent a closed claim, but I don’t see any concepts that would represent an open medical/pharma/dental claim. It seems like I can add custom concepts to accomplish my goal, but I’m assuming there is a within-bounds way to do this already and I’m just not seeing it.

Can someone provide guidance on how this is typically approached?

To identify the provenance of a record in the CDM, utilize the domain(aka table)_type_concept_id field. The allowable concept_ids for this field are located here. And it looks like you found them:

Can you elaborate what exactly happens during adjudication? E.g., x items becomes y items (y < x, y=x-n) and n items are not-adjudicated.

Do you care about representing money issues (pre and post adjudication)? (items lost as not adjudicated).

Or you are focused on diagnoses (or procedures) pre and post adjudication? (items lost as not adjudicated).

I am partially trying to learn more about this issue from someone who is close to the data.

I am trying to see the later use case (analytical/research question)

  • Study what procedures or diagnoses end up in the n set? That is a very interesting question (as a healthcare consumer).

  • Which payors reject which items?

The difference in quality is two-fold:

  • There is substantial amount of fraudulent claims which payers are trying to weed out. Not sure how successful they are with that. Probably, nobody knows.
  • There are Open Claims that cannot be assigned to a patient, because the identifiable information (name, subscription ID, dob etc.) is wrong. Which results in Open Claims having a substantial amount of semi-duplicates (claims that get reclaimed after they are returned by the payer as erroneous). But these claims tend to be “solitaires”, i.e. the only record for that “patient”. For our type of longitudinal research they never make it into meaningful cohorts, so in reality Open Claims work much better than expected.

The claims that are nominally correct but get rejected for violating a billing rule are actually a quality issue of the Closed Claims. Billing rules are notoriously opaque. They change all the time, and private payers have different (but similar) rules from the Medicare/Medicaid rule set.

And yes, the Open Claims have charges, and the Closed Claims contain what was paid. The charges in the Open Claims tend to be pretty useless, as they often are fantasy prices ignoring what was negotiated between payers and providers. The payers just pay what’s due.

These shouldn’t be just in the COST table, but in all the clinical event tables that get populated from a claim.

That’s not OMOP CDM. It does NOT model claims. It models what happened to a patient. So, if the patient had a hospital visit, and the hospital (facility claim) and a bunch of physician providers or provider groups claimed either fee for service or aggregated (DRG type) claims, you need to consolidate that into one VISIT_OCCURRENCE record. It should last from admission to discharge. The VISIT_DETAIL would point out what happened within that visit, i.e. move between different departments and functions.

Episodes of care could be placed into the EPISODE and EPISODE_EVENT tables.

But again, OMOP is not about reimbursement of healthcare service. You can create your own representation for that. Would be good to show the community what you did. Just don’t abuse the existing tables.

Alright, thanks for all the feedback. There are multiple things to address, but breaking it down into responses below.

@Vojtech_Huser :
I think Christian_Reich explained the Open/Closed difference and what happens during adjudication. The main thing I’m concerned with is that I will eventually be combining data from Open claims into the same place as data from Closed claims, and I need to have a clear way for researchers to know the provenance. I’m not the best person to speak to the different use cases for Open vs Closed, I just want to make sure that people can clearly tell which data they’re looking at.

@MPhilofsky thanks for confirming that those 3 OMOP codes would be how I call out the provenance in the cost table.


“These shouldn’t be just in the COST table, but in all the clinical event tables that get populated from a claim.”

This might be what I’m missing. Can we build out an example? Lets say I have a closed claims data source (e.g., Medicaid payer data) showing that a patient had a procedure that is 100% paid for by Medicaid.

Are you saying that the record in the procedure_occurrence table would have a value of ‘OMOP4822218’ (Payer system (Primary payer) Closed Claims) in the procedure_type_concept_id field?

Also, the corresponding row in the cost table would have the exact same value of ‘OMOP4822218’ in the cost_type_concept_id field? And this cost record would have cost_event_id linking to the procedure record (or possibly both the procedure_occurrence record and the cost record link to the same visit_detail id, tbd)?


You lost me here:

“That’s not OMOP CDM. It does NOT model claims.”

I misspoke when I said “episode of care”. I’m planning on visit_occurrence being aggregated up from visit_detail similar to the discussion in the decision time for visit_detail thread and this separate comment by Gowthan_Rao. The grain for visit_occurrence will be coarser than the claim_header and we’ll use temporal association so that a visit_occurrence is bound by Admission/Discharge/Transfer.
Does that correction address your concern?



You got it.

Good luck.

Okay great. That was super helpful and I think I understand perfectly. I have one last question to make sure I’m tracking provenance correctly for cost and the other tables.

Can you confirm that the below is accurate?

Consider below in the context of loading a schema from a single data source that is exclusively closed claims. If that data source flows into all of these tables, the “_type_concept_id” fields would be populated as shown to reflect the provenance.

Table Field for Identifying Provenance FK Class Appropriate Concept Code
CONDITION_OCCURRENCE condition_type_concept_id Type Concept OMOP4976940
COST cost_type_concept_id (Blank in docs, but implied as “Type Concept”) OMOP4976940, or OMOP4976941*
DEATH death_type_concept_id Type Concept OMOP4976940
DEVICE_EXPOSURE device_type_concept_id Type Concept OMOP4976940
DRUG_EXPOSURE drug_type_concept_id Type Concept OMOP4976940
EPISODE episode_type_concept_id Type Concept OMOP4976940
MEASUREMENT measurement_type_concept_id Type Concept OMOP4976940
NOTE note_type_concept_id Type Concept OMOP4976940
OBSERVATION observation_type_concept_id Type Concept OMOP4976940
OBSERVATION_PERIOD period_type_concept_id Type Concept OMOP4976940
PROCEDURE_OCCURRENCE procedure_type_concept_id Type Concept OMOP4976940
VISIT_DETAIL visit_detail_type_concept_id Type Concept OMOP4976940
VISIT_OCCURRENCE visit_type_concept_id Type Concept OMOP4976940

* in cases where there is a secondary payer, the cost table would get an additional row with the secondary payer code.

OMOP4976940 Payer system record (primary payer)

OMOP4976941 Payer system record (secondary payer)

Sounds good.