Uses cases: For tracking health care resource utilization (Health plans) are interested in being able to calculate the following from OMOP common data model
Visits per 1,000 per year (Numerator: number of unique visits in a certain period Denominator: Person-time during that period)
Encounters per 1,000 per year (Numerator: number of unique encounters in a certain period Denominator: Person-time during that period)
Claims per 1,000 per year (Numerator: number of unique claims in a certain period Denominator: Person-time during that period)
(Inpatient) Days per 1,000 per year (Numerator: number of unique (inpatient) dates in a certain period Denominator: Person-time during that period)
(Inpatient) average length of stay (Numerator: total number of unique (inpatient) dates in a certain period Denominator: Number of persons during that period)
[Dates may be visit days, encounter days or claim days - if using from and to dates]
These are generally then stratified by site of service, condition/diagnosis (primary), primary procedure, patient demographics, etc. The metric and stratifications are used for reporting and visualization in Business Intelligence platforms.
I think health-plans would be very interested in being able to consistently and reproducibly measure this.
That is the problem. An encounter can be a 5 minute stop-by in a walk-in clinic to get a flu shot, or a month-long stay in a hospital with tons of diagnostic procedures and treatments. Within such a jumbo encounter you can have little encounters embedded, because those providers bill separately. And each encounter has a bunch of claims attached.
Bottom line: We have to either infer the true healthcare experience of the patient from how the claims are submitted, and that depends on how the provider is organized. Or ignore it and keep it the way it is (like in your use cases).
Or both. My proposal 2. is supporting that: Put your encounters in a new ENCOUNTER table where we dump everything as it is given to us. That’s where you would do your use cases. In the VISIT_OCCURRENCE table we infer the standard visits.
Can we use Visit - Encounter - Service (wordsmithing?), where visit is our visit occurrence table. Encounter is new (table). Service is represented in visit and/or encounter as a person-level concept for a datetime.
Ok - @Christian_Reich now i see clearly about this. Makes sense - there is precedence to represent inferred information in the CDM.
The more I think about it - i dont think we need another encounter-table. We could use the existing visit_occurrence table, and concept_id’s to represent the level (visit, micro-visit, mini-visit, encounter etc.).
visit_level_concept_id: A foreign key to the predefined concepts in the Visit Level Vocabulary reflecting the granularity concept of a nest of information. visit_level_relationship_concept_id: A foreign key to the predefined concepts (HL7’s partOf links) Vocabulary
related_level_visit_id: A foreign key to the VISIT_OCCURRENCE table record of the visit that is related to visit but not in the same level/nest. visit_level_definition_id: A foreign key to a new Definition table that records descriptions and syntax that led to the instantiation of the visit_occurrence record.
Alternatively, we could use an encounter table like proposed above - but why an additional table when visit_occurrence table is able to solve the use-cases?
The impetus for my original proposal at the top of this thread is to solve level of care, location of care and transfer of care within an inpatient visit in greater granularity than is currently available in the VISIT_OCCURRENCE table. The proposal @Gowtham_Rao links above is very thorough, but neglects my use case. Is there a way to solve everyone’s use case with one change?
Explanation: The level of care a patient received. Yes, the patient may continue to have the same provider in an ICU or on a standard medical unit, but the level of direct care (staffing ratio, certifications of direct care staff, complexity and number of procedures, measurements and drugs, etc.) given to a patient is vastly different. We’d like to look at outcomes associated with these different levels of care. We would also like to research attributes of patients that went from the ER > medical floor > ICU within a short period of time. What do they have in common? Maybe certain attributes should be considered in the level of care placement of a patient within an inpatient hospital stay.
Per @Gowtham_Rao Proposal and by others, a consequence of combining this new data with the current VISIT_OCCURRENCE table is “Performance impacts for extremely large data”. Those of us with EHR data will be adding a lot of data to our instances of the CDM. Not every query will be asking for this level of granularity, so I propose we keep the new table separate to lighten the load.
@MPhilofsky I think we should distinguish between a service and location - and that will probably show how the proposal addresses most of your use case.
The proposal does not address the use-cases that involve the concept of a ‘service’ . But it does address two of your three focus areas - it does not address level of care, it does address location of care and transfer of care.
An example: When i was training as an internist in the hospital, we would have many days when the ICU beds would be full. These days, if the ER is still open and a patient comes in to the ER who needs ICU level of care - it would make it so difficult for everyone. Since the ICU had no physical space, the patient would stay in the ER and the ICU nurse/physician would take care the patient in the ER. This is an example of ICU service in the ER location. In this example - care_site_id would capture the physical location i.e. ER, and service_concept_id would capture the ICU service. In real-world, there will be many such examples: such as dialysis service in a rehabilitation floor location, or surgery service in the ER location etc.
The visit_occurrence table with the proposal addresses the use cases listed that involve location or transfers between locations: e.g.
These are not addressed:
To address these we have to vet out what is a service and what is a level-of-care or care-intensity. We will have to introduce service_concept_id and have a new domain, concept_id etc for service. This is important, happy to support you on that work.
This is also addressed in the proposal - because the change is physical location (a micro visit).
Yes - I am no datamodeler, there are others who would know best. Its a trade-off
You make some good points. Were you the debate club captain in high school?
Our unit of analysis is physical location and level of care. Generally, the questions have asked about patients receiving ICU level care which includes the medical ICU, neuro ICU, etc. Sometimes they ask about a specific care_site (who received aaa procedure in the xxxICU vs the yyyICU). For us, the service_concept_id could be a way to group the differing levels of care (ICU care, acute care, rehab) and the care_site would be a specific site within a location. Unfortunately, the current vocabularies available for use aren’t granular enough to represent the care_site at the level our investigators require. The service_concept_id value set that the PEDSnet folks created would help us distinguish the level of care.
I’m going to let this rest while our friends mull it over. Hopefully, this can be finalized at the F2F so we can add this data to our instance of OMOP.
Yes, sounds like a good idea. Let me understand: Essentially, instead of a fixed Visit-Encounter-Procedure hierarchy we use one table and Concepts to clarify what it is, and the hierarchical relationships between these Concepts are predefined and live outside, correct?
If so, I like it. It’s no different than in DRUG_EXPOSURE. There, we have only one Concept for the Drug, but that concept tells us whether it is a product, or a Component, or an Ingredient. And you have the full hierarchical relationships to ask for “Give me all records with teatment with Ingredient XYZ”, and all you have to do is to join the CONCEPT_ANCESTOR table. We could do the same thing here: “Give me all the records with a hospitalization”, which would pull up hospital-based services.
Except: The derived DRUG_ERAs are in a different table, and there are predefined on the Ingredient level.
We may think of replicating this approach. Here is the reason: There are all these @MPhilofsky and @bailey use cases trying to figure out what goes right and wrong inside the hospital. But 90% of the use cases for the visit table is “Give me all patients with a hospitalization for Condition XYZ”, where “hospitalization” really is a measure of severity. @MPhilofsky has that even inside her hospital world.
So, how about this: We create a new table for any of the different levels Episode, Visit, Encounter, Service (not Procedure), and folks dump in there whatever they find, together wiht a Concept that defines the level. This is the equivalent of DRUG_EXPOSURE. And VISIT_OCCURRENCE gets inferred and cobbled together, and becomes (stays really) the equivalent of the DRUG_ERA. In all the event tables we add another field: newtable_occurrence_id, which is the equivalent of visit_occurrence_id.
A new table is fine I guess, but do we really have to add a new column to every data table (drug, condition, etc.)? It doesn’t meet the need anyway if the encounters and services are not hierarchical. Ie, you both switch services while on a floor and switch floors while on the service. Therefore one column is not enough to link to both encounters and services. Isn’t this what the fact_relationship table is supposed to do?
Similar issues are surfacing as we figure out storing microbiology results, although that one is more hierarchical. In our clinical data repository, we have “component” tables that nest with child rows pointing to parent rows within the table. It has worked for almost 30 years, but you have to infer the hierarchy. With fact_relationship we can be more explicit about the parents and children and it need not be a strict hierarchy. Nevertheless, everyone is nervous about relying on the fact_relationship table to store micro results, and I wonder if it is the same for visit information.
Your [proposal][1] makes me think more: Episodes. We have CPT4 concepts 0525F “Initial visit for episode (BkP)” and 0526F “Subsequent visit for episode (bkp)”, which could be used to build them! But we would have no place to have that inferred in a standardized way. Not even sure we can, really, given the data. But it would be cool.
Another thing: Services. We have all these CPT4 and HCPCS codes for services, which we really trash into the Observation table right now:
1013774 Home Services CPT4 CPT4 Hierarchy
1013775 New Patient Home Services CPT4 CPT4 Hierarchy
ET Emergency services HCPCS HCPCS Modifier
G0108 Diabetes outpatient self-management training services, individual, per 30 minutes HCPCS HCPCS
Yeah, I know, but it’s only one tiny little column.
Why is this a problem? One column connects the event to the VISIT_OCCURRENCE, the other one to NEWTABLE_OCCURRENCE. The data define the latter, and there is usually a one-to-one. In a way, we don’t have that in the DRUG_OCCURRENCE - DRUG_ERA world, where we don’t know which Occurrence belongs to which Era. Here, we would (indirectly).
The FACT_RELATIONSHIP table doesn’t really work, because you cannot join things without a extensive case when then hierarchy. Not efficient, terrible to write and read. Hence my hatred.
What happened to that proposal: Should we bring it up on the F-T-F?
Yes - and I think it generalizes our thinking and makes the table both future proof and backward compatible at the same time. By using concepts at the record level of visit_occurrence, and defining the hierarchy between concepts outside the visit_occurrence table – we can meet a lot of use-cases.
This “VISIT_OCCURRENCE gets inferred and cobbled together, and becomes (stays really) the equivalent of the DRUG_ERA” will be a source for confusion for a new-comers. Let me explain. “_ERA” is obviously calculated/inferred, i.e. any inferred table is an “_ERA” table. While _OCCURRENCE tables condition_occurrence, procedure_occurrence, drug_exposure, visit_occurrence may or many not be inferred and may have 1:1 lineage to source data. If we use visit_occurrence as the inferred table – it will making our table naming convention confusion.
So we use two tables:
visit_occurrence : with the intent to be as close to source data as possible including the enhancement proposals i have made*, but we dont do inferred calculation* this will meet all the use cases I mentioned above. (no change to the proposal i put)
visit_era (new table, if we really have to): will be the derived/inferred table that will calculated from visit_occurrence table like a drug_era, condition_era. This will meet @MPhilofsky and @bailey use cases?
This will avoid the confusion in the table naming conventions - and everyone is happy.
Ie, you both switch services while on a floor and switch floors while on the service
Why is this a problem? One column connects the event to the VISIT_OCCURRENCE, the other one to NEWTABLE_OCCURRENCE. The data define the latter, and there is usually a one-to-one. In a way, we don’t have that in the DRUG_OCCURRENCE - DRUG_ERA world, where we don’t know which Occurrence belongs to which Era. Here, we would (indirectly).
The FACT_RELATIONSHIP table doesn’t really work, because you cannot join things without a extensive case when then hierarchy. Not efficient, terrible to write and read. Hence my hatred.
I am saying that this new table has multiple things in it and a given row in say the condition table may need to link to two rows in the new table. So if you want to link to both the service and the encounter, you only have one column. You cannot infer the service from the encounter or the encounter from the service, so you need both links. If you are willing to infer them from the timing, then you don’t need the new column at all.
Fact_relationship is a concern, yes. So I don’t have an answer.
storing microbiology results
What happened to that proposal: Should we bring it up on the F-T-F?
I dont think that makes sense - or maybe I am not understanding. Are you proposing creating a new _occurrence table with its own _occurrence_id AND having that in condition_occurrence, drug_exposure, observation etc.
I think i agree here. There is not a 1:1 relationship, there could m:n relationship.
Yes. The difference between _OCCURRENCE and _ERA may be used here. _ERA may be use for the arbitary number of builds - including local definitions for builds.
Can we work out a few use-cases?
I have not had an opportunity to use FACT_RELATIONSHIP table - its use makes sense, but your hatred towards it makes me think this is not the right approach. I dont know
Correct. In addition to the VISIT_OCCURRENCE, we have GOWTHAMS_ANYLEVEL_OCCURRENCE table. And yes, we need two fields in the CONDITION_OCCURRENCE table, one for a link to each.
The alternative is to mix the VISIT and the GOWTHAMS tables together into one, which would be akin to mixing the DRUG_EXPOSURE and DRUG_ERA tables into one. We could have done it that way, but we haven’t for a reason. One contains the information from the source as-is, the other one is a derived summary.
That I don’t know. Really? The way I see it is that the data will place each event into one of the GOWTHAM records. So, a procedure is in a claim, which corresponds to a certain provider-site combination. So, 1:1. No? You guys know the raw data better.
Well, hang on. You can only roll up to one defined level. It is either Visit or Episode. Otherwise it is a mish-mash again, where events can occur in more than one record. Just like DRUG_ERA rolls up to Ingredient, and nothing else. If you wanted, say, Drug Component level you need to build it yourself as a Cohort.
That would be good. Do you want to do that? I could do too, but I am slower since not as fluid in the claims data.
My hatred of course is not a real debate contribution to this. All I am saying is that in contrast to all other fields in the CDM the foreign keys in the FACT_RELATIONSHIP table aren’t really foreign keys, because you need to decipher the table from the domain_concept_id first. Ugly. But works.
The other aversion I have is due to its “stealth” nature. If, say, visit_occurrence_id is empty, I know that the event didn’t happen during a visit. If you wanted to express the same in the FACT_RELATIONSHIP table you’d have to make sure there isn’t a relationship with the right domain_concept_id_1, fact_id_1, domain_concept_id_2 and relationship_concept_id. In SQL speak "where not exists (select 1 …). Awful. The German proverbial idiom for that, for which I don’t think there is an equivalent in English, says “From behind through the chest into the knee”.
I am not saying that. I am saying we would only infer the Visit level. Like we do now. The other levels are randomly dumped into GOWTHAMS_ANYLEVEL_OCCURRENCE. There is no inference in that one, and both the service and the encounter during which it happened can live there nicely in parallel. Just like in DRUG_EXPOSURE, where we can have the e-prescription of a drug and the dispensing of that same drug both in the table. DRUG_ERA cobbles it all up into one.
Nice idea. Except we would be not backwards compatible. Because VISIT_OCCURRENCE already exists and represents VIST_ERA, really. So, we can do one of three things:
We bite that bullet (not good, lots of folks will not notice that VISIT_OCCURRENCE changed)
Instead of VISIT we call it GOWTHAM_OCCURRENCE/ERA or something (all the other terms encounter, service etc. seem to be burnt) and drop the VISIT
We leave the VISIT_OCCURRENCE intact, even though it is an ERA, and for the real OCCURRENCE we use another name, also OCCURRENCE.
haha!!! Love it!! Lets do it joking. Lets not tarnish my name
Ok - how about:
Visit_occurrence (legacy - to be removed in v6 of OMOP)
Introduce encounter_occurrence and encounter_era; where encounter_era is like our current visit_occurrence but both encounter tables with the _anylevel_occurrence?
The new table includes several types of things not in a hierarchy. A given data point may want to link to two or more of them. That’s why one column is not enough. So if I want to link a procedure to the medical service the patient was on and to the location the patient was in at the time, I would have to pick one or the other to represent.
Another approach is to let the new table encode the exact times of the services and locations (I mean in the sense of floors) and infer which one from the time. works for inpatient, but not for outpatient.
We could. I don’t have a strong preference. Let the crowd raise or lower the thumb during the face-to-face like the ancient Roman arenas.
Got it. Yes, that’s a problem. @Gowtham_Rao? What do we do with our external hierarchy now?
That one I am not sure. Because the raw data usually give us one such link, even if it could give us more. At least in claims (@Gowtham_Rao, true?) In HL/7 we could have a many to many, I agree.
Outpatient? I am not sure we have any of these problems there, have we?
I think we are confusing it unnecessarily, and the confusion is again - Service vs visit vs encounter/mini-visit vs level of care etc. I tried to clarify that using “whose point of view”. I say - stop. No more posts on this. Clarify and define the different things we are trying to represent.
Then we can ask - does a hiearachy really exists? Can things live outside a hiearachy? If hiearachial is it 1:1, 1:n, n:n, or m:n. Do we represent that in the table or outside the table.
It’s really simple once we define those concepts or adopt those definitions from somewhere else. It’s just like how we strongly differentiate between the domains observation, vs procedure, vs drug, vs condition. Or how we refer vocabularies as clean and dirty by whether they belong to one domain vs multiple domains.