New table for visits, encounters, care transitions

Gowtham_Rao · March 2, 2017, 12:26pm

I think this is tough to be sourced from claims data

Agree - we need to preserve claims level referencing.

i.e. whose point of view? Claims will help answer some of the points of view as posted above.

@bailey @MPhilofsky Reading your use-cases, i think the unit of analysis is care_site (site of care) , where the care-site generally refers to the ‘level of care’ e.g. ICU vs step down.

When representing data in OMOP common data model - I think it is critical to represent the data in a such a way that is semantically represents the source data and is at the same atomic level of granularity as the source data. This will retain lineage/provenance to the source 1:1. Any ‘inferred’ information should be done at the analytic time. If we represent inferred or calculated information in the OMOP CDM, then we are saying that information came from the source - which is not true.

@Mark_Danese - can we use the Contexts and Collections approach to derive visits, encounters, services as described above? @jenniferduryea this is an opportunity to be able to represent claims data accurately in OMOP CDM like @Patrick_Ryan said

The visit vs. micro-visit lingo may confuse some - what do you think about service, encounter, visit framework above using whose point of view? More word-smithing?

You can know the following:

Facility claim vs professional claim atleast in USA by using UB04 vs CMS 1500 claim information or pharmacy/vision/dental claim etc.

Other important things in claims are HIPAA place of service, Type of bill, revenue code, admit type that help define it further
Yes finally!

I think we should first agree on conventions, then find an efficient way to represent the data

Mark_Danese · March 2, 2017, 4:50am

Just to answer the question you directed at me. Yes, visits can be derived in the way we handle data. For EHR data, a Collection is a visit. For claims data, a Collection is a claim. If we need to count unique visits for a patient for some reason, we group items using specific rules related to the research question (e.g., using unique dates, places of service, and providers).

In fact, in my experience a uniquely defined visit is almost never needed in research. Usually, we want a specific clinical event – a diagnosis, procedure, drug, or lab – and the date it occurred. The notion of a visit in most data models is generally used as a convention to link information together.

I think the process begins with establishing what we are trying to do with visits.

Gowtham_Rao · March 2, 2017, 4:54am

Thanks @Mark_Danese

I think, if if i understand you correctly, your way allows flexibility to the community of ‘deriving’ the visit, encounter, service etc. based on rules defined by researcher at the analytics time rather than ETL time.

Is the collection/contexts something we an work more on?

Mark_Danese · March 2, 2017, 5:04am

If the community wants to go this direction, I am happy to help. We built our own data model so we could more easily convert raw data to OMOP and Sentinel. If you want to see how it all fits together, it is publicly available: GitHub - outcomesinsights/generalized_data_model: Outcomes Insights' Data Model for Clinical Research

Purely selfishly, the more similar the data models are, the easier my job is. But what we are doing is very specific to our needs and I recognize that OMOP has a fundamentally different goal. That is, OMOP is designed to be an “analysis ready” representation of the data. By definition, decisions related to analyses are baked into the data model. Otherwise it isn’t “analysis ready”.

Gowtham_Rao · March 2, 2017, 12:00pm

Our mission is to be able to do reproducible, collaborative, large scale research. We use a common data model to standardize the structure, content, representation of data. I don’t think that implies “analytics ready”. We have packages and tools that convert the data in CDM into analytics ready, but the CDM itself is not analytics ready. I don’t think our mission calls for baking in analytic use-cases into CDM so that those use cases are easier; what we do is, collaboratively understand multiple usecases and enhance the CDM such that multiple use cases maybe supported.

Atleast that’s my understanding. @Patrick_Ryan thoughts?

Christian_Reich · March 2, 2017, 12:35pm

@Gowtham_Rao:

You are absuing @MPhilofsky’s thread to define visits and its hierarchy here for a general ideological discussion. I am going to move it to a new Forum posting, if you don’t mind. See you there.

Gowtham_Rao · March 2, 2017, 1:10pm

Uses cases: For tracking health care resource utilization (Health plans) are interested in being able to calculate the following from OMOP common data model

Visits per 1,000 per year (Numerator: number of unique visits in a certain period Denominator: Person-time during that period)
Encounters per 1,000 per year (Numerator: number of unique encounters in a certain period Denominator: Person-time during that period)
Claims per 1,000 per year (Numerator: number of unique claims in a certain period Denominator: Person-time during that period)
(Inpatient) Days per 1,000 per year (Numerator: number of unique (inpatient) dates in a certain period Denominator: Person-time during that period)
(Inpatient) average length of stay (Numerator: total number of unique (inpatient) dates in a certain period Denominator: Number of persons during that period)
[Dates may be visit days, encounter days or claim days - if using from and to dates]

These are generally then stratified by site of service, condition/diagnosis (primary), primary procedure, patient demographics, etc. The metric and stratifications are used for reporting and visualization in Business Intelligence platforms.

I think health-plans would be very interested in being able to consistently and reproducibly measure this.

Christian_Reich · March 2, 2017, 7:39pm

@Gowtham_Rao

That is the problem. An encounter can be a 5 minute stop-by in a walk-in clinic to get a flu shot, or a month-long stay in a hospital with tons of diagnostic procedures and treatments. Within such a jumbo encounter you can have little encounters embedded, because those providers bill separately. And each encounter has a bunch of claims attached.

Bottom line: We have to either infer the true healthcare experience of the patient from how the claims are submitted, and that depends on how the provider is organized. Or ignore it and keep it the way it is (like in your use cases).

Or both. My proposal 2. is supporting that: Put your encounters in a new ENCOUNTER table where we dump everything as it is given to us. That’s where you would do your use cases. In the VISIT_OCCURRENCE table we infer the standard visits.

Works?

Gowtham_Rao · March 3, 2017, 12:02am

@Christian_Reich Thank you - trying hard to

Did you see this Conventions

Can we use Visit - Encounter - Service (wordsmithing?), where visit is our visit occurrence table. Encounter is new (table). Service is represented in visit and/or encounter as a person-level concept for a datetime.

Ok - @Christian_Reich now i see clearly about this. Makes sense - there is precedence to represent inferred information in the CDM.

Gowtham_Rao · March 5, 2017, 12:39pm

The more I think about it - i dont think we need another encounter-table. We could use the existing visit_occurrence table, and concept_id’s to represent the level (visit, micro-visit, mini-visit, encounter etc.).

See Proposal

visit_level_concept_id: A foreign key to the predefined concepts in the Visit Level Vocabulary reflecting the granularity concept of a nest of information.
visit_level_relationship_concept_id: A foreign key to the predefined concepts (HL7’s partOf links) Vocabulary
related_level_visit_id: A foreign key to the VISIT_OCCURRENCE table record of the visit that is related to visit but not in the same level/nest.
visit_level_definition_id: A foreign key to a new Definition table that records descriptions and syntax that led to the instantiation of the visit_occurrence record.

Alternatively, we could use an encounter table like proposed above - but why an additional table when visit_occurrence table is able to solve the use-cases?

@Christian_Reich - may i get a spot to propose this to the CDM?

This does not solve the need for ‘services’ i.e. service_concept_id used by PEDSnet. That would be a different proposal

MPhilofsky · March 6, 2017, 9:07pm

How does this:

Solve this use case:

More specifically:

@bailey also has a use case:

The impetus for my original proposal at the top of this thread is to solve level of care, location of care and transfer of care within an inpatient visit in greater granularity than is currently available in the VISIT_OCCURRENCE table. The proposal @Gowtham_Rao links above is very thorough, but neglects my use case. Is there a way to solve everyone’s use case with one change?

To answer @Gowtham_Rao’s question:

Patient + severity of medical condition.

Explanation: The level of care a patient received. Yes, the patient may continue to have the same provider in an ICU or on a standard medical unit, but the level of direct care (staffing ratio, certifications of direct care staff, complexity and number of procedures, measurements and drugs, etc.) given to a patient is vastly different. We’d like to look at outcomes associated with these different levels of care. We would also like to research attributes of patients that went from the ER > medical floor > ICU within a short period of time. What do they have in common? Maybe certain attributes should be considered in the level of care placement of a patient within an inpatient hospital stay.

Per @Gowtham_Rao Proposal and by others, a consequence of combining this new data with the current VISIT_OCCURRENCE table is “Performance impacts for extremely large data”. Those of us with EHR data will be adding a lot of data to our instances of the CDM. Not every query will be asking for this level of granularity, so I propose we keep the new table separate to lighten the load.

Gowtham_Rao · March 6, 2017, 11:41pm

@MPhilofsky I think we should distinguish between a service and location - and that will probably show how the proposal addresses most of your use case.

The proposal does not address the use-cases that involve the concept of a ‘service’ . But it does address two of your three focus areas - it does not address level of care, it does address location of care and transfer of care.

An example: When i was training as an internist in the hospital, we would have many days when the ICU beds would be full. These days, if the ER is still open and a patient comes in to the ER who needs ICU level of care - it would make it so difficult for everyone. Since the ICU had no physical space, the patient would stay in the ER and the ICU nurse/physician would take care the patient in the ER. This is an example of ICU service in the ER location. In this example - care_site_id would capture the physical location i.e. ER, and service_concept_id would capture the ICU service. In real-world, there will be many such examples: such as dialysis service in a rehabilitation floor location, or surgery service in the ER location etc.

The visit_occurrence table with the proposal addresses the use cases listed that involve location or transfers between locations: e.g.

These are not addressed:

To address these we have to vet out what is a service and what is a level-of-care or care-intensity. We will have to introduce service_concept_id and have a new domain, concept_id etc for service. This is important, happy to support you on that work.

This is also addressed in the proposal - because the change is physical location (a micro visit).

Yes - I am no datamodeler, there are others who would know best. Its a trade-off

MPhilofsky · March 9, 2017, 5:26pm

@Gowtham_Rao,

You make some good points. Were you the debate club captain in high school?

Our unit of analysis is physical location and level of care. Generally, the questions have asked about patients receiving ICU level care which includes the medical ICU, neuro ICU, etc. Sometimes they ask about a specific care_site (who received aaa procedure in the xxxICU vs the yyyICU). For us, the service_concept_id could be a way to group the differing levels of care (ICU care, acute care, rehab) and the care_site would be a specific site within a location. Unfortunately, the current vocabularies available for use aren’t granular enough to represent the care_site at the level our investigators require. The service_concept_id value set that the PEDSnet folks created would help us distinguish the level of care.

I’m going to let this rest while our friends mull it over. Hopefully, this can be finalized at the F2F so we can add this data to our instance of OMOP.

Melanie

Christian_Reich · March 11, 2017, 3:44pm

@Gowtham_Rao:

Yes, sounds like a good idea. Let me understand: Essentially, instead of a fixed Visit-Encounter-Procedure hierarchy we use one table and Concepts to clarify what it is, and the hierarchical relationships between these Concepts are predefined and live outside, correct?

If so, I like it. It’s no different than in DRUG_EXPOSURE. There, we have only one Concept for the Drug, but that concept tells us whether it is a product, or a Component, or an Ingredient. And you have the full hierarchical relationships to ask for “Give me all records with teatment with Ingredient XYZ”, and all you have to do is to join the CONCEPT_ANCESTOR table. We could do the same thing here: “Give me all the records with a hospitalization”, which would pull up hospital-based services.

Except: The derived DRUG_ERAs are in a different table, and there are predefined on the Ingredient level.

We may think of replicating this approach. Here is the reason: There are all these @MPhilofsky and @bailey use cases trying to figure out what goes right and wrong inside the hospital. But 90% of the use cases for the visit table is “Give me all patients with a hospitalization for Condition XYZ”, where “hospitalization” really is a measure of severity. @MPhilofsky has that even inside her hospital world.

So, how about this: We create a new table for any of the different levels Episode, Visit, Encounter, Service (not Procedure), and folks dump in there whatever they find, together wiht a Concept that defines the level. This is the equivalent of DRUG_EXPOSURE. And VISIT_OCCURRENCE gets inferred and cobbled together, and becomes (stays really) the equivalent of the DRUG_ERA. In all the event tables we add another field: newtable_occurrence_id, which is the equivalent of visit_occurrence_id.

Thoughts?

hripcsa · March 11, 2017, 4:05pm

A new table is fine I guess, but do we really have to add a new column to every data table (drug, condition, etc.)? It doesn’t meet the need anyway if the encounters and services are not hierarchical. Ie, you both switch services while on a floor and switch floors while on the service. Therefore one column is not enough to link to both encounters and services. Isn’t this what the fact_relationship table is supposed to do?

Similar issues are surfacing as we figure out storing microbiology results, although that one is more hierarchical. In our clinical data repository, we have “component” tables that nest with child rows pointing to parent rows within the table. It has worked for almost 30 years, but you have to infer the hierarchy. With fact_relationship we can be more explicit about the parents and children and it need not be a strict hierarchy. Nevertheless, everyone is nervous about relying on the fact_relationship table to store micro results, and I wonder if it is the same for visit information.

George

Christian_Reich · March 11, 2017, 4:11pm

@Gowtham_Rao:

Your [proposal][1] makes me think more: Episodes. We have CPT4 concepts 0525F “Initial visit for episode (BkP)” and 0526F “Subsequent visit for episode (bkp)”, which could be used to build them! But we would have no place to have that inferred in a standardized way. Not even sure we can, really, given the data. But it would be cool.

Another thing: Services. We have all these CPT4 and HCPCS codes for services, which we really trash into the Observation table right now:

1013774	Home Services	CPT4	CPT4 Hierarchy
1013775	New Patient Home Services	CPT4	CPT4 Hierarchy
ET	Emergency services	HCPCS	HCPCS Modifier
G0108	Diabetes outpatient self-management training services, individual, per 30 minutes	HCPCS	HCPCS

We could make use of them as well. But how?
[1]: https://docs.google.com/document/d/1qM7cmdbLYgmo61TqpgvlOSw54-TwkDYLJoR3b5NSg_c/edit

Christian_Reich · March 11, 2017, 4:25pm

Yeah, I know, but it’s only one tiny little column.

Why is this a problem? One column connects the event to the VISIT_OCCURRENCE, the other one to NEWTABLE_OCCURRENCE. The data define the latter, and there is usually a one-to-one. In a way, we don’t have that in the DRUG_OCCURRENCE - DRUG_ERA world, where we don’t know which Occurrence belongs to which Era. Here, we would (indirectly).

The FACT_RELATIONSHIP table doesn’t really work, because you cannot join things without a extensive case when then hierarchy. Not efficient, terrible to write and read. Hence my hatred.

What happened to that proposal: Should we bring it up on the F-T-F?

Gowtham_Rao · March 11, 2017, 7:48pm

Yes - and I think it generalizes our thinking and makes the table both future proof and backward compatible at the same time. By using concepts at the record level of visit_occurrence, and defining the hierarchy between concepts outside the visit_occurrence table – we can meet a lot of use-cases.

This “VISIT_OCCURRENCE gets inferred and cobbled together, and becomes (stays really) the equivalent of the DRUG_ERA” will be a source for confusion for a new-comers. Let me explain. “_ERA” is obviously calculated/inferred, i.e. any inferred table is an “_ERA” table. While _OCCURRENCE tables condition_occurrence, procedure_occurrence, drug_exposure, visit_occurrence may or many not be inferred and may have 1:1 lineage to source data. If we use visit_occurrence as the inferred table – it will making our table naming convention confusion.

So we use two tables:

visit_occurrence : with the intent to be as close to source data as possible including the enhancement proposals i have made*, but we dont do inferred calculation* this will meet all the use cases I mentioned above. (no change to the proposal i put)
visit_era (new table, if we really have to): will be the derived/inferred table that will calculated from visit_occurrence table like a drug_era, condition_era. This will meet @MPhilofsky and @bailey use cases?

This will avoid the confusion in the table naming conventions - and everyone is happy.

hripcsa · March 11, 2017, 7:30pm

Ie, you both switch services while on a floor and switch floors while on the service
Why is this a problem? One column connects the event to the VISIT_OCCURRENCE, the other one to NEWTABLE_OCCURRENCE. The data define the latter, and there is usually a one-to-one. In a way, we don’t have that in the DRUG_OCCURRENCE - DRUG_ERA world, where we don’t know which Occurrence belongs to which Era. Here, we would (indirectly).
The FACT_RELATIONSHIP table doesn’t really work, because you cannot join things without a extensive case when then hierarchy. Not efficient, terrible to write and read. Hence my hatred.

I am saying that this new table has multiple things in it and a given row in say the condition table may need to link to two rows in the new table. So if you want to link to both the service and the encounter, you only have one column. You cannot infer the service from the encounter or the encounter from the service, so you need both links. If you are willing to infer them from the timing, then you don’t need the new column at all.

Fact_relationship is a concern, yes. So I don’t have an answer.

storing microbiology results
What happened to that proposal: Should we bring it up on the F-T-F?

Sounds like a good idea.

George

Gowtham_Rao · March 11, 2017, 8:02pm

I dont think that makes sense - or maybe I am not understanding. Are you proposing creating a new _occurrence table with its own _occurrence_id AND having that in condition_occurrence, drug_exposure, observation etc.

I think i agree here. There is not a 1:1 relationship, there could m:n relationship.

Yes. The difference between _OCCURRENCE and _ERA may be used here. _ERA may be use for the arbitary number of builds - including local definitions for builds.

Can we work out a few use-cases?

I have not had an opportunity to use FACT_RELATIONSHIP table - its use makes sense, but your hatred towards it makes me think this is not the right approach. I dont know

agree with @hripcsa