Defining encounter-level visit grain from claims with inconsistent place-of-service and nested detail lines

esteban.correa · April 14, 2026, 7:36pm

Hi all,

I am reaching out to seek clarity on a topic that I find somewhat confusing, and I believe your expertise could greatly help me understand it better.

I’m currently working on a healthcare utilization segment for the OMOP CDM 5.4 database, working through how to define visit_occurrence grain when source claims bundle multiple real-world encounters under a single claim_id. We’ve identified three distinct data patterns in our medical claims detail lines (claims usually nested) and want to validate our proposed approach with the community.

Context

Our source data has one row per claim detail line, with columns for claim_id, medical_claim_detail_id, service_from_date, service_to_date, place_of_service_code, medical_code_type (ICD-10, CPT-4, HCPCS, REV), and medical_code. A single claim can contain a mix of ICD-10 diagnosis headers and CPT-4/HCPCS/REV service lines.

The naïve approach of one visit_occurrence per claim_id collapses distinct encounters — for example, 47 separate psychotherapy sessions billed under one claim would become a single visit spanning six months, but it should be 47 visit occurrences…

Proposed encounter grain

We are trying to refine a new visit grain one visit_occurrence per unique combination of person_id + claim_id + place_of_service_code + service_from_date, derived only from service lines (CPT-4, HCPCS, REV). ICD-10 lines never drive visits — they fan out as condition_occurrence rows (or whichever table the vocabulary domain_id dictates) linked to every visit on their parent claim.

But we’ve observed three patterns

Pattern A : POS present on service lines (most common): ICD-10 headers have no POS and spanning dates, but CPT/HCPCS lines carry discrete dates and POS codes. The four-part grain works cleanly. Example: 47 CPT lines across 47 dates at two different POS codes produce 47 visits, while the single ICD header fans out to all 47.

Pattern B : POS present on service lines, single date: An ER encounter where all 32 service lines share the same date and POS. The grain collapses to 1 visit. Three ICD headers fan out to that single visit.

Pattern C : POS missing on all lines: Every line on the claim — including CPT/HCPCS service lines — has a null POS. The grain drops to person_id + claim_id + service_from_date, and visit_concept_id must be inferred from the E&M or service codes (e.g., CPT 99285 signals ER visit → concept 9203, HCPCS G0378 signals observation).

My questions for the community

Is the code-type-based classification (ICD-10 = always header, everything else = service line that drives visit grain) a sound general rule, or are there known edge cases where ICD lines should drive visits?
For the ICD fan-out, attaching each diagnosis to every visit on the same claim. is this consistent with how other CDM builders handle claim-level diagnoses that lack encounter-level attribution? We recognize this may overcount condition prevalence at the visit level but see no reliable way to attribute a diagnosis to a specific encounter within the claim.
For Pattern C (no POS anywhere), is inferring visit_concept_id from E&M codes an accepted convention? We’re considering a hierarchy: ER E&M (99281–99285) → 9203, inpatient indicators → 9201, default → 9202. Has anyone implemented something similar?
The core question: what constitutes an encounter?

Any experience or guidance would be appreciated.

Thanks in advance.

Christian_Reich · April 17, 2026, 10:10am

@esteban.correa:

This is a common question and keeps coming back. We may want to create some convention around this.

In principle, the OMOP CDM describes the healthcare experience of the PATIENT. Claims are how providers (institutional or flesh-and-blood) bill their services to the payers, and are therefore not the granularity we need. So, you need to ask yourself the question: What would the patient say happened to her or him. Which is actually fairly simple:

For outpatient visits, every encounter is most likely what you want to write into VISIT_OCCURRENCE - the patient believes to have “gone to the doctor’s office”.

For inpatient or long-term care encounters, you need to aggregate all your encounters in such a way that they reflect the entire stay in the hospital (or whichever inpatient institution it is). If there are different institutional providers involved it may reflect the fact that the patient moved between different wards or departments - these would go into VISIT_DETAIL.

I am not sure how ICD-10-CM codes would help. They are just a justification for the service, and there could be different such codes for, say, a hospital visit. ICD-10-PCN are higher level interventions, it could work for combining claims into a hospital visit. CPT - it is more for more confined services, so will work for outpatient, but maybe not for inpatient.

@MPhilofsky may have better experience with all this.

MPhilofsky · April 20, 2026, 3:44pm

Hello @esteban.correa,

Per the CDM v5.4 specifications, an encounter is defined as such:

“This table contains Events where Persons engage with the healthcare system for a duration of time. They are often also called “Encounters”. Visits are defined by a configuration of circumstances under which they occur, such as (i) whether the patient comes to a healthcare institution, the other way around, or the interaction is remote, (ii) whether and what kind of trained medical staff is delivering the service during the Visit, and (iii) whether the Visit is transient or for a longer period involving a stay in bed…in US claims outpatient visits that appear to occur within the time period of an inpatient visit can be rolled into one with the same Visit_Occurrence_Id.”

As @Christian_Reich stated, ignore the diagnosis codes since those are used as justification for a procedure or reason for visit.

Generally,

is an acceptable approach. One thing to be mindful is a person can be an inpatient in a hospital for days/weeks/months and they will have what looks like an outpatient procedure. This is not what happens. The person does not get out of bed and go visit the clinic to get a procedure. But this is the way it is billed in the US and how the data look in a US claims database.

The way I would approach Visits derived from claims data, 1. Define the time period a person is in the hospital as an inpatient 2. Define ER visits, if they overlap with the inpatient visit in #1, use concept_id = 262 “Emergency room and inpatient”, 3. Define outpatient visits, if they overlap with the inpatient visit in #1 or the ER visit in #2, then they happened during the inpatient or ER visit and you should NOT create a separate OP visit. If you have a use case, you can create a Visit Detail record and link it to the Visit Occurrence record for the parent IP visit for the claim, if you don’t have a use case, you don’t need to create the Visit Detail record. The Visit Detail table is optional and not utilized in many/most studies which utilize claims data.

The Health Systems WG has discussed this topic a couple of times. One recording is here and there’s another one here.

esteban.correa · May 28, 2026, 4:36pm

Hi all,

I wanted just to keep you posted on this.
Taking into account advice from both, we followed a multistep approach including clinical ladder precedence, visit consolidation for handling overlaps and fan out multiple conditions.

And yes, claims format sucks

Thank you!!

Gowtham_Rao · June 5, 2026, 11:49am

The visit detail and a visit centric ETL solves most of this. Please see

Gowtham_Rao · June 5, 2026, 11:53am

I think we have made the argument that ETL should start with visit detail as anchor in the HEVA WG