OHDSI Home | Forums | Wiki | Github

Data not associated with a person

Hey! Currently trying to get familiar with the CDM and using some synthetic data to play around since the plan is implementing it in a larger scale project down the line (which may have really different data sources).

I understand that the model is designed with patient-visit interactions as a central piece, but I’m not sure about what would be the cleanest way to store data that is not actually related to any patient. I’m thinking things like “temperature in OR number 7 at a given time” or “people at the 3d floor waiting room at a given time”, because I will have to deal with this data on top of the “patient-centric” one in the future, but I would like to keep the closest to the OMOP CDM standards as possible.

I’m thinking something like using measurements or observations in a creative way with null or placeholder values for patient_id so that I keep the structural integrity of the tables, but somewhat feel it’s not an elegant approach.

So, for a use-case like this, what is the community recommendation? Just create new tables? I haven’t been able to find anything regarding this in the forums, so sorry in advance if it is there.
Thanks!
Santi.

Yes, just create new tables, I’d suggest creating them in a separate schema from your CDM schema as to make it clear these are extra tables.

2 Likes

@Srmejiac, what DB, and version of said DB, are you using? This is sounding like document data, from your examples, to me.

Welcome to OHDSI, @Srmejiac!

You can do what Chris suggested since it won’t break the CDM. But why waste the time building out new tables to house these data?

What question would you ask of these data?

At the University of Colorado, we have extended our data model quite a bit to meet specific use cases for our on-campus researchers. However, we do leave a lot of data in the original model for the one-off or very rare use cases. Maintaining the OMOP CDM and any extensions or expansions is a lot of work. So, leaving the rarely used data at the source is actually more efficient. I’m unsure OMOP is the best place to house the type of data you list above. You might find a better model for non-person centric data.

If ‘people in 3rd floor waiting room’ is decided by tracking each person’s check in time (eg, via kiosk; and kiosk location) - you can have ‘check in event’ on person level (in fact). The location of check-in can also be captured.

I always wanted to get into CDM data for PHR (personal health record) login events (patients logging into MyChart PHR).

The OR temperature is indeed a non-person item.

Above would not capture people in waiting room, it would, if patients checked in correctly, identify active patients in waiting room, which is a very different question. I agree with the OP, his example is non-person data.

Thank you all for your insights! I realize I may have left out some information that may be relevant. The project is still on a really early kind of “brainstorming” phase, and it’s actually necromancing an old idea from my boss which included an ETL-Data Lake - Dashboard structure with a focus on leveraging the data specifically for predictive modelling of in-hospital logistics (think waiting times, occupation, time in the hospital).

We’re currently working out how we should structure the backend, looking for synthetic data and deciding on a model (of course, the ETL will depend on the source EHR, so for now I’m just playing around in Postgres using one of the prebaked ETL’s for Synthea generated data). But one of the selling points management wants is a be-all end-all model to store the data (because of course they do).

Since some of us have been eyeing for a while the work OHDSI is doing and really like the model and are advocating for data standarization, I’ve been tasked with exploring how the old functionality I talked about could fit into it (which would include EHR-extracted but non-person data).

So regarding your questions:
@Mark I understand you’re referring to leveraging non-relational? I will look into it but we’ve always worked with PostgreSQL.

@ Vojtech_Huser That’s actually a really interesting point!

@MPhilofsky I understand. What I’m kind of getting from this is that the most elegant solution to store non patient-centric data may be to not try to Frankenstein my way into a patient-centric model… which of course makes a lot of sense. Would you have any personal recommendation for non-person centric standarized models? I’m sure it shows, but I’m new at this field, just jumped from clinical practice.

If you are on Postgres, then you have much better JSON support than SQL Server, plus you have full array support AND object support, which SQL Server has neither. Postgres is all you need to support document data, the specific type of NOSQL, that I suspect you use. I have done this before, on other projects. There are a few tricks one can use with POSTGres’ Gist and GIN indicies to make it almost as fast as table seeks… I did say almost, please remember.

1 Like

No, I don’t. But from your use case of

A health system delivery focused data model would fit your use case. Are you familiar with HDAA? I would ask there. They are more focused on healthcare system analytics

1 Like
t