OHDSI Home | Forums | Wiki | Github

UCUM terminology subset in OMOP-is this a valid construct?

Hi, I apologize beforehand if this email seems to “challenge” the existing construct of OMOP infrastructure. This is not my intention at all. I merely want to share some UK and local experience in the process of standardisation so we can engage the international community to establish the best approach for both OMOP and participating organization.

UCUM, strictly speaking is a very “different standard” as it doesn’t really need a subset. The whole point is instead of having a concept that falls within a artificial subet, we should focus on its “validity”. By that we mean the atomic unit symbols or unit atoms, multiplier prefixes and expression syntax by which these symbols can be combined to yield valid units.

Through standardizing unit of measure (UoM) locally, we feel what is really needed is not keeping adding or asking the organisation to keep submitting the “missing” units (because theoretically speaking, the list is endless), but to testify the validity of the submitted unit. There are already tools out there to support this, including the “conversion” tool, e.g. h to mins. I think OMOP should handle the “standardisation” and conversation internally but allow organisations to submit any “valid” UCUM. Basically, reconsider the construct of having a “UCUM subset”.

Having said this, during the interim, our first piece of work has identified some missing ones, is it this place you would like me to share them with you?

Hello, @LeileifromUK

OMOP community likes challenges and has been through a lot during years :slight_smile:

UCUM is a valid standard for units and no one will be bothered if conversion from hours to minutes happens somewhere in your data. It is a valid strategy, but not considered best practice. Saving original data may be crucial in some cases.

I don’t see how the current system is wrong.

If you have units in your source data but don’t see them in OMOPed UCUM (you found missing ones), ask here. Otherwise, we don’t really need thousands of theoretical units of all kinds in OMOP.

Best regards, Oleg

I was not aware we are only loading a subset of UCUM codes into the OMOP vocabularies. How did we decide which units were part of the subset? Can’t we load all and only mark the subset as standard concepts?

Re: I don’t see how the current system is wrong.
@zhuk UCUM is not wrong. Having a subset of UCUM in OMOP needs to be challenged, because UCUM is “limitless” and valid UCUM is limitless. Why service has to do one more mapping process and spending time submitting you the missing ones? I am not going to repeat what I said in the original post. I have tried to share experience in this forum on many other domains and still didn’t seem to get anywhere. This is no exception. But we uphold our view.

@MaximMoinat Unfortunately we only load a subset in OMOP because of the comments above. that is precisely why we opened up a debate.

This is a valid idea, and it has been discussed. @Vojtech_Huser has a proposal along that line. We just have to execute and implement it.

Exactly. That’s missing. The other missing piece is to collect the right unit(s) for all applicable Measurement concepts. Before we can do that, we need to clean up the Measurement concepts.

If this all sounds like a lot of work - it is. :frowning:

Yes please.

I agree with you: UCUM is not wrong and UCUM is limitless.

So why bother chasing conversion which also seems almost limitless and not adding the missing units you need right now? There are not really many of them.

I understand your idea, I think, but it seems time demanding and requires a lot of preparatory work, as Christian stated. It also includes finding the right UCUM if it’s already here in OMOP, right? So it is a mapping step anyway. If you already have this system implemented in your workflow, missing UCUM units may be added as 2bil+ concepts. It’s a fast, easy and reliable way to run a study on your data. The only problem with this solution is network studies, working on the same version of vocabularies within a network. And those missing units will be added to OMOP as soon as possible.

t