OHDSI Home | Forums | Wiki | Github

Units of Measure to Add

Hello,
While mapping units of measure to their standards, I have found a short list that appear to be in the UCUM but do not have a standard mapping in the OMOP Vocabulary. Is it possible to have these added? Or if there is a way to map these that I am not understanding please advise. Thank you!

{copies} – Copies

U/dL – Enzyme unit per deciliter

U/mL – Enzyme unit per milliliter

mL/(kg.min) – Milliliter per kilogram per minute

nmol/h/mg{protein} – Nanomole per hour per milligram of protein

pmol/h/mg{protein} – Picomole per hour per milligram of protein

Seems that some of your units are substance-agnostic, so I’d suggest the following:

{copies} – Copies can be left w/o unit especially if LOINC test already gives you the unit in its name
U/dL – Enzyme unit per deciliter 44777568 unit per deciliter
U/mL – Enzyme unit per milliliter 8763 unit per milliliter

For nmol/h/mg{protein}, pmol/h/mg{protein} and mL/(kg.min) examples of the substances may be helpful. For example, protein may be hemoglobin and then pmol/h/mg{protein} can be mapped to picomole per hour and milligram of hemoglobin or converted to nanomole per hour and milligram or else.

Thanks for the guidance @aostropolets!

I know there were some limitations in analytical methods (value_as_number thing), so that unit to be always specified.
Don’t we need a dummy unit for such cases? E.g., piece/item/each?

But agree, copies without the denominator (volume of sample) don’t make a lot of sense.

@aostropolets @Alexdavv

I want to circle back to this – let me provide more context. I am working on mapping values that are on a form, which is why this is substance agnostic as Anna mentioned and also why I would like to do a one to one map rather than calculating by substance. For example on the form there is copies/L and copies/mL, however there is also a box that can be checked just for copies in case there is no denominator information. The form has an open text line to fill in other information and we are trying to take what we can and automate that process, hence the reason this is probably coming across a request that does not make sense. I need to worry about mapping just unit information even if that information standing alone would not normally serve a purpose, it does in this scenario.

Hi!
Not sure if got you right, but I have a proposal :slight_smile:

Form has ‘copies/ml’ checked → use 8799 {copies}/mL copies per milliliter
Form has ‘copies/L’ checked → use 44777570 {copies}/L copies per liter
Form has no denominator information → leave it without units or if @Alexdavv is right about limitations in analytical methods, use some dummy unit (I like this idea very much).

You can also use conversion during your ETL process to make copies per liter become copies per milliliter

More examples may be very helpful to understand your problem better.

@aostropolets
@Alexdavv
@zhuk

Thank you for all your responses. But we still need following units to be added to OMOP vocabulary:

{copies} – Copies
mL/(kg.min) – Milliliter per kilogram per minute
nmol/h/mg{protein} – Nanomole per hour per milligram of protein
pmol/h/mg{protein} – Picomole per hour per milligram of protein

They are all part of UCUM units, so I would think they should have their applications.

All of these are substance-agnostic. The number of molecules (nanomole, picomole) are pre-calculated at the source data.

I agree that copies should have denominator in most cases but in some rare instances it does not, e.g., number of copies of a genes, which is expressed frequently in the forms @Meghan_Pettine mentioned.

Thanks,

Qi

@QI_omop Please provide the examples of their use (e.g. substances).

Usually UCUM already has everything you need to we will have to look into the use cases prior to adding things.
And “protein” is definitely not substance agnostic. Which protein?

We don’t. {copy}={dummies}={each}={piece}=1. A unitless unit. In fact, everything that is in curly brackets is just for ornaments and doesn’t constitute a part of a unit. We need to add that to the description. @clairblacketer?

@Christian_Reich, according to the information provided by LOINC, the unit of {copies} exists in UCUM while we meet it again and again in custom mappings and map to 0 :slightly_frowning_face:).

Please, see the attachments to find out more.
TableOfExampleUcumCodesForElectronicMessagingWithPreface.pdf (542.0 KB)
TableOfExampleUcumCodesForElectronicMessaging.xlsx (62.4 KB)

Yes. They do. But they carry no meaning. You can put anything you want into the curly brackets. If the unit only contains something in a curly bracket than it is a unitless unit (=concept_id=0). You can also put in {copies}, {pieces}, {polinas}, 1.

It’s still may be relevant if the concept by itself doesn’t explicitly specify how the thing is measured.
E.g., the concept is a mutation analysis of the specified part of the chromosome. The result can be recorded as 3 {loci}/{genes}/{codons}/{nucleotides} affected meaning the different things. And unless the detailed concept is provided, the unit is the only way to preserve the meaning.

Another story is the limitation of the analytical methods that arise when unit_concept_id = 0. And ETL simply has to use one of the existing units.
What would be a work-around here? Use unit_concept_id = 1 or {polinas}?

1 Like

Exactly what probably happens. But the subject of the measurement should be in the measurement_concept_id, not in some vague combination with the unit. The unit should contain canonical physical units (m, s, etc.) and non-canonical physical units (Units, barrels, etc.), and mean just that. Let’s not throw good money (units) after bad money (bad Measurement Concepts).

We could have the convention of having the unit_concept_id equal to NULL, 0 or 1, all meaning the same thing. The latter we would have to add as a UCUM. I have no preference, but we need to have @clairblacketer add whatever the community prefers. I’d say we go for 1.

And, if folks want those {} units - bring them on. It’s easy to add them.

Guys, I beg you not to create {polinas} as a new unit. Please.
But I am wondering why we ignore new codes created by UCUM. As far as I know, we always tend to preserve the structure and content of the source vocabularies as is. Why this case is the exception?

By the way, how often do we meet units below in real data?
32705 [wood’U] wood unit
9321 [hd_i] hand

Why exception? We can add what we need. And UCUM doesn’t have units, it is a notation.

What do we need?

It’s not just a conventional question. Analytical methods simply fall if the real unit (having exact concept_id) is not specified.

It’s like this. US government passed an act that requires US transplant centers to submit outcomes data on all transplant centers to submit outcomes data on all allogeneic transplants, both related and unrelated, when either the donor or the recipient resides in the United States, or if the collection or infusion takes place in a US center.

They made a bunch of forms to capture this information. That’s where these units come from.

Instead of poo-pooing on their existence and calling them illegitimate, can OMOP acknowledge they’re real? :pray: :pray: What’s that saying: more flies with honey than vinegar? :wink:

Friends:

I am saying it again: We CAN have these units. We create them using the UCUM notation and instantiate them as an OMOP concept. No need to convince anybody.

(All I was saying is that nominally you don’t need anything that is in curly brackets for the unit to be fully descriptive and complete. A unit of some indivisible thing - a piece, a copy, a Polina - is the same from the perspective of describing the physical entity. I’ll stop now with this. :slight_smile: )

1 Like

You don’t want your own concept? Some people pay money to have a star named after them. This is free! :slight_smile:

2 Likes

Great to hear we’re on the same page. Let’s create the generic unit for pieces:

concept_code “1”
concept_name “Generic unit of any indivisible thing (piece, part, copy, etc.)”

t