OHDSI Home | Forums | Wiki | Github

Mapping of Danish registries: Existing vocabulary mappings?

Hi All,

At the Danish Medicines Agency we are converting Danish health registries to OMOP, including the LSR registry with drug sales information, and LPR (National Patient register) with e.g. information on hospital treatments.
Our tasks include e.g. the mapping of drugs and surgical procedure codes to OMOP standard vocabularies, but we want to avoid repeating what has been done previously.
In LSR each drug pack has a Nordic item number (VNR), an ATC code, a dose form and a drug strength. From this combined information it will be possible to do mapping of VNR to Clinical Drug concept IDs. We are very interested in hearing if someone has already done (a part of) this work and has the possibility to share it with us.
In LPR the surgical procedure codes are based on NCSP (NOMESCO Classification of Surgical Procedures), and we would also like to know if someone has performed mapping of NCSP codes to OMOP previously.
It will be a great help if someone can tell, if these mappings can be available to us.


I know of people who use ATC to map to ingredient/clinical drug form (namely @mdewilde). What we (Vocab Team) recommend is to get ingredients from ATC, combine them with strength and form and run automated machinery that we have (open source on GitHub). It is more initital investment but the output is reliable and can be repeated after you set up the pipeline.

We did some work on ATC, but not from ATC to clinical drug form. For most ATC’s it will not work in that direction.

Within the EHDEN project we worked on the Laegemiddelstyrelsen mappings. I assume this is the underlying drug vocab you try to get mapped to RxNorm. We worked on this a couple of years ago to help some Danish partners getting started mapping to OMOP. Not updated, and also not completed. It did most mappings on drug level (instead of ATC) to keep much more detail of the underlying drug.

If this is the mapping you talk about, let me know maybe you can use some of this to get started.

Thanks for you answer. It sounds interesting if you have some mappings to concepts at the drug level, as this is what we are interested in (at least we want the Clinical Drug level). Our source data are from ‘The Register of Pharmaceutical Sales’ (Laegemiddelstatistikregisteret), where most drugs has a VNR (item number), an ATC code, a STRNUM value (strength) and DOSFORM (form) in a table called Laegemiddeloplysninger. I guess it could be the same data. Does it sound familiar?

Thanks a lot. What is the name of the tool? I think it might do exactly what we want to do. When I started looking for methods, I came across this tool GitHub - EHDEN/DrugMapping: Tool for mapping drugs in a source vocabulary to RxNorm(Extension) concepts in the OMOP-CDM, but I have not looked much into it. Is that useful?

From what Marcel wrote I understood that for danish data they produced some sort of file, not a tool, which can get you started but is not complete (Marcel please correct me if I’m wrong :)).

As I said, vocab team maintains generic code that helps to map source drugs to standard codes in a systematic way. If you have resources it would be a good option (and you can reach out if you need further guidance). If you don’t - using what Marcel and the team has already created or mapping to ingredient level would be another possibility.

Hi Anna, I think we are interested in using the ‘boiler’ method (in the ‘mapping only’ mode), which you provided a link to. We do have some resources for it but unfortunately not plenty of time. If you can provide some guidance on how to get started, it would thus be much appreciated. Also, it could be nice to know ways to limit the amount of work.
If we want to skip ‘difficult’ mappings in the beginning, will it be a good idea to exclude combination packs? Are there other time-consuming cases?
Does it make sense to map to Clinical Drug concepts only and then include brand, quantity and supplier when time allows?

Yes, all of this makes perfect sense (excluding packs, brands, suppliers etc. and focusing on ingredient + form + dose combo). Other hard cases that might be easier to handle by hand are vaccines and insulins.

In terms of time needed, I’d plan for at least 40-80 dedicated hours. Based on our experience, you cannot map to Clinical Drugs if you have less than that regardless of the approach you use. If you do, the first step would be to read Step 0 in the document and try to follow the steps (and of course we will help you along the way with this thread being a good place to keep the conversation going and have it as a reference for others with similar needs).

This sounds like a good solution. Though, we have not made the final decision yet.

Meanwhile I hope you can clarify some things for me regarding the drug strength table (DS_stage), which I think could be a difficult table to fill.
-If we have the drug strength for solutions, gels, cremes etc., should we then use this as numerator_value? Numerator unit is then e.g. MG/ML? And we don’t need the volume for this table?
-In many cases, the strength is given as MG per dose, and packsize is the number of doses. Should we handle this differently?
-Other examples that confuses me are things like solutions in pens/syringes and powder in vials? Is it amount_value we need to fill in here?

Hope you can help.

Yay!!! You read it (it brings me so much joy)!!

In a nutshell:

  • Solid forms like tablets usually have MG in amount_unit. Same goes for things like syringes where we don’t know volume but know dose. Box size will be number of tablets/syringes. It seems that would be a case for your #2 but give me an example :slight_smile:

  • liquid forms use MG/ML as in your #1. For example, RxNorm 40220876 acetaminophen 10 MG/ML Injection has 10 in numerator_value, MG in numerator_unit, NULL in denominator_value, ML in denominator_unit. You can also see it in drug_strength:
    select * from drug_strength where drug_concept_id=40220876;

Thanks for your explanation, that was helpful. Unfortunately we have decided to begin with a less time-consuming approach, the EHDEN DrugMapping tool. However, I think we will come back to the boiler method at a later time.

Sorry for late response. I was on vacation.

Yes attribute names like VNR, STRNUM sound familiar. Anna was right EHDEN/DrugMapping was the tool we used to create these mappings. It needs a little preprocessing to create the required standard input files (specs can be fond in the github of the tool). Just like the vocab boiler, the tool is using the decomposed drugs (ingredients, dose forms, units, etc). Most ingredients are mapped automatically because we are lucky that Laegemiddelstatistikregisteret also contains english terms for this. For the dose forms we used a manual mapping and also for a couple of ingrediens we needed some overrules.

@sbru: I will send you the files we used via email.

We did this about 3 years ago, so applying this on a later vocab will probably result in some mappings that are no longer valid and/or mappings that can be done better because new drugs/ingredients/dose forms are added. But hopefully you can use this as a quick start to get nice drug mappings.

Thanks a lot for the files. They are very helpful to us -also as a guide, since we are starting using the EHDEN DrugMapping tool ourselves.
I hope you could also explain the process you used for drugs with multiple ingredients. It says that we need to create a record in the input file ‘Generic Drugs file’ for each ingredient. So we first need to find the multi-component drugs and identify each ingredient? For the single-ingredient drugs, I think it won’t be necessary to enter ingredient names, right?

The drugmapper does not use the term of the drugs. Similar as the OHDSI boiler, it uses the underlying elements (ingredients, amounts, units and doseforms) to find the best possible match in rxnorm(ext). These elements will be mapped first and after this the definition of a drug is known in cdm concepts. Based on this it tries to find the best match on drug level, optionally with a margin for the amounts.

Therefore you need to specify all of the active ingredients. Not only for multi ingredient drugs, but also for a single ingredient. The drugmapper will need the ingredient names and tries to find the match in the vocab (via terms, synonyms, relations, etc). If you provide the english terms and/or CAS codes, the tool will also use these to find the ingredient mapping. If no automatic match is found you can use the manual fallback mappings. If the automatic match is invalid, you can use the manual overrule mappings.

See https://github.com/EHDEN/DrugMapping/blob/master/README.md for the file specs.

Okay thanks. I just thought the ATC code was enough to specify the ingredient in most cases, since they are often mapped to one or more RxNorm ingredient in OMOP vocabularies. But I understand then that we actually need to obtain a list of all ingredient names somehow else.

@sbru we are starting a project to map Danish registries as well. Can we coordinate our efforts?

Hi @mdewilde

Would it be possible for you to send the files you used to me? We’re also in the process of mapping to OMOP. My mail is nreece20@student.aau.dk.

Hi Tomer,
We have come quite far in the work already, as we have converted eight health data registries and our OMOP database is created.
We still need to map the LAB and vaccine registry and for the national patient registry we haven’t mapped procedure codes or Danish versions of ICD10 codes yet.
It could be interesting to hear if you are interested in collaborating on these topics and contributing actively via our partners at NGC. All of our ETL logic is open source and we are happy to share this.

That’s wonderful to hear! We have some resources to help with the LAB, procedure codes and ICD10 codes if you want to collaborate on this. Let’s continue this discussion by email?