OHDSI Home | Forums | Wiki | Github

Call for frequency of LOINC codes in your data

In OMOP vocabularies, SNOMED and LOINC are two foundations of the Measurement domain.

Many measurements or Lab tests exist in both Snomed and LOINC and can be chosen during ETL. They are similar to each other, but not identical, typically LOINC is much more granular than SNOMED. Given the differences in the granularity of the concepts, it is impossible to build simple horizontal ‘Maps to’ relationships in most cases. To harmonize the two vocabularies, we instead started creating a hierarchy, where LOINC concepts are descendants of SNOMED. This will support the analytical use case of selecting a whole group of lab tests by picking only one top-level concept, like what we have in Conditions or Procedures.

The problem is - there are thousands of LOINC codes. To prioritise our work, we need to know LOINCs that are used in the data more often. Therefore, we ask you to calculate some numbers for us.

Please use the scripts below to calculate frequency of LOINC codes in your data (SQL and R adaptation, connection details should be modified in R script). Share the counts here or in PM if you have trouble attaching documents to forum posts.

SQL
SELECT a.concept_code, sum(a.counts) AS counts FROM (

SELECT c.concept_code, count(*) AS counts
FROM procedure_occurrence proc
JOIN concept c
ON proc.procedure_concept_id = c.concept_id
WHERE c.vocabulary_id = 'LOINC'
GROUP BY c.concept_code

UNION ALL

SELECT c.concept_code, count(*) AS counts
FROM observation obs
JOIN concept c
ON obs.observation_concept_id = c.concept_id
WHERE c.vocabulary_id = 'LOINC'
GROUP BY c.concept_code

UNION ALL

SELECT c.concept_code, count(*) AS counts
FROM measurement meas
JOIN concept c
ON meas.measurement_concept_id = c.concept_id
WHERE c.vocabulary_id = 'LOINC'
GROUP BY c.concept_code
) AS a

GROUP BY a.concept_code

ORDER BY counts DESC
;
R code
library(dplyr)
connectionDetails <- DatabaseConnector::createConnectionDetails(dbms = dbms, 
                                                                user = user,
                                                                password = password, 
                                                                server = server, 
                                                                port = port, 
                                                                extraSettings = extraSettings, 
                                                                connectionString =  connectionString, 
                                                                pathToDriver = pathToDriver)
cdmDatabaseSchema <- "YOUR_DATABASE_SCHEMA"

connection <- DatabaseConnector::connect(connectionDetails)

query <- "
SELECT a.concept_code, sum(a.counts) AS counts FROM (
  
  SELECT c.concept_code, count(*)

  AS counts
  FROM @cdm_database_schema.procedure_occurrence proc
  JOIN @cdm_database_schema.concept c
  ON proc.procedure_concept_id = c.concept_id
  WHERE c.vocabulary_id = 'LOINC'
  GROUP BY c.concept_code
  
  UNION ALL
  
  SELECT c.concept_code, count(*)

  AS counts
  FROM @cdm_database_schema.observation obs
  JOIN @cdm_database_schema.concept c
  ON obs.observation_concept_id = c.concept_id
  WHERE c.vocabulary_id = 'LOINC'
  GROUP BY c.concept_code
  
  UNION ALL
  
  SELECT c.concept_code, count(*)

  AS counts
  FROM @cdm_database_schema.measurement meas
  JOIN @cdm_database_schema.concept c
  ON meas.measurement_concept_id = c.concept_id
  WHERE c.vocabulary_id = 'LOINC'
  GROUP BY c.concept_code
) AS a

GROUP BY a.concept_code

ORDER BY counts DESC
;
"

sql <- SqlRender::render(query, cdm_database_schema = cdmDatabaseSchema) %>% 
  SqlRender::translate(connectionDetails$dbms)
result <- DatabaseConnector::executeSql(connection, sql)

Thank you for your collaboration!

This is an excellent initiative, @zhuk! I have had many people ask me how to find “all serum cholesterol” tests and the like before. And telling people they need to use keyword search is usually met with a huge “sigh”.

I’ll send Colorado’s results.

Hi @zhuk I think this would be a great opportunity for us to learn from the OHDSI Evidence Network. Currently the network characterization only contains standard concepts but we can still use the information regarding the standard LOINC codes that are represented around the community.

Hey :slight_smile: I have the data for Stanford. Can I get your email to send them over?
We have around 6.7k LOINC codes (but only 800 or so with more than 10,000 occurrences)

1 Like

Hi Oleg, happy to help.

1 Like
t