Advice on Choosing Concept Granularity in a Researcher Data Request Platform

bflook08307 · May 22, 2025, 12:14pm

Hello everyone,

I’m currently developing a platform that allows researchers to request clinical data across the Condition, Procedure, Drug, and Measurement domains.
When a request is submitted, a linked mobile app will use the selected criteria to recruit matching participants with incentives (e.g., rewards).

I’m facing a challenge around the level of concept granularity that researchers should be allowed to select:

If the platform only offers very abstract concepts (e.g., liver disease, heart disease), the collected data might be too broad and not aligned with the research intent.
If it only offers very specific concepts (e.g., diabetic neuropathy, hypertensive heart disease), recruitment may become difficult due to limited eligible participants.

So I would love to hear your thoughts on:

Is it appropriate to limit concept selection to only those with relationship_id = 'Maps to'?
How do you recommend determining the optimal level of concept granularity for researchers to choose from?
Are there any recommended or community-maintained concept sets (e.g., for common diseases like type 2 diabetes, hypertension, etc.) that would be suitable for researcher selection in this kind of use case?
Due to local requirements, we must support KCD7 (Korean Classification of Diseases 7th Revision), which is not a standard vocabulary in OMOP. Would it still be acceptable to use these concepts in research if they do not map to standard concepts?
For RxNorm, would using the concept_class_id hierarchy (e.g., Ingredient, Brand Name, Clinical Drug, etc.) be sufficient to define appropriate levels of selection granularity for researchers?

We’re currently working with the following vocabularies: KCD7, EDI, RxNorm, LOINC.

Any insight, best practices, or shared experience would be greatly appreciated!

Christian_Reich · May 25, 2025, 11:19pm

Hi @bflook08307:

Welcome to the family!

You are pointing out an important problem: In a hierarchical system of concepts, are there strata we can use for various purposes that have the right level of granularity for reporting purposes? You have one use case, the typical rolled-up Table 1 reporting on conditions and treatments is another one. This has been discussed before.

We don’t have a solution for that, unfortunately. You could be part of creating the solution!

If you are working in OMOP you shouldn’t use source concepts. If they are properly mapped you should focus your conceptset on the standard concepts. If they are not mapped you should get them mapped.

The only thing I can think of is the phenotype library.

See answer to 1.

Usually, you need to roll up beyond the ingredient into ATC.