OHDSI Home | Forums | Wiki | Github

Confusion about 80/20 rule in source_to_concept_mapping

Hello everyone,

I am currently working on making a source_to_concept_map using Usagi, and I am a little bit confused about 80/20 rule. For example, there are more than 5500 source codes, and by mapping 1000 of the most frequent ones, we mapped more than 99% instances of codes. Should I continue with mapping till i reach 80% of mapped source codes, or those 80% are actually pointed to the instances and thus we should proceed to the next phase?

tl;dr In your experience, what is the approximate threshold in % above which should we proceed to the next phase?

Thank you in advance!

1 Like

Hello!
Usually we are mapped 99% of codes, and then just inspect the remaining concepts for important unmapped of them.

To conclude: we should then map approximately 5000(of total 5500) codes?

Thank you for your helpful response!

Does your source codes are distinct? Maybe the first 500 of your concepts contain by few thousands of counts and the others are 5-10 counts. So the first 500 concepts will cover more than 99% of mapping. But if every source code have the only one count - 99% are practically whole concepts.
If I get you right, your 99% is in 1000 codes

Yeah, exactly 99.6% is in 1000 codes, other 4500 correspond to only 0.4%

Just inspect other 4500, maybe there are something important, even with low count

2 Likes

Thank you so much!

t