Once they’re deStandardized and mapped over to the OMOP Extensions, I think, we can generate such a list as the vocabulary mapping look-up of SNOMED, ICDO3, NAACCR and others sorted by the count of the source concepts mapped to each modifier/variant.
mapping to SNOMED, ICDO3, NAACCR is an additional step, better to look at source data right away:
there are always these ER, HER2, ERBB2, BRCA1/BRCA2 ect.
We can run some network study on generating the list of most common biomarkers, and then when you standardize by mapping to the OMOP Extension, you can focus on those concepts.