Please use this spot to document:
- Describe the issue/topic?
- What do we know about this topic? What has been discussed?
- What are recommendations for how to handle this issue/topic?
- What next steps should be taken?
Related posts:
OHDSI Home | Forums | Wiki | Github |
Please use this spot to document:
Related posts:
@Christian_Reich has a very good proposal linked above. It covers smokeless tobacco consumption and nicotine smoking, not marijuana or other substances. Since we don’t have a lot of time during our Themis workshop, we need to be focused on a solution for nicotine.
The proposal:
Creates a hierarchy with smoking at the top
Covers e-cigarettes/vape pens, pipe, passive/2nd hand smoke, smokeless, etc.
Records intensity of smoking into trivial to very heavy categories.
Previous or ex-smoker gets relegated to “History of” category like all history of concepts
One item not covered is the “pack years” concept. There is not a standard concept_id for this idea. UK Biobank has a non-standard concept_id = 35811050. Do we make this concept standard since there isn’t a standard concept_id for this idea? Then have the observation.value_as_number contain the # of pack years? We will need to keep it outside the hierarchy because pack years can’t be determined to be trivial versus heavy because the age of a person matters. Example: 20 packyear record is a very different category for a 30yo person versus an 80 yo person. And as Christian and Oleg point out, pack years is a measure of cumulative damage versus cigarettes per day which is current usage.
This proposal also does not cover:
Survey data. Survey data has a working group and should continue to be modeled and debated there.
Tobacco and nicotine data are not easy for a number of reasons. It’s not always recorded, it’s patient reported, it’s not always reported accurately, it can begin years or decades before our Observation Period starts, there’s lots of free text out there, etc. Our goal is solid conventions covering the majority of use cases. And as our data, vocabularies, recording methods, and use cases evolve, so will the conventions and the CDM.
Thoughts?
On Sunday, October 16th at the CDM workshop we discussed the above issue. We came to the following conclusions:
1. Describe the issue/topic?
Currently, the conventions lack guidance on how tobacco usage should be stored in the CDM. There are multiple standard concept_ids with the same/similar meaning in different domains.
2. What do we know about this topic? What has been discussed?
This has been discussed on the forums for many years. And was also discussed at the original Themis meeting in 2018. See the forum post above for the most recent and comprehensive discussion.
3. What are recommendations for how to handle this issue/topic?
Proposed Solution: Tobacco and nicotine data are not easy for a number of reasons. It’s not always recorded, they are patient reported data, it’s not always reported accurately, it can begin years or decades before our Observation Period starts, there’s lots of free text out there, etc. Our goal is solid conventions covering the majority of use cases. And as our data, vocabularies, recording methods, and use cases evolve, so will the conventions and the CDM.
*We don’t need the exact number of cigarettes (it is probably false precision anyway)
*Cigarette frequency is for current usage
You map your data in at the granularity you have. So, if the record is Heavy Cigarette smoker, you pick the concept_id representing this idea. If you have a flag for Cigarette smoker: yes, then you pick the concept_id representing this idea.
4. What next steps should be taken?
The Vocabulary team will notify us which concept_ids will be standard for each idea above. And we can put this information into the CDM conventions
The Vocab team will make some concepts non-standard and map them to standards and create new hierarchy
Do we have a concept for each of these, or will we need to create some de-novo?
We’ll recreate the whole story de novo in OMOP Extension to make it all clean and not confusing by old hierarchy, other relationships and different flavors of meaning.
Like we did in Specialty and Visit?
There we mostly preserved the source semantics, structure, and entire donor vocabularies.
Rather Cancer modifier…
Do we have a convention agreed already?
If yes, when we can expect the proper concepts?
We’re planning to release them in December
@Alexdavv what are your thoughts about working with the SDOs to get the content added to the appropriate vocabulary (LOINC or SNOMED) rather than creating the concepts in the OMOP Extension?
We - and probably other organizations - currently store concept codes for tobacco smoking information natively in our EHR (SNOMED concept ids when captured in social history, LOINC codes when captured in flowsheets)
Only concepts from industry standard vocabularies - not OMOP Extension concepts of course - can be mapped natively in the EHR.
Best,
Piper
Because it would take an enormous amount of time. Even within OMOP, it took us years to agree, author the content, and release (hopefully, this week)?! Besides that, in most cases it’s another way around - they have much more content than we need. And the content is modeled and organized in the hierarchy in another way that we can easily adopt in OMOP.
Dear Community, I want to share with you our updates that were done 16 January and tell you about our future plans.
We introduced a set of tobacco or its derivatives-related concepts to accompany these ETL Smoking conventions in a few axes in OMOP Extension vocabulary. The top concept of hierarchy is Findings of tobacco or its derivatives use or exposure. Tobacco users are now defined according to the type of the product they use (Smokeless, Electronic, Cigarettes, Cigars, etc.), while cigarette smokers are also classified according to the severity of smoking (Trivial, Light, Moderate, Heavy, Very heavy). Cigarettes pack-years smoked during life is intended to capture the cumulative consumption of cigarettes.
Newly created concepts are Standard and should be used during ETL processes and mapping. We are going to destandardize concepts from other vocabularies during further releases. For now, our priority is SNOMED Standard concepts as many Non-Standard concepts from other vocabularies have “Maps to” link to them.
We would like to share an update on our progress with the Smoking hierarchy: in this v20230531 release, we have remapped smoking-related SNOMED concepts to new OMOP Extension concepts. As a result of these changes, certain concepts from source vocabularies such as ICD9CM, Read, CIEL, etc., have lost their mappings.
Each vocabulary has its own load_stage, meaning that to extend these missing mappings to new OMOP Extension concepts, we must execute the load_stage of each vocabulary. This process often exposes numerous pitfalls due to variations in vocabulary versions, interdependencies, relationships, and domain assignments. Consequently, running each vocabulary and achieving satisfactory results consumes a considerable amount of time. We will address this issue in future releases.
We identified some issues that need to be discussed with the community:
We believe it would be beneficial to have a new concept that serves as a parent for all types of smokers. This would provide a more comprehensive and inclusive representation of smoking behavior within the OMOP Extension.
What do you think about it? Do we really need a new concept that would encompass all smokers?
concept_id_1 | concept_name_1 | relationship_id | concept_id_2 | concept_name_2 |
---|---|---|---|---|
4052949 | Ex-cigar smoker | Maps to | 903651 | Currently doesn’t use tobacco or its derivative |
4052949 | Ex-cigar smoker | Maps to | 1340204 | History of event |
4052949 | Ex-cigar smoker | Maps to value | 903664 | Cigar smoker |
4092281 | Ex-cigarette smoker | Maps to | 903651 | Currently doesn’t use tobacco or its derivative |
4092281 | Ex-cigarette smoker | Maps to | 1340204 | History of event |
4092281 | Ex-cigarette smoker | Maps to value | 903657 | Cigarette smoker |
44811943 | Ex user of electronic cigarette | Maps to | 1340204 | History of event |
44811943 | Ex user of electronic cigarette | Maps to value | 903655 | Electronic cigarette smoker |
4052465 | Ex-pipe smoker | Maps to | 1340204 | History of event |
4052465 | Ex-pipe smoker | Maps to value | 903663 | Pipe smoker |
A similar situation arises with the concept Never used tobacco or its derivatives. This concept should not be equated with “Never smoked tobacco.” There may be individuals who have never smoked tobacco but have used other forms of tobacco or derivatives.
We need to establish clear rules for when this concept is appropriate and consider the possibility of adding new concepts or modifying the existing one.
concept_id_1 | concept_name_1 | relationship_id | concept_id_2 | concept_name_2 |
---|---|---|---|---|
42534813 | Maternal tobacco use in pregnancy | Maps to | 4270154 | Pregnancy observable |
42534813 | Maternal tobacco use in pregnancy | Maps to | 903654 | Tobacco or its derivatives user |
40486722 | Stopped smoking before pregnancy | Maps to | 1340204 | History of event |
40486722 | Stopped smoking before pregnancy | Maps to value | 903654 | Tobacco or its derivatives use |
40486721 | Stopped smoking during pregnancy | Maps to | 1340204 | History of event |
40486721 | Stopped smoking during pregnancy | Maps to value | 903654 | Tobacco or its derivatives use |
40486696 | Smoked before confirmation of pregnancy | Maps to | 1340204 | History of event |
40486696 | Smoked before confirmation of pregnancy | Maps to value | 903654 | Tobacco or its derivatives use |
It is important to acknowledge that this mapping decision compromises the granularity and functionality of SNOMED. We would like to ensure that everyone agrees that this is a suitable decision.
concept_id_1 | concept_name_1 | relationship_id | concept_id_2 | concept_name_2 |
---|---|---|---|---|
37109024 | Tobacco dependence caused by chewing tobacco | Maps to | 4209423 | Nicotine dependence |
37109024 | Tobacco dependence caused by chewing tobacco | Maps to | 903667 | Chewing tobacco user |
3655996 | Tobacco dependence with current use | Maps to | 4209423 | Nicotine dependence |
3655996 | Tobacco dependence with current use | Maps to | 903654 | Tobacco or its derivatives use |
602340 | Tobacco dependence caused by cigarettes in remission | Maps to | 602340 | Tobacco dependence in remission |
602340 | Tobacco dependence caused by cigarettes in remission | Maps to | 1340204 | History of event |
602340 | Tobacco dependence caused by cigarettes in remission | Maps to value | 903657 | Cigarette smoker |
37110445 | Nicotine dependence with current use | Maps to | 4209423 | Nicotine dependence |
37110445 | Nicotine dependence with current use | Maps to value | 903654 | Tobacco or its derivatives use |
What is your opinion on this matter?
Good questions
So if your data is granular enough, use specific type of smokers. If not, use Cigarette smoker. If you need all smokers, just pick all of them while building your concept set / cohort.
Let’s assume patient smoked cigarettes for 5 years, then switched to moist tobacco for 2 years, then smoked 2 packs a single day for some reason, and then switched to something else. Would it be accurately reflected in the data? How should we treat this patient? Yes, Tobacco or its derivatives user
is the right concept.
Now never users. We can name it a feature of the model. You no longer need to care about these small differences.
If the patient never smoked, but used other types of tobacco, he/she is a Tobacco or its derivatives user, period.
“Never smoked” was included in the synonyms list because it is a very common name to discuss tobacco behaviour in clinics and because logically, every smoker used tobacco. So
IF Never used tobacco THEN Never smoked.
It is a small change. We can remove the synonym if it feels confusing.