OHDSI Home | Forums | Wiki | Github

Proposal for new Smoking conventions - please weigh in

(Christian Reich) #1


We have been discussing this for a while. Eventually, we started a version of smoking hierarchy. After reviewing all the existing concepts containing “smoke”, “nicotine”, “cigarette” etc. it turned out the variety of attributes and their combinations are not that large, really. For example, heavy smoker does not get applied to, say, pipe or hookah. See the result here.

We are proposing a new hierarchy making the following assumptions:

  1. We treat tobacco, smoking and nicotine dependence as synonyms. Even though there are electronic cigarettes, which really are not smoke, in reality these are used that way.
  2. All non-nicotine smoke (or electronic version of that) is out, such as smoking illicit or legal drugs and marijuana.
  3. Family history is out. For the patient’s history see below.
  4. Smoke from fires etc. is out.
  5. All sorts of toxic effects, sequelae of nicotine abuse and allergies are out. They are separate conditions, and we don’t know how the nicotine came into the body, and whether that was an acute event, rather than smoking.
  6. Lab tests for nicotine are out.
  7. Nicotine replacement therapy are due to nicotine abuse, but still not the same thing, and therefore out
  8. Water pipes, shishas and hookahs as identical.
  9. Second hand smoking and passive smoking is treated the same.
  10. Only cigarette frequency is measured in the existing Concepts, cigars, pipes and hookahs almost never are. The cigarette frequency per day definition is:
    • trivial=0-1 cigarettes per day
    • light=1-9
    • moderate=10-19
    • heavy=20-39
    • very heavy or aggressive=>40
      We don’t need the exact number of cigarettes (it is probably false precision anyway). Source concepts that ask for exact number per day or week need to be manually mapped to these five categories. If the source concept is mentioning one of the five frequencies without the type we automatically assume cigarettes.

This means we have the following dimensions:

  • Type: hookah pipe, cigarette, moist tobacco, chewed tobacco, smokeless, pipe, passive, electronic cigarette, cigar and snuff.
  • Amount as above: trivial, light, moderate, heavy, very heavy
  • Timing: ex (=in remission), in utero, perinatal

The resulting hierarchy is much simpler than expected:


One thing we need to discuss is should we create additional concepts where we combine all of these with the various timings. The big one is “ex”, meaning the smoking behavior is in remission but happened in the past. Alternatively, we relegate those to “History of”. The “in utero” and “perinatal” are usually only combined with “Smoking” and “Cigarette”. Again, unless the Observation Period covers this period of time it might be “History of”.

Here is what we are not modeling, mostly because we didn’t actually find that many concepts:

  • Time since cessation
  • Episodic smoking
  • Age of start of smoking
  • Duration of unsuccessful cessation
  • Overall time of smoking (irrespective of strength)
  • Negative concepts

The latter might cause some stress with folks (as usual flavors of null do).

Please think and discuss, and let us know. In particular, let us know if the omissions make sense.

@aostropolets + @Christian_Reich

Survey vocabularies in OMOP
Survey vocabularies in OMOP
(Melanie Philofsky) #2

Very nice @aostropolets & @Christian_Reich!

From Colorado’s point of view, duration of use and/or start and stop dates are missing.Very heavy tobacco consumption for 1 year versus 50 years has very different physiological affects on the body.

New fact_relationship relationship_id
(Christian Reich) #3

But do you have codes you are using for that, @MPhilofsky ? We need to hang this on to something.

(Melanie Philofsky) #4

Yes, we use the standard concept_id = 3004518 for pack per day. And we have a custom concept_id for tobacco used in years. Both could be children of cigarette.

(Alexander Davydov) #5

Don’t we want these types? It says they’re very distinctive. Is there any chance it would be recorded?

What about occasional/social smoking? You map it to “Smoker”, but I wouldn’t do. I know, they say it’s harmful. Isn’t such an assumption in the terminoly is a reason for that?
At least, a distinct category is needed to set the borderline between “1 cigarette per week/month/year” and “0-1 cigarette per day”.

It’s pretty the same as “Passive” is a “Smoker, where “Passive” means not just a fact, but also a risk of exposure (according to mapping). Do we want all this in the 'Smoker” cohort?

Another thing is the survey data/classification terms you map to the “Smoker” concepts. Would it be “Maps to”? You can’t just map the questions. How would ETL treat it? So unless we have a MAPPING table set, it’s also manual work.

40770347 Have you ever smoked regularly [PhenX]
45508052 [V]Tobacco use
4041306 Tobacco use and exposure
40766305 Have you ever smoked part or all of a cigarette [PhenX]
40766943 Do or did you inhale the cigar smoke [PhenX]

Some negative facts simply confirm that patient doesn’t smoke or mean nothing. So no need for any mapping. Just keep them alive, ok?

4196422 Not a passive smoker
45522772 Smoking review not indicated
40664614 Smoking status and exposure to second hand smoke in the home not assessed, reason not given
45508195 Parents do not smoke
45441534 Never smoked tobacco

Also, you map the contextual facts. Is there a real need? Is there any chance that “non-attractive appearance” would be registered, while the entire “smoking” not? I’m not sure about mapping to “Smoker”:

37021066 My smoking makes me less attractive to other people [PROMIS]
37021082 If I quit smoking I will be more attractive to others [PROMIS]
37020314 I crave cigarettes at certain times of day [PROMIS]
40766360 How soon after you wake up do, or did, you smoke your first cigarette [FTND]
36713256 Tobacco cessation education not done
37019831 The idea of not having any cigarettes causes me stress [PROMIS]
36208999 Evaluation and management of smoking cessation note | {Setting}
4263877 Smokers cough
40766306 Have you smoked at least 100 cigarettes in your entire life [PhenX]

If we use “History of…” here, you can’t even know about the remission. While we map all these “did or do you” / “did you” / “have you ever”, it’s gonna be altogether in the “History of…” group?

Don’t you want to add one more dimension here? @Christian_Reich

(Christian Reich) #6

Can you not translate that to the categories trivial to very heavy?

Don’t think so. Never heard of the distinction, and doubt we have that level of detail in the data. Leave alone the lack of use cases.

Why? We need some simple categorical level of smoke exposure. Anything below 1 cigaratte per day is negligible compared to the other categories.

Well, if the data tell us there is exposure to smoke it’s smoking. But you are right, these are not very precise distinctions. It should be just good enough.

That’s a question for the survey discussion. I agree, you cannot map a question, only the question-answer combo.

Right now, negative facts are conspicuous of absence in the OMOP CDM, with the exception of the Measurements and Observations.

We don’t need that. If somebody says we do - give me the use case.

Well, that’s the question.

Yes, I thought so too. But then I didn’t find much in the way of codes like that. Only how heavy the smoking was in cigarettes per day. So, I dropped it.