OHDSI Home | Forums | Wiki | Github

Phenotype Phebruary Day 11 - Suicide attempts

Azza Shoaibi

Hi team, it’s me again @AzzaShoaibi using @Gowtham_Rao account. This is day 11 of Phenotype Phebruary and I would like to start a discussion about suicide attempt. This is a phenotype that I worked on with my dear friend @conovermitch . In today’s post I will demonstrate:

  1. how important it is to learn from what others already did (literature) as a primary input into the phenotyping process
  2. how it is possible to incorporate other’s findings into OHDSI phenotyping practices and tools
  3. How we can but not necessarily should use “source codes” when developing phenotypes using OHDSI tools.
    Suicide attempt/self-harm (clinical description):
    One of the biggest challenges when working on this phenotype is agreeing on the target/clinical description. Suicide is a major public health concern and there is a big debate on what is the right target for studies looking at suicide as a target (study population) or an outcome. There is multiple overlapping but different constructs/terms like: Self-harm, suicide attempt, suicide ideation/thoughts, suicidality overall, suicidal behavior.
    The table below from INTRODUCTION - Screening for Suicide Risk in Primary Care - NCBI Bookshelf provides a nice summary of the clinical definitions of these terms

I will try to simplify things here and set suicide attempt and/or self-harm (grouped) as my target-

What we know from literature about suicide attempt on observational data.
Many groups have looked into the utility of suicide related codes (mainly ICD-9 CM code) in claims data to identify patients with suicide attempts. I will brief my findings here”
• ICD-9 codes of E95* (injuries of intentional intent) are the most explicit diagnostic code for suicide attempts.
• Previous reports have indicated that ICD-9 codes matching E950* may have low sensitivity to detect suicidal behavior due to coding practices, reimbursement patterns, and the uncertainty of intent.
• To maximize sensitivity of the case definition, many have identified an additional set of ICD-9 injury code categories (that includes wounds and drug posing) and E98* (injury of questionable intent) as potential indicators of suicide attempts in medical records
• Different papers reported different PPV values for these additional codes For example, Barak-corren et al. Predicting Suicidal Behavior From Longitudinal Electronic Health Records | American Journal of Psychiatry reported poor PPV for E98* codes and demonstrated through chart reviews that E95* (positive predictive value: 0.82), 965.* (poisoning by analgesics, antipyretics, and antirheumatics; positive predictive value: 0.80), 967.* (poisoning by sedatives and hypnotics; positive predictive value; 0.84), 969.* (poisoning by psychotropic agents; positive predictive value: 0.80), and 881.* (open wound of elbow, forearm, and wrist; positive predictive value: 0.70) should be included. in contrary Simon et al Risk of suicide attempt and suicide death following completion of the Patient Health Questionnaire depression module in community practice reported that that inclusion of injuries and poisonings with undetermined intent (E98*) increases ascertainment of probable suicide attempts by approximately 25%.
• Coding of self-harm or possible suicidal behavior changed significantly with the transition from ICD-9-CM to ICD-10-CM. ICD10-CM provides a higher granularity and higher opportunity to record suicide attempts and self-harm making it easier to identify intentional self-harm…

Cohort development:
Considering this prior knowledge, I developed the following suicide attempts/self-harm cohort.

# Cohort Standard code Codes used Source of included codes
C1 First event of Suicide attempt, 365 days prior observation Earliest occurrence of suicide attempts using SNOMED standard codes (which translates/maps to E95* ICD-9m. Phoebe informed
C2 First event of Suicide attempt, including injuries and poisonings with undetermined intend, 365 days Earliest occurrence of suicide attempts using SNOMED standard codes and ICD9CM codes for injury of questionable intent- E98* Replicate Simon et al
C3 First event of Suicide attempt, including drug poisoning arm injury, 365 days prior observation Earliest occurrence of suicide attempts using SNOMED standard codes and ICD9CM codes for drug poising (965.* ,967.* 969*) Replicate Barak-corren et al.

Up to this point, we have been developing cohorts using standard codes only. The second and the 3rd cohorts above are using ICD9CM codes source in addition to SNOMED. You can do that in atlas by selecting the attribute condition source concept option from the drop-down list (+add attribute) at the right side- as shown below. In this case I used the source codes verbatim that were recommended by each of Simon et al and Barak-corren et al. instead of the corresponding standard code (as Patrick has done before in his examples of replicating cohort definitions from the literature). I had to do that because, 1. The recommendation of using ‘drug poisoning, wounds, and injuries with undetermined intent’ is specific to the US health care context in the era of ICD9cm. 2. These source codes map to general/broad standard SNOMED codes/clinical idea that will sweep in unrelated clinical ideas.

The use of source codes in the 2 cohorts above will only make a difference in data sources and times that ICD9CM is used. For example, in US based data sources the 3 definitions will be identical post the transition to ICD10CM era.

Cohort evaluation:
I will start here with incidence rate plot from cohort diagnostics

Above are results from 2 US based data bases (Optum EHR,DOD). C1 is showing a increasing trend post 2015. This is the cohort that is based on standard code and is consequently limited to the mapped ICD-9 codes of E95* (that we know have low sensitivity). The fact that we observe an increasing trend post 2015 suggest that the use of ICD10CM did indeed improve the systems ability to capture suicide in medical record (it is unlikely that suicide attempts and self-injury truly increased by 50%-80%).
The next question we can ask, does including additional set of ICD-9 injury code categories correct for the low incidence rate we observe pre-2015? In other words, will we see a stable incidence rate plot among C2 or C3?
We can see that both c2 and c3 has a much higher rate pre-2015, than that observe in c1, however, the line drops post 2015. The drop in c3 is milder than that in c2. This may suggest that using ICD9CM codes for injury of questionable intent- E98* resulted in a much higher rate than that observed post 2015 but using drug poising (965.* ,967.* 969*) resulted in a closer trend than that observed post 2015. We don’t know from here that the extra people added by using these drug poising codes are true suicide attempt cases, but we know that the trend became closer to what we observe post 2015.
To check for what kind of people we are capturing we can explore temporal characterization. For this exercise, will look at a sub-cohort who met the requirement of C3 by having a code that belong to drug poising (965.* ,967.* 969*) [lets call it c93] and compare them to a subset of cohort who met c1 criteria through ICD-9 codes of E95*[lets call them c96]. This will help answer the question, are there similarities between those with specific suicide code and those with dug pointing codes , the plot below are taken from Optum DOD

While the two groups are not identical there is considerable similarity in the covariate distribution. I will highlight some specific covariates that I investigated:

Please note that “suicidal thoughts” is not in the cohort definitions of neither of these groups. We observe that 11.2% c93 codes had suicidal thoughts. Interestingly, suicidal deliberate poising is part of c95 definition but not part of c93, however 11.0% of c93 has that code.

occurrence of depressive disorder, bipolar, alcohol use and opioid use were similar and relatedly common in both groups before and after index. These covariates may function as markers of specificity among these cohorts, observing consistent trends in the two cohorts (when they are mutually cohorts) is supportive evidence that icd9 cm drug poising codes may indeed be capturing patients with suicide attempts. Finally, age and gender distribution was similar across the 2 groups.

I showed 2 diagnostics in CD that are consistent (at least directionally) with prior findings of Barak-corren et al. and others that relied on chart reviews to estimate PPVs of specific codes.
Finally, I will quote my partner @conovermitch and his thoughts about any suicide phenotype, “suicide is a little different than our other phenotypes in terms of our expected capture in our data sources. Administrative claims and EHR data have fundamental limitations when it comes to capturing suicide and suicide attempts. Only a subset of suicides and suicide attempts result in medical encounters that appear in our data. This is important to think through in the context of whatever analysis you are doing since the misclassification can easily be differential with regard to disease severity (i.e. successful suicide attempts may never appear in the health system)

1 Like

Thank you @AzzaShoaibi for your leadership and kicking off this discussion (and to @Gowtham_Rao for sharing your account :slight_smile: )

There are many more people in our community much more expert in the clinical area than me, so I can’t add much to that part of the discussion. I suspect many in our Mental Health WG should be able to weigh in with their thoughts and observations. @Dymshyts @Andrew . I clearly recognize the clinical importance of this area, so excited if OHDSI can meaningfully contribute here. @paulstang is very optimistic about the potential opportunities for real world data in this space (limitations notwithstanding). I also have seen that there is quite of lot of recent publications on trying to develop and validate predictive models to anticipate suicidal thoughts and behaviors amongst depressed patients, including from members of our community (can’t find their tags, but @Victor Castro, @Fei Wang looking at you). Here’s some recent references for those interested. " Temporally informed random forests for suicide risk prediction" JAMIA 2021 by Byyrami et al. and Nock et al’s, “Prediction of Suicide Attempts Using Clinician Assessment, Patient Self-report, and Electronic Health Records” in JAMA Open 2022 both cite using the Barak-Corren et al algorithm. Tsui et al, " Natural language processing and machine learning of electronic health records for prediction of first-time suicide attempts" in JAMIA Open 2021 suggest they used a different algorithm, using structured diagnosis codes and NLP, and conducted a sampled adjudication to assess their cases (and then performed sensitivity analyses with alternative definitions). Su et al, " Machine learning for suicide risk prediction in children and adolescents with electronic health records" in Transl Psychiatry 2020 cited a different algorithm for identifying suicides from Patrick et al PDS 2010, and then used a ICD9->ICD10 mapper to apply the algorithm to more recent data. Now, I’m sure that @jennareps @RossW @cynthiayang @aniekmarkus @Rijnbeek would have fun reviewing these predictive models from the perspective of the design, methods, features, how much of PROBAST they satisfy, and whether OHDSI’s PLP package could be used to train or evaluate similar models. But what I have reflected on when reading this papers is how measurement error in the phenotype can impact our ability to to develop reliable prediction models. If we predict an outcome when imperfect sensitivity, then that could mean we underestimate performance because maybe some of the labed ‘non-cases’ are actually true cases in disguise, the model’s ‘misclassification’ is really just telling us who is really an outcome misclassification (in a manner akin to @Juan_Banda 's APHRODITE probabilistic phenotyping approach). If an outcome phenotype has imperfect specificity, then it could mean that we’re fitting to the wrong some of the wrong targetted people (and in particular, I could imagine those with a suicide code that didn’t have a suicide could be systematically different from those that are true positive, like for example, high-risk ‘watch list’ patients who exhibit different characteristics and health service utilization). I don’t have good intuition about the impact of outcome misclassification on predictive models, in terms of their internal performance (discrimination and calibration) but also in terms of their external validation. But my immediate gut is that predicting an outcome using a phenotype with high measurement error in any direction (low sensitivity OR PPV) may be quite problematic. It seems like it could be an extremely important area of research for OHDSI community to explore, given that we’re seeing from @jswerdel 's work that many of the outcomes we are dealing with have substantial measurement error, with sensitivity, specificity and PPV estimates that suggest we’re misclassifying large numbers of patients in all our studies.

The technical point that I want to reinforce, which @AzzaShoaibi very nicely demonstrated above: you CAN create a cohort definition that relies on source codes using ATLAS. I get the question often, ‘if I use the OMOP CDM, am I’m forced to use the standard vocabulary concepts in all my analyses?’ , and the answer is absolutely NO. If you want to, your source codes are in the CDM and accessible in your analyses, nobody is standing in your way. And they are occasional rare edge cases when you may feel you need to make a source code definition, because of how source codes are mapped into the standard. If so, go nuts, have fun, there is NOTHING technically standing in your way from doing that. Now, I would argue you SHOULD create cohort definitions that use standard concepts, because that is your path to enabling network analyses to generalize your findings. In the context of prediction models, like discussed above, it’s critical if you want to perform external validation (which we argue is a required best practice in PLP). Standardizing data and vocabularies is the foundation that makes it possible for us to generate reliable evidence across our community, and we should take advantage of that opportunity as much as we can.

We have worked several years on the problem of inferring uncoded self-harm (which includes both suicidal and non-suicidal self-injury) using noisy label machine learning. Our first work found wildly disparate levels of coded self-harm by US state, and also found disparities in coding practices by gender:

  • Kumar P, Nestsiarovich A, Nelson SJ, Kerner B, Perkins DJ, Lambert CG. Imputation and characterization of uncoded self-harm in major mental illness using machine learning. J Am Med Inform Assoc. 2020 Jan 1;27(1):136–146. PMID: 31651956

We built on this effort to impute the presence of self-harm within inpatient/ER visits and perform a comparative effectiveness study on 65 mono- and polypharmacy drug regimes plus psychotherapy using both the coded and imputed presence of self-harm at different levels of confidence in this paper:

  • Nestsiarovich A, Kumar P, Lauve NR, Hurwitz NG, Mazurie AJ, Cannon DC, Zhu Y, Nelson SJ, Crisanti AS, Kerner B, Tohen M, Perkins DJ, Lambert CG. Using Machine Learning Imputed Outcomes to Assess Drug-Dependent Risk of Self-Harm in Patients with Bipolar Disorder: A Comparative Effectiveness Study. JMIR Ment Health. 2021 Apr 21;8(4):e24522. PMID: 33688834

Because various forms of intentional self-harm as coded In US healthcare data do not have SNOMED equivalents, we found we could not count on the OMOP mappings, and had to enumerate all of the ICD-9 and ICD-10 codes for self-harm and use condition_source_concept_id for our phenotype characterization and imputation efforts.

Appendix 1 of this latter paper has the full list of ICD-9 and ICD-10 codes we used – the ICD-10 codes for self-harm have to be plucked out all over the hierarchy, and I believe it is a comprehensive list for these two vocabularies. The source code repository for this study also has a python file with all of the OMOP concept_id values.

As little as only 5%-10% of self-harm is coded in administrative claims data. We also have found in chart reviews of US Veteran’s administration data that follow-up visits for self-harming behavior (especially suicidality) often miscode follow-up visits after the initial self-harm as having a new self-harm event, when it is just a check-in. That is, the original self-harm code gets re-used in subsequent visits. Thus, models that say past self-harm is predictive of future self-harm can be polluted by these false positives.

In short, the coding of self-harm, including both suicidal and non-suicidal self-harm is so inconsistent, has so much missingness, and is so noisy that noisy-label machine learning approaches, especially those that incorporate patient notes are going to be essential for inferring high-quality phenotypes in this domain. Our poster at this year’s OHDSI conference covers some of our more recent work: Detecting PTSD and self-harm among US Veterans using positive unlabeled Learning – OHDSI


1 Like

Thanks @Christophe_Lambert for highlighting your team’s valuable work in this area (including your OHDSI2021 Best Community Contribution Award-winning poster and lightning talk). And its a good insight about another example of index date misspecification - if the code you see is really just the follow-up check-in, not the incident event, then it will be quite challenging to use such a phenotype if the timing of the cohort entry event matters (as is most often the case when using a phenotype as an outcome in either a characterization, estimation or prediction study). Grouping together codes with some era gap collapse logic will consolidate follow-up codes with initial codes as long as both are observed, but if only the follow-up code is seen, then you’re out of luck. I understand how your work has shown that machine learning approaches may improve classification of cases, but it’s not clear to me how it would help resolve the index date misspecification problem when you are missing observations of the initial event. I know @Juan_Banda has been thinking about how to incorporate setting of cohort entry and cohort exit dates into APHRODITE so that probabilistic modeling can produce an output that is consistent with all required cohort constructs for all the OHDSI outputs. I would be quite curious to see how much improvement in index dates could be achieved (in addition to improvements in net classification).

Ouch!! Can you show us, @Christophe_Lambert?

Because various forms of intentional self-harm as coded In US healthcare data do not have SNOMED equivalents,

Hi @Christophe_Lambert, if you post a list of the concepts that you need but that don’t currently exist in SNOMED (along with any relevant synonym(s) and an expilict definition), I will submit requests to ensure these concepts are added. This offer stands for any mental health related concept missing from SNOMED. SNOMED is typically able to turn around the mental health concepts within 30-60 days.

Please also feel free to join one of the SNOMED Mental and Behavioral Health Clinical Reference Group (CRG) Meetings. We meet the 1st and 3rd Monday of the month @1600-1730 UTC (Noon - 130 ET).

You can also email me a spreadsheet directly @sven0018@umn.edu or piperranallo@gmail.com and I will be happy to submit the request for new concepts.

It’s critical that we close the loop and make concepts needed by clinicians and researchers in the appropriate industry standard terminology.


1 Like

@Christophe_Lambert, and of course this is the easy part.
The hard part is getting health systems to make appropriate tools avialble for documenting relevant mental health findings and getting clinicians to consistently document it.

Here is the link to the SNOMED Mental and Behavioral Health CRG. The team is working hard to clean up outdated content, explicitly model concepts, and fill gaps.


HI @Christian_Reich ,

I have included postgresql code as an attached Word file, as it exceeds the maximum post size, which generates a list of ICD codes that do not have mappings to intentional self-harm codes in SNOMED, and I have attached an excel file that shows the output of that run on my system.

The attached excel file has 1190 active ICD10 codes related to intentional self-harm, as well as 756 codes which have no mappings in the concept_relationship table (mainly deprecated but some non-deprecated). Thus, use of these 756 codes to denote intentional self-harm will result in no standard terms. Use of the 1190 will map to terms that lose the fact that the person had an intent to harm self.

I realize there are one-to-many mappings that use one term to flag generic intent plus one or more other terms to denote the specific mode of harm (without intent), but this list contains none of these. However, I do have an older version of the vocabulary files and it is possible many of these have been fixed with new one-to-many mappings. However, this code should flag anything that may have slipped through the cracks, based on our careful curation of ICD-10 self-harm terms.

I apologize in advance for any hassle you may have with converting the postgresql temporary table syntax to a different sql dialect to run on your system, but this seemed like the most modular representation of the code.

Please let me know what you see with the latest vocabularies!

Thank you,

selfharm.sql.docx (26.8 KB)

selfharm_problems.xls (371 KB)

1 Like

Hi @Piper-Ranallo,

Thanks for weighing in and being willing to help with SNOMED. I am totally maxed out on other commitments for at least a couple months, but the code and output I posted can provide a starting point for identifying these problems. I just don’t have an environment with the latest OMOP vocabularies to give you a clean list right now.


1 Like

Hi @Christophe_Lambert

I took a quick peak at your spreadsheet and at the vocabulary files and looks like the issue (for most of the concepts anyway) is not that there is no available concept in SNOMED, but that the ICD10 concepts don’t have appropriate entries in the [edit: CONCEPT_RELATIONSHIP not CONCEPT_MAP] table.

Sounds like @Christian_Reich is your man for this one :grinning::

I will review the spreadsheet in more detail and bring the appropriate concepts the SNOMED CRG - no need to attend the call.


@Christophe_Lambert I believe that the code list I used in the definition above does indeed capture all the self harm code:

Hi @Azza_Shoaibi

While you may have captured all of the OMOP standard (SNOMED) codes for self-harm, that is not the issue. The issue is whether the concept_relationship mappings properly map all of the intentional self-harm codes from ICD-10 to OMOP standard codes related to self-harm. My post showed that is not the case in my version of the vocabulary.


mappedConcepts.xlsx (82.5 KB)

@Christophe_Lambert please see that attached list of the codes that maps to the standard SNOMED concept set expression that I shared in the previous post. I believe it does capture all the self harm codes you are looking for. I wonder if what you are experiencing is related to prior versions of the vocabulary. I had these issues around 2 years ago, but we don’t see those issues in current version. Hope this helps

Hi @AzzaShoaibi ,

I took the 148 standard concept codes you have listed under “Suicide, incident (informed by PHOEBE)” at atlas-phenotype.ohdsi.org, and checked if it captured all of the SNOMED codes I had enumerated as concept_id’s in my SQL code listed above. I have attached the following spreadsheet which shows 1381 SNOMED self-harm codes that are missing. I will have to look separately into the question of how well ICD-10 intentional self-harm codes are mapped with the latest vocabularies.
extra.xlsx (42.4 KB)