OHDSI Home | Forums | Wiki | Github

Phenotype Submission - Self-harm nonstandard vocabulary

Cohort Definition Name: Self-harm nonstandard vocabulary (both with and without suicidal intent) ICD9CM and ICD10CM source vocabularies

Contributor Names: Ruchina Shakya and Christophe Lambert, University of New Mexico


  1. Self-harm initial visit
  2. Self-harm subsequent encounter/sequela
  3. Self-harm ambiguous whether initial or follow-up
  4. Self-harm history
  5. Self-harm thoughts

Self-harm can be divided into suicidal and nonsuicidal self-injury (SSI & NSSI), but the billing codes for ICD-9 and ICD-10 are problematic for distinguishing the two. The ICD-10 vocabulary can specify intent to harm oneself by many different means, but those means do not have a category for suicidal vs. nonsuicidal intent. Only after October 1, 2021 did ICD-10 provide R45.88 as a billable code for nonsuicidal self-harm (e.g. for self-cutting). Therefore we create an umbrella phenotype definition for self-harming behavior without differentiating suicidal intent.

Importantly, as described in this vocabulary issue, the OMOP mappings for self-harm confuse subsequent encounters for self-harm with additional self-harm events: https://github.com/OHDSI/Vocabulary-v5.0/issues/881. Therefore we have defined concept sets using ICD9CM and ICD10CM codes that do not confuse these issues. Note that an earlier OHDSI effort towards self-harm phenotyping does not address this nuance but does provide a more portable SNOMED-based phenotype: Phenotype Phebruary Day 11 - Suicide attempts.

In addition, we differentiate between several categories of concept sets that can be mixed and matched according to specific use cases, and to address variations on outcomes (e.g. first known self-harm):

  1. Self-harm initial visit: It focuses on self-harm initial encounters with healthcare services related to self-harm but does not include subsequent encounters or sequela of self-harm. ICD-9 does not have followup codes and we consider them all as initial.
  2. Self-harm subsequent encounter/sequela: It provides only the codes for subsequent encounters and follow-up visits for self-harm. This is useful for determining that self-harm has happened in the past, but does not represent a current event. ICD-9 does not have followup codes and we consider them all as initial.
  3. Self-harm ambiguous whether initial or follow-up: These are ICD-10 ancestor terms, eg. T65.92 that are Non-Billable/Non-specific, but nonetheless are sometimes used (even in claims data).
  4. Self-harm history: It identifies individuals with a documented/recorded history of self-harm. This captures only the fact that self-harm occurred at an unspecified point in the past. This is useful for establishing a lifetime history of self-harm. It should be combined with 1 and 2 to capture all history of self-harm. The concept sets for self-harm history only includes ICD-10-CM codes because the ICD-9-CM codes do not provide a specific enough code (i.e. V15.89 for other specified personal history presenting hazards to health) for a history of self-harm. Thus one should use this set with caution prior to the introduction of ICD-10.
  5. Self-harm thoughts: It identifies individuals who report thoughts or ideation related to self-harm, but not actions. On its own, this is not self-harm. Also presumably all actions of self-harm would involve thoughts of such, though thoughts are not typically recorded when an action has occurred.

Useful combinations could include:

  • 1 + 2 + 3 + 4 ever — to determine whether there ever was self-harm
  • 5 but never 1-4 — an indication of only thoughts of self-harm but no actions
  • 4 but not 1-3 — an indication that self-harm presumably occurred prior to the patient observation period or was not recorded, potentially due to the care for self-harm being done at a different organization whose data are not captured in the current OMOP instance.
  • 1 only could be used in phenotype imputation as compared to 2 to quantify how often follow-ups for self-harm get coded as initial visits.


self-injury, suicide attempt, intentional self-harm, suicidal behavior, suicidal self-injury (SSI), non-suicidal self-injury (NSSI).


While self-harm is often linked to visible physical injuries like cuts, burns, or bruises, there is a broader range of signs and symptoms associated with this behavior. Individuals who engage in self-harm may also grapple with intense emotional distress, frequent mood fluctuations, social isolation, and symptoms indicative of depression or anxiety. Moreover, self-harming tendencies can be accompanied by secretive actions, such as hiding wounds, collecting sharp objects, or making multiple attempts at self-injury. Given the multitude of symptoms and the fact that emotional indicators are often vague, identifying self-harm can be a challenging task, underscoring the importance of early intervention and support for those affected.

Diagnostic Evaluation:

Self-harm is an outcome and not a disease. It is an action that co-occurs with numerous mental health conditions and can have higher or lower risk as a function of different medications. The evaluation for self-harm involves healthcare professionals inquiring about the methods, frequency, and emotional triggers for the behavior. They also assess when and where self-harm occurs in an individual’s life and conduct psychological assessments to gauge emotional well-being and potential co-existing mental health conditions. This comprehensive evaluation is essential for tailoring effective interventions and providing necessary support.

Differential Diagnoses:

Self-harm is an outcome and not a disease. Risk factors often include mental health conditions such as major depressive disorder, borderline personality disorder, bipolar disorders, generalized anxiety disorder, substance abuse or addiction, and post-traumatic stress disorder (PTSD). Importantly there are behaviors such as smoking or substance use that may be considered detrimental to self but are not considered self-harm. One will also find that it is often a judgment call to say whether a drug overdose was self-harm or not, based on a judgment of the presence of suicidal intent. All things being equal, when an overdose occurs, drugs used in substance use disorders may less often be judged to be used in a self-harming manner than overdoses of sleeping pills, for instance. Many times intent is not judged and the nature of the injury is reported (e.g. poisoning). A large fraction of self-harming behavior is not recorded as such but can be detected with machine learning [1-4].

Treatment Plan:

The treatment plan for self-harm often involves a combination of psychotherapy, such as cognitive-behavioral therapy (CBT), to address emotional triggers and develop healthier coping strategies. Additionally, medication, such as lithium, antidepressants, antipsychotics, and/or mood stabilizers, may be prescribed to manage underlying mental health conditions contributing to self-harming behaviors. These approaches work together to provide support and treatment for individuals struggling with self-harm.

Self-harm is associated with repeated episodes. However, many times coding is sloppy and follow-up for self-harm may be coded as if it was a new episode. With effective treatment for underlying mental health conditions and psychosocial support, individuals who engage in self-harm can reduce self-harming behaviors. However, relapses may occur during times of stress. Death is the worst outcome of self-harming behavior (particularly with suicidal intent). Completed suicides are often not captured in EHR or claims data as the event may trigger contact with only a coroner and not the medical system.


Depending on the use cases the 5 concept sets can be used to exclude different time periods or events that do not represent the outcome of interest. For instance, subsequent encounters and follow-ups might be excluded in a study of new self-harm outcomes. They might be used if one was asking if there was lifetime self-harm. Distinguishing bad outcomes of substance use disorders from intentional self-harm can be tricky diagnostically and will be even more challenging just by examining billing codes.


Ambiguities in self-harm include distinguishing between non-suicidal self-injury and self-harm with suicidal intent, determining whether self-harm primarily serves as an emotional coping mechanism (NSSI) or involves an intent to take one’s life. This may not be a clear-cut binary delineation but rather be along a continuum. Differentiating deliberate self-harm from accidental injuries with similar physical signs can also be challenging. Additionally, it is important to identify co-occurring mental health conditions, which can inform diagnosis and treatment decisions.


Self-harm is typically classified into several subtypes based on the methods and behaviors involved. These subtypes may include cutting, burning, jumping, suffocation, scratching, hitting, and ingestion of harmful substances. Psychiatry also attempts to distinguish between suicidal and non-suicidal self-injury — the latter often tied to managing of difficult emotions or attention-seeking behavior.


  1. Nestsiarovich A, Kumar P, Lauve NR, Hurwitz NG, Mazurie AJ, Cannon DC, Zhu Y, Nelson SJ, Crisanti AS, Kerner B, Tohen M, Perkins DJ, Lambert CG. Using Machine Learning Imputed Outcomes to Assess Drug-Dependent Risk of Self-Harm in Patients with Bipolar Disorder: A Comparative Effectiveness Study. JMIR Ment Health. 2021 Apr 21;8(4):e24522. PMCID: PMC8100888
  2. Kumar P, Davis SE, Matheny ME, Villarreal G, Zhu Y, Tohen M, Perkins DJ, Lambert CG. PULSNAR: Positive Unlabeled Learning Selected Not At Random –towards imputing undocumented conditions in EHRs and estimating their incidence [Internet]. OHDSI 2022 Global Symposium; 2022 Oct 14; Washington, DC. Available from: PULSNAR: Positive Unlabeled Learning Selected Not At Random –towards imputing undocumented conditions in EHRs and estimating their incidence – OHDSI
  3. Kumar P, Lauve NR, Davis SE, Parr SK, Park D, Matheny ME, Villarreal G, Uhl G, Zhu Y, Tohen M, Perkins DJ, Lambert CG. Detecting PTSD and self-harm among US Veterans using positive unlabeled Learning. OHDSI 2021 Global Symposium [Internet]. Observational Health Data Sciences and Informatics; 2021. Available from: Detecting PTSD and self-harm among US Veterans using positive unlabeled Learning – OHDSI
  4. Kumar P, Nestsiarovich A, Nelson SJ, Kerner B, Perkins DJ, Lambert CG. Imputation and characterization of uncoded self-harm in major mental illness using machine learning. J Am Med Inform Assoc. 2020 Jan 1;27(1):136–146. PMID: 31651956

Logic description:

Cohort definitions: We have specified only concept sets, and further logic for a given study is required to specify a cohort. The source codes map to either the condition_occurrence table or the observation table.

  1. Self-harm initial visit:
    self_harm_initial.txt (269.4 KB)

  2. Self-harm with subsequent encounter/sequela:
    self_harm_followup.txt (467.0 KB)

  3. Self-harm ambiguous whether initial or follow-up:
    self_harm_ambiguous.txt (230.1 KB)

  4. Self-harm history:
    self_harm_history.txt (1.8 KB)

  5. Self-harm thoughts:
    self_harm_ideation.txt (1.2 KB)

Assertion Statement: Simple queries determining the presence of at least one code in the given concept set were executed and at least one real person-level observational health data source and resulted in a cohort with at least 1 person.

Issues: As stated earlier, SNOMED codes lack the detailed specificity present in ICD-9CM and ICD-10CM codes. When translated into SNOMED codes, the naming of concepts can create the impression of new occurrences of self-harm, counting follow-up visits as multiple incidents. At the same time, the EHR data can be wrong. We have verified in EHR data that follow-up visits and sequelae of suicide attempts and other self-harming behavior frequently are coded as initial events, overstating the amount of self-harm for individuals. Given that ICD-9 does not have specificity on initial vs. follow-up/sequelae, some studies may wish to limit studies to post-ICD-10. Because the concept sets we provide include only the ICD10CM/ ICD9CM codes and not their SNOMED mappings, this limits portability to health systems that do not use these source vocabularies. Again, note that an earlier OHDSI effort towards self-harm phenotyping does provide a more portable SNOMED-based phenotype: Phenotype Phebruary Day 11 - Suicide attempts.

We have created two cohort definitions using the source vocabularies described above. They were executed on at least one real person-level observational health data source and resulted in a cohort with at least 1 person.

  1. Self-harm initial encounter ICD9CM and ICD10CM
    self_harm_initial_encounter_ICD9CM_ICD10CM.txt (1.0 MB)

  2. Self-harm initial or follow up encounter ICD9CM and ICD10CM
    self_harm_initial_followup_ICD9CM_ICD10CM.txt (1.0 MB)

These cohorts allow one to contrast using only initial encounter codes for self-harm versus initial + subsequent + sequela encounters.