OHDSI Home | Forums | Wiki | Github

Phenotype Phebruary 2023 - P5 - Systemic Lupus Erythematosus (SLE)

Building on previous work in Phenotype Phebruary 2022

Phenotype Phebruary Day 6 - Systemic Lupus Erythematosus (SLE)

Ah, remember the old days, say, February 2022, when the phenotype development process in OHDSI was in its early adolescence? Well, it’s grown up a lot in the past year thanks to a lot of work by phenotype development scientists like @Gowtham_Rao and @Azza_Shoaibi . Last year I submitted my phenotypes for SLE as four separate algorithms: two sensitive algorithms for incident and prevalent cases and two specific algorithms requiring a second code for SLE within 31-365 days post index. This year I’m going to reduce the algorithms down to one for easier review. The single algorithm, which can be viewed in the OHDSI Phenotype Library shiny app as phenotype 119:

The algorithm is incident, using the first code for SLE, with up to 90 day index date reclassification.

Systemic lupus erythematosus (SLE) is a chronic autoimmune disease of unknown origin. Clinical manifestations include fatigue, arthropathy, and involvement of nearly all organ systems, particularly cardiac and renal.(Jump et al, Greco et al 1, Miner et al 1, Danila et al 1) A review by Stojan and Petri 1 of research on multi-country incidence rate estimates found the incidence rate of SLE to be between 1-9 cases per 100,000 person-years (PY).

Clinical Description

Overview: Systemic lupus erythematosus (SLE) is an autoimmune disease of unknown etiology that may involve any organ system with a wide range of severity. It is characterized by periods of exacerbation and relative quiescence and occurs predominantly among women of child-bearing age.

Presentation: SLE presents with symptoms such as fatigue, rash (usually malar), arthritis, anemia, and nephritis and occasionally may be life threatening.

Assessment: Anti-nuclear antibodies (ANA) presence is highly sensitive but not specific to SLE. Under specialist care, testing includes measurement of anti-double-stranded DNA (ds-DNA) antibodies, a more specific version of ANA. When someone with SLE features tests ANA+, testing then includes measurement of complement (C3, C4) as well as ds-DNA. Diagnosis of SLE is difficult and often delayed especially in the early stages. For many patients, true disease onset may be when symptoms, treatment, or testing started for SLE which may be weeks or months prior to an SLE diagnosis

Plan: Conservative treatment includes sun protection, NSAIDS, anti-malarial drugs such as hydroxychloroquine and chloroquine, and methotrexate. More life-threatening states may be managed by systemic glucocorticoids and cytotoxic or immunosuppressive agents.

Prognosis (from UpToDate): Systemic lupus erythematosus (SLE) can run a varied clinical course, ranging from a relatively benign illness to a rapidly progressive disease with fulminant organ failure and death. The five-year survival rate in SLE has dramatically increased since the mid-20th century, from approximately 40 percent in the 1950s to greater than 90 percent since 1980s. The improvement in patient survival is probably due to multiple factors including increased disease recognition with more sensitive diagnostic tests, earlier diagnosis or treatment, the inclusion of milder cases, increasingly judicious therapy, and prompt treatment of complications. Despite these improvements, patients with SLE still have mortality rates ranging from two to five times higher than that of the general population. Based on mortality data from the United States Centers for Disease Control and Prevention (CDC) from 2000 to 2015, SLE ranked among the top 20 leading causes of death in females between the ages of 5 and 64. In another large population-based study in the United States, mortality risk from SLE was higher among women, African Americans, and residents of the South.

Designated Medical Event - MedDRA PT terms: Granulocytopenia, Has neutropenia as a component: Aplastic anemia, Bone marrow failure, Pancytopenia; Neutropenic colitis, Neutropenic infection, Neutropenic sepsis

Phenotype development:


As discussed last year, after reviewing the literature and finding recommendations from PHOEBE (thanks again @aostropolets), we decided on the following concept set:

We then began building our cohort definitions. We knew from prior literature that SLE often takes a long period to diagnose so we looked for signs, symptoms, and indicative treatment to determine if we should include index date reclassification.

We found the following for signs and symptoms:

And for prior indicative treatment:

We submit the cohort definition # 119 to the OHDSI phenotype library which is currently in peer review status.

Phenotype evaluation:

Here are the incidence rates across several databases:

SLE is more prevalent in females than males.

And tends to peak in the 50-59 YO age group.

The results from PheValuator are:

We see good sensitivity among those with a 365 day lookback period, with a mean of around 94% across the 7 databases tested. Positive predictive value (PPV) was fair, with a mean of around 69%. Remember from last year we found that adding a second code 31-365 days post-index increased the PPV with a concomitant decrease in sensitivity.

While I am limiting this year’s algorithm to one, this algorithm can be easily converted for other uses such as a conversion to a prevalent cohort by removing the requirement for a 365 day look-back.

Thanks, @jswerdel . I’ll be your average reviewer for this one.
Did you consider including measurements as part of the definition? SLE is hardly ever diagnosed without some supporting antibody measurements (as you state already in the summary clinical description above). Of course, problem is maybe we miss this in some databases, but I wonder if incorporating antibodies could increase validity in genera, and specificity in particular

And this is my somewhat structured peer review, based mostly on results from Cohort Diagnostics and pending an answer on the point above re potential inclusion of measurements:

Concepts included and observed:

  1. The concept set used for this phenotype is OK, although somewhat dependant on the observation of complications. These can be indeed the first clue that leads to a diagnosis, but there is always the possibility that these would be long-term sequelae, leading to index date inaccuracies
  2. As mentioned, I miss the inclusion of key measurements. I would NEVER diagnose SLE without supporting immunology/serologies. I understand some datasets might not have measurements data, but maybe it is worth studying what happens when we include vs exclude these in the phenotype in the datasets with good measurement capture

Orphan concepts (standard only reviewed):
The app was a bit jumpy and did not allow me to review these for all databases at once. In any case, I skimmed through each of one of these and could not find any ‘never miss’ orphan concept. Some could be symptoms of SLE (eg certain rashes) but of course these can be caused by many other conditions/exposures

Incidence Rates
Somehow this did not work. I tried a few things but kept getting an error message (Error: there is no package called ‘ggh4x’). Can you see if you also see this @jswerdel ? This is an important bit of information for me, because I want to make sure your phenotype matches my expectations based on previous descriptive epi literature, however scarce that might be.

Index event breakdown
This matched my expectation that most cases would be identified by Concept ID 257628. It also made me realise that we missed the inclusion of hydroxychloroquine or chloroquine among common treatments in the ‘Plan’ section of the Clinical description above.

Visit context
this again matched my expectation that most diagnoses would be made in outpatient clinics, even in data sources with good/granular information on visit types, e.g. Pharmetrics Plus

Cohort overlap
I guess this was never going to be useful for this one, but I still checked overlap with C225: [P] Drug-induced Lupus (180Pe, 365Era) because I’m cheeky like that. I did not see great overlap, but I did see a bit of that. Somewhat reassuring, but could dive in more if needed, because these are different conditions. Again, antibody measurements would help a great deal

Cohort characterization
This reassured me quite a bit as it showed the profile I expected, particularly in terms of socio-demographics but also in terms of the treatments observed etc

Compare cohort char
This was also useful. I used the ‘drug-induced lupus’ cohort as a “benchmark” of sorts for comparison, and indeed saw the differences I expected in terms of age-sex profile and treatments

It is likely that this is a good working phenotype for incident (newly diagnosed) SLE, which I understand is the target here. Recommendations could include the following:

  • Inclusion of ANA or dsDNA antibody measurements could increase PPV in databases where these are available
  • Include hydroxychloroquine in the clinical description for completeness, as this is indeed a first line treatment for SLE

NOTE: I did not review Incidence Rates as there was some glitch in the app. Let me know if I should redo/re-review if it works for you or you manage to fix this.

1 Like

Thanks @jswerdel for writing up the evaluation and @Daniel_Prieto for performing the peer review. Nice teamwork!

@Daniel_Prieto , I’m curious, did you find the PheValuator results helpful in determining that this definition was suitable as a phenotype for incident SLE? Your suggestion to add measurement values to increase PPV (beyond the 0.60-0.72 that was estimated across the 7 databases provided), are you thinking that this level of PPV is not adequate ‘as is’, or just thinking about how it can be improved further? Since many databases don’t have measurements, I was trying to think about how to incorporate an inclusion criteria without greatly sacrificing sensitivity. One strategy could be not requiring to see ANA or dsDNA antibodies, but rather include if 0 occurrences of those measurements with values incompatible with the phenotype OR >=1 occurrences of measurements with compatible values … this would allow databases without measurements at all to still be potentially included.

Hi @Daniel_Prieto I agree with you that SLE diagnosis is rarely done without both supportive serological marker AND expected constellation of symptoms and signs; and there are myriad of manifestations or subtleties of clinical presentations that makes diagnosis rely on experience of clinician.

I was curious to know how many persons had such antibody measurements and found this

I observed that only about 19% had the testing on index date, but 75% had it sometime during their entire observation period

This data source does not tell us the antibody results (positive, negative or titer), other than it was performed. The fact that this test was performed on the majority of the persons atleast once sometime makes me want to trust this phenotype. But i am not sure why it was less observed on index date. Is this evidence of index date misclassification.

1 Like

hi @Patrick_Ryan . the results from PheValuator were useful. Without having read the most recent paper on the current version of the package, I went for ‘let us believe these results’. So I guess what I am saying is that if I basically trust the ‘credibility’ of PheValuator then of course this is interesting. Re PPV, in my previous experience regulatory studies aim for PPV>75% in chart reviews, but of course that is a completely different population etc.
As @Gowtham_Rao says in this same thread, there is quite a lot of testing going on, which is somewhat reassuring. If bandwidth allows, it would be good to create a phenotype that has an additional criterion of ANA or even better dsDNA+ in any database where measurement results are available. We could then compare the results for this vs the current phenotype, or look at cohort overlap. Just a thought.

To @Gowtham_Rao : I agree there is potential evidence of index date misclassification as you say, but I don’t see how we could improve that unless we have access to test results

Having said all that, I believe that the current phenotype works well overall and could be used across a network of databases for all the reasons above.

For others considering chipping in as a ‘Phenotype Peer Reviewer’: this was real fun, so do not hesitate and go for it, especially if there is a phenotype you have plenty of experience of

BTW this is a good paper on SLE epi in the UK using CPRD. They explicitly excluded drug-induced lupus, which could also be done in our context, and should probably also improve PPV a bit

Thanks all for the review and comments. I’m going to try @Patrick_Ryan’s suggestion for including the measurements and see how it affects the performance characteristics. And also a good suggestion to add hydroxychloroquine or chloroquine to the treatment plan (see changes above).

Is this a typo, or are there different lists? I could only find 119.

No idea what you’re talking about Christian - it clearly says ‘119’ at the top :smile: - thanks for catching this.

@jswerdel would you consider testing a cohort definition that restricts the persons in base cohort to those who had antibody testing at ANY TIME during their observation period. I would like to see if that improve the PPV without dropping sensitivity by much. Reason for this is multiple - including data capture issues, complexities in physician decision making for definitive diagnosis and the observation in my post above that more 70% of the persons had at least one antibody testing. Note - this observation was on one data source only - and it may not hold for other data sources. You could also consider adding the hydroxychloroquine or chloroquine requirement at any time too.

My expectation is the combination of the two would improve the PPV without sensitivity taking a big hit - on atleast one data source. It may have bigger sensitivity hit on data sources that do not capture antibody testing or drugs.

1 Like

@jswerdel based on the recommendation from @Daniel_Prieto this definition has been accepted to the OHDSI Phenotype Library.

The cohort id is 119 and this definition specification (json) is guaranteed to not change. It is expected to be available in https://github.com/OHDSI/PhenotypeLibrary starting version 3.10.0 (current release is 3.9.0). Once available it may be retrieved as follows

PhenotypeLibrary::getPlCohortDefinitionSet(cohortIds = 119)

and can be referenced in any OHDSI HADES software package.

The following meta information has been captured with the cohort id 119. If you would like any additional information please let me know

Name: Systemic lupus erythematosus indexed on signs, symptoms, treatment or diagnosis (FP)
Status: Accepted
hashTag: #PhenotypePhebruary, #2023, #SystemicLupusErythematosus
Contributor: Joel Swerdel, Daniel Prieto-Alhambra
Forum post: Phenotype Phebruary 2023 - P5 - Systemic Lupus Erythematosus (SLE)
Logic: first signs and symptoms suggestive of Systemic lupus erythematosus (SLE) or first treatment suggestive of SLE with SLE diagnosis with 90 days.

The power of Dani - after his review of the phenotype for SLE, it is now published!

Congratulations @jswerdel ! It’d be wonderful to see more papers like this for each of our phenotypes that we’re developing and evaluating during Phenotype Phebruary!

@jswerdel That is exactly what I am looking for. Thank you, @Daniel_Prieto for the clinical review. @Gowtham_Rao , I see that the github phenotype library release is at 3.12. and I found the cohortID 119 there. Great work!

1 Like