OHDSI Home | Forums | Wiki | Github

Phenotype Phebruary 2023 - P3 - Appendicitis - interactive session

Please use the link so that we can work on this together

Meeting invite on MS Teams

Friday, February 3, 2023 2:00 PM - 3:30 PM EST

(Apologies in advance for our colleagues in a time zone when attending this live session is not possible. This was the only time @Azza_Shoaibi and @Gowtham_Rao were available. This session is recorded. If needed we are willing to repeat it at a different time zone).

Clinical Description

Appendicitis is the inflammation of the vermiform appendix. It typically presents acutely, within 24 hours of onset, but can rarely also present as a more chronic condition. Classically, appendicitis initially presents with generalized or periumbilical abdominal pain that later localizes to the right lower quadrant. Appendicitis occurs most often between the ages of 5 and 45, with a mean age of 28. The incidence is approximately 233/per 100,000 people. Males have a slightly higher predisposition to developing acute appendicitis than females, with a lifetime incidence of 8.6% and 6.7% for men, and women, respectively. Approximately 300,000 hospital visits yearly in the United States for appendicitis-related issues. Presentation: Pain may or may not be accompanied by any of the following symptoms: Anorexia, Nausea/vomiting, Fever (40% of patients), Diarrhea, Generalize malaise, Urinary frequency or urgency. Assessment: Appendicitis is traditionally a clinical diagnosis. However, several imaging modalities are used to proceed with the diagnostic steps, including an abdominal CT scan, ultrasonography, and MRI. Laboratory measurements, including total leucocyte count, neutrophil percentage, and C-reactive protein (CRP) concentration. Plan: The gold-standard treatment for acute appendicitis is to perform an appendectomy. Laparoscopic appendectomy is preferred over the open approach. There is some disagreement regarding preoperative antibiotic administration for uncomplicated appendicitis. The differential diagnosis includes Crohn ileitis, mesenteric adenitis, the inflammatory process in the cecal diverticulum, mittelschmerz, salpingitis, ruptured ovarian cyst, ectopic pregnancy, tubo-ovarian abscess, musculoskeletal disorders, endometriosis, pelvic inflammatory disease, gastroenteritis, right-sided colitis, renal colic, kidney stones, irritable bowel disease, testicular torsion, ovarian torsion, round ligament syndrome, epididymitis, and other nondescript gastroenterological issues. Prognosis: If diagnosed and treated early, as a relatively safe surgical procedure, the recovery within 24 to 48 hours, is expected. Cases that present with advanced abscesses, sepsis, and peritonitis may have a more prolonged and complicated course, possibly requiring additional surgery or other interventions.

Phenotype development:
Since appendicitis is an acute disease that almost always should be managed in an inpatient setting, we decided to limit this cohort to persons who are presenting in an inpatient or emergency room setting.

We explored if we should remove persons who had history of appendectomy - as this is biologically not compatible with an event of appendicitis, but after observing that almost all the persons who had appendectomy’s had it within a short time (~ 1 week prior) of appendicitis - we decided to not add that as a rule, as it may represent index date misclassification of a true case of appendicitis.

Cohort Submission:
See cohort id #234 in OHDSI Phenotype library currently in pending peer review status. Cohort Diagnostics output is available at data.ohdsi.org/PhenotypeLibrary

Potential problems with this phenotype:

Miss rate/False negative rate/Sensitivity - mild forms of appendicitis is thought to go undiagnosed as persons may never seek care and it resolves. It is unlikely that this condition is managed outpatient. If a person has symptoms of appendicitis, it is possible that an alternate diagnosis would be evaluated if persons who are older or pregnant.
Index date misclassification - we don’t expect significant index date misclassification in acute settings especially in typical settings, because persons who have appendicitis and receive care for symptoms of appendicitis are likely to be diagnosed early.
Specificity we do not know how many persons who have symptoms/signs suggestive of appendicitis (e.g. cholecystitis) maybe misdiagnosed as having appendicitis and potentially managed for appendicitis including surgery. We also do not know if prophylactic appendectomy during other abdominal surgeries are reported as appendicitis.

Phenotype evaluation: We evaluated 11 data sources. Note: we observed 0 counts in many data sources and this is by design. Because the cohort definitions requires inpatient/ER visit. The data sources with 0 count do not appear to have inpatient related data.

Incidence rate: in most data source we observed an incidence rate around 0.8/1,000 persons/per year which is approximately (within an order of magnitude of) the rate that has been previously reported. The rates are slightly higher in the 10-30 age deciles compared to other age deciles. This is in line with the expected age of 28 (mean). Males appear to have a higher rate compared to females. We observe a slightly higher rates in the 2012-2014 calendar years compared to more recent calendar years. Reason for this unknown and may represent a higher sensitivity and/or lower specificity in the 2012-2014 compared to 2018-2020. This observation was mostly in 10-30 age deciles.
Index event breakdown: about 10 to 20% of persons appear to have appendicitis with peritonitis, while < 10% have reported peritoneal abscess.
Visit context: note - we require inpatient or ER visit dy design. About 40 to 60% of persons had a simultaneous ER visit.
Pain: On index date 30 to 50% had right lower quadrant pain while 10% to 50% have abdominal pain.
Top concepts: the following concepts were found on the top of the list in characterization when ranked by frequency
condition domain: right upper quadrant pain, abdominal pain, nausea, vomiting, leukocytosis, fever, sepsis, hypovolemia, constipation, acute abdomen, peritonitis
procedure domain: about 50 to 80% had ct scan, while about 80% had anestheia administration, 30%-60% had laproscopy
drug domain: 40 to 80% appear to be on anti infective for systemic use.
no unexpected concepts observed
Source of errors: Index date misclassification - We did observe some appendectomy being performed in the short term window prior to the diagnosis of appendicitis. We did not address this in this cohort definition.

Overall: review of population level characteristics in cohort diagnostics is consistent with the phenotype of interest


Thank you @Gowtham_Rao . Earlier this afternoon, the phenotype development and evaluation work group met and reviewed the submitted appendicitis phenotype. Please see above link to the recorded meeting. The group accepted the submitted phenotype, but recommend the following modification/consideration:

  1. Update the clinical description to explicitly state that you are targeting all types of events of appendicitis - acute, chronic, mild and severe, perforated, non-perforated appendicitis.
  2. Phoebe 2.0 recommend the inclusion of ‘Rupture of Appendix’ as an entry criteria - please test and provide evidence as to why that code was not included. We also observed that READ code ‘Appendicitis and other disorders of the appendix’ (omop concept id 45456828) is not used in your definition.
  3. We found 2 prior studies that validated cohort definitions that were similar to your definition and had a good performance characteristics (PPV: 83.0% (95% CI: 82.2% to 83.7%); Please reference these studies in your write up and report their findings - Kleif et.al, Coward et.al.
  4. While limiting events to in-patient and ER setting probably improve the specificity of your definition, the reviewers are concerned about possible sensitivity error. During the review we observed that the rule requiring an inpatient/ER visit led to reduction of 15-47% reduction in events – that were captured across data sources. While we agree with the use of this rule of inpatient or ER (also used in the above referenced validated cohort definitions), we would like you to investigate if you are missing events because of incomplete visit representation (i.e. please expand the care setting requirement to include inpatient/ER from visit table or concepts from procedure/observation domain that correspond to ER/inpatient). We agree that use of the inpatient/ER inclusion rule will avoid counting events that are follow-up outpatient care.
  5. Following on the point above, please add to your write up a brief description of the clinical characteristics of patients who meet the entry criteria but do NOT meet the ER/inpatient requirement compared to those who do. Also provide PheValuator results that can estimate the tradeoff between specify and sensitivity associated with the ER/inpatient requirement.
  6. The reviewers agree with your conclusion that the demographic and clinical characteristics of the cohort illustrated in CD is consistent with expectations. In specific, the prevalence of symptoms (abdominal pain), diagnostic work ups (imaging), treatments (appendectomy, antibiotics) – we agree that these may indicate a relatively high Positive Predictive value (PPV).
  7. As you described in your write up, we observed that 10% of the cohort had events such as appendectomy, imaging, acute appendicitis in (-1 days to -30 days), indicating index event misclassification. We ask that you evaluate the impact of changing your entry criteria to index on diagnosis, or treatment or procedure, while requiring an inpatient/ER diagnosis within 1-2 days. We do not think you should go beyond 2 days because it is unlikely that in clinical care persons with true acute appendicitis would be followed without definitive care (ie. appendectomy) for more than 1 to 2 days.
  8. Please provide PheValuator results.

Overall, the phenotype submited is in line with the clinical description, builds on prior validated definitions, the data observed in the Cohort Diagnsotics tool is consistent with expected demographic and clinical profile of patients with appendicitis. However, the definition may be missing true mild cases of appendicitis (by either missing capture of inpatient/er care, or those that may be managed outpatient without surgery using antibiotics) - leading to some loss of sensitivity. In addition, we observed that at least around 10% of patients may have index event misclassification by few days.

Thanks @Azza_Shoaibi and @Gowtham_Rao for leading the interactive session on peer reviewing Appendicitis. I was sorry to miss the session live, but REALLY enjoyed listening to the recording (link here for those who’d like to also listen in). I strongly recommend it to anyone who wants to hear various perspectives of the thinking that goes into clinical descriptions, justifying cohort definitions, and elements to look at during a review from @Christian_Reich @Sebastiaan_van_Sandi
@Evan_Minty @aostropolets @Andrea_Noel @qifeng Alexander Miller and others.

My big takeaway from listening to the discussion: building trust for a phenotype definition is hard. There are a near-infinite set of ‘what ifs’ that someone can speculate about, but its not feasible or reasonable to explore all of those rabbitholes. So, in terms of the intended scope here as a peer review, the question shouldn’t be: ‘how or why did you develop the following phenotype algorithm?’ Instead, the question should be: ‘Is this phenotype algorithm - with its estimated measurement error on the evaluated databases - sufficient to produce reliable evidence?’. If the peer review of the empirical evidence provided by the developer determines that it supports the use of the algorithm, then it should be ‘accepted’. The peer review could determine there’s either insufficient evidence provided (e.g. no quantitative assessment by tools like CohortDiagnostics, PheValuator, or some other approach). It’s also possible that the evidence to support the use of the algorithm is not compelling (e.g. the estimated positive predictive value < 20%, the sensitivity < 25%), but here is where I’d prefer for us to establish objective diagnostics with pre-specified decision thresholds for phenotype adequacy determined by the extent to which characterization/estimation/prediction results will be biased by the measurement error, rather than having subjective opinions about how much error is ‘too much’.

It seems to me a second-order question of ‘can we improve the algorithm further?’, of which the discussion uncovered multiple candidate opportunities for revision, but as @aostropolets , one needs to empirically evaluate those algorithm changes to assess the impact on measurement error before declaring them ‘improvements’ . But definition refinement, with its associated development and evaluation, seems a disjoint effort from peer reviewing an existing phenotype definition.

I’m glad to see our appendicitis phenotype nearing the finish line to be a community-accepted phenotype definition for everyone to have the opportunity to review and re-use in their own evidence generation process. Well done team!

@Gowtham_Rao and @Azza_Shoaibi nicely presented. I’d use this phenotype! My only ‘what-if’ is not going to be some clinical rabbit hole but a data what-if we had greater data linkage across time to better capture those distant past appendectomies to further exclude from denominators those not at risk. Only really important for incidence evaluations. Likely totally non-differential in our comparative cohort studies.

If only every clinical condition was this clear cut; problem with the organ, cut it out all acute and well defined. Imagine depression if it was acute, cut out that part of the brain that makes one depressed life would be so easy to compute phenotypes. I know there are probably several interested in cutting a part of my brain out!