Another beautiful day in Phenotype Phebruary…2-22-22! Today’s phenotype will be HIV, or human immunodeficiency virus. The virus itself is relatively “new” when we think about some of the phenotypes that we have already explored. This virus a history within the United States (and developed nations) to carry some social and cultural constructs that have shaped the lives of many from the 1980’s to now. While this disease was life threating, it has become more managed with advances in modern medication. A shoutout to @stephenfortin in helping develop this phenotype with me.
As always let’s start with the clinical definition…
Clinical Definition: According to the CDC, human immunodeficiency virus (HIV) is defined as a virus spread by contact of certain body fluids, which attacks the body’s immune system, specifically CD4 cells. The virus is most commonly spread through unprotected sex (e.g., without a condom or preventative HIV medicines), or sharing of injection drug equipment. HIV reduces the number of CD4 cells in the body thereby weakening an individual’s immune system and rendering them more susceptible to opportunistic infection and cancer. Left untreated, HIV can lead to acquired immunodeficiency syndrome (AIDS).
Diagnosis: The only method to determine HIV status is through screening tests. HIV tests are available at many medical clinics, substance abuse programs, community health centers, and hospitals. Home testing kits are also available at many pharmacies or online. Several types of HIV tests exist, including nucleic acid tests (NAT), antigen/antibody tests, and antibody tests.
Prognosis and Treatment: Without treatment, the average survival of individuals with AIDS is approximately 3 years; however, the occurrence of opportunistic infection decreases life expectancy without treatment to 1 year. That being said, taking HIV medicine (i.e., antiretroviral therapy) may enable individuals with HIV to live long and healthy lives, and prevent the transmission of HIV to their sexual partners. In addition, certain measures can decrease the risk getting HIV through sex or injection drug equipment, including pre-exposure prophylaxis (PrEP) and post-exposure prophylaxis (PEP).
As we read this definition, there are some key aspects to consider in developing this phenotype. This is a uncurable disease BUT it can be suppressed (sometimes all the way to being undetectable by laboratory tests). The only way to know if someone has the disease is by preforming a test and the disease must be managed by drugs. These are some important factors to consider because some of these items (laboratory measurements, diagnosis, and drugs) can vary in many observational databases.
There are many studies conducted on patients with HIV, but mostly all use laboratory measurements as a confirmatory for the disease. This is helpful when the databases we provide all have well captured lab data (but they don’t!). Here is a summary of some helpful papers.
Author | Year | Database | Algorithm | Sensitivity | Specificity | PMID |
---|---|---|---|---|---|---|
Paul et al | 2018 | EHR | Algorithm 1. Lab + Medications. Algorithm 2. ICD-9 codes, medications, lab tests | Algorithm 1 and 2: 78% and 77%, respectively | Algorithm 1 and 2: 99% and 100%, respectively | 28645207 |
Antoniou et al. | 2011 | Claims | 48 phenotypes tested. Combinations of physician billing codes, hospitalizations, ED visits, prescription claims, over various time frames | See Table 1. Information available only for select algorithms. Sensitivity increases as observation period increases. | Specifity >99% for all definitions except single physician claim. | 21738786 |
In Paul et al., the authors found positive lab values and HIV medications (algorithm 1) or positive lab values and HIV medications and ICD-9 diagnosis code for HIV (algorithm 2) to have a sensitivity of 78% and 77%, respectively, and a specificity of 99% and 100%, specifically. It is important to note that the authors had direct access to patient records including labs. Meanwhile, Antoniou et al., tested a total of 48 phenotype algorithms including varying combinations of physician billing codes, hospitalizations, ED visits and prescription claims associated with HIV occurring over varying time periods. Among other findings, the authors concluded that a combination of 2+ physician claims and/or HIV medications occurring within a 2-year period achieved a high sensitivity (e.g., >90%) and specificity (e.g., >99%).
From the scope of literature reviewed we know that labs, drugs, and diagnosis codes will play a role. So we will start with these cohorts:
• HIV diagnosis only 5442
• HIV diagnosis or laboratory measure 5452
• HIV diagnosis or laboratory measure AND treatment 5451
• HIV diagnosis or laboratory measure AND treatment OR 2nd diagnosis 5445
• HIV diagnosis AND laboratory test OR treatment 5441
Our suspicion here is that the diagnosis or laboratory measure gives us the largest pool of patients and maybe we see a small drop when we add treatments (rationale being we lost some people that maybe were rule-outs etc.) The series of cohorts addresses all possible avenues of how a patient could be identified with the assumption that some databases won’t have good laboratory capture, OR in EHR’s where a patient may only ever get the diagnosis once.
Here we can see right away those databases that have laboratory measurement show differences between definitions C2 compared to C5, though these differences are quite small in our US databases. This tells us that all labs are not captured, and we must heavily relay on the diagnosis codes and medications.
Incidence rates:
The incidence rates are higher in males which is expected, and quite a bit higher in MDCD compared to our other US claims/EHR databases. The incidence rates are telling us at face value what we know about the disease, but we aren’t really at our phenotype yet. Much of the literature and clinical practice relay heavily on laboratory measurements but we may not have that, so what else can we learn here?
So now if we want to understand why some people don’t have medications when it is required to maintain low viral loads? We can take a peak at temporal characterization for those with the diagnosis…
We can see that many people have the diagnosis on index, about 80%, and then we dive deeper and see laboratory measurements for about 40% on index, we see on days 31 to 365 we see more diagnosis, labs.
When we look deeper at the cohort for medications, we see anti-retroviral and they occur mainly after index, and in some cases before (likely due to other diseases)
So this short tour of this phenotype raises some questions for our fellow community members to think about, laboratory measurement and diagnosis codes? Tell me what you think…start the conversation and hope to add more color in the coming days…