OHDSI Home | Forums | Wiki | Github

Does your dataset have eGFR results?

Hi all-

We are considering running a study that requires eGFR results, and we were wondering how many OHDSI datasets contain eGFR? IF you have eGFR results in your dataset please let us know - also a sample size (N) of patients having eGFR results would be great.

Thanks in advance!

Best,
~Mary Regina Boland + Jing Huang (@jinghuang) + Yong Chen

As you know, eGFR is calculated based on serum creatinine plus a few demographic factors. You might consider that option as well since some people may not have eGFR in their data, but could calculate it. The one issue that might happen is that different databases will calculate it differently based on all of the different formulas that exist. But since you are looking at feasibility, you might get better results if you also ask for the components needed to calculate eGFR.

that’s a great suggestion @Mark_Danese
To calculate eGFR -blood/serum creatinine along with age, sex and race would be needed.
It would be awesome if we could get eGFR along with serum creatinine, age, sex and race - to compare for discrepancies (which often occur in EMR data).
Perhaps the question is really how many OHDSI collaborators have access to laboratory data? and then within that subset how many have eGFR, and/or creatinine

@Mary_Regina_Boland:

You are living in the era of OHDSI Network studies. Hack a cohort in ATLAS with the LOINC concept for eGFR and post it. If nobody answers, send some nastigrams. You’ll have your answer. And once you are at it, add the creatinine and demographics. All as distributions, so you aren’t asking for patient-level data.

thanks @Christian_Reich, something like this?
Using all concept ids for eGFR that are LOINC based

SELECT count(distinct person_id)
FROM MEASUREMENT
WHERE measurement_concept_id=”40758886” OR
measurement_concept_id=”40758889” OR
measurement_concept_id=”36203240” OR
measurement_concept_id=”40758890” OR
measurement_concept_id=”40758887” OR
measurement_concept_id=”40758888” OR
measurement_concept_id=”40758885” OR
measurement_concept_id=”40758891” OR
measurement_concept_id=”3009224” OR
measurement_concept_id=”3014444” OR
measurement_concept_id=”3053283” OR
measurement_concept_id=”3049187”;

Also might talk to Ning (Sunny) Shang at Columbia because she became an eGFR expert because of our eMERGE CKD phenotype. (I don’t mean just to find out about Columbia data, but what are different ways to get it from other sites, coded in an OMOP database.) George

great sounds good @hripcsa

@Mary_Regina_Boland:

Yes, except you want to be nice to your colleagues and:

  • Instead of double quotes use single quotes (that’s how SQL likes it)
  • Put “@.” in front of MEASUREMENT so folks or SQLRender can insert the correct schema name

But then it will work. Takes no time at all. I just ran it at our Ambulatory EMR dataset, and we got 6,683,329 such patients.

awesome thanks @Christian_Reich sounds like a decent number of patients!

The modified query is appended for those using SQLRender:

SELECT count(distinct person_id)
FROM @.MEASUREMENT
WHERE measurement_concept_id=‘40758886’ OR
measurement_concept_id=‘40758889’ OR
measurement_concept_id=‘36203240’ OR
measurement_concept_id=‘40758890’ OR
measurement_concept_id=‘40758887’ OR
measurement_concept_id=‘40758888’ OR
measurement_concept_id=‘40758885’ OR
measurement_concept_id=‘40758891’ OR
measurement_concept_id=‘3009224’ OR
measurement_concept_id=‘3014444’ OR
measurement_concept_id=‘3053283’ OR
measurement_concept_id=‘3049187’;

Hi @Mary_Regina_Boland, I created a cohort definition for you in ATLAS:
http://www.ohdsi.org/web/atlas/#/cohortdefinition/1655678. Anyone could
grab the JSON and run if they have ATLAS, or copy/paste the SQL if they
don’t.

Note, the conceptset I used doesn’t include all those concepts in your
post, because several of those are looking at gene markers. Also, note,
I’m not just looking for a measurement record, but also looking to ensure
that I have a numeric value > 0.

Based on this, on the Janssen side, it appears we would have data in Truven
CCAE, Truven MDCR, IMS Australia, and Optum Extended. For CPRD, IMS
Germany, IMS France, and JMDC, theres no eGFR but there is raw creatinine
values (though the challenge is there’s different codes and units used
across different databases).

1 Like

awesome @Patrick_Ryan

So it sounds like an OHDSI network study using eGFR and creatine values is doable, however there will be a number of challenges: including country-specific methods of measuring, testing and perhaps documenting creatine
however this should be good from an informatics perspective :smile:
we’ll be in touch - working on developing the network study

Hi @Mary_Regina_Boland - at PEDSnet we created a query (written in postgres) that calculates eGFR from creatinine. I’m not sure if you will find it useful but let me know if you’d like to take a look.

If your studies include historic data you may want to consider this-

One catch that I came across when speaking with clinicians here is that due to changes in reporting, around 2012 the calculations used for eGFR went from MDRD methodology to CKD-EPI.

Here’s the SAS code snippet that I use to make the calculations both ways. I implemented the most precise formulae that I could find. I am sorry that I don’t know how to preserve my indentation here.

Best,
Gerry

/* MDRD calculation (Before 2011)/
if Year(CrDate) <= 2011
then
do;
if gender_concept_ID=8507 /
Male /
then gfr = 175
(crvalue**-1.154)(testage**-0.203) ; * Male ;
else gfr = 175
(crvalue**-1.154)(testage**-0.203)(0.742) ; * Female;
if race_concept_ID=8516 then gfr=gfr1.212; / Adjust if African American */
end;

/* CKD-EPI calculations (2012 and beyond) /
else
do;
if (race_concept_ID=8516) /
African American /
then
select;
/
Female, low crvalue /
when((gender_concept_ID= 8532) & (crvalue <= 0.7)) gfr=166 ((crvalue/0.7)**-0.329)(0.993**testage);
/
Female, high crvalue*/
when((gender_concept_ID= 8532) & (crvalue > 0.7)) gfr=166 ((crvalue/0.7)**-1.209)(0.993testage);
/* Male, low crvalue */
when((gender_concept_ID= 8507) & (crvalue <= 0.9)) gfr=163 *((crvalue/0.9)
-0.411)(0.993**testage);
/
Male, high crvalue*/
when((gender_concept_ID= 8507) & (crvalue > 0.9)) gfr=163 ((crvalue/0.9)**-1.209)(0.993testage);
otherwise Error “GFI Calculation - Black” Race_concept_ID= gender_concept_ID= testage= crvalue= ;
end;
else /* Not African American /
select;
/
Female, low crvalue */
when((gender_concept_ID=8532) & (crvalue <= 0.7)) gfr=144 *((crvalue/0.7)
-0.329)(0.993**testage);
/
Female, high crvalue*/
when((gender_concept_ID=8532) & (crvalue > 0.7)) gfr=144 ((crvalue/0.7)**-1.209)(0.993testage);
/* Male, low crvalue */
when((gender_concept_ID=8507) & (crvalue <= 0.9)) gfr=141 *((crvalue/0.9)
-0.411)(0.993**testage);
/
Male, high crvalue*/
when((gender_concept_ID=8507) & (crvalue > 0.9)) gfr=141 ((crvalue/0.9)**-1.209)(0.993**testage);
otherwise Error “GFR Calculation - Not Black” Race_concept_ID= gender_concept_ID= testage= crvalue= ;
end;
end;
run;

thanks @razzaghih and @Pulver your insights are great.
We would love to have the pediatric population as well…it will definitely add some additional caveats to the work, but i think would greatly enhance it.
We are going to work on making our code here shareable, and formalized into a network study, and then we can have a call with all interested parties to discuss further :slight_smile:

@Patrick_Ryan GFR in CPRD is ent type 466 (happen to be working on CPRD data today). not sure how well populated it is. creatinine should be fine. just wanted to put this out there in case someone looks at this in the future.

Hi @Mary_Regina_Boland,
The EHR data from Ajou university hospital and Korean National cohort from claim registry both have creatinine and other variables for calculating eGFR.

I’m sure you’ve already known that there are many equations for calculating eGFR, and the equations for Koreans are different from those for Westerns.

Good luck!!

1 Like

@Mary_Regina_Boland:

Couple of points about the codes you picked:

  • 3009224 EGFR gene mutations found [Identifier] in Blood or Tissue by Molecular genetics method Nominal
  • 3014444 EGFR gene mutations tested for in Blood or Tissue by Molecular genetics method Nominal
  • 36203240 EGFR gene c.2369C>T actual/normal in Plasma cell-free DNA by Molecular genetics method
  • 40758885 EGFR gene exon 19 deletion [Presence] in Blood or Tissue by Molecular genetics method
  • 40758886 EGFR gene c.2156G>C+2155G>A+2155G>T [Presence] in Blood or Tissue by Molecular genetics method
  • 40758887 EGFR gene c.2573T>G [Presence] in Blood or Tissue by Molecular genetics method
  • 40758888 EGFR gene c.2582T>A [Presence] in Blood or Tissue by Molecular genetics method
  • 40758889 EGFR gene c.2303G>T [Presence] in Blood or Tissue by Molecular genetics method
  • 40758890 EGFR gene c.2369C>T [Presence] in Blood or Tissue by Molecular genetics method
  • 40758891 EGFR gene exon 20 insertion [Presence] in Blood or Tissue by Molecular genetics method

The epidermal growth factor receptor gene variants have nothing to do with glomerular filtration rate.

In addition to the eGFR for blacks and non-blacks, and the normalized rate per 1.73 m2 (which is the area of skin the average human has) @Patrick_Ryan picked you may want to consider these:

  • 42869913 Glomerular filtration rate/1.73 sq M predicted among males [Volume Rate/Area] in Serum or Plasma by Creatinine-based formula (MDRD)
  • 3029829 Glomerular filtration rate/1.73 sq M predicted among females [Volume Rate/Area] in Serum or Plasma by Creatinine-based formula (MDRD)
  • 3030104 Glomerular filtration rate/1.73 sq M.predicted [Volume Rate/Area] in Serum or Plasma by Creatinine-based formula (Schwartz)
  • 46236952 Glomerular filtration rate/1.73 sq M.predicted [Volume Rate/Area] in Serum, Plasma or Blood by Creatinine-based formula (MDRD)
  • 3004917 Creatinine renal clearance in 8 hour
  • 3005770 Creatinine renal clearance in 24 hour
  • 3006563 Creatinine dialysis fluid clearance
  • 3006873 Creatinine renal clearance in 12 hour
  • 3007659 Creatinine renal clearance/1.73 sq M in 12 hour
  • 3018252 Creatinine renal clearance in 4 hour
  • 3018775 Creatinine renal clearance in 6 hour
  • 3022053 Creatinine renal clearance/1.73 sq M in 4 hour
  • 3022988 Creatinine renal clearance in 2 hour
  • 3025137 Creatinine renal clearance/1.73 sq M in 6 hour
  • 3026583 Creatinine renal clearance/1.73 sq M in 8 hour
  • 3027108 Creatinine renal clearance/1.73 sq M in 24 hour
  • 3032462 Creatinine dialysis fluid clearance/1.73 sq M
  • 3034605 Creatinine renal clearance/1.73 sq M in 2 hour
  • 3035532 Creatinine renal clearance/1.73 sq M in collected for unspecified duration

And then you can calculate it yourself if you pick up serum creatinine

  • 3016723 Creatinine serum/plasma
  • 3032033 Creatinine [Mass or Moles/volume] in Serum or Plasma
  • 3022243 Creatinine [Mass/volume] in Serum or Plasma --pre dialysis

But then you also need gender, age docile and race (as per @razzaghih, @pulver or the formulas you pick up in the internet). Atlas won’t do that for you.

thanks @Christian_Reich: most databases will not have the genetic test result - curious to see if any actually have that result…
i’m a little curious about whether or not actual Glomerular filtration rate (GFR) is measured - since in theory it should be measurable. eGFR is the estimated version, which is highly dependent on things like ethnicity, etc. - unfortunately ethnicity is often poorly captured in EHRs and therefore the quality of eGFR is probably not great.
Creatinine clearance values would be great if those are recorded (again my experience with EHRs is that there are probably patients with an eGFR result without a creatinine clearance result and probably patients with a creatinine clearance result without an eGFR) - therefore capturing both along with the other values: ethnicity, age, gender, etc. would be great if doable. Height would also be great if that is available - height and weight are also ethnicity dependent and also affect the estimated GFR calculations…I think using 1 standard height is a bit naive
We have a statistical method currently (@jinghuang) , and are working on making that OHDSI-compatible…obviously ATLAS is not going to function as an actual informatics algorithm - just trying to get a ballpark figure of the sample sizes at different institutions because I was not sure that laboratory results were accurately captured in the ODHSI network…I’m glad to see that they are

@Mary_Regina_Boland:

Yes, you can measure it. Don’t think it happens that often, but hey, you can query, The way it works is that you challenge the body with a defined amount of creatinine and keep measuring it over time. The LOINC codes for that are:

3042610 Creatinine [Mass/volume] in Serum or Plasma --30 minutes post XXX challenge
3041723 Creatinine [Mass/volume] in Serum or Plasma --1 hour post XXX challenge
3038809 Creatinine [Mass/volume] in Serum or Plasma --1.5 hours post XXX challenge
3038530 Creatinine [Mass/volume] in Serum or Plasma --2 hours post XXX challenge
3040706 Creatinine [Mass/volume] in Serum or Plasma --3 hours post XXX challenge
3038298 Creatinine [Mass/volume] in Serum or Plasma --10 hours post XXX challenge
3042043 Creatinine [Mass/volume] in Serum or Plasma --18 hours post XXX challenge
3040585 Creatinine [Mass/volume] in Serum or Plasma --2 days post XXX challenge
3041199 Creatinine [Mass/volume] in Serum or Plasma --30 minutes pre XXX challenge
3043912 Creatinine [Mass/volume] in Serum or Plasma --4 hours post XXX challenge
3038833 Creatinine [Mass/volume] in Serum or Plasma --5 hours post XXX challenge
etc.

Height: If you find one measurement like that in the data you should be good.
Gender: Should be fine.
Race: Agreed, not easy. If you ask in other countries you don’t have that problem - the European or Asian societies are usually not that diverse.

Just for sake of adding an un-needed distraction -
I wonder whether using an estimation of skin surface are based on height and weight, in place of the “1.73 m2” standard measure would provide more accurate eGFR.

t