OHDSI Home | Forums | Wiki | Github

Does your dataset have eGFR results?

Hi @Mary_Regina_Boland - at PEDSnet we created a query (written in postgres) that calculates eGFR from creatinine. I’m not sure if you will find it useful but let me know if you’d like to take a look.

If your studies include historic data you may want to consider this-

One catch that I came across when speaking with clinicians here is that due to changes in reporting, around 2012 the calculations used for eGFR went from MDRD methodology to CKD-EPI.

Here’s the SAS code snippet that I use to make the calculations both ways. I implemented the most precise formulae that I could find. I am sorry that I don’t know how to preserve my indentation here.

Best,
Gerry

/* MDRD calculation (Before 2011)/
if Year(CrDate) <= 2011
then
do;
if gender_concept_ID=8507 /
Male /
then gfr = 175
(crvalue**-1.154)(testage**-0.203) ; * Male ;
else gfr = 175
(crvalue**-1.154)(testage**-0.203)(0.742) ; * Female;
if race_concept_ID=8516 then gfr=gfr1.212; / Adjust if African American */
end;

/* CKD-EPI calculations (2012 and beyond) /
else
do;
if (race_concept_ID=8516) /
African American /
then
select;
/
Female, low crvalue /
when((gender_concept_ID= 8532) & (crvalue <= 0.7)) gfr=166 ((crvalue/0.7)**-0.329)(0.993**testage);
/
Female, high crvalue*/
when((gender_concept_ID= 8532) & (crvalue > 0.7)) gfr=166 ((crvalue/0.7)**-1.209)(0.993testage);
/* Male, low crvalue */
when((gender_concept_ID= 8507) & (crvalue <= 0.9)) gfr=163 *((crvalue/0.9)
-0.411)(0.993**testage);
/
Male, high crvalue*/
when((gender_concept_ID= 8507) & (crvalue > 0.9)) gfr=163 ((crvalue/0.9)**-1.209)(0.993testage);
otherwise Error “GFI Calculation - Black” Race_concept_ID= gender_concept_ID= testage= crvalue= ;
end;
else /* Not African American /
select;
/
Female, low crvalue */
when((gender_concept_ID=8532) & (crvalue <= 0.7)) gfr=144 *((crvalue/0.7)
-0.329)(0.993**testage);
/
Female, high crvalue*/
when((gender_concept_ID=8532) & (crvalue > 0.7)) gfr=144 ((crvalue/0.7)**-1.209)(0.993testage);
/* Male, low crvalue */
when((gender_concept_ID=8507) & (crvalue <= 0.9)) gfr=141 *((crvalue/0.9)
-0.411)(0.993**testage);
/
Male, high crvalue*/
when((gender_concept_ID=8507) & (crvalue > 0.9)) gfr=141 ((crvalue/0.9)**-1.209)(0.993**testage);
otherwise Error “GFR Calculation - Not Black” Race_concept_ID= gender_concept_ID= testage= crvalue= ;
end;
end;
run;

thanks @razzaghih and @Pulver your insights are great.
We would love to have the pediatric population as well…it will definitely add some additional caveats to the work, but i think would greatly enhance it.
We are going to work on making our code here shareable, and formalized into a network study, and then we can have a call with all interested parties to discuss further :slight_smile:

@Patrick_Ryan GFR in CPRD is ent type 466 (happen to be working on CPRD data today). not sure how well populated it is. creatinine should be fine. just wanted to put this out there in case someone looks at this in the future.

Hi @Mary_Regina_Boland,
The EHR data from Ajou university hospital and Korean National cohort from claim registry both have creatinine and other variables for calculating eGFR.

I’m sure you’ve already known that there are many equations for calculating eGFR, and the equations for Koreans are different from those for Westerns.

Good luck!!

1 Like

@Mary_Regina_Boland:

Couple of points about the codes you picked:

  • 3009224 EGFR gene mutations found [Identifier] in Blood or Tissue by Molecular genetics method Nominal
  • 3014444 EGFR gene mutations tested for in Blood or Tissue by Molecular genetics method Nominal
  • 36203240 EGFR gene c.2369C>T actual/normal in Plasma cell-free DNA by Molecular genetics method
  • 40758885 EGFR gene exon 19 deletion [Presence] in Blood or Tissue by Molecular genetics method
  • 40758886 EGFR gene c.2156G>C+2155G>A+2155G>T [Presence] in Blood or Tissue by Molecular genetics method
  • 40758887 EGFR gene c.2573T>G [Presence] in Blood or Tissue by Molecular genetics method
  • 40758888 EGFR gene c.2582T>A [Presence] in Blood or Tissue by Molecular genetics method
  • 40758889 EGFR gene c.2303G>T [Presence] in Blood or Tissue by Molecular genetics method
  • 40758890 EGFR gene c.2369C>T [Presence] in Blood or Tissue by Molecular genetics method
  • 40758891 EGFR gene exon 20 insertion [Presence] in Blood or Tissue by Molecular genetics method

The epidermal growth factor receptor gene variants have nothing to do with glomerular filtration rate.

In addition to the eGFR for blacks and non-blacks, and the normalized rate per 1.73 m2 (which is the area of skin the average human has) @Patrick_Ryan picked you may want to consider these:

  • 42869913 Glomerular filtration rate/1.73 sq M predicted among males [Volume Rate/Area] in Serum or Plasma by Creatinine-based formula (MDRD)
  • 3029829 Glomerular filtration rate/1.73 sq M predicted among females [Volume Rate/Area] in Serum or Plasma by Creatinine-based formula (MDRD)
  • 3030104 Glomerular filtration rate/1.73 sq M.predicted [Volume Rate/Area] in Serum or Plasma by Creatinine-based formula (Schwartz)
  • 46236952 Glomerular filtration rate/1.73 sq M.predicted [Volume Rate/Area] in Serum, Plasma or Blood by Creatinine-based formula (MDRD)
  • 3004917 Creatinine renal clearance in 8 hour
  • 3005770 Creatinine renal clearance in 24 hour
  • 3006563 Creatinine dialysis fluid clearance
  • 3006873 Creatinine renal clearance in 12 hour
  • 3007659 Creatinine renal clearance/1.73 sq M in 12 hour
  • 3018252 Creatinine renal clearance in 4 hour
  • 3018775 Creatinine renal clearance in 6 hour
  • 3022053 Creatinine renal clearance/1.73 sq M in 4 hour
  • 3022988 Creatinine renal clearance in 2 hour
  • 3025137 Creatinine renal clearance/1.73 sq M in 6 hour
  • 3026583 Creatinine renal clearance/1.73 sq M in 8 hour
  • 3027108 Creatinine renal clearance/1.73 sq M in 24 hour
  • 3032462 Creatinine dialysis fluid clearance/1.73 sq M
  • 3034605 Creatinine renal clearance/1.73 sq M in 2 hour
  • 3035532 Creatinine renal clearance/1.73 sq M in collected for unspecified duration

And then you can calculate it yourself if you pick up serum creatinine

  • 3016723 Creatinine serum/plasma
  • 3032033 Creatinine [Mass or Moles/volume] in Serum or Plasma
  • 3022243 Creatinine [Mass/volume] in Serum or Plasma --pre dialysis

But then you also need gender, age docile and race (as per @razzaghih, @pulver or the formulas you pick up in the internet). Atlas won’t do that for you.

thanks @Christian_Reich: most databases will not have the genetic test result - curious to see if any actually have that result…
i’m a little curious about whether or not actual Glomerular filtration rate (GFR) is measured - since in theory it should be measurable. eGFR is the estimated version, which is highly dependent on things like ethnicity, etc. - unfortunately ethnicity is often poorly captured in EHRs and therefore the quality of eGFR is probably not great.
Creatinine clearance values would be great if those are recorded (again my experience with EHRs is that there are probably patients with an eGFR result without a creatinine clearance result and probably patients with a creatinine clearance result without an eGFR) - therefore capturing both along with the other values: ethnicity, age, gender, etc. would be great if doable. Height would also be great if that is available - height and weight are also ethnicity dependent and also affect the estimated GFR calculations…I think using 1 standard height is a bit naive
We have a statistical method currently (@jinghuang) , and are working on making that OHDSI-compatible…obviously ATLAS is not going to function as an actual informatics algorithm - just trying to get a ballpark figure of the sample sizes at different institutions because I was not sure that laboratory results were accurately captured in the ODHSI network…I’m glad to see that they are

@Mary_Regina_Boland:

Yes, you can measure it. Don’t think it happens that often, but hey, you can query, The way it works is that you challenge the body with a defined amount of creatinine and keep measuring it over time. The LOINC codes for that are:

3042610 Creatinine [Mass/volume] in Serum or Plasma --30 minutes post XXX challenge
3041723 Creatinine [Mass/volume] in Serum or Plasma --1 hour post XXX challenge
3038809 Creatinine [Mass/volume] in Serum or Plasma --1.5 hours post XXX challenge
3038530 Creatinine [Mass/volume] in Serum or Plasma --2 hours post XXX challenge
3040706 Creatinine [Mass/volume] in Serum or Plasma --3 hours post XXX challenge
3038298 Creatinine [Mass/volume] in Serum or Plasma --10 hours post XXX challenge
3042043 Creatinine [Mass/volume] in Serum or Plasma --18 hours post XXX challenge
3040585 Creatinine [Mass/volume] in Serum or Plasma --2 days post XXX challenge
3041199 Creatinine [Mass/volume] in Serum or Plasma --30 minutes pre XXX challenge
3043912 Creatinine [Mass/volume] in Serum or Plasma --4 hours post XXX challenge
3038833 Creatinine [Mass/volume] in Serum or Plasma --5 hours post XXX challenge
etc.

Height: If you find one measurement like that in the data you should be good.
Gender: Should be fine.
Race: Agreed, not easy. If you ask in other countries you don’t have that problem - the European or Asian societies are usually not that diverse.

Just for sake of adding an un-needed distraction -
I wonder whether using an estimation of skin surface are based on height and weight, in place of the “1.73 m2” standard measure would provide more accurate eGFR.

thanks @Christian_Reich very useful and yes @Pulver if we have height, weight, or BMI we can get a little more accurate with estimating the eGFR.
If we do not have height, weight, etc. then perhaps we can use race/ethnicity proxies - but again there are issues with that…but i agree with several of you that have noted that using one skin surface measure for all datasets in the world is not a good approach

@Mary_Regina_Boland:

Good idea by @Pulver, but the point here is not to use the actual skin surface. In other words, it is not important if a person is fat or skinny. The thing is that the physiological eGFR is proportional to the square of height, so they use the 1.73 m2 as an artificial benchmark. Most eGFR calculators don’t even ask for height, only for children.

Hi Mary, hope all is well! I also want to add that from clinical epi
perspective, the eGFR estimation via MDRD, CKD-EPI, and Cockgroft-Gault
don’t always match. Not sure if the estimation methods are all represented
in the data model (I see MRDR above).

I think this goes back to whether you want to estimate directly from Cr
which is a direct measurement.

Best,

Kye

A lot of eGFR calculators don’t ask for height, but then they make adjustments based on race/ethnicity, etc. - so there might be some height adjustment baked into those formulas (even if they don’t adjust directly for height)
Better to have creatinine clearance, height and race/ethnicity - then the estimated GFR would be closer to the ‘true’ GFR…however, its important to note that all of the eGFRs are estimated…better to use the ‘raw’ creatinine clearance values - unless the actual challenge data is obtainable (which would be ideal)…

@Mary_Regina_Boland:

Well, it is for children: http://www.calculator.net/gfr-calculator.html.

For adults the variation in height is not that great, so most places don’t bother. It’s not that an exact measurement anyway. It gives you corridors of renal impairment:

Kidney damage stage description estimated gfr (ML/MIN/1.73M2)
1 Normal or minimal kidney damage with normal GFR 90+
2 Mild decrease in GFR 60-89
3 Moderate decrease in GFR 30-59
4 Severe decrease in GFR 15-29
5 Kidney failure <15

@Christian_Reichhttp://forums.ohdsi.org/users/christian_reich
So by coalescing actual height with a default of 1.73m, we’d get slightly more accurate estimate for those subjects for whom we have height and be no worse off for the others.
Thanks,
Gerry

1 Like

agreed - use actual height when possible otherwise use some default height…

I created a package that runs any set of phenotypes defined on the public atlas server. (using utilities from Marc and Martijn). I made it a while ago for George’s cancer study but it can be re-used for other phenotypes as well.

So today I took Patrick’s atlas definition (the ID of it), added it to the a CSV file, ran package maintenance code. It can produce the estimates with no SQL coding or rendering needed.

I suggest any site uses this package to get cohort counts. (and please report results how the package worked back in this forum).

Here is the package full URL and readme instructions:

The output file looks like this: (with two previous phenotypes shown only)

The settings file now has egfr phenotype. See it here: StudyProtocolSandbox/CohortsToCreate.csv at master · OHDSI/StudyProtocolSandbox · GitHub

So our network already has a nice mechanism how to quickly generate multiple phenotypes cohort extimate. Use that package. I am planing improvements to compute “table 1” for each phenotype using Feature Extraction package.

1 Like

A couple of quick comments from a clinical perspective here (disclaimer: internist, not a nephrologist)

  1. RE: measurement of true GFR. Rarely done. I’ve only seen it done in pre- kidney transplant evaluations. Uptodate mentions it might be done in some areas of chemotherapy due to narrow therapeutic windows of a particular drug, I’ve not seen it used in that context.

  2. RE: eGFR. I believe an assumption in all equations is that the creatinine is in steady state. If you’re going to use eGFR measures in a study, you’d want to parameterize any particular measure to ensure you aren’t evaluating this in the context of a kidney injury and dynamically changing renal function.

Checking on a Cr measurement before and after (a suitable delta time) to ensure you’re in a ‘steady state’ would be way to implement in practice?

This assumes the research use case does really need eGFR.

I do see a reasonable count in Stanford Stride v6 but none in v7 - I can ask around as to why this is if there’s something moving forward…

Evan

thanks @Evan_Minty: all good points.

I agree that true GFR is probably rarely done - however, the algorithm will probably extract it if it is present (again this might only be useful for data coming from cancer centers). I do believe the context of the measurements is important. Also regarding steady state, that may be tricky in EHRs where there is often a degree of missing-ness. However, I think it will be possible to design an algorithm that addresses this issue. We are working on finalizing the study - and hope to launch in the near future…so if you could find out about increased data accessibility in the interim that would be great.

Thanks!

thanks so much @Vojtech_Huser very useful!

t