OHDSI Home | Forums | Wiki | Github

Medicare ETL development

To @amatcho and the rest of the Medicare ETL Development community,

In case some of you may have missed this, Outcomes Insights has created a lite ETL of SEER Medicare in CDM v4 format. The documentation and SAS code is open to the public and can be accessed here:

There are a lot of similarities between SynPuf and SEER Medicare and we are using a lot of the logic documented in our SEER Medicare ETL for our SynPuf ETL. Please keep in mind that this ETL is a “lite” version and does not populate all of the CDM tables. This is used more as a starting point and a general guideline. Specifically, I hope this proves useful for starting @amatcho’s ETL for SEER Medicare.

Side Note: Please forgive me for the novice programming skills.

@ericaVoss et al.:

What is carrier claim? The institution’s claim? They also give 13 diagnoses?

@jenniferduryea: SAS? You are travellling back in time, using proprietary (and pathetically expensive for the value) technology. Well, I guess we can’t ask much for a free contribution. Gift horse and all. :smile:

HAHAHA @Christian_Reich! :joy: I have to agree about SAS! Though I’m probably doing SAS more of a disservice with my atrocious coding skills! I guess there is a method to my madness.

@jenniferduryea:

I was thinking the same thing. In fact, I was contemplating this session for a while. The next two days aren’t good because I am in good old England (8 hours away from you), but what about middle of next week? Let me push out a doodle.

Yes, thank you, I was going too fast and wasn’t thinking.

@jenniferduryea is the one teaching me about this . . .

“The Carrier file (also known as the Physician/Supplier Part B claims file) contains final action fee-for-service claims submitted on a CMS-1500 claim form. Most of the claims are from non-institutional providers, such as physicians, physician assistants, clinical social workers, nurse practitioners. Claims for other providers, such as free-standing facilities are also found in the Carrier file. Examples include independent clinical laboratories, ambulance providers, and free-standing ambulatory surgical centers.” [1]

@Christian_Reich,

Okay, I need to edit this list - here are some new codes that we need. We can discuss at our next meeting.

CONDITION_TYPE

  • Carrier Claims header - 1st position
  • Carrier Claims header - 2nd position
  • Carrier Claims header - 3rd position
  • Carrier Claims header - 4th position
  • Carrier Claims header - 5th position
  • Carrier Claims header - 6th position
  • Carrier Claims header - 7th position
  • Carrier Claims header - 8th position
  • Carrier Claims details - 1st position
  • Carrier Claims details - 2nd position
  • Carrier Claims details - 3rd position
  • Carrier Claims details - 4th position
  • Carrier Claims details - 5th position
  • Carrier Claims details - 6th position
  • Carrier Claims details - 7th position
  • Carrier Claims details - 8th position
  • Carrier Claims details - 9th position
  • Carrier Claims details - 10th position
  • Carrier Claims details - 11th position
  • Carrier Claims details - 12th position
  • Carrier Claims details - 13th position

PS - @jenniferduryea - I think we should change these to DETAILS not HEADERS because otherwise we will be requesting all these codes for the HEADER, which is unlikely .

PROCEDURE_TYPE

  • Outpatient header - 7th position
  • Outpatient header - 8th position
  • Outpatient header - 9th position
  • Outpatient header - 10th position
  • Outpatient header - 11th position
  • Outpatient header - 12th position
  • Outpatient header - 13th position
  • Outpatient header - 14th position
  • Outpatient header - 15th position
  • Outpatient header - 16th position
  • Outpatient header - 17th position
  • Outpatient header - 18th position
  • Outpatient header - 19th position
  • Outpatient header - 20th position
  • Outpatient header - 21th position
  • Outpatient header - 22th position
  • Outpatient header - 23th position
  • Outpatient header - 24th position
  • Outpatient header - 25th position
  • Outpatient header - 26th position
  • Outpatient header - 27th position
  • Outpatient header - 28th position
  • Outpatient header - 29th position
  • Outpatient header - 30th position
  • Outpatient header - 31th position
  • Outpatient header - 32th position
  • Outpatient header - 33th position
  • Outpatient header - 34th position
  • Outpatient header - 35th position
  • Outpatient header - 36th position
  • Outpatient header - 37th position
  • Outpatient header - 38th position
  • Outpatient header - 39th position
  • Outpatient header - 40th position
  • Outpatient header - 41th position
  • Outpatient header - 42th position
  • Outpatient header - 43th position
  • Outpatient header - 44th position
  • Outpatient header - 45th position
  • Carrier Claims header - 1st position
  • Carrier Claims header - 2nd position
  • Carrier Claims header - 3rd position
  • Carrier Claims header - 4th position
  • Carrier Claims header - 5th position
  • Carrier Claims header - 6th position
  • Carrier Claims header - 7th position
  • Carrier Claims header - 8th position
  • Carrier Claims header - 9th position
  • Carrier Claims header - 10th position
  • Carrier Claims header - 11th position
  • Carrier Claims header - 12th position
  • Carrier Claims header - 13th position

@Mark_Danese & @jenniferduryea,

Some follow-ups from Today’s 2/27/15 Meeting:

  1. Drew the 45 PROCEDURE_OCCURRENCE arrows from OP_CLAIMS :expressionless:
  2. Added the PROCEDURE_OCCURRENCE types
  3. Added back in the CONDITION_OCCURRENCE types
  4. Added in the death data.
  5. Uploaded to GitHub

Note I added some DEATH date logic for OP_CLAIMS, many CLM_THRU_DT = 0, if that is true, take the CLM_FROM_DT.

Follow-ups from Today’s 3/9 meeting:

  1. Completed COHORT and PROCEDURE Cost Table
  2. updated document and Rabbit-in-a-hat here.
  3. Action Items:
  • @Mark_Danese to send out a draft of the AMIA abstract tonight to @Patrick_Ryan, et al.
  • @jenniferduryea & @Mark_Danese to pull together COHORT_DEF_NAMES and COHORT_DEF_DESCRIPTIONS for the cohorts
  • @ericaVoss needs to still finish up the arros between CARRIER_CLAIMS and PROCEDURE_COST
  • @ericaVoss needs to add the CONCEPT_IDs given by Christian for the CONDITON_TYPES, PROCEDURE_TYPES, and DOMAINS

Follow-Ups from today’s 3/11 meeting:

  1. Completed VISIT_COST, DRUG COST, and DEVICE EXPOSURE.
  2. Action Items
    • Finish AMIA draft, @Mark_Danese to submit tomorrow
    • @jenniferduryea to provide definition to ESRD.
    • @ericaVoss needs to still finish up the arrows between CARRIER_CLAIMS and PROCEDURE_COST and CARRIER_CLAIMS and DRUG_COST
    • @ericaVoss needs to add the CONCEPT_IDs given by Christian for the CONDITON_TYPES, PROCEDURE_TYPES, and DOMAINS
  • @ericaVoss to add COHORT_DEFINITIONS in provided

Follow-ups from March 12 Meeting:

  1. Completed the following:
  • COHORT_DEFINITIONS entered
  • DEVICE_COST
  • CARE_SITE
  • FACT_RELATIONSHIP
  • DRUG_EXPOSURE (non-RX)
  • Millions of arrows for
    • OUTPATIENT_CLAIMS–>DRUG_EXPOSURE
    • CARRIER_CLAIMS --> DEVICE_COST
    • CARRIER_CLAIMS --> PROCEDURE_COST
    • CARRIER_CLAIMS --> DRUG_COST
  • CARRIER_CLAIM CONDITION_TYPES, PROCEDURE_TYPES
  • Share final Rabbit-in-a-Hat file and WORD file for feedback
  • Action Items
  • @jenniferduryea to provide line process indicator codes
  • AMIA Discussion – schedule?
  • Outstanding Items:
    • Final document clean up
    • Final DOMAIN discussion (e.g. HCPC that shouldn’t end up in PROCEDURE_OCCURRENCE, review how the VOCAB is handling it and if appropriate for our situation)
      • Plan to follow-up on the DOMAIN discussion with Christian either to provide feedback or get thoughts on how to implement.
    • DRG in conditions?? Vocabulary usage (FORUM LINK)
      • We have a reply on the OBSERVATION_CONCEPT_ID, just use DRG.
    • Line Process Indicator Codes (claims accepted or denied) - LINE_PRCSG_IND_CD_1 in CARRIER_CLAIMS need to develop logic to keep or reject records
  • Testing Discussion

Sorry for no updates for a couple of weeks — I was on vacation. So, thanks to @ericaVoss and @jenniferduryea , we now have a pretty solid ETL spec. The next step is to program it, so we can get data to the community.

There are a couple of options:

  1. If you want to do it all yourself, you are free to do so, and can ask Jen and/or Erica for clarification at any time. Use whatever language you know best.
  2. If you want to contribute and do part of the ETL with others, we will have to agree on a language, and a division of labor.

Possible options for option 2 include Java, Python, SQL + database of your choice, and R. Of course, if you want to use Cobol, Perl, Ruby, or something else, you can take option #1 and possibly do it in parallel with someone else (good for QC). Even if you pick something more “mainstream", you may also be doing it in parallel if nobody else is comfortable with it.

The goal for this is to get it done, and to get the data out there, available for everybody. Depending on the person, and how it is done, we may also want to make the ETL code available via GitHub, but this is not absolutely essential. In other words, we would rather get it done fast, than get it done pretty.

So, if you are willing to contribute via option 1 or 2, please email me back. Once I have that list, I think it will be pretty easy to figure out how this will work.

Note: Before we finish, we will also create a test ETL dataset to test the code against. So, if you don’t want to do the ETL, you can contribute by making up data.

Also, I will put this out on the forums, so other who are interested can join. @Frank, @Patrick_Ryan, @Christian_Reich, @wstephens, @lee_evans, @donohara, @aguynamedryan, @mlgleeson

Regards,
Mark

Any progress on the ETL implementation front? @Mark_Danese, did you get any volunteers to move this forward? It came up again this morning in the HERACLES / OLYMPUS workgroup that having the SynPUF data in CDM format would be useful for both the developers who are trying to build out new OHDSI tools and well as for those in the community who’d like to install and play with the OHDSI tools as they come online.

@donohara has volunteered to take this on as soon as he gets some time. (Thanks Don!)

Thanks @donohara ! This will be a huge contribution to the OHDSI
community. Let us know how we can help you turn the ETL spec into a CDM
that we can all benefit from.

Cheers,

Patrick

My data analyst has been reviewing the documentation and we’ve been brainstorming about potential implementations, including a new approach that I started testing for a client in December. We’ll contribute where we can to help the implementation.

Bill

We can meet with you and/or your analyst one-on-one if you want us to go over the details with you guys (or anyone who wants to).

@Mark_Danese; @donohara and friends:

Do you still need help and hands on the table with this?

@donohara will have to reply – I know he has made some progress but I am not sure of the current status.

Hi all - been on the road and offline. Will be posting updates soon.
Don

t