Medicare ETL development

jenniferduryea · February 25, 2015, 11:05pm

To @amatcho and the rest of the Medicare ETL Development community,

In case some of you may have missed this, Outcomes Insights has created a lite ETL of SEER Medicare in CDM v4 format. The documentation and SAS code is open to the public and can be accessed here:

There are a lot of similarities between SynPuf and SEER Medicare and we are using a lot of the logic documented in our SEER Medicare ETL for our SynPuf ETL. Please keep in mind that this ETL is a “lite” version and does not populate all of the CDM tables. This is used more as a starting point and a general guideline. Specifically, I hope this proves useful for starting @amatcho’s ETL for SEER Medicare.

Side Note: Please forgive me for the novice programming skills.

Christian_Reich · February 25, 2015, 11:47pm

@ericaVoss et al.:

What is carrier claim? The institution’s claim? They also give 13 diagnoses?

@jenniferduryea: SAS? You are travellling back in time, using proprietary (and pathetically expensive for the value) technology. Well, I guess we can’t ask much for a free contribution. Gift horse and all.

jenniferduryea · February 26, 2015, 12:51am

HAHAHA @Christian_Reich! I have to agree about SAS! Though I’m probably doing SAS more of a disservice with my atrocious coding skills! I guess there is a method to my madness.

Christian_Reich · February 26, 2015, 1:33am

@jenniferduryea:

I was thinking the same thing. In fact, I was contemplating this session for a while. The next two days aren’t good because I am in good old England (8 hours away from you), but what about middle of next week? Let me push out a doodle.

ericaVoss · February 26, 2015, 2:21am

Yes, thank you, I was going too fast and wasn’t thinking.

ericaVoss · February 26, 2015, 2:25am

@jenniferduryea is the one teaching me about this . . .

“The Carrier file (also known as the Physician/Supplier Part B claims file) contains final action fee-for-service claims submitted on a CMS-1500 claim form. Most of the claims are from non-institutional providers, such as physicians, physician assistants, clinical social workers, nurse practitioners. Claims for other providers, such as free-standing facilities are also found in the Carrier file. Examples include independent clinical laboratories, ambulance providers, and free-standing ambulatory surgical centers.” [1]

ericaVoss · February 27, 2015, 10:31pm

@Christian_Reich,

Okay, I need to edit this list - here are some new codes that we need. We can discuss at our next meeting.

CONDITION_TYPE

Carrier Claims header - 1st position
Carrier Claims header - 2nd position
Carrier Claims header - 3rd position
Carrier Claims header - 4th position
Carrier Claims header - 5th position
Carrier Claims header - 6th position
Carrier Claims header - 7th position
Carrier Claims header - 8th position
Carrier Claims details - 1st position
Carrier Claims details - 2nd position
Carrier Claims details - 3rd position
Carrier Claims details - 4th position
Carrier Claims details - 5th position
Carrier Claims details - 6th position
Carrier Claims details - 7th position
Carrier Claims details - 8th position
Carrier Claims details - 9th position
Carrier Claims details - 10th position
Carrier Claims details - 11th position
Carrier Claims details - 12th position
Carrier Claims details - 13th position

PS - @jenniferduryea - I think we should change these to DETAILS not HEADERS because otherwise we will be requesting all these codes for the HEADER, which is unlikely .

PROCEDURE_TYPE

Outpatient header - 7th position
Outpatient header - 8th position
Outpatient header - 9th position
Outpatient header - 10th position
Outpatient header - 11th position
Outpatient header - 12th position
Outpatient header - 13th position
Outpatient header - 14th position
Outpatient header - 15th position
Outpatient header - 16th position
Outpatient header - 17th position
Outpatient header - 18th position
Outpatient header - 19th position
Outpatient header - 20th position
Outpatient header - 21th position
Outpatient header - 22th position
Outpatient header - 23th position
Outpatient header - 24th position
Outpatient header - 25th position
Outpatient header - 26th position
Outpatient header - 27th position
Outpatient header - 28th position
Outpatient header - 29th position
Outpatient header - 30th position
Outpatient header - 31th position
Outpatient header - 32th position
Outpatient header - 33th position
Outpatient header - 34th position
Outpatient header - 35th position
Outpatient header - 36th position
Outpatient header - 37th position
Outpatient header - 38th position
Outpatient header - 39th position
Outpatient header - 40th position
Outpatient header - 41th position
Outpatient header - 42th position
Outpatient header - 43th position
Outpatient header - 44th position
Outpatient header - 45th position
Carrier Claims header - 1st position
Carrier Claims header - 2nd position
Carrier Claims header - 3rd position
Carrier Claims header - 4th position
Carrier Claims header - 5th position
Carrier Claims header - 6th position
Carrier Claims header - 7th position
Carrier Claims header - 8th position
Carrier Claims header - 9th position
Carrier Claims header - 10th position
Carrier Claims header - 11th position
Carrier Claims header - 12th position
Carrier Claims header - 13th position

ericaVoss · February 27, 2015, 10:48pm

@Mark_Danese & @jenniferduryea,

Some follow-ups from Today’s 2/27/15 Meeting:

Drew the 45 PROCEDURE_OCCURRENCE arrows from OP_CLAIMS
Added the PROCEDURE_OCCURRENCE types
Added back in the CONDITION_OCCURRENCE types
Added in the death data.
Uploaded to GitHub

Note I added some DEATH date logic for OP_CLAIMS, many CLM_THRU_DT = 0, if that is true, take the CLM_FROM_DT.

ericaVoss · March 9, 2015, 9:55pm

Follow-ups from Today’s 3/9 meeting:

Completed COHORT and PROCEDURE Cost Table
updated document and Rabbit-in-a-hat here.
Action Items:

@Mark_Danese to send out a draft of the AMIA abstract tonight to @Patrick_Ryan, et al.
@jenniferduryea & @Mark_Danese to pull together COHORT_DEF_NAMES and COHORT_DEF_DESCRIPTIONS for the cohorts
@ericaVoss needs to still finish up the arros between CARRIER_CLAIMS and PROCEDURE_COST
@ericaVoss needs to add the CONCEPT_IDs given by Christian for the CONDITON_TYPES, PROCEDURE_TYPES, and DOMAINS

ericaVoss · March 11, 2015, 9:50pm

Follow-Ups from today’s 3/11 meeting:

Completed VISIT_COST, DRUG COST, and DEVICE EXPOSURE.
Action Items
- Finish AMIA draft, @Mark_Danese to submit tomorrow
- @jenniferduryea to provide definition to ESRD.
- @ericaVoss needs to still finish up the arrows between CARRIER_CLAIMS and PROCEDURE_COST and CARRIER_CLAIMS and DRUG_COST
- @ericaVoss needs to add the CONCEPT_IDs given by Christian for the CONDITON_TYPES, PROCEDURE_TYPES, and DOMAINS

@ericaVoss to add COHORT_DEFINITIONS in provided

ericaVoss · March 12, 2015, 10:43pm

Follow-ups from March 12 Meeting:

Completed the following:

COHORT_DEFINITIONS entered
DEVICE_COST
CARE_SITE
FACT_RELATIONSHIP
DRUG_EXPOSURE (non-RX)
Millions of arrows for
- OUTPATIENT_CLAIMS–>DRUG_EXPOSURE
- CARRIER_CLAIMS --> DEVICE_COST
- CARRIER_CLAIMS --> PROCEDURE_COST
- CARRIER_CLAIMS --> DRUG_COST
CARRIER_CLAIM CONDITION_TYPES, PROCEDURE_TYPES
Share final Rabbit-in-a-Hat file and WORD file for feedback
Action Items
@jenniferduryea to provide line process indicator codes
AMIA Discussion – schedule?
Outstanding Items:
- Final document clean up
- Final DOMAIN discussion (e.g. HCPC that shouldn’t end up in PROCEDURE_OCCURRENCE, review how the VOCAB is handling it and if appropriate for our situation)
  - Plan to follow-up on the DOMAIN discussion with Christian either to provide feedback or get thoughts on how to implement.
- DRG in conditions?? Vocabulary usage (FORUM LINK)
  - We have a reply on the OBSERVATION_CONCEPT_ID, just use DRG.
- Line Process Indicator Codes (claims accepted or denied) - LINE_PRCSG_IND_CD_1 in CARRIER_CLAIMS need to develop logic to keep or reject records
Testing Discussion

Mark_Danese · April 6, 2015, 11:53pm

Sorry for no updates for a couple of weeks — I was on vacation. So, thanks to @ericaVoss and @jenniferduryea , we now have a pretty solid ETL spec. The next step is to program it, so we can get data to the community.

There are a couple of options:

If you want to do it all yourself, you are free to do so, and can ask Jen and/or Erica for clarification at any time. Use whatever language you know best.
If you want to contribute and do part of the ETL with others, we will have to agree on a language, and a division of labor.

Possible options for option 2 include Java, Python, SQL + database of your choice, and R. Of course, if you want to use Cobol, Perl, Ruby, or something else, you can take option #1 and possibly do it in parallel with someone else (good for QC). Even if you pick something more “mainstream", you may also be doing it in parallel if nobody else is comfortable with it.

The goal for this is to get it done, and to get the data out there, available for everybody. Depending on the person, and how it is done, we may also want to make the ETL code available via GitHub, but this is not absolutely essential. In other words, we would rather get it done fast, than get it done pretty.

So, if you are willing to contribute via option 1 or 2, please email me back. Once I have that list, I think it will be pretty easy to figure out how this will work.

Note: Before we finish, we will also create a test ETL dataset to test the code against. So, if you don’t want to do the ETL, you can contribute by making up data.

Also, I will put this out on the forums, so other who are interested can join. @Frank, @Patrick_Ryan, @Christian_Reich, @wstephens, @lee_evans, @donohara, @aguynamedryan, @mlgleeson

Regards,
Mark

Patrick_Ryan · April 21, 2015, 3:36pm

Any progress on the ETL implementation front? @Mark_Danese, did you get any volunteers to move this forward? It came up again this morning in the HERACLES / OLYMPUS workgroup that having the SynPUF data in CDM format would be useful for both the developers who are trying to build out new OHDSI tools and well as for those in the community who’d like to install and play with the OHDSI tools as they come online.

Mark_Danese · April 21, 2015, 3:52pm

@donohara has volunteered to take this on as soon as he gets some time. (Thanks Don!)

Patrick_Ryan · April 21, 2015, 4:18pm

Thanks @donohara ! This will be a huge contribution to the OHDSI
community. Let us know how we can help you turn the ETL spec into a CDM
that we can all benefit from.

Cheers,

Patrick

wstephens · April 21, 2015, 6:37pm

My data analyst has been reviewing the documentation and we’ve been brainstorming about potential implementations, including a new approach that I started testing for a client in December. We’ll contribute where we can to help the implementation.

Bill

Mark_Danese · April 21, 2015, 6:47pm

We can meet with you and/or your analyst one-on-one if you want us to go over the details with you guys (or anyone who wants to).

Christian_Reich · May 5, 2015, 5:11pm

@Mark_Danese; @donohara and friends:

Do you still need help and hands on the table with this?

Mark_Danese · May 5, 2015, 11:49pm

@donohara will have to reply – I know he has made some progress but I am not sure of the current status.

donohara · May 6, 2015, 10:21pm

Hi all - been on the road and offline. Will be posting updates soon.
Don