OHDSI Home | Forums | Wiki | Github

Medicare ETL development

Friends:

Did we agree on the anything? I need a clear marching order, don’t want to cobble it together from the commenting cascade. Please somebody summarize.

You meant the other thread on procedures, right?

Yes.

Just a reminder that we will touch base as a group at noon (pacific) today. We can summarize where the mapping to the CDM is (Erica, Jen, Amy, Michelle), and also talk about the loading of the data and planning for the ETL (Ryan, Lee, Don, and Bill).

I’ve updated the Rabbit-In-A-Hat in GitHub and Generated a DRAFT document:


I had two follow-ups:

(1) What gets used for RACE_CONCEPT_ID/RACE_SOURCE_VALUE/RACE_SOURCE_CONCEPT_ID and for ETHNICITY_CONCEPT_ID/ETHNICITY_SOURCE_VALUE/ETHNICITY_SOURCE_CONCEPT_ID?

For the RACE_CONCEPT_ID and ETHNICITY_CONCEPT_ID we would use the following:

SELECT *
FROM CONCEPT c
WHERE vocabulary_id = 'Race'
AND INVALID_REASON IS NULL

SELECT *
FROM CONCEPT c
WHERE vocabulary_id = 'Ethnicity'
AND INVALID_REASON IS NULL

Obviously we will just use BENE_RACE_CD for RACE_SOURCE_VALUE and ETHNICITY_SOURCE_VALUE.

However, I do not think we will populate the RACE_SOURCE_CONCEPT_ID and ETHNICITY_SOURCE_CONCEPT_ID. The documentation does not call out specific lookups. Instead we will take these lookups and put them in the SOURCE_TO_CONCEPT_MAP.

Centers for Medicare and Medicaid Services (CMS) Linkable 2008–2010 Medicare Data Entrepreneurs’ Synthetic Public Use File (DE-SynPUF)


(2) County Lookup (with Mark)

For BENE_COUNTY_CD we will use SSA Codes (http://www.resdac.org/cms-data/variables/County-Code) for COUNTY_CODE IN BENE_COUNTY_CD.

THIS CODE SPECIFIES THE SSA CODE FOR THE COUNTY OF RESIDENCE OF
THE BENEFICIARY. EACH STATE HAS A SERIES OF CODES BEGINNING WITH ‘000’
FOR EACH COUNTY WITHIN THAT STATE. CERTAIN CITIES WITHIN THAT STATE
HAVE THEIR OWN CODE. COUNTY CODES MUST BE COMBINED WITH STATE
CODES IN ORDER TO LOCATE THE SPECIFIC COUNTY. THE CODING SYSTEM IS
THE SSA SYSTEM, NOT THE FEDERAL INFORMATION PROCESSING STANDARD
(FIPS).

Just a reminder that the meeting on Monday will be for a more detailed discussion of the ETL process. If @lee_evans, @Frank, @aguynamedryan, @donohara, and @wstephens can make it, that would be helpful. (Others are certainly more than welcome.)

The main discussion will be to start thinking about how we might take advantage of everyone’s skills and write an etl process for the medicare data.

Thanks for the reminder, I will definitely be attending.

Here is a quick-and-dirty synpuf data viewer. It will be up for today’s call

http://54.213.235.187:5000/

Hi @Mark_Danese

unfortunately I wasn’t able to attend the call today. As mentioned on previous calls I think my contribution can be to help out with the hosting of CMS synthetic data and associated tools.

Lee.

This is VERY nice.

I love how you can see all of the claims (across all files) for one patient id. Very cool.

Some Follow-Ups from Today’s 2/25/15 Meeting:

  1. I added all the CONDITION_TYPES we needed for CARRIER_CLAIMS, INPATIENT_CLAIMS, and OUTPATIENT_CLAIMS.
  2. Put our updated Rabbit-In-A-Hat file and DOC on GitHub.

@Christian_Reich,

We would like to be able to tell the difference between different types of claims on the CONDITION_OCCURRENCE table:

  • INPATIENT_CLAIMS
  • OUTPATIENT_CLAIMS
  • CARRIER_CLAIMS

We want to be able to have a TYPE for CARRIER_CLAIMS just like we do IP and OP. What are your thoughts? If you need more information we can chat about it.

We are thinking we need something like this. We could maybe generalize the titles.

  • Carrier Claims header - 1st position
  • Carrier Claims header - 2nd position
  • Carrier Claims header - 3rd position
  • Carrier Claims header - 4th position
  • Carrier Claims header - 5th position
  • Carrier Claims header - 6th position
  • Carrier Claims header - 7th position
  • Carrier Claims header - 8th position

  • Carrier Claims details - 1st position
  • Carrier Claims details - 2nd position
  • Carrier Claims details - 3rd position
  • Carrier Claims details - 4th position
  • Carrier Claims details - 5th position
  • Carrier Claims details - 6th position
  • Carrier Claims details - 7th position
  • Carrier Claims details - 8th position
  • Carrier Claims details - 9th position
  • Carrier Claims details - 10th position
  • Carrier Claims details - 11th position
  • Carrier Claims details - 12th position
  • Carrier Claims details - 13th position

@ericaVoss - Instead of “Carrier Claims - #th position”, I believe we want “Carrier Claims Detail - #th position” up to position 13. This will more closely follow the header/detail format of Inpatient vs Outpatient.

To @amatcho and the rest of the Medicare ETL Development community,

In case some of you may have missed this, Outcomes Insights has created a lite ETL of SEER Medicare in CDM v4 format. The documentation and SAS code is open to the public and can be accessed here:

There are a lot of similarities between SynPuf and SEER Medicare and we are using a lot of the logic documented in our SEER Medicare ETL for our SynPuf ETL. Please keep in mind that this ETL is a “lite” version and does not populate all of the CDM tables. This is used more as a starting point and a general guideline. Specifically, I hope this proves useful for starting @amatcho’s ETL for SEER Medicare.

Side Note: Please forgive me for the novice programming skills.

@ericaVoss et al.:

What is carrier claim? The institution’s claim? They also give 13 diagnoses?

@jenniferduryea: SAS? You are travellling back in time, using proprietary (and pathetically expensive for the value) technology. Well, I guess we can’t ask much for a free contribution. Gift horse and all. :smile:

HAHAHA @Christian_Reich! :joy: I have to agree about SAS! Though I’m probably doing SAS more of a disservice with my atrocious coding skills! I guess there is a method to my madness.

@jenniferduryea:

I was thinking the same thing. In fact, I was contemplating this session for a while. The next two days aren’t good because I am in good old England (8 hours away from you), but what about middle of next week? Let me push out a doodle.

Yes, thank you, I was going too fast and wasn’t thinking.

@jenniferduryea is the one teaching me about this . . .

“The Carrier file (also known as the Physician/Supplier Part B claims file) contains final action fee-for-service claims submitted on a CMS-1500 claim form. Most of the claims are from non-institutional providers, such as physicians, physician assistants, clinical social workers, nurse practitioners. Claims for other providers, such as free-standing facilities are also found in the Carrier file. Examples include independent clinical laboratories, ambulance providers, and free-standing ambulatory surgical centers.” [1]

t