I’m in the process of doing the mimicIII ETL using this project. I lost my session in the middle of the actual ETL step, but I think it ran to completion. There’s a post test you can run to check your ETL results, but I can’t tell if the issues it reports are critical or not
[dteuser@dtk8scentos02 mimic-omop]$ psql “$MIMIC” -f “etl/check_etl.sql”
psql:etl/check_etl.sql:1: unrecognized value “” for “ECHO”
Available values are: none, errors, queries, all.
Output format is unaligned.
Pager usage is off.
psql:etl/check_etl.sql:15: etl/StandardizedVocabularies/CONCEPT/check_etl.sql: No such file or directory
psql:etl/check_etl.sql:16: etl/StandardizedVocabularies/COHORT_DEFINITION/check_etl.sql: No such file or directory
psql:etl/check_etl.sql:17: etl/StandardizedVocabularies/ATTRIBUTE_DEFINITION/check_etl.sql: No such file or directory
psql:etl/check_etl.sql:18: etl/StandardizedDerivedElements/COHORT_ATTRIBUTE/check_etl.sql: No such file or directory
psql:etl/check_etl.sql:19: etl/StandardizedHealthSystemDataTables/CARE_SITE/check_etl.sql: No such file or directory
1…4
ok 1 - PERSON – number patients match
ok 2 - PERSON – gender distribution matches
ok 3 - PERSON – date of birth year distributions match
ok 4 - PERSON – no births after deaths
1…1
not ok 1 - number of unique patients who die in the database
Failed test 1: “number of unique patients who die in the database”
Results differ beginning at row 1:
have: (15759)
want: (14849)
Looks like you failed 1 test of 1
1…7
ok 1 - Visit_occurrence table – same number admission
ok 2 - number of hospital admissions who die in-hospital match
ok 3 - Visit_occurrence table – same distribution adm
ok 4 - Visit_occurrence table – distribution admit source value
ok 5 - Visit_occurrence table – repartition discharge_to_source_value
ok 6 - Visit_occurrence table – links checker
ok 7 - Visit_occurrence table – start_date > end_date
1…1
ok 1 - OBSERVATION_PERIOD – row count matches visit_occurrence
1…7
not ok 1 - VISIT_DETAIL – test same number transfers
Failed test 1: “VISIT_DETAIL – test same number transfers”
Results differ beginning at row 1:
have: NULL
want: (184142,19)
ok 2 - VISIT_DETAIL – test visit_source_value and visit_source_concept_id match
not ok 3 - VISIT_DETAIL – test admitting_source_concept_id and admitting_source_value match
Failed test 3: “VISIT_DETAIL – test admitting_source_concept_id and admitting_source_value match”
Results differ beginning at row 1:
have: (0)
want: (139516)
not ok 4 - VISIT_DETAIL – test patients number in visit_detail/icustays
Failed test 4: “VISIT_DETAIL – test patients number in visit_detail/icustays”
Results differ beginning at row 1:
have: (0,0)
want: (46476,57786)
not ok 5 - VISIT_DETAIL – check start_datetime < end_datetime
Failed test 5: “VISIT_DETAIL – check start_datetime < end_datetime”
Number of columns or their types differ between the queries:
have: (161)
want: (0)
not ok 6 - VISIT_DETAIL – check start_date < end_date
Failed test 6: “VISIT_DETAIL – check start_date < end_date”
Number of columns or their types differ between the queries:
have: (35)
want: (0)
ok 7 - VISIT_DETAIL – check care site is never null
Looks like you failed 5 tests of 7
1…7
not ok 1 - MEASUREMENT – check distribution of all labs match
Failed test 1: “MEASUREMENT – check distribution of all labs match”
Results differ beginning at row 168:
have: (33,3,49,219)
want: (33,3,50,219)
psql:etl/StandardizedClinicalDataTables/MEASUREMENT/check_etl.sql:67: ERROR: syntax error at or near “AS”
LINE 7: AS org_name
^
QUERY:
SELECT org_name::TEXT, count(1)
FROM
(
SELECT DISTINCT ON
(hadm_id, spec_type_desc, org_name, coalesce(charttime, chartdate))
AS org_name
FROM microbiologyevents
WHERE org_name IS NOT NULL
) tmp
GROUP BY org_name ORDER BY 2, 1 desc;
CONTEXT: PL/pgSQL function results_eq(text,text,text) line 7 at OPEN
ok 2 - MEASUREMENT – there is source concept in measurement not described
ok 3 - MEASUREMENT – check for duplicate primary keys
ok 4 - MEASUREMENT – standard concept checker
not ok 5 - MEASUREMENT – check row counts match
Failed test 5: “MEASUREMENT – check row counts match”
Results differ beginning at row 1:
have: (768,3097)
want: (1,6330)
ok 6 - Measurement pass, w00t!
Looks like you planned 7 tests but ran 6
1…5
ok 1 - PROCEDURE_OCCURRENCE – check all procedureevents_mv rows inserted
ok 2 - PROCEDURE_OCCURRENCE – check label is consistent with source_value
ok 3 - PROCEDURE_OCCURRENCE – check all CPT code rows inserted
ok 4 - PROCEDURE_OCCURRENCE – check CPT subsections mapped correctly
ok 5 - PROCEDURE_OCCURRENCE – check ICD procedure rows inserted
1…1
ok 1 - PROVIDER – check caregivers/providers match
1…8
ok 1 - CONDITION OCCURRENCE – check ICD diagnoses row count matches
ok 2 - CONDITION OCCURRENCE – diagnosis in admission same
ok 3 - CONDITION OCCURRENCE – distrib diagnosis the same
ok 4 - CONDITION OCCURRENCE – there is source concept in measurement not described
ok 5 - CONDITION OCCURRENCE – primary key checker
ok 6 - CONDITION OCCURRENCE – standard concept checker
not ok 7 - CONDITION OCCURRENCE – start_datetime should be > end_datetime
Failed test 7: “CONDITION OCCURRENCE – start_datetime should be > end_datetime”
Results differ beginning at row 1:
have: (585)
want: (0)
ok 8 - CONDITION OCCURRENCE – start_date should be > end_date
Looks like you failed 1 test of 8
1…10
not ok 1 - OBSERVATION – religion distribution matches (concept 4052017)
Failed test 1: “OBSERVATION – religion distribution matches (concept 4052017)”
Results differ beginning at row 15:
have: (OTHER,2696)
want: (JEWISH,5314)
ok 2 - OBSERVATION – language distribution matches (concept 40758030)
ok 3 - OBSERVATION – marital distribution matches (concept 40766231)
ok 4 - OBSERVATION – insurance distribution matches (concept 46235654)
ok 5 - OBSERVATION – ethnicity distribution matches (concept 44803968)
ok 6 - OBSERVATION – source concept described
ok 7 - OBSERVATION – primary key is always unique
not ok 8 - OBSERVATION – datetimeevents number
Failed test 8: “OBSERVATION – datetimeevents number”
Results differ beginning at row 1:
have: (0)
want: (4485342)
not ok 9 - OBSERVATION – Standard concept checker
Failed test 9: “OBSERVATION – Standard concept checker”
Results differ beginning at row 1:
have: (79027)
want: (0)
Looks like you planned 10 tests but ran 9
1…4
ok 1 - DRUG_EXPOSURE – check number of patients with prescription matches
not ok 2 - DRUG_EXPOSURE – check drug_source_value matches source
Failed test 2: “DRUG_EXPOSURE – check drug_source_value matches source”
Results differ beginning at row 1:
have: (76329301205,1)
want: ("zz ",1)
ok 3 - DRUG_EXPOSURE – is concept source id full filled
ok 4 - DRUG_EXPOSURE – Standard concept checker
Looks like you failed 1 test of 4
1…2
ok 1 - visit occurrence nb
ok 2 - check radio nb
1…3
ok 1 - primary key checker
ok 2 - source concept described
ok 3 - source concept described
ok 4 - Note Nlp pass, w00t!
Looks like you planned 3 tests but ran 4
[dteuser@dtk8scentos02 mimic-omop]$