Hi there!
Currently to create a cdm schema I’m copying the synpuf 5% sample .csv files (https://www.ohdsi.org/data-standardization/) using the Postgresql scripts here (https://github.com/OHDSI/CommonDataModel/tree/v5.2.2/PostgreSQL)
However I have encountered the following errors:
device_exposure:
COPY cdm.device_exposure FROM ‘/tmp/CDM5VOCAB/device_exposure.csv’ WITH DELIMITER E’\t’ CSV HEADER QUOTE E’\b’;
ERROR: invalid input syntax for integer: "C1894"
SQL state: 22P02
Context: COPY device_exposure, line 2, column visit_detail_id: "C1894"
condition_occurrence:
COPY cdm.condition_occurrence FROM ‘/tmp/CDM5VOCAB/condition_occurrence.csv’ WITH DELIMITER E’\t’ CSV HEADER QUOTE E’\b’;
ERROR: missing data for column "condition_status_concept_id"
SQL state: 22P04
Context: COPY condition_occurrence, line 2: "12 1 134736 2009-07-25 2009-07-25 00:00:00 2009-07-25 38000230 6 6 7245 44825708 7245 0"
drug_exposure
COPY cdm.drug_exposure FROM ‘/tmp/CDM5VOCAB/drug_exposure.csv’ WITH DELIMITER E’\t’ CSV HEADER QUOTE E’\b’;
ERROR: missing data for column "dose_unit_source_value"
SQL state: 22P04
Context: COPY drug_exposure, line 2: "97 1 19040307 2008-04-10 2008-04-10 00:00:00 2008-07-09 2008-07-09 00:00:00 38000175 30 90 0..."
measurement:
COPY cdm.measurement FROM ‘/tmp/CDM5VOCAB/measurement.csv’ WITH DELIMITER E’\t’ CSV HEADER QUOTE E’\b’;
ERROR: missing data for column "unit_source_value"
CONTEXT: COPY measurement, line 2: "1 1 2212731 2008-09-04 45754907 0 2 2 85610 2212731 "
observation:
COPY cdm.observation FROM ‘/tmp/CDM5VOCAB/observation.csv’ WITH DELIMITER E’\t’ CSV HEADER QUOTE E’\b’;
ERROR: invalid input syntax for integer: "V4502"
CONTEXT: COPY observation, line 2, column visit_detail_id: "V4502"
payer_plan_period:
COPY cdm.payer_plan_period FROM ‘/tmp/CDM5VOCAB/payer_plan_period.csv’ WITH DELIMITER E’\t’ CSV HEADER QUOTE E’\b’;
ERROR: missing data for column "plan_concept_id"
SQL state: 22P04
Context: COPY payer_plan_period, line 2: "1 1 2008-01-01 2010-12-31 Medicare Part A "
procedure_occurrence:
COPY cdm.procedure_occurrence FROM ‘/tmp/CDM5VOCAB/procedure_occurrence.csv’ WITH DELIMITER E’\t’ CSV HEADER QUOTE E’\b’;
ERROR: invalid input syntax for integer: "V5841"
SQL state: 22P02
Context: COPY procedure_occurrence, line 2, column visit_detail_id: "V5841"
vocabulary:
COPY cdm.vocabulary FROM ‘/tmp/CDM5VOCAB/vocabulary.csv’ WITH DELIMITER E’\t’ CSV HEADER QUOTE E’\b’;
ERROR: null value in column "vocabulary_version" violates not-null constraint
SQL state: 23502
Detail: Failing row contains (Cohort Type, OMOP Cohort Type, OMOP generated, null, 44819234).
Context: COPY vocabulary, line 3: "Cohort Type OMOP Cohort Type OMOP generated 44819234"
For the invalid input syntax, I’m guessing the data dosen’t match the datatype specified for the column, e.g. for procedure_occurence, visit_detail_id should be an integer but the data is a string “VS841”?
I’m puzzled over the error regarding null values as I’ve looked at the .csv files and the number of columns of the data seem to match the number of columns in the table?
Thanks in advance,
Hui Xing