OHDSI Home | Forums | Wiki | Github

CDM document on WIKI

I am new to OHDSI,and reading the data model documen recently,
in the section below, concept_id is the primary key in the CONCEPT table,how to understand this “foreign keys” description?

Representation of content through Concepts

In CDM data tables the meaning of the content of each record is
represented using Concepts. Concepts are stored with their concept_id as
foreign keys to the CONCEPT table in the Standardized Vocabularies,
which contains Concepts necessary to describe the healthcare experience
of a patient. If a Standard Concept does not exist or cannot be
identified, the Concept with the concept_id 0 is used, representing a
non-existing or unmappable concept.

Wang Hai:

Very simple. The concept_id is the foreign key used in the data tables to the primary key in the CONCEPT table. Let’s say your Concept is a Condition A with the concept_id=123. Then 123 is the primary key in the CONCEPT table, and the record with the concept_id=123 represents Condition A, while 123 is the foreign key in the condition_concept_id field of the CONDITION_OCCURRENCE table.

thanks very much.Now I realize in the expression " their concept_id as foreign keys to the CONCEPT table",their concept_id is _CONCEPT_ID

in the section Difference between Concept IDs and Source Values,why not for controlled vocabularies,what is difference between controlled vocabularies and common healthcare code systems?

Source Concepts are the concepts that represent the code used in the source. **Source Concepts are only used for common healthcare code systems, but not for controlled vocabularies.** Source Concepts are stored in the source_concept_id field in the data tables.

Oh. Good catch. That is not very well said, I agree. I fixed it. Thanks very much, Wang Hai.

I read this Standardized Vocabulary V5.0,but the following section in red color seems not to be completed.is that true or I just go to the wrong link
Is there a skype channel or something like that?
kind regards

Correct. We are writing it as we speak. Couple of weeks and you should have it. Let me know if you have any questions right now.

another one

    Standard Concepts (designated as 'S' in the standard_concept field) may appear in CDM tables in all *_concept_id fields, whereas Classification Concepts ('C') should not appear in the CDM data, but participate in the construction of the CONCEPT_ANCESTOR table and can be used to identify Descendants that may appear in the data. See CONCEPT_ANCESTOR table. Non-standard Concepts can only appear in *_source_concept_id   fields and are not used in CONCEPT_ANCESTOR table. Please refer to the    Standardized Vocabularies Specifications for details of the Standard Concept designation.

does this mean standard concept can exist in any table contains a *_concept_id fields?I dont get the expression " tables in all *_concept_id fields"

Yes. In any table that contains a field with a name ending in "_concept_id". For example, in the CONDITION_OCCURRENCE table you have the field condition_concept_id, which will take Concepts that have standard_concept = 'S'. You cannot put there Concepts with standard_concept = NULL or standard_concept = 'C'. In the field condition_source_concept_id you put the Concept that corresponds to the code in the source vocabulary. For example, if your source vocabulary is in ICD-10, you put the Concept that corresponds to that ICD-10 code. This Concept may or may not be a Standard Concept, and if it is not it will have standard_concept_id = NULL.

Speaking off. There is another restriction by Domain in CDM V5.0. The condition_concept_id field only takes Concepts that have the domain “Condition”.

Having that document on the web is a great advantage.

For evolution and maintenance - it is nice if there can be comments made that are not part of the spec.

Something like comments on this page (MySQL manual)

Alternative to comments - is to directly edit the wiki (I have made few things hyperlink in the past to help guide people)
However one sided editing can be not optimal and vetting may be needed.

I understand the forum is for commenting CDM - but having the specs and discussion in one place has some advantages too.

it will be great help if we can host the wiki document in github project.
translation thing can be more easier

here is another one, OBSERVATION_PERIOD table ,
1 take me as an example, during these three years I came to one hospital twice, one is for fever in 2012 and one is for skin problem in 2014,definitely there would be some records for me .is this one period or two ?how to distinguish between different period?is there any principles?
2 observation_period_start_date field "The start date of the observation period for which data are available from the data source " if there is a period,let`s say 2011-11-01 /2015-12-02 ,but only after 2012-11-01 it starts to collect records for this person, this start date refer to which one?

OBSERVATION_PERIOD represents the spans of time for which a person is
at-risk to have clinical events recorded within the source systems, even if
no events in fact are recorded (healthy patient with no healthcare

Some observational data sources have this information explicitly available
in the source data. For example, many payers have insurance claims data
where the beneficiary enrollment records define the periods of time when
you should see data (because its the time when a person can submit claims
for reimbursement).

Other systems do not have this explicit, so the observation periods need to
be inferred. For example, some sources that contain electronic health
records have inferred the observation period as the duration from the first
observation date to the last observation date. Other sources have tried to
define the spans of time when you have confidence in observing data.

There’s been a couple threads on this topic before, you can search for
others in the forums, but one that you may find useful is here: