Thank you @schuemie that sounds like a good solution, especially row_number(order by subject_id, cohort_start_date)
Shouldn’t we make the row_number() default? It seems like this is a portal gotcha situation?
OHDSI Home | Forums | Wiki | Github |
Thank you @schuemie that sounds like a good solution, especially row_number(order by subject_id, cohort_start_date)
Shouldn’t we make the row_number() default? It seems like this is a portal gotcha situation?
Default how? The cohort table is the input to FeatureExtraction. Whether or not it contains a row_id
field (created using ROW_NUMER
) is not within FeatureExtraction’s control.
Thank you @schuemie sounds like we have a known problem when there are more then one record per subject_id. The default behavior of FF is to use subject_id as row_id. So, given a situation like this
We have a problem using FF default behavior of rowId = ‘subject_id’, because we are more likely to have difficulty differentiating between features generated for the same subject_id with different cohort_start_date 5/1/2016 and 2/15/2017.
In this case, we need to use a structure something like this where we need to create a new column that uniquely identifies every row record within the same cohort_definition_id
where rowId = ‘cohort_row_id’ . Current standard tools don’t do this by default, and cohort table does not have cohort_row_id. So we have to do it outside - by creating a new rowId field by using row_number() (partition by cohort_definition_id order by subject_id, cohort_start_date)
@schuemie , can you please help me with the logic of creating custom covariates?
Looking the vignette I found the next:
cohort_definition_id, A key to link to the cohort table. Note that this will be come the covariate
ID, so you should take care that these IDs do not overlap with IDs of other covariate builders that may
be used as well
I actually want to take care of assigning cohort_definition_id in order not to overlap with other default covariates I’m going to use. It seems like I can’t just use cohort_definition_id’s from Atlas and need to reassign it as well.
Is there some range of numbers which is not used for default covariate_id’s?
I tried to obtain it via reverse engineering, but failed =(