ATLAS generated SQL - how to fill in Target_Cohort statements

krfeeney · June 28, 2020, 3:36pm

@Chris_Knoll, @Patrick_Ryan @Konstantin_Yaroshove,

I’ve long ignored this line at the end of the ATLAS generated cohort definition SQL:

DELETE FROM @target_database_schema.@target_cohort_table where cohort_definition_id = @target_cohort_id;
INSERT INTO @target_database_schema.@target_cohort_table (cohort_definition_id, subject_id, cohort_start_date, cohort_end_date)
select @target_cohort_id as cohort_definition_id, person_id, start_date, end_date 
FROM #final_cohort CO
;

I’m in a spot where I need to use this to run a cohort. Can you help me understand what goes in these global variables?

I’m assuming I need to make the @target_cohort_table:

CREATE TABLE @target_database_schema.@target_cohort_table  ( person_id varchar(50)  NOT NULL,
	start_date			date  NOT NULL,
	end_date			date  NOT NULL
)
DISTKEY(person_id);

But what do I put in for @target_cohort_id? Is that an arbitrary value I assign to the cohort I’m building?

Appreciate your help demystifying this!
Kristin

Ajit_Londhe · June 28, 2020, 4:20pm

It’s a cohort table (when Atlas runs, it uses the cohort table). This is where the actual cohort content gets written.

But if you’re using a custom one, it just needs the same cohort table fields:
cohort_definition_id, subject_id, cohort_start_date, cohort_end_date

Atlas uses the cohort definition id to fulfill @target_cohort_id.

Konstantin_Yaroshove · June 29, 2020, 7:43am

@krfeeney, yes @Ajit_Londhe is right. @target_cohort_id is ATLAS cohort ID (its design ID).

krfeeney · June 29, 2020, 12:35pm

So if I’m just running this as SQL without access to the ATLAS tables, it’s just an arbitrary ID. Got it.

Thanks!

Patrick_Ryan · June 29, 2020, 12:47pm

@krfeeney, you can use that query to populate your own designated table and give if your own identifier. Then, in your local analysis, you can reference the table name and identifier you created to conduct your analysis (ex. CohortMethod or PatientLevelPrediction). So, yes, its ‘abritrary’, insofaras you have complete freedom to organize your own local cohort instances as you see fit. The main thing you need to avoid is collision of identifiers, if you were to accidentally use the same table name and id, then you’d get yourself in trouble. When I’m doing a local analysis, I tend to create a table name to correspond to the study (e.g. COHORT_EPI123) and then identify the cohorts used within that study, numbering them 1…n. That way I know I only have to ensure that I uniquely identify the cohorts within the study and don’t need to worry about cross-study contamination:)