OHDSI Home | Forums | Wiki | Github

Inflated Incidence rate counts


My colleagues and I are looking into Incidence rates in ATLAS and we are seeing some strange counts. When we look at the # of people in the target cohort under the Cohort definition tab we see a lot fewer persons than the # of persons in the target cohort under the Incidence rates tab. How could this be?

For a bit more context the target cohort is all persons with colorectal cancer that have been correctly classified for N-stage N0 and the outcome cohort is for death.

Soham (and colleagues)

Sorry for my first reply, I misunderstood your question. That is very strange result…I can’t tell from your screenshot of your cohort generation, but can you tell me how many ‘events’ were captured in the cohort generation? Maybe there’s an issue where it’s using multiple events per person?

Hi @Chris_Knoll,

Under the cohort generation tab, we see 2,896 People and 2,896 Records for the target cohort.

The outcome cohort (which is a simple death occurrence cohort) has 43,855 People and 43,855 people under the cohort generation tab (screenshot not included).

Ok, so it looks like your target could only find 2896 max for people used in the incidence rate calculation, and I have no idea why you would get 12621. Can you delete your incidence rate design and start a new one…is it somehow possible that results from one execution got overwritten into the results of another? It doesn’t seem possible…but it’s worth trying.

In addition, you can look at the analysis SQL here. In order to run this yourself, you need to replace a few tokens:
@cohortInserts: replace this with

SELECT 1 as cohort,_id, 0 as is_outcome 
SELECT 2 as cohort_id, 1 as is_outcome

Replace the 1 with your target cohort, 2 with your outcome cohort. This will let you run the query at line 1 to create your target-outcome pairs.

Next, run the query on line 8 to create all the combinations (you will only have 1 T-O pair)

Finally, you will need to replace @temp_database_schema.@cohort_table with your cohort table name, and @cdm_database_schema with your cdm schema. Then run the query on line 14. This will return the people from your target cohort that should be included in your analysis. This query should not return 12k rows, it should only have the 2.6k people from your target cohort. If you can identify why you are getting more people in your analysis than actually exist in your cohort.