OHDSI Home | Forums | Wiki | Github

Randomized data - Atlas Cohort Generation


(Selva Muthu Kumaran Sathappan) #1

Hello Everyone,

I have raw data from hospital in which all the date fields are randomized (doesn’t follow any order for the same purpose). None of the dates in any of the table match with other related tables. All dates are unique and in future like 12/30/2900. However, I would like to generate a cohort using Atlas but it fails due to the reason that Primary criteria in Atlas has certain mandatory observation period start and end dates.

I tried different combinations of drop down values with Atlas for dates but cohort generation fails.

Am I right to understand that when there are zero results returned from SQL, cohort generation would fail. Because I tried running the Atlas generated SQL part by part and was able to see that it returns zero records for the date condition in “Where” clause (op.start date, op.end date)

So the only solution for this would be to create a dummy table with valid dates? Because, we would like to run a demo using our data in atlas. Any inputs is highly appreciated

Can you guide me as to how to handle this scenario or let me know if there is any specific combination of values that can help me generate cohort?


(Selva Muthu Kumaran Sathappan) #2

@Chris_Knoll - Will you be able to help me with this scenario?

(Chris Knoll) #3


As a rule, cohort entry events can only be used if they fall within an observation period, so that we can then determine how long the person is in the cohort (by default, it uses 'the observation period end date as the cohort exit date).

Since you are doing this as a demo, you can probably inject ‘fake’ observation_period records based on the min(condition_occurrence_start_date) and max(condition_ocurrence_start_date) + 1:

insert into observation_period (observation_period_id, person_id, obseration_period_start_date, observation_period_end_date, period_type_concept_id)
select row_number() over (order by person_id) as observation_period_id,
  min(condition_start_date) as observation_period_start_date,
  dateadd(d, 1, max(condition_start_date)) as observation_period_end_date,
  0 as period_type_concept_id
from condition_occurrence
group by person_id

I do not know which dialect of sql you need, so you’ll have to translate the above to your desired SQL dbms.

This query will create 1 record per person based of min/max of dates in condition_occurrence. If you need to create cohort entry events from any of the other domain tables, then you will have to take the min/max of all the other domain tables (by person) to define your observation periods.


(Selva Muthu Kumaran Sathappan) #4

Thanks for the reponse Chris @Chris_Knoll. Much appreciated. I understand the need for observation periods in cohort entry criteria but are there any possibilities (do you foresee?) of making it as an optional field?

(Selva Muthu Kumaran Sathappan) #5

Hello @Chris_Knoll ,

In addition, when I update the below ‘Target’ items using Public as schema, ‘Cohort’ as table and ‘24’ as id, My sql runs without any issues but still my cohort generation fails. Am I missing anything else?


Update - Found this link. Will try the below solution.

If I still face any issues, will get back to you. Thanks for your time and patience

(Chris Knoll) #6

No. In order to enter a cohort, we need to know the continuous observation before and after the entry event.

(Selva Muthu Kumaran Sathappan) #7

@Chris_Knoll - Thanks for your response, The above thread helped us fix the issue and we are abe to generate the cohort successfully now. However, I am facing an issue under ‘Characterisation’ tab.

Can you please help me with the below post?


(Chris Knoll) #8

Check your console’s log (your chrome debug console, ctrl-shift-i on windows) when you load the report. See if there are any errors fetching the report data.

(Selva Muthu Kumaran Sathappan) #9

@Chris_Knoll -

Our cohort characterizations is failing continuously.

  1. Can you please let us know what can be the reason ? I have attached the json code (.pdf file) from Utilities section for two of the cohorts if that can help

JSON_COHORT_Characterization.pdf (14.1 KB)

astrovastatin_cohort_characterisation.pdf (13.9 KB)

  1. Can you please let me know whether ‘Feature analyses parameter’ Tab is mandatory? I mean when I leave that field empty as shown in screenshot below, will the cohort characterization fail?

We couldn’t find error message in console log when ‘Cohort characterization’ failed?

However, for cohort characterization which were generated successfully few days back, we get error while report generation. I have attached the screenshots at the bottom

How to generate ‘Cohort Characterization’ successfully without any error?




We require your assistance for two things

  1. Cohort Characterization - Failed issue

  2. Report generation issue - Not sure whether issue 1 is causing this issue. The above screenshots are for this reporting issue. The screenshots are fetched from Google chrome console. Sharing it only to help you get a better understanding of the problem as I am not sure which issue is causing the other

Can you please help us?


(Chris Knoll) #10

The errors shown in your console log is indicating that there was an error retrieving information on the server side. Although, it is not clear which ‘resource’ was requested that resulted in the Error 500 status, since it’s not shown on your console.

The “failed” result of generation also indicates a server-side problem.

To troubleshoot, you can try a two things:
1: From the UI side (in Chrome) if you can look in the ‘network’ tab, you will see the requested resource that failed (Error 500), it will be highlighted in red. You can right click on that request and ‘open in new tab’ and it will make the request to the server, and present the error message. The message may give you information as to the core problem.

2: From the WebAPI-side: you should get the logs and look for exceptions/errors related to cohort characterization. if you search for the text ‘[FAILED]’, that is the text that is associated with a job failing. You will need to look above the job failure message to look for the underlying error. It could be a mal-formed SQL excpetion, or a permission denied error. Unsure, but you need to interrogate the log and look for errors.

(Selva Muthu Kumaran Sathappan) #11

Hi @Chris_Knoll

Thanks for your time and help in answering my queries. I believe my team and I are just close to resolving this issue but somehow we don’t get it right.

As shown in the screenshot below (t1.PNG), I don’t see any red-color highlighted error message. As said earlier we don’t see any error message when cohort characterization failed. Am I missing anything here?

Does the message at the bottom of the screen indicate any useful information?

Screenshots shows 29 requests, but I waited to validate for rest of the requests but I don’t see any error messages

In addition, I tried again after updating my Atlas/webAPI to 2.7.1, even then the cohort characterization is failing. But when I clicked the “Failed” status button, I was able to see an exception message which I have attached it for your reference. Can you please help us on how to fix this?

Java_Exception_Bad_SQL_Cohort_Characterization.pdf (11.1 KB)

What should we do to fix this issue?

Apologies if my questions are basic, we are quite new to this area/setting up of servers etc.

(Selva Muthu Kumaran Sathappan) #12

@Chris_Knoll - We fixed it. cc_results table was missing. Thanks a lot for your time and patience. You are awesome.

In closing, I would like to confirm my understanding of cohort entry criteria for one last time with you.

  1. I am looking to build a cohort of patients who had taken ‘Paracetamol’. So I have drug exposure filter with “Paracetamol” concept

  2. Can you please help me understand the need of observation period?

Let’s consider my database has patients records from 2008-2013. The criteria that I have set for observation period (7 and 120 days as shown in screenshot), does that indicate that I am trying to understand what led the patient to consume paracetamol by looking at his past history (7 days) and after consumption, we observe him for another 120 days to understand the effect of having paracetamol. Am I right? Is my understanding correct? Can you help me understand this with layman terms? I have no background in healthcare/medical science.

  1. We can’t do anything with just having one cohort. Am I right? I mean we can generate summary statistics/or kind of derive some baseline characteristics but we always have to compare our ‘Paracetamol’ cohort to another cohort (ex: people who didn’t take paracetamol or any other related drugs)

  2. Limit Initial Events - Gives first (only one) of all records (earliest), last (only one) of all records (latest) and “all events” will give all the events of paracetamol consumption. Am I right?

(Chris Knoll) #13

7 and 120 means you are requiring the person to have at least 7 days continuous observation prior to the exposure record start date, and 120 days after the exposure record start date. You use these settings to enforce that you have enough ‘look back’ time for each person (you may not be able to determine if the drug exposure is ‘new’ if you only require 7 days of prior observation…mabye something like 180d is mroe appropriate. To require 120 days after means the person must have survived 120 days. In certain studies, this would be introducing ‘immortal time bias’ but in other studies (like prediction), it may be appropraite, it depends!

You can characterize it, but yes, for other operations you need a target cohort and an outcome cohort to do things like prediction and incidence rate.