instead of using getDbCohortMethoddate to generate cohortMethodData, how do I load my own dataset and to apply createPS() on my dataset?
Thanks!
OHDSI Home | Forums | Wiki | Github |
instead of using getDbCohortMethoddate to generate cohortMethodData, how do I load my own dataset and to apply createPS() on my dataset?
Thanks!
That is actually very hard, because CohortMethodData is a very specific type of data object.
Could you help me. understand you specific use case? What prevents you from using getDbCohortMethodData()`?
Thanks for replying, @schuemie . My exposure is not a single drug. The types of exposure I am looking at consist of thousands of ConceptIDs, and I am trying to assign one person to one exposure category exclusively.
This is a hypothetical example. Imagine I want to study the effect of three drug categories—antibiotics, antineoplastic drugs, and antidiabetic drugs—on renal function. If a person receives all three categories of drugs during one visit, I would categorize them under antineoplastic drugs because I consider their effect to be more significant compared to antibiotics and antidiabetic drugs. Since my exposure groups are not straight forward. I wonder if I can input the same data format as it is from getDbCohortMethoddate(), so I can apply createPS().
It seems the columns from getDbCohortMethoddate() are: cohort_definition_id (including ids for the exposure groups and outcome), subject_id, cohort_start_date (exposure date), cohort_end_date
Thank you!
CohortMethodData is an S4 class derived off the Andromeda class. It has 4 Andromeda tables (outcomes, cohorts, covariates, covariateRef), and a set of metadata attributes. As I said: it is non-trivial to construct yourself.
I would highly recommend creating your exposure cohorts in either ATLAS or Capr. These tools will be able to implement the logic you’re looking for. Once you have your exposure cohorts in place, you can run CohortMethod with just a few R calls, as described in the vignette.
Alternatively, if you do not wish to learn how to use ATLAS or Capr, you could create a cohort table on your server yourself. A cohort table has 4 columns: cohort_definition_id, subject_id, cohort_start_date, and cohort_end_date. (The subject_id is the person_id).
@schuemie Thanks for taking the time to reply.
Follow-up questions regarding cohort_definition_id, subject_id, cohort_start_date, and cohort_end_date (where subject_id is the same as person_id):
Yes, I think all of those statements are correct.
When calling getDbCohortMethodData(), the targetId
, comparatorId
, and outcomeIds
arguments will correspond to the cohort_definition_id
field in the cohort table.
For exposures, we almost always indeed set cohort start and end to correspond to exposure start and end. Importantly, we typically combine subsequent prescriptions into a single exposure cohort entry (usually allowing for a gap between prescriptions). Note that, with the createStudyPopulation() function, you can set the time at risk based on the cohort start and end using the riskWindowStart
, startAnchor
, riskWindowEnd
, and endAnchor
arguments. By default, the time at risk start and end are identical to the cohort start and end.
@schuemie For plotKaplanMeier(), is there a function like summary(survfit(Surv()) that we can check the risk at each time point? Thank you!
No, sorry! You’ll have to hack the plotKaplanMeier()
for that. E.g. copy the function source code and add some code of your own to export the data object.