OHDSI Home | Forums | Wiki | Github

CohortDiagnostics: What is the purpose of JSON / can one avoid providing it?

Hello! I’m interested in setting up CohortDiagnostic for cohort that I’ve created using “hand written” SQL, i.e. no ATLAS.

But it seems as if I’m not allowed to, CohortDiagnostics seems to want a JSON-file. I don’t understand why that’s needed, and would like to know if there’s some workaround, for instance an “convertSQLtoJSON”-function?

That’s a good question.

Starting version 3, cohort diagnostics does NOT generate cohorts. So it does not need it for that purpose.

Cohort diagnostics has diagnostics related to concept set that requires Json. The incidence rate computation requires the Json. Both require SQL that is generated using Circe.

If you are willing to turn those two diagnostics to off, then you can provide any"dummy" Json and you should get the other diagnostics.

I have not tested this out. Please let me know your experience

Thank you very much for answering, most helpful.

In terms of dummy-JSON, we actually put together some JSON-file manually which at least makes it possible to run CohortDiagnostics, but we felt uncertain how it’s being used in relation to the SQL, does the JSON affect anything, in case we messed up constructing it.

Is it correct to say that as long as we’re turning off the incidence and concept set-tabs, we can provide whatever JSON that makes CohortDiagnostics run, and the output will be based on the SQL and not the JSON?

Perfectly OK to advice that we should go through the code and turn off incidence and concept set-calculations as well, just asking in case we could save ourselves that little endeavor.

Its not a simple answer - but in theory you can do what you are trying to do.

Note: We required CohortDiagnostics to take as the input to be an object called ‘cohortDefinitionSet’. This object is defined by the OHDSI/CohortGenerator package. In addition, we require that object to have the field json, cohortId, cohortName and sql. The reason is cohortJson, as generated by OHDSI circe library, is parsed by CohortDiagnostics and its companion DiagnosticsExplorer shiny app.

The check is performed here.

We pretty much export the content of this object, including JSON, as output - here https://github.com/OHDSI/CohortDiagnostics/blob/5f4d80e9f4210ffaf2ac7d4ab2b26102e4987c58/R/RunDiagnostics.R#L435

So you will have to skip all these processes, and provide a dummy json to avoid error.

@OskarGauffin you are welcome to make a proposal for this functionality here Issues · OHDSI/CohortDiagnostics · GitHub

I think it would be useful in many scenarios. For example - it is possible that we take as an input an instantiated cohort table i.e. a table with cohort_definition_id, subject_id, cohort_start_date, cohort_end_date. Can we just point to that cohortTable and runCohortDiagnostics?

I think that would be valuable - as we can get a dashboard (diagnosticsExplorer Shiny app) that

  • provides cohort counts
  • cohort overlap
  • visit context
  • cohort characterization/temporal characterization including cohort as features which is new in version 3
  • cohort time series also new in version 3

@jpegilbert is the new maintainer of CohortDiagnostics. So he will have to consider this issue.