[The Book of OHDSI] Data Quality, evidence quality, and general validity

schuemie · March 16, 2019, 5:57am

In the Book of OHDSI currently a chapter on data quality is being written by @Vojtech_Huser, but we don’t have any specific chapters covering other aspects of validity.

In @jon_duke’s talk on validation (see slides here), Jon distinguishes four aspects of validation that contribute to evidence quality:

Data validation
Software validation
Clinical validation
Methods validation

I would argue we’d need to cover all four aspects in our book, and as you can see in Jon’s slides, we already have quite a lot to say on each topic.

As currently organized, the book would cover some of the non-data aspects throughout other chapters. For example, software validity would be covered in the chapter on OHDSI tools.

An alternative would be to have an entire book section, with four chapters corresponding to the four aspects highlighted by Jon (Data, software, clinical, and methods validity).

Yet another alternative would be to change the Data Quality chapter to be a generic validity chapter, and cover all aspects in a single chapter.

Let me know what you think! Looping in @Rijnbeek and @David_Madigan.

saradempster · March 19, 2019, 6:48pm

Hi @schuemie,
This is an interesting question and thanks for the link to @jon_duke’s nice slide deck.

Overall, I’d vote for a more holistic integration of the quality discussions. The subject of quality may be hard to break out to its own chapters, Quality and validation are threads that would naturally flow through many of specific discussions in existing chapters and these chapters are giving context for these discussions. For example, regarding data quality alone, there are at least 4 aspects of data quality impacting evidence quality (1) review vocab mappings, (2) checking source data quality, (3) ensuring ETL conventions are not violated, (4) evaluating fidelity and completeness of source data in the CDM form. Each of those are pertinent to existing chapters in the current book outline. For example, A discussion of how and why to evaluate the ETL conversion of a particular database discussion seems most natural in the ETL chapter etc. etc.

Similarly for clinical validation, there are aspects that are integral to a study design discussion, and cohort building discussion. A methods chapter would naturally discuss choosing an appropriate method for the scientific question, being clear about limitations and assumptions, checking assumptions, checking things like convergence and reproducibility etc.

That said, a summary chapter in each of the four overall areas that you listed that highlights general principles and tools for quality would be useful. The more general discussion could supplemented by specific discussions within detailed chapters i.e. ETL chapter discusses how to confirm that ETL conventions are followed and why the conventions are important etc.

In addition, in the intro, the Open science chapter could discuss benefits of community participation enabling constant iteration on quality across all the areas and generally describe the areas.

Todd_Price · March 19, 2019, 6:53pm

If you have one chapter on validity and give spoon feeding of the information, you can then take all of the fleshed out other writing and use that for PPT and instructor resource information which can be an add on for the book.

tp