Metadata/Annotations WG: DDL and concepts feedback

Ajit_Londhe · August 13, 2018, 6:54am

Hi all,

Over the past few weeks, @Vojtech_Huser and I have been using the Metadata and Annotations WG calls to discuss new standard concepts and concept hierarchies that can be used to represent our Metadata use cases in a standard fashion. These concepts (governed by the hierarchies) can then be utilized to represent Metadata facts (data quality, ETL/design, source provenance, or data content) and to represent Annotations from data or clinical experts. These representations then can be stored into a new Metadata schema of tables that can also maintain authorship and log activities.

Before our upcoming WG call on Friday, August 17, I’d like to ask those involved or interested in Metadata and Annotations to help provide feedback about the drafts of these efforts by adding comments into the documents linked below. The sooner we can reach a consensus, the sooner we can bring this to the CDM WG, and start developing a branch of WebAPI and Atlas that can utilize these new tables and conventions. Additionally, in order to support sites that don’t use Atlas, we will need to establish a library of standard queries and conventions to ensure that this new Metadata schema is used in consistent manner regardless of implementation.

In these Google Drawings files, you will find the concept hierarchies for Metadata and Annotation.

In this Google Sheets file, you will find worksheets detailing the new concepts we’re proposing, and a draft of the Metadata schema design.

Additionally, I will be attending @jon_duke’s Chart Review Question Interface WG this Wednesday to discuss these drafts and see where we need to make adjustments to support that effort.

Thanks,
Ajit

Ajit_Londhe · August 16, 2018, 8:16pm

All,

An update from the Chart Review WG call with @jon_duke and @Andrew: we reviewed the concept hierarchies, concepts, and DDL, and it was generally well received. The tables defined in the Google Doc would comprise a new metadata schema that would be separate from the CDM or Results schemas, but still reside within a CDM database. The usage of separate dimension tables for time validity, authorship, transactional logging, and value storage could lend itself to hosting a variety of data quality, ETL/design, data content, and source provenance information. We could even eventually bring Achilles (the original OHDSI metadata) results into this schema.

A key part of the discussion was about how the Metadata schema could host Chart Review data. We agreed that while this is a priority, we need to strike a balance between the goal of avoiding data redundancy in our ecosystem and the goal of allowing rapid development of new features. For the current Chart Review app, there’s custom application tables that may not be a good fit with the Metadata schema at present, while there are others that could more seamlessly fold into the Metadata tables. For the former, the right schema may be the OHDSI Repository schema rather than the Metadata schema.

@jon_duke, to help evaluate more tangibly the intersection between your schema and the Metadata schema, could you share an export of some dummy data in your current schema? I’m curious to see how difficult it would be to land the results of the Chart Reviews into our current metadata schema design. In the meantime, I’ve begun working on an Authorship table API, as that was one of the key tables and web services you’d mentioned that could be leveraged by the Chart Review application straight away.

Lastly, I’d like to “poke the bear” a little and see if @razzaghih, @Andrew, @Vojtech_Huser, or @Frank have any feedback on the materials I described in the first post before our meeting tomorrow (8/17) at 2 pm est.

Thanks,
Ajit

Andrew · August 17, 2018, 2:35pm

Ajit,
I will repeat here the gist of my reply to the direct email version of this post that you sent.
I hope the excellent model you and Vojtech developed will support the same predictive modeling functions as the PEDSnet metadata warehouse. I don’t know the schema and vocab of their metadata warehouse well enough to evaluate that myself. So I’'m hoping Connor Callahan of PEDSnet might provide that.

Based on the call with Jon’s group, I’ll add two comments.

The model is clear and intuitively appealing. Having seen your walk through and thought about it more, I see no obvious problems for using it to represent the use cases we’ve discussed in the WG. Thanks for the great work!
The potential interest in using it as an alternative to Achilles results tables was a new use case. I’m not opposed to that. The model’s ability to handle that for Achilles and other “working tables” needed by other apps isn’t as clear to me.
How focused should we be on evaluating that capacity at this point?

Ajit_Londhe · August 17, 2018, 4:29pm

Could you connect us with Connor? Perhaps he can join us for a future call.

Regarding the Achilles results, I think we should assume that this will become a future use of the Metadata schema as it is actually observable Metadata. We should strive towards simplifying our data footprint as much as possible, so storing Achilles results in the Metadata schema allows us to then make annotations on Heel results directly without having to copy anything over from the Achilles tables.

I think the current design can work, but I welcome @t_abdul_basser’s feedback on this, as he is a key maintainer of the Achilles package and heavily involved in WebAPI development / Architecture.

t_abdul_basser · August 23, 2018, 4:04pm

Thank you @Ajit_Londh. I have been following the Metadata WC work with pleasure. I agrre with you that a significant subset of the curent Achilles results are “observable metadata”. I look forward to working with yourself, @Vojtech_Huser and others to enhance Achilles so that it produces and uses data that conforms to the emerging Metadata schema. Design issues such as what subset of the currently generated Achilles results should be “moved” from the results schema to the Metadata schema and how such changes should be phased over the Achilles roadmap (coming soon! ) should probably be discussed once the Metadat WChas had a bit more time to finish and stabilize its work.

Ajit_Londhe · August 30, 2018, 6:49pm

All,

We have our next Metadata and Annotations WG call tomorrow August 31 at 2 pm est. Call-in details are located here:
http://www.ohdsi.org/web/wiki/doku.php?id=projects:workgroups:metadata_and_annotations

Thanks,
Ajit