OHDSI Home | Forums | Wiki | Github

How do we define "Metadata" and "Annotation?"

Hi all,

As part of our kickoff Metadata and Annotations WG call, we discussed formalizing the OHDSI definitions for “metadata” and “annotation” being an important step to complete before we can start creating standards and conventions.

I took a shot at writing the definitions, and we did a bit of editing during the call, but I’d like to open this up to those who couldn’t attend and to the broader community to help us refine the definitions.

Metadata is information that can be directly observed, indirectly inferred, or externally obtained about an observational data set that provides us with a more complete understanding of the patient experience being represented.

Annotations are topical notes about metadata authored by those with relevant experience or expertise that are intended to improve study design for other researchers.



1 Like

Hi @Ajit_Londhe,

On the wiki, the meeting schedule says May 16, but the time is not listed. Can you post?

To me, metadata should be its literal definition – data about data. So anything information about the dataset itself would be metadata. Editing your entry, I’d say Metadata is information about an observational data set that provides a more complete understanding of the data contained.

Annotations seem to imply a statement of opinion rather than fact, but otherwise would be just the same. Annotations are expert assertions intended to provide a more complete understanding of the data contained within a dataset.

1 Like


The generic philosophical definition is great, but why don’t you start putting substance on it, to create some use cases. What are the metadata and annotation you are keen to put in there? How can we standardize them?

Hello Ajit, I agree with @jon_duke that the metadata definition should be broader, not only observed from the data itself. For example, valuable metadata could also include contact information like posted on the Data Network page. This can be enriched with e.g. accessibility information.

The FAIR metadata principles might also be helpful to guide what metadata is needed and in what format.

1 Like

Sorry about that, meeting time is updated, 2 pm est today.

All – we’re using a new Skype conference URL, please check the Wiki

Hello Ajit,

Would it be of interest to align this with the FAIR Data Principles (
For example, https://www.w3.org/TR/hcls-dataset/ offers a great set of
standardized vocabularies to build the metadata set around.
I’m sorry I cannot call in today but I am very keen to be involved in this,
as we also have some major development tasks around this in the forthcoming
IMI EHDEN project and we can build on a lot of expertise from both the FAIR
community as well as previous experiences in IMI EMIF (Catalogue).
I was hoping we could work on this in the Bio IT FAIR Hackathon as I
proposed at the forum, but that was a this week and a bit to close to some
other face to face meetings – but we could still try to organize something
or do this next year at Bio IT if it makes sense.

Met vriendelijke groet / Kind regards,

Kees van Bochove


1 Like