Some examples of metadata, or data that provide information about other data, are familiar in medical publishing. Digital object identifiers (DOI), for example, allow easy location of digital content, and are linked to other metadata embedded within an article. This may include tags identifying the title, author, journal and International Standard Serial Number (ISSN). Such metadata are used by programs like Altmetric to track online activity surrounding a unique publication. However, current processes are not allowing the full potential of metadata to be realised in the evolving scholarly communications landscape.
Metadata 2020 is an international collaboration involving stakeholders from across scholarly communications.
Metadata 2020’s mission involves “advocating for richer, connected, and reusable, open metadata for all research outputs in order to advance scholarly pursuits for the benefit of society”.
This work builds on that of Crossref, which facilitates the finding, citing, linking and reuse of research outputs by tagging such content with metadata. In an article in the Journal of Research Ideas and Outcomes, Kathryn Kaiser et al lay out the Metadata 2020 Principles for the use of metadata in scholarly communications.
The Metadata 2020 Principles
For metadata to support the community, it should be:
Compatible: provide a guide to content for machines and people.
- Metadata must be as open, interoperable (allowing information exchange), parsable (containing characters that can be broken down into recognisable components analysable by a computer), machine actionable (structured for recognition by computer programmes), and human readable as possible.
- Persistent identifiers (PIDs), such as ORCID ID, should be used to support interoperability. In contrast, a URL link is not a PID: it can break and so does not reliably refer to a digital entity.
Complete: reflect the content, components and relationships as published.
- To achieve this, metadata must be as complete and comprehensive as possible.
Credible: enable content discoverability and longevity.
- To meet this principle, metadata must be of clear provenance, trustworthy and accurate.
Curated: reflect updates and new elements.
- The authors note that metadata will always be evolving, so metadata must be maintained and updated over time.
These principles aim to shape a common understanding around the metadata needed to support scholarly communications. The authors note that the language of metadata may be specific to a particular discipline, so best practice approaches must allow flexibility in metadata use across communities. In addition, existing metadata standards should be considered and re-used as much as possible to improve the interoperability of different metadata schema, while minimising redundancy.
Applying these principles may also help stimulate further development of ideas and workflows in the scholarly ecosystem. The future work of the Metadata 2020 group will focus on demonstrating the potential benefits of better metadata for stakeholders including researchers, through a set of Metadata Practices.