Controlled Vocabularies

What is a controlled vocabulary?

A vocabulary can be defined as:

an information tool that contains standardized words and phrases used to refer to ideas, physical characteristics, people, places, events, subject matter, and many other concepts. Controlled vocabularies allow for the categorization, indexing, and retrieval of information.

Introduction to controlled vocabularies (Harpring, 2010, p. 12)

Controlled vocabularies provide a definition and a coding scheme for each term that is included in the vocabulary. This makes metadata and information retrieval or exchange much easier and more efficient. Controlled vocabularies’ level of detail ranges from short unidimensional lists to complex vocabularies with hierarchical relationships.

Why use controlled vocabularies?

Controlled vocabularies also play an important role in metadata standards (see also the knowledge base’s section on metadata), because they define the meaning of metadata elements and which values are allowed in an element/attribute. Apart from that, they can also help to find relevant data, or provide information on how to interpret data (for both, humans and machines). The use of a controlled vocabulary helps to improve the interoperability of your data, as vocabularies facilitate the interpretation and harmonization of data (especially, if other researchers employ the same vocabulary for their data).

Controlled vocabularies and Ontologies

Controlled vocabularies can also be part of an ontology. An ontology (sometimes also referred to as vocabulary) defines not only the meaning of a single element (like controlled vocabularies do), but also interrelationships and the hierarchical structure of these elements and, thereby, a domain’s structure. As a consequence, machine-interpretability can be achieved.

Which Controlled Vocabularies and Ontologies exist for psychological data?

The most important source for controlled vocabularies in psychology is the American Psychological Association’s (APA) thesaurus for psychological terms which is also available as digital ontology on the BioPortal Platform. A German translation of this thesaurus is provided by the ZPID.

For the social sciences, the DDI (Data Documentation Initiative) Controlled Vocabularies Group has created sets of controlled vocabularies that can be applied when using the DDI metadata standard (see also the knowledge base’s section on metadata) or on their own.

Other controlled vocabularies exist for specific types of research related information like the open funder registry.

Further Resources

  • Taina Jääskeläinen, Meinhard, Moschner and Joachim Wackerow  (2009) published a paper on Controlled Vocabularies for DDI 3: Enhancing Machine-Actionability which provides further information on the topic of controlled vocabularies.
  • The collaborative knowledge building project Cognitive Atlas  aims at creating a knowledge base/ontology for cognitive science, which makes it relevant for psychologists of the cognitive, or neuro-scientific realm.
  • The Unified Medical Language System (or short UMLS), as a set of files and software, combines a variety of vocabularies and standards from the field of health and biomedicine to enable interoperability between different systems.
  • Protégé is a widely used software for building and maintaining ontologies, it is free and open-source.
  • linkedscience.org provides vocabularies for interconnecting scientific outputs.

References