Standards:Vocabularies
From THO
THO Standards for Cultural Heritage Digitization Projects
Controlled Vocabulary Standards
Introduction
Controlled vocabularies allow catalogers and collection managers to use descriptive terminology consistently. Controlled vocabularies may take the form of authority lists, in which one form is considered to be definitive, thesauri, which describe relationships among various terms, and structured classifications, which generally create hierarchical and/or faceted relationships between terms. Controlled vocabularies are primarily used for subject headings, but they can also establish the "correct" spellings of individual terms and the preferred structure of complex terms, such as the formation of personal names as "last name, first name" or dates as "two-digit-date three-letter-month four-digit-year."
Properly developed controlled vocabulary supports both the search and browse functions of information retrieval, as described in Marcia J. Bates' article "The Design of Browsing and Berrypicking Techniques for the Online Search Interface." A controlled vocabulary may be used as a browse list directly, or a search result could return an item with a subject heading from a controlled vocabulary source that is a hyperlink enabling a search for items that also are described by that subject heading. The advantage of using controlled vocabulary instead of uncontrolled keywords is that this hyperlinking can be managed without the need for additional data mining or processing. While uncontrolled keywords remain useful tools for description (and are particularly useful in the new area of folksonomies), they should be considered a supplement to controlled vocabularies rather than a replacement.
The Getty Museum's Introduction to Vocabularies: A Guide to Enhancing Access to Art and Material Culture Information provides a good overview of controlled vocabulary, although it is scheduled to be updated, and participants are urged to consult the latest version when it becomes available.
Controlled Vocabulary Best Practices
The preferred THO authority list for local and regional terms is The Handbook of Texas Online, which includes articles on people, places, and events related to Texas history and heritage. The authoritative version of any given term is that used in the article title, which may or may not be supplemented at the local level with a date facet. The Handbook of Texas Online is limited, however, in that it does not include articles for living people and may not have developed articles on people, places, and events of strictly regional interest. Participants should therefore supplement the Handbook of Texas Online with other sources of controlled vocabulary.
Levels of Controlled Vocabulary
THO recognizes three levels of controlled vocabulary use:
Minimal
Participants will use a locally developed and maintained controlled vocabulary for names and subject headings. This is particularly appropriate for regional terms, but participants should be aware that use of a non-standard controlled vocabulary may result in omission of some records from search results.
Basic
Participants will use one or more standard sources for controlled vocabulary, such as the Getty Vocabularies, Library of Congress Subject Headings and Authorities, Chenhall's Nomenclature, etc. Whenever possible, the source of the controlled vocabulary will be indicated by a namespace or code in the descriptive metadata record.
Some common sources of controlled vocabulary include the following:
- Alexandria Digital Library Gazetteer Server
http://www.alexandria.ucsb.edu/clients/gazetteer/
- Alexandria Digital Library Geographic Feature Type Thesaurus
http://www.alexandria.ucsb.edu/gazetteer/FeatureTypes/ver070302/
- Art and Architecture Thesaurus (AAT)
http://www.getty.edu/research/conducting_research/vocabularies/aat/
- Getty Thesaurus of Geographic Names (TGN)
http://www.getty.edu/research/conducting_research/vocabularies/tgn/
- Geographic Names Information System (GNIS)
- Handbook of Texas Online
http://www.tsha.utexas.edu/handbook/online/
- Library of Congress Authorities
- Thesaurus for Graphic Materials I: Subject Terms
http://lcweb.loc.gov/rr/print/tgm1/toc.html
- Thesaurus for Graphic Materials II: Genre and Physical Characteristic Terms
http://lcweb.loc.gov/rr/print/tgm2/
- Union List of Artist Names (ULAN)
http://www.getty.edu/research/conducting_research/vocabularies/ulan/
Enhanced
Participants will participate in the development of controlled vocabulary appropriate to their projects, regions, and resources. For the Handbook of Texas Online, participants may use the "suggest an article topic" link if an article is not available for any given term, or the "report an error or correction" link for articles that are incorrect. Additional tools for controlled vocabulary development may be available for the Handbook of Texas Online in the future. The Getty Vocabularies also have a formal method for contributions from participants.
Participants may also wish to explore alternate methods for vocabulary development, including new applications for controlled vocabulary such as the OCLC FAST project or tools for generating uncontrolled vocabulary, sometimes referred to as "folksonomies" (Wikipedia article), or systems such as collabularies (Wikipedia article) that fall somewhere in between. However, folksonomies are primarily to be used as a supplement to controlled vocabularies rather than a replacement. Participants using locally developed vocabularies will post their local vocabularies online and establish a formal ontology that can be used in an xml namespace declaration.
References
- Lanzi, Elisa, and Patricia Harpring. (2000). Introduction to Vocabularies: A Guide to Enhancing Access to Art and Material Culture Information. Retrieved March 6, 2006, from http://www.getty.edu/research/conducting_research/vocabularies/introvocabs/
- Shirky, Clay. (2005). Ontology is Overrated: Categories, Links, and Tags. Retrieved March 13, 2006, from http://www.shirky.com/writings/ontology_overrated.html
- Texas State Historical Association. (2005). The Handbook of Texas Online. Available from http://www.tsha.utexas.edu/handbook/online/