USGS - science for a changing world

USGS Data Management

blank space
Describe / Metadata > Data Dictionaries and Thesauri
U.S. Geological Survey Data Lifecycle Diagram Plan Acquire Preserve Publish/Share Manage Quality Describe (Metadata, Documentation) Backup & Secure The USGS Science Data Lifecycle

Data Dictionaries and Thesauri

Data Dictionaries contain structured data names, and thesauri contain terms that make your data more easily discovered.

Definition: Data Dictionary

Key Points

  • A Data Dictionary is a repository of information that defines and describes a data resource.
  • A Thesaurus is a structured list of preferred terms around a subject.
  • Use widely known keywords and tags for your data in order to make your data more searchable and discoverable.
  • Find preferred terms and keywords with a thesaurus (e.g., USGS Biocomplexity Thesaurus).

According to the DOI Data Management Guide [PDF] (2008), a data dictionary is a repository of data (metadata) defining and describing the data resource.

Definition: Thesaurus

A thesaurus is a structured list of preferred terms or subjects that indicate relationships between those terms. Preferred terms are focal points where all information about a concept is collected. Relationships between preferred terms can be broad, narrow, or related in another way.

A thesaurus also indicates non-preferred terms, which are terms indexers and searchers should not use. A good thesaurus makes clear what a term is meant to cover by providing preferred terms, their relationships with other preferred terms, and non-preferred terms.

Data Dictionary Definitions

Example:

Entity: Fish Measurements - A sample of the physical measurements of rainbow trout in Lake Superior, MN collected on 07/14/2010.

Attributes:

  • Attribute Type: fish_totl
  • Attribute Definition: Fish Total Length - Measured total length (cm) of the fish from mouth to tip fin. Mouth shut and fin pinched closed.
  • Domain Range: 0.00-300.00;
    -999 = NA

Entity definitions:

  • Defines a person, place, or thing about which data can be stored
  • Must be clearly understood before attributes can be named or defined

Data element (attribute) definitions:

  • Describe the inherent nature of the data
  • NOT the entity that the attribute contains information about
  • NOT the uses of the data (where, when, how, or by whom)
  • NOT the codes and values the codes represent

Tools

  • USGS Biocomplexity Thesaurus Project
  • Description:
    The Biocomplexity Thesaurus Project is a thesaurus and dictionary database of terms and concepts in nearly every scientific field. The Biocomplexity Thesaurus serves as a controlled vocabulary for facilitating improved access and retrieval of data and information. Users can query the thesaurus for matching and related terms both specific and broad.
    URL:
    http://www.usgs.gov/core_science_systems/csas/biocomplexity_thesaurus/index.html

Example USGS Data Dictionaries

Recommended Reading

References

Can't open pdf files? Get Adobe Acrobat Reader.

Accessibility FOIA Privacy Policies and Notices

USA.gov logo U.S. Department of the Interior | U.S. Geological Survey
URL: http://www2.usgs.gov/datamanagement/describe/dictionaries.php
Page Contact Information: Email Us
Page Last Modified: Tuesday, June 07, 2016