Metadata describe information about a dataset, such that a dataset can be understood, re-used, and integrated with other datasets. Information described in a metadata record includes where the data were collected, who is responsible for the dataset, why the dataset was created, and how the data are organized. Metadata generally follow a standard format, making it easier to compare datasets and to transfer files electronically.
Why Do We Need Metadata?
- Data are not complete without a metadata record.
- Use metadata to understand and re-use data.
- Document everything about the data in the metadata record.
- Use mandated Federal metadata standards and tools to create metadata.
- Validate metadata to ensure they follow metadata standards.
- Share metadata with catalogs to improve discovery and access to the data.
- Metadata are an important component of a USGS data release.
Metadata are crucial for any potential use or reuse of data; no one can responsibly re-use or interpret data without accompanying metadata that explains how the dataset was created, why, where it is geographically located, and details about the structure and meaning of the data.
There are many uses for metadata, even beyond the simple discovery of datasets. Metadata can be used for understanding data, analysis and synthesis, maintaining longevity of a dataset for an organization, tracking the progress of a research project, and demonstrating the return on investment for research at an institution.
For more information about metadata as it pertains to the USGS data release process, visit Metadata for Scientific Data FAQs.
How to create a metadata record:
- Getting started
- Creating metadata records
- Validating metadata records
- My metadata is created, what’s next?
1. Getting started
Gather content for the metadata record
- Understand what goes into a metadata record (e.g. title, abstract, methods, keywords, etc.).
- Use the Metadata Questionnaire [PDF] or Metadata in Plain Language to gather content for building a metadata record or use metadata creation tools which will ask you the same questions about your data.
What does a metadata record look like?
Federal agencies are mandated by Executive Order 12906 to use metadata standards endorsed by the Federal Geographic Data Committee (FGDC) below:
Both FGDC-CSDGM and ISO require metadata to be formatted in Extensible Markup Language (.xml) although a stylesheet can be applied over the XML to make it easier to read. Learn more about XML for Advanced Users.
Examples of metadata records in FGDC-CSDGM for different types of information products. View the metadata record in its native XML code or with a stylesheet applied to be easier to read.
An example of a metadata record in ISO 19115-2. Please note that it may contain only certain sections of the ISO standard.
- Alaska Data Integration Working Group (ADIwg) [XML]
2. Creating metadata records
The following free tools create or edit FGDC CSDGM metadata in XML. For a wider selection of tools see the FGDC Metadata Tools. For a list of tools for the ISO metadata standard, refer to the FGDC ISO Metadata Editor Review.
- USGS Online Metadata Editor (OME) - An online form for USGS staff to create FGDC-CSDGM by answering simple questions about your data. Best for biological and non-biological datasets. Login to start new records or upload and edit existing ones. Save completed or ongoing records for later or download directly to your computer.
- USGS Metadata Wizard - A Python toolbox in Esri ArcGIS Desktop for creating FGDC-CSDGM metadata for geospatial datasets. The tool ingests geospatial files and through a semi-automated workflow, creates and updates metadata records in Esri’s 10.x software. Best for geospatial data (e.g. raster and shapefiles) and tabular data (e.g. Esri geodatabase or database file). Comma separated value files can be used but must first be converted into Esri formats.
- USGS TKME - A Windows platform tool for creating FGDC-CSDGM which can be configured for Biological Data Profile and other extensions. The software program is closely aligned with the Metadata Parser, and can be configured for French and Spanish.
- USDA Metavist - A desktop metadata editor for creating FGDC-CSDGM for geospatial metadata. Includes the Biological Data Profile (version 1.6). Produced and maintained by the USDA Forest Service.
- Microsoft XML Notepad - A simple intuitive user interface for browsing and editing XML files. Does not automatically produce FGDC-CSDGM records but allows easy editing and validating of existing metadata records. See Advanced Users to learn how to configure this tool.
- Gather all information together, especially if multiple people have information that you need.
- Use information that is already developed.
- Re-use text from grant or funding proposals (e.g. abstract, purpose, date, etc.).
- Reference a data dictionary if you created one during data collection and processing which can be included in the metadata.
- Choose a descriptive title for your dataset that incorporates who, what, where, why, and scale.
- Example: Greater Yellowstone Rivers from 1:126,700 U.S. Forest Service Visitor Maps between 1961-1983
- Choose keywords wisely: Consider all of the possible interpretations of your word choices and use a thesaurus to add descriptive terms you may not have otherwise selected.
- Include as many details as you can in the metadata record for future users of the data.
- Review your metadata for completeness and accuracy.
- Ask someone unfamiliar with the project to review your metadata objectively.
- Check for clarity and omissions.
- Use the best practices described in the Systems Level Applications or Collections [PDF] for large data systems or when describing "collections" of datasets.
3. Validating metadata records
You must validate metadata to ensure it has been created properly and all required elements have been filled in. Validation compares the metadata standard to the XML metadata record to ensure it conforms to the structure of the standard. See best practices for Checking Metadata with Data [PDF] with FGDC-CSDGM metadata. Many metadata creation and editing tools (such as OME and Metadata Wizard) validate automatically so a second validation may not be necessary.
- USGS Metadata Parser – A tool that validates XML metadata records against the FGDC-CSDGM standard and generates error reports if any. Good for geospatial and non-geospatial datasets. Users can view XML metadata records in easy-to-read formats (html, text). It is multilingual (English, French and Spanish) and can be configured for the Biological Data Profile and other extensions. For advanced users, learn how to Run MP from the Command Line window [PDF].
- Microsoft XML Notepad – The tool offers the ability to validate records but requires a schema package. See Advanced Users to validate metadata.
4. My metadata is created, what’s next?
- USGS policy requires a formal review of the data and metadata if intended as a USGS data release.
- Package your data and metadata together whenever possible since the metadata record is critical to understanding the data.
- Work with your organization to identify how metadata should be shared or visit Publish and Share for more information. Sharing metadata improves discoverability, access, and reuse of the data. The USGS Science Data Catalog is the approved mechanism for serving USGS metadata to data.doi.gov, data.gov, and geoplatform.gov, etc.
To create, edit, and/or validate metadata records directly in XML code, use an XML editor such as the Microsoft XML Notepad:
- Instructions for using XML Notepad [PDF]
- Sample Starter Template [XML] - A starter metadata record that can be filled in with content.
- Stylesheets: Use XML Notepad to display metadata in a human readable form using stylesheets [XSL]
- Configuring XML Notepad to Validate metadata: Use a schema package (.xsd) to validate the record and ensure the fields are in the proper order as defined by the FGDC-CSDGM standard.
What the U.S. Geological Survey Manual Says:
The USGS Instructional Memorandum IM OSQI 2015-2 Fundamental Science Practices: Metadata for Scientific Data, Software, and Other Information Products discusses metadata requirements for data products.
Policy: Metadata must accompany all USGS scientific data, software, and other information products approved for release. The requirement for metadata applies to all scientific data, prior to approval for release. This includes geospatial and non-geospatial datasets, databases, and Web data services that are created, collected, or compiled by USGS employees, volunteers, contractors, or data from other sources that are subsequently made part of a USGS dataset, database, or service.
Metadata records for datasets and databases must comply with one of the following FGDC standards: FGDC Content Standard for Digital Geospatial Metadata or the International Organization for Standardization suite of standards (refer to http://www.fgdc.gov/metadata).
The USGS Instructional Memorandum IM OSQI 2015-3 Fundamental Science Practices: Review and Approval of Scientific Data for Release discusses when metadata requirements apply and when a metadata review is required.
Policy: Approved data must comply with the metadata policy (refer to IM OSQI 2015-2), and the metadata must accompany the data approved for release when released.
USGS releases both approved and provisional data. Until they are approved, data are considered provisional or preliminary. Provisional or preliminary data released to the public are not required to meet the metadata requirements of IM OSQI 2015-2.
For additional guidance, please refer to the Fundamental Science Practices FAQ: Metadata for USGS Scientific Data.
- Chatfield, T., Selbach, R. February, 2011. Data Management for Data Stewards. Data Management Training Workshop. Bureau of Land Management (BLM).
- DataONE education modules. Accessed June 13, 2012.