FSP FAQs: Metadata for USGS Scientific Data
| FSP FAQs Home | General FAQs | Release of Scientific Data FAQs | Metadata for Scientific Data FAQs | Data Management Planning FAQs |
Note: Terms used in these FAQs will be referred to by their acronyms (in parentheses) as follows: U.S. Geological Survey (USGS), Survey Manual (SM), Instructional Memorandum (IM), Office of Science Quality and Integrity (OSQI), Fundamental Science Practices (FSP), Bureau Approving Officials (BAOs), Office of Science and Technology Policy (OSTP), Office of Management and Budget (OMB), Information Product Data System (IPDS), National Water Information System (NWIS), Freedom of Information Act (FOIA), Data Management Plan (DMP).
These frequently asked questions (FAQs) supplement IM OSQI 2015-02 and apply only to metadata for USGS scientific data. Additional guidance and information on metadata for USGS data products is available at http://www.usgs.gov/datamanagement/describe.php. FAQs on release of scientific data are also available.
Updates and additions to the FAQs will be posted as they occur (month/year). Questions about FSP policies and procedures that are not addressed here should be directed to the FSP Advisory Committee or a BAO in the OSQI.
- What is metadata?
- Why do we need metadata for data?
- What do metadata records look like?
- How do I create metadata?
- What is a metadata review and who can perform it?
- When do I create metadata?
- I have a lot of data packaged in different datasets and databases, for what packages of data do I produce a metadata record?
- Are metadata records required for any size dataset?
- Are metadata records needed for datasets and databases that are provided by non-USGS authors and are subsequently included in USGS datasets, databases, or publications?
- Are the output data generated by a model simulation subject to the metadata requirement?
- Do summary data tables in information products such as USGS series or outside publications (i.e., journals) need metadata?
- Is USGS Science Publishing Network (SPN) editing required for metadata records?
- Where do the metadata records go once we have created them?
- Where can I find additional guidance or information about metadata?
1. What is metadata?
Metadata refers to documentation of important aspects of the data that describe where, when, and why the data were collected; who collected the data; what types of data were collected; processes used to create the data; what quality assurance controls were used; and where the collected data are located. Metadata are provided in a human-readable form as well as in a format that is machine-readable (i.e., XML) for automated use.
2. Why do we need metadata for data?
Metadata enable users to find, understand, and reuse the data, thus extending the life of the data for those who may want to use or refer to it. In addition, a metadata record is required by the USGS for including data in the Science Data Catalog. Further Federal Government mandates (including Executive Order 12906 and OMB Circular A-16) and USGS policy (IM OSQI 2015-02) require metadata be an integral part of data released to the public.
3. What do metadata records look like?
Examples of metadata for data include those found at http://sofia.usgs.gov/metadata/sflwww/SOFIA_Cape_Sable.html and http://sofia.usgs.gov/metadata/sflwww/gachemca.html. Refer to these and other examples on the USGS Data Management Web page at http://www.usgs.gov/datamanagement/describe/metadata.php#hide-FGDC-CSDGM-Standard-Metadata-Examples.
4. How do I create metadata?
Various tools for creating metadata are available on the USGS Data Management Metadata Web page at http://www.usgs.gov/datamanagement/describe/metadata.php#hide-FGDC-CSDGM-Standard-Metadata-Tools. For example, the Online Metadata Editor (OME) helps you create a valid FGDC metadata record by answering common questions about your data. The tool can be used to start new records, upload and edit existing records, and save completed or ongoing records to the tool or directly to your desktop. The OME tool (https://www1.usgs.gov/csas/ome/) is available to USGS staff.
5. What is a metadata review and who can perform it?
A metadata review includes both checking for compliance with metadata standards using a recommended metadata validation tool, and performing quality checks. A minimum of one metadata review by a qualified reviewer is required for all USGS scientific data approved for release. A written report of all metadata reviews (reviewer comments and how they were reconciled) must be included in the internal IPDS package that is submitted for Bureau approval (refer to http://www.usgs.gov/fsp/faqs_datarelease.asp#faq1).
The role of the metadata reviewer is to evaluate the accuracy, completeness, and usability of the metadata. The metadata review can be conducted as part of the peer review, data review, or editorial review, or can be conducted separately as appropriate. Center Directors and Supervisors determine who serves as qualified metadata reviewers for the data produced by authors they supervise. A checklist that provides guidelines to reviewers of metadata is available (http://www.usgs.gov/datamanagement/documents/MetadataReviewChecklist_2014.pdf). For more information on metadata reviews, visit the USGS Data Management Metadata Web page at http://www.usgs.gov/datamanagement/describe/metadata.php#hide-FGDC-CSDGM-Standard-Metadata-Tools.
6. When do I create metadata?
A metadata record needs to be finalized and disseminated when the data are ready to be released to others. Planning for metadata should take place at the data management planning stage (refer to IM OSQI 2015-01). Metadata are a living resource that are collected, used, and revised throughout the data lifecycle. Therefore, metadata collection should be started as soon as the project begins. When recorded throughout the lifecycle of data, the information in the metadata record is likely to be more accurate and will require less effort than starting the record at the end of the project. The metadata record must be updated periodically to document any changes to the data, such as corrections and additions.
7. I have a lot of data packaged in different datasets and databases, for what packages of data do I produce a metadata record?
It depends on how the data will be used. You need a metadata record that describes the data package that will be cited, which is generally also the package that will be searched for in the Science Data Catalog and public search engines. Additional metadata records might be needed for different parts of data packages that have different creation or processing details.
8. Are metadata records required for any size dataset?
There is no established size for a dataset that prescribes when a metadata record is required. For example, a separate metadata record may not be needed if only a few sample results are presented in their entirety in a published table. However, if the table contained analytical or summary results, or a larger set of data was used from which a small number of records were extracted to create a table for a USGS series or an outside publication, then it is appropriate to also have a metadata record for that larger dataset source.
When data are released, they must be accompanied by a metadata record. If a report uses a subset or summary of separately released data, no additional metadata record is needed.
9. Are metadata records needed for datasets and databases that are provided by non-USGS authors and are subsequently included in USGS datasets, databases, or publications?
Generally, yes, because these items become part of USGS information products. When incorporated into USGS information products, these datasets or databases from non-USGS sources need to comply with USGS data release requirements, including review and approval of data, and documentation of the source data. This is important because metadata records establish the provenance of incorporated data, which includes access to the original source data. If sufficient metadata do not exist, record(s) should be created and included as part of the new product's data package, or cited in the metadata for the new USGS data product.
10. Are the output data generated by a model simulation subject to the metadata requirement?
Yes, model data that will be made publicly available through the data release process need metadata. Source data used for the model should be well documented and cited in the metadata to allow the work to be understood and replicated by others.
11. Do summary data tables in information products such as USGS series or outside publications (i.e., journals) need metadata?
No. However, the data behind the summary table will need metadata and will need to go through the data release process.
12. Is USGS Science Publishing Network (SPN) editing required for metadata records?
No, an SPN editorial review is not required, but Science Centers have the option of obtaining such a review at their discretion.
13. Where do the metadata records go once we have created them?
The metadata record must stay with its associated data. Upon formal release of the data, metadata records for all USGS data-related products, including non-geospatial data, must also be registered with the USGS Science Data Catalog. For more information about how to include metadata in the Science Data Catalog, refer to http://data.usgs.gov/info/about.html.
14. Where can I find additional guidance or information about metadata?
Additional guidance on metadata creation, quality control and content review, tools, best practices, and clearinghouses is available on the USGS Data Management Web site's metadata page at http://www.usgs.gov/datamanagement/describe/metadata.php.