U.S. Geological Survey Manual
U.S. Geological Survey Instructional Memorandum
No: IM OSQI 2015-01
Issuance Date: February 19, 2015
Expiration Date: Retain Until Suspended
Subject: Scientific Data Management Foundation
1. Purpose. This Instructional Memorandum (IM) provides interim policy for establishing a U.S. Geological Survey (USGS) data management foundation following a data lifecycle model. This interim policy guidance is issued to allow the time needed for USGS science activities to fully implement the data management requirements herein and will be retained until superseded by a permanent Bureau Survey Manual (SM) policy chapter. Associated IMs related to some elements of the data management lifecycle will be issued separately.
2. Background. USGS scientific data encompass a wide variety of information including textual and numeric information, instrument readouts, statistics, images (fixed or moving), diagrams, maps, and audio or video recordings. They include raw or processed, published, and archived data, for example, data generated by experiments, models, simulations, and by observations of natural phenomena at explicit times and locations and data stored on any type of media.
3. Policy. USGS scientific data shall be managed throughout the data lifecycle as described in section 4 below and, when approved, the data must be released to the public in a machine-readable form under the authority of USGS Fundamental Science Practices (FSP) requirements (SM 502.1). Guidance and procedures that support this interim policy are available at the USGS Data Management Web site (http://www.usgs.gov/datamanagement). Other applicable FSP requirements related to review, approval, and release of all USGS science data and information must also be followed (SM 502.2 and SM 502.4). Refer to IM OSQI-2015-02, IM OSQI-2015-03, and IM OSQI-2015-04 for associated requirements.
4. Elements of the USGS Science Data Lifecycle Model. The descriptions of the USGS science data lifecycle elements below represent an overview of results to be achieved. Refer to the Data Management Web site (http://www.usgs.gov/datamanagement) for more detailed information.
A. Plan. The overall project work plan of every research project (SM 502.2) must include planning for data management. A data management plan describes standards and intended actions for acquiring, processing, analyzing, preserving, publishing/sharing, describing, managing quality, backing up, and securing the data holdings (http://www.usgs.gov/datamanagement/plan.php). The data management plan, like the work plan, should be updated during the research phase to reflect the reality of the project activities.
B. Acquire. Methods and techniques for acquiring research data are planned and documented to ensure that scientific findings are verifiable. Data acquisition encompasses collecting new data or adding to existing data holdings and may include data purchased or otherwise acquired for use in a USGS data product. Appropriate methods, agreements, and other requirements for acquiring the data are also considered (http://www.usgs.gov/datamanagement/acquire.php).
C. Process. Data processing denotes those actions or steps performed to verify, organize, transform, repair, integrate, and produce data in an appropriate output form for subsequent use. Methods of processing are documented to ensure the utility, quality, and integrity of the data (SM 502.2).
D. Analyze. Analysis involves actions and methods performed that help describe facts, detect patterns, develop explanations, and test hypotheses (for example, statistical data analysis, computational modeling, and interpretation of results). Methods of analysis are documented to ensure the utility and integrity of the data (SM 502.2).
E. Preserve. Preservation includes actions and procedures that are performed to ensure that data are retained and accessible consistent with the USGS Records Disposition Schedules and other applicable regulations. Archiving or transfers to an appropriate data repository or the National Archives and Records Administration are integral aspects of data preservation (http://www.usgs.gov/datamanagement/preserve.php). Controls are in place to protect proprietary and predecisional data (SM 502.5) and the integrity of the data (refer to Departmental Manual (DM) chapter 305 DM 3).
F. Publish/Share. USGS scientific data may be released or disseminated in a variety of ways, for example in datasets and databases, software, and other information products including USGS series publications (SM 1100.3), outside publications (SM 1100.4), and USGS Web pages. Publishing and sharing of data benefits USGS scientists, the scientific community, the general public, and other stakeholders. The USGS supports data sharing by providing data services and guidelines such as including the proper citations and use of persistent identifiers (for example, digital object identifiers or DOIs) for data and associated metadata (http://www.usgs.gov/datamanagement/share.php). All USGS scientific publications must identify their associated datasets. Refer to SM 502.4 for review, approval, and release requirements for USGS information products.
G. Describe (Metadata, Documentation). Metadata describe USGS scientific data and how they were collected and processed, and are essential for reproducibility of research results. Metadata include specific attributes, such as spatial coverage, scale, collection methods used, citation information, abstract, purpose of the data, and information about how data are formatted. USGS metadata records follow standards that are endorsed by the Federal Geographic Data Committee (http://www.fgdc.gov), which include metadata standards for both spatial and non-spatial data. Additional documentation provides information about data in the context of their use in specific systems, applications, and settings, and includes ancillary materials (such as field notes) as a supplement to standardized metadata (http://www.usgs.gov/datamanagement/describe.php).
H. Manage Quality. Data management activities (including use of standard methods and best practice techniques) are performed in a consistent, objective, and replicable manner to help ensure that high-quality and verifiable results are achieved (refer to SM 502.2). Quality assurance checks are performed at all stages of the science data lifecycle (http://www.usgs.gov/datamanagement/qaqc.php).
I. Backup and Secure. During all data management processes, backup copies of data are created to protect against loss that can result from events such as human error, hardware failure, computer viruses, power failure, or natural disaster. Best practices for backup and securing data at all stages of the data lifecycle are available (http://www.usgs.gov/datamanagement/backup.php).
5. Responsibilities. Everyone in the USGS involved in scientific data-related activities described in section 4 above is responsible for complying with this chapter. Designated officials have specific roles in establishing the policies and other requirements that underpin data management:
A. Associate Directors and Regional Directors., Associate Directors (ADs) and Regional Directors (RDs), as members of the USGS Executive Leadership Team (ELT), set policy for how scientific investigations, research, and related activities are carried out and how data and information products are reviewed and approved for release and dissemination. The ADs and RDs provide oversight and support for the data management processes in their mission and regional areas. They collaborate with each other to address issues or take corrective action with regard to the data management lifecycle processes.
B. Office of Science Quality and Integrity and Core Science Systems. The Office of Science Quality and Integrity (OSQI) and Core Science Systems (CSS) are responsible for jointly developing USGS data management policy and collaborating on the development of related guidance and procedures. The OSQI coordinates with the ADs, RDs, or the entire ELT as needed to address and resolve issues regarding the execution of this policy and related data management review and approval processes. The CSS maintains comprehensive guidance and procedures related to data management (refer to the USGS Data Management Web site). The OSQI maintains and communicates this and other FSP related policy documents.
C. Science Center Directors. Science Center Directors or their designees ensure compliance with data management requirements for data produced in their centers or offices and consult with their ADs, RDs, Managers (program and project), scientists, and others on their staff as needed with regard to carrying out data management activities, including ensuring the development of data management plans. They also assign or ensure the assigning of data managers to oversee or steward the lifecycle activities of their respective data products.
D. Approving Officials. Approving Officials, including Science Center Directors (or designees) and Bureau Approving Officials in the OSQI, ensure that USGS standards for scientific quality are followed by confirming that appropriate review, approval, and release requirements are met before they grant Bureau approval of the data products they have authority to approve.
/s/ Alan D. Thornhill February 19, 2015
Alan D. Thornhill Date
Director, Office of Science Quality and Integrity
Return to Instructional Memoranda Page
Return to Survey Manual Home Page