R09: The repository assumes responsibility for long-term preservation and manages this function in a planned and documented way.
Preservation Plan
To meet its legislative mandate to make collections publicly available over the long-term, NTL performs curation activities, including preservation, migration, and transformations, to ensure permanent access to its records and research. As stewards of digital collections, NTL staff use current, widely accepted digital curation policies and practices where possible. NTL’s goal is to preserve all digital information at the bit level, at a minimum. This means that the NTL will protect digital information from bit rot and media failure, ensuring future devices will be able to faithfully reproduce the sequence of bits encoded in a digital information object. To achieve this goal, NTL employs the following practices for preservation: descriptive metadata, persistent identifiers, the “3-2-1” backup rule, daily server backups, an extensive disaster recovery plan, and format migration. We evaluate all datasets for their scope and relevance to ROSA P. Then we evaluate each dataset's level of responsibility for preservation by using the Levels of Curation below:
Preservation Strategies
NTL uses an international standard structured metadata format for interoperability and exports in XML to enable compatibility with future search technology. NTL uses a combination of Dublin Core Metadata Initiative terms and locally created terms. To ensure a consistent approach between intramural and extramural digital research data with OMB Memorandum M-13-13, DOT will require that the metadata for scientific data will include, at a minimum, the common core metadata schema in use by the Federal government, DCAT-US Schema v1.1, (https://resources.data.gov/resources/dcat-us/). These files are currently created in the JavaScript Object Notation (JSON) format, allowing for the validation of these metadata files according to the DCAT-US Schema v1.1. This ensures that our data metadata files are not only machine readable but compliant with this federally mandated schema.
To protect digital information and data from loss, NTL employs the “3-2-1” backup rule. NTL maintains:
Currently, NTL maintains a copy of its repository content and metadata in the following locations:
Backups on the USDOT-managed Microsoft Azure cloud environment are in the disaster recovery site, located in a different geographical area than USDOT headquarters. Backups on the CDC Public Access Platform are in the disaster recovery site on the US West Coast, a different geographic area than CDC headquarters. The disaster recovery site is updated daily. All daily backups of the staging server and weekly backups of the production servers are kept for 45 days.
Format Preferences
Per NTL’s Collection Development and Maintenance Policy (https://doi.org/10.21949/1530598), format preferences for content submitted to NTL are non-proprietary and open electronic file formats as described by the Library of Congress in Sustainability of Digital Formats (https://www.loc.gov/preservation/digital/formats/index.shtml). Curatorial activities include migrating data from one format into another when earlier formats or devices become obsolete, and as NTL resources permit (Curation Level A. Active Preservation).
When content is migrated from one format to another, NTL:
Typical file formats received by ROSA P include:
Regardless of whether an item in ROSA P is in proprietary or nonproprietary we provide open source software that users will be able to use to view the files for that record. This information is recorded as a second paragraph in the abstract metadata field for each dataset. This text is generated through our file format dictionary (https://transportation.libguides.com/researchdatamanagement/fileformatdictionary) which is updated regularly as needed. An example of this is provided below.
Example: (https://doi.org/10.21949/1530621)
Storage Policy
All items are held permanently by NTL. As a permanent archive of transportation information, no items are to be removed from the repository. Once deposited with NTL, an item is added to the collection and made publicly available. Public access to an item may be interdicted only if one of the following criteria is met:
All research funded by the Department of Transportation is required to be submitted to NTL for long-term preservation. Preservation of research is outlined in each project’s Data Management Plan, which is required for all research. All elements of our preservation plan are outlined in our Digital Curation Policy (https://doi.org/10.21949/1530599).