Generic NSF Data Management Plan

This example Data Management Plan provides an obviously hypothetical research project submitted to the National Science Foundation. It responds to the information requested by the NSF programs that support the generic data management plan format. This was completed using the DMPTool. Types of data produced In this project, three primary types of data will used and produced.

  • Raw data will be collected at the field sites in Excel files and uploaded to the project's Box folder on a nightly basis.
  • Aggregate and intermediate data sets will be generated by combining and cleaning the raw data files using scripts developed by project staff.
  • Existing comparison data for California is publicly available from the Department of Agriculture (link)

Data and metadata standards Frequency and distribution data will be collected using the EML standard schema with additional fields added that are specific to the project's primary research question.  Digital Object Identifiers will be assigned to collected samples using the EZID service, developed by the California Digital Library and supported by the UC Berkeley Library.  Standard practices for filenaming will be applied to ensure proper organization and to ensure future reuse of these data.   Policies for access and sharing During the active phase of the project, the project team staff listed in the proposal will have access to raw data sets and intermediate processed files as they are uploaded from the field site into our project folder stored in Box.  UC Berkeley licenses and supports Box as a cloud-based collaboration service with unlimited storage provided free for research projects.  Group permissions will be set up in Box so that the raw data files can only be edited directly by the PIs and field staff.  No data are subject to sensitive data protections.  In-progress results and limited data sets will be published on the project web site, which is hosted on the Open Berkeley platform.  These limited data sets will provide summary information about the results until the full data are published at the conclusion of the project. Policies for re-use, redistribution On conclusion of the project and as the final publication is released, raw data files, intermediate results, and final results will be published in DASH, a data archiving and sharing solution developed by the California Digital Library and implemented at UC Berkeley.  A readme file will be included with the data to provide context and documentation about the data sets.  Those interested will also be pointed to the final publications where additional information about the project and its data can be found. Plans for archiving & preservation DASH will provide a long-term access point to the project's data sets.  In addition, data will be backed up in the department's long-term archive which is periodically saved to the CDL Merritt repository.