Where can I store my data? New guidance available for researchers

Active Research Data Storage grid

I have…

terabytes of data that I need to analyze…
a data use agreement…
video footage from my classroom experiment…
data and files from my former institution…
data that I don’t want to lose accidently.

What’s best for me? Box? bDrive? Amazon? A host in the campus data center? A hard drive on a shelf in the lab?

Researchers in the active phase of a project, across all disciplines and within professional schools as well, face a difficult challenge in selecting the right place to store their data. Security and confidentiality, proximity to computing resources, cost -- these factors and others shape the decision.

A new online resource helps campus researchers assess their storage needs and identify appropriate options. The Active Research Data Storage Guidance Grid presents a  range of services available from UC Berkeley, from other UC campuses, from commercial cloud vendors and from national agencies. Accompanying the grid, the article “How to think about storing your data?” (forthcoming) describes an approach that leads researchers through the decision-making process. These documents have been developed by the Research Data Management (RDM) program in close partnership with researchers, research staff and IT colleagues across campus.

Together, these documents provide a starting place for understanding the characteristics of various types of storage and designing a storage strategy. For some researchers, the information will be sufficient to point to a satisfactory solution. For those who have outstanding questions or would simply like to review their situation, RDM Consultants (researchdata@berkeley.edu) are available to help navigate this complex terrain.

Active Research Data Storage Guidance Grid

The Active Research Data Storage Guidance Grid presents an at-a-glance view of available services and briefly describes the strengths, limitations and costs of each.

The chart includes:

  • Campus-provided services -- such as the bConnected suite of applications or hosted, mountable storage -- that are applicable to research (some of which are available for free)

  • Services offered by other UC campuses, vetted by the RDM team

  • Commercial cloud services

  • National infrastructure built for and available to researchers.

RDM expects this guide to grow as it continues to investigate and identify new services. One goal of authoring this chart is to identify gaps in the resources available to campus today.

“How to think about storing your data”

Over the past year, Research Data Management has distilled a set of key questions to guide the decision making process. Starting from the question of whether the data must be handled in ways that meet security or confidentiality requirements, the article “How to think about storing your data” (forthcoming) directs the researcher through a series of considerations that help identify appropriate data storage options.

Key decision points include:

  • The purpose, or goal, of the storage, which will differ at various stages of activity. This is often related to the type of computer analysis that will be run on the data and the environment in which that processing will occur.

  • The project’s budget and timeframe

  • The characteristics of the data itself: How much data needs to be stored? How many files? How large are the individual files?

  • The technical comfort and tool preferences of the researcher or research team.

All of these issues factor into the decision, steering the researcher towards some options and away from others.

Where to get help

Research Data Management consultants are available (at no cost to researchers) to discuss your situation. We can help you understand your data storage requirements, unknot your data workflows, and find the services that best fit your needs. Contact us at researchdata@berkeley.edu