You are here

Active Research Data Storage Guidance Grid

What services can I use to store data during the active phase of my research project?  

Data storage during the active phase of research presents challenges, regardless of whether you are housing an acquired data set or collecting and analyzing your own data. Whether you work with text or time series, sensors or surveys, recordings or RNA sequences, many factors influence choice of storage, and the optimal selection may vary at different stages of your work. This guidance grid summarizes many active data storage options available to campus researchers. See Where can I store my data? New guidance available for researchers for a discussion of key issues to consider when designing a data storage strategy.

The Research Data Management program evaluates the storage needs of campus scholars and develops services to fit those needs. We can help you navigate this complex landscape -- please contact RDM Consulting (researchdata@berkeley.edu) for assistance.  

Note about Data Protection and Security: The “Data protection” column refers to data protection levels (PL0 to PL3) that are defined in UC Berkeley’s Data Classification Standard. In addition to the services listed below, there are other services or solutions that can be used for storing and managing sensitive and restricted data. Please contact RDM Consulting to discuss your sensitive or restricted data needs and options.

Storage provided by Berkeley (directly or via contract)

Title Best suited for Key capabilities Not well suited for Limitations Data protection Cost details Connection methods
bDrive
  • Gathering data, source materials, and documentation; parking data for later preparation, analysis
  • Collaboration
  • File-sharing with a limited number of collaborators
  • Off-site protection copy
  • Unlimited storage
  • Share outside UCB
  • Use of Google’s familiar & powerful authoring tools
  • Backup
  • High-speed, large volume data transfer
  • 5 TB maximum file size
  • No FTPS access
  • PL1
  • Unlimited storage, sponsored by campus at no cost to researchers
  • Web browser, mobile app, sync, desktop tool, command line (via add-ons), API
Berkeley Box
  • Unlimited storage
  • Share outside UCB
  • Some additional role capabilities
  • Collaborative editing using Microsoft Office Online
  • Backup
  • High-speed, large volume data transfer
  • 15 GB maximum file size
  • Some FTPS limitations for very large transfers
  • PL1
  • Unlimited storage, sponsored by campus at no cost to researchers
  • Web browser, mobile app, sync, desktop tool, command line (FTP and WebDAV), API
CalShare
  • Project/site dashboard
  • Integrates well with Microsoft Office, OneDrive for Business
  • Off-site disaster recovery
  • Storing more than a small amount of data
  • 1 GB maximum file size
  • Primarily browser-based access
  • Share outside UCB requires CalNet Guest accounts
  • PL2
  • $53/month for a site with 1 GB of storage; $1/GB/mo for additional storage
  • Web browser, API, Sync via OneDrive for Business
IST Performance Storage
  • Storing data during data preparation and high performance computation
  • Large group file sharing, retaining group or departmental ownership of files
  • Off-site protection copies
  • Managed storage for high-performance applications
  • Hosted in UCB data center (with UCSD backup available)
  • No direct web access
  • PL0 to PL2 (Please contact RDM for a consultation)
  • $0.20/GB/mo, minimum initial purchase of 100 GB
  • Mountable file-based storage or block storage
IST Utility Storage
  • Storage of large volumes of data with limited I/O requirements
  • Low performance computation
  • File shares
  • Web content delivery
  • Backups
  • Managed storage for less I/O intensive needs or less frequently accessed materials
  • Hosted in UCB data center (with UCSD backup available)
  • High performance applications
  • No direct web access
  • PL0 to PL2 (Please contact RDM  for a consultation)
  • $0.05/GB/mo, minimum initial purchase of 1 TB
  • Mountable file-based storage or block storage
Savio HPC Condo Storage
  • Users or research groups that need to import, work on, and store large data sets to support their use of Savio
  • Performance is very good, but users whose computation includes heavy I/O should stage data on the parallel filesystem
  • Limited to Savio Condo Cluster Service contributors
  • PL0
  • $7K for 25 TB, for 5 years ($59/TB/yr), 25 TB minimum purchase
  • Service page
  • Please contact BRC to consult
Savio HPC Parallel File System
  • Temporary read/write storage during computations on Savio with moderate to heavy I/O demands
     
  • Performant "global scratch" storage close to Savio/HPC computation
  • Large pool (885TB), shared among all Savio users
  • Limited to users of Savio cluster
  • Periodically purged
  • PL0
  • No additional cost

Storage provided by others (with UC Berkeley guidance)

Title Best suited for Key capabilities Not well suited for Limitations Data protection Cost details Connection methods
Cloud Archival Storage Solution (CASS) (UCLA)
  • Gathering data, source materials, and documentation; parking data for later preparation, analysis
  • Offsite protection copy
  • Backup (with CrashPlan ProE)
  • Convenient, resilient, and fast data transfers via Globus
  • Cannot limit access by individual or group
  • PL0
  • Approximately $137/TB/yr ($0.01/GB/mo), minimum purchase of 1 TB
  • Mountable  file-based storage or block storage
Amazon Web Services (AWS) Storage Services

Amazon offers multiple kinds of storage that cost different amounts and meet different storage use cases.  Please contact RDM Consulting if you are interested in these storage options.

XSEDE Storage Services

XSEDE is a single virtual system that scientists can use to interactively share computing resources, data and expertise. People around the world use these resources and services — things like supercomputers, collections of data and new tools — to improve our planet.  XSEDE resources include several services for storing research data.  Please contact RDM Consulting if you are interested in these storage options.

Other storage services being evaluated

The RDM Program is evaluating storage services offered by other UC campuses (including UCSD’s Project Storage service) as well as other cloud services (such as those provided by Google and Microsoft).  This guide will be updated on a regular basis.  Please contact RDM Consulting if you are interested in these storage options.