An Interview with Jon Stiles and Rick Jaffe of the D-Lab’s Securing Research Data Working Group

Picture of working group members

The Securing Research Data working group, run by Jon Stiles of the D-Lab and Rick Jaffe of Research IT, provides a forum for researchers and staff to discuss issues concerning sensitive, confidential, and restricted-use data. Beyond simply sharing information, the group focuses on improving conditions for research. The group meets the 4th Monday of every month and is open to anyone. The next meeting will take place on Monday, February 25th from 2-3pm in the D-Lab (356 Barrows Hall).

Research Data Management sat down with Jon and Rick to learn more about the group. This interview has been edited and condensed for clarity.

What was the impetus for this group developing?
Jon Stiles: Rick and I, along with Patty Frontiera, went to UC Davis at the end of 2016 for a presentation about CITRIS (Center for Information Technology Research in the Interest of Society) seed grants. From that we put together a proposal related to secure research data, which led to creating this working group in the D-Lab. It was a natural fit for the D-Lab because one leg of our “tripod” of service areas – training, community and consulting – involved creating working groups that would revisit topics or domains over time. It was a natural fit for Research IT because they were exploring what we as a campus should be doing in this domain of sensitive data.

Rick Jaffe: Chris Hoffman, myself, and others from Research Data Management were serving as front line support for the Information Security and Policy team and the Industry Alliances Office. We recognized that researchers needed – and wanted – to know what to do to protect their computer environments and their data. I was providing RDM consulting through the D-Lab and knew that Jon had been creating cold rooms and other secure environments for social sciences for a long time.

JS: One of the first things we did when D-Lab started up – since we knew that there was a need for secure environments – was to fold in the existing Federal Statistical Research Data Center, which provides access to non-public data info from the Census Bureau. Other federal agencies make data they collect available in the FSRDC, and researchers find those kinds of data really valuable for their research. This led to a  partnership between the census bureau and UC Berkeley to provide access in a secure way.  At the same time, we also created a separate cold room for secure projects that didn’t fit the FSRDC model.

There were other cold rooms and restricted use environments at IRLE (Institute for Research on Labor and Employment) and the Economics Department and up at the Haas Business School. These places all served individual clienteles and we weren’t taking advantage of the opportunity to learn from each other.

It was not just a matter of some people knew how to do it and some people didn’t, but that the environment was changing so quickly as increasingly more kinds of data became available. That data had an independent existence that persisted, and new techniques were then being created for linking these data across sources in all sorts of ways, which was potentially very threatening to individual confidentiality and privacy.

RJ: So we invited researchers that we knew, as well as staff from Information Security and Policy and the Industry Alliances Office. We also included people in research administration who manage the grants and contracts that involve research data; members of the Committee for the Protection of Human Subjects; staff in labs and research units; Campus Shared Services IT colleagues who directly support researchers; and the people from Information Services and Technology who run the central campus computing, storage, and database services.

JS: These meetings were unique in that there was a wide variety of folks attending along with researchers with all kinds of data. This would spark everybody’s involvement as people who had been involved with dealing with certain types of data in the past brought their perspectives to the group.

RJ: Researchers would come in to tell their story and one of the first things we realized was that few of them knew who they needed to talk to. That was one of those ‘Ah-ha!’ moments. So, through this group, we started making the path visible for researchers while getting the offices to talk to each other.

What is the structure of the meetings?
RJ: We tend to have researchers come in and speak for half the time and use the other half to cover other agenda items that are timely, which could be related to what the researcher is talking about. This usually results in a great discussion and a lot of input from different perspectives.

JS: Some of the researchers who originally presented have turned into regular attendees and give wonderful, down to earth, practical advice on what the implications of following a policy are, as they have usually ended up spending a lot of time figuring these things out.

What are the goals for this group?
RJ: One goal is to coordinate support across campus; to learn from each other and be able to provide help collaboratively when individually we don’t know enough. Another goal is to figure out solutions and advocate for them. Researchers describe the barriers they face and when we hear that, we want to respond. What we have learned has directly informed a secure research data and computing initiative now before campus leadership.

JS: One of the groups that came and spoke to us was the California Policy Labs. They were working on a set of agreements with state agencies that would put into place the legal framework for researchers at the UCs to have access to non-public data from those agencies. They were also building out a collaboration with UCLA to create a computing infrastructure to host that data, manage retention of it over time, and handle provision of access to other researchers. I think for all of us that is somewhat of a holy grail: Turning individual solutions from individual researchers into larger, sustainable data collections that people can build on over time.

How has this group changed over time (goals, membership, direction)?
JS: There are several pieces to this. There is the piece about the interaction between researchers and support organizations and there is the piece about understanding and addressing what researcher needs are. The third piece has been to define a set of infrastructure accomplishments. We had very specific sets of infrastructure goals we were trying to achieve in the first year. We were talking about finding the low hanging fruit and that shifted into identifying the things that we were solving already and making them understandable and available broadly. I think this is where we are right now, working on solutions and disseminating things that work.

RJ: We’ve realized that we need to document campus processes. We need to help researchers understand the security frameworks governing a particular set of data and the requisite controls that must be in place. We need to publicize all of this and simplify where possible. There is concrete work that can be done.

Are there topics the group keeps coming back to?
RJ: We always want to hear from researchers. Hear their stories. That is not a topic per se, but it is a thread for sure. Also, UC Berkeley and UCOP policy on data security and privacy.

JS: I do think that in terms of topics we end up revisiting the technical pieces of storage and computing environments over and over again in different contexts with different needs. The other thing that comes up is the social organization that is necessary to drive the research process forward and to create relationships with data providers while making sure that the non-technical parts of compliance actually get achieved. There is just a wide latitude in enthusiasm and people can comply on paper and really not be protecting data. So a lot of this is getting people engaged and enthusiastic about doing the right thing.

Who is this group for and is it open to anyone?
JS: The group is open to anyone, though it has a core set of attendees from organizations that are supporting it. We have had grad students come in who provide access to restricted use data, such as supermarket scanner data, or who are working with restricted use data and want to understand more about what that means. It can get technical because security gets technical; and dip into the policy weeds. However, we try to keep it so no one feels over their head in technical and policy talk and acronyms and stuff like that. To be successful, we must continually recruit researchers to attend, along with people and organizations that specifically support this type of research. If that is you, please contact us!

Information about the Securing Research Data Working Group can be found on the D-Lab website.