Sarah Telford and Stefaan G. Verhulst at Understanding Risk Forum: “….In creating the policy, OCHA partnered with the NYU Governance Lab (GovLab) and Leiden University to understand the policy and privacy landscape, best practices of partner organizations, and how to assess the data it manages in terms of potential harm to people.
We seek to share our findings with the UR community to get feedback and start a conversation around the risk to using certain types of data in humanitarian and development efforts and when understanding risk.
What is High-Risk Data?
High-risk data is generally understood as data that includes attributes about individuals. This is commonly referred to as PII or personally identifiable information. Data can also create risk when it identifies communities or demographics within a group and ties them to a place (i.e., women of a certain age group in a specific location). The risk comes when this type of data is collected and shared without proper authorization from the individual or the organization acting as the data steward; or when the data is being used for purposes other than what was initially stated during collection.
The potential harms of inappropriately collecting, storing or sharing personal data can affect individuals and communities that may feel exploited or vulnerable as the result of how data is used. This became apparent during the Ebola outbreak of 2014, when a number of data projects were implemented without appropriate risk management measures. One notable example was the collection and use of aggregated call data records (CDRs) to monitor the spread of Ebola, which not only had limited success in controlling the virus, but also compromised the personal information of those in Ebola-affected countries. (See Ebola: A Big Data Disaster).
A Data-Risk Framework
Regardless of an organization’s data requirements, it is useful to think through the potential risks and harms for its collection, storage and use. Together with the Harvard Humanitarian Initiative, we have set up a four-step data risk process that includes doing an assessment and inventory, understanding risks and harms, and taking measures to counter them.
- Assessment – The first step is to understand the context within which the data is being generated and shared. The key questions to ask include: What is the anticipated benefit of using the data? Who has access to the data? What constitutes the actionable information for a potential perpetrator? What could set off the threat to the data being used inappropriately?
- Data Inventory – The second step is to take inventory of the data and how it is being stored. Key questions include: Where is the data – is it stored locally or hosted by a third party? Where could the data be housed later? Who might gain access to the data in the future? How will we know – is data access being monitored?
- Risks and Harms – The next step is to identify potential ways in which risk might materialize. Thinking through various risk-producing scenarios will help prepare staff for incidents. Examples of risks include: your organization’s data being correlated with other data sources to expose individuals; your organization’s raw data being publicly released; and/or your organization’s data system being maliciously breached.
- Counter-Measures – The next step is to determine what measures would prevent risk from materializing. Methods and tools include developing data handling policies, implementing access controls to the data, and training staff on how to use data responsibly….(More)