CARE Principles for Indigenous Data Governance


The Global Indigenous Data Alliance: “The current movement toward open data and open science does not fully engage with Indigenous Peoples rights and interests. Existing principles within the open data movement (e.g. FAIR: findable, accessible, interoperable, reusable) primarily focus on characteristics of data that will facilitate increased data sharing among entities while ignoring power differentials and historical contexts. The emphasis on greater data sharing alone creates a tension for Indigenous Peoples who are also asserting greater control over the application and use of Indigenous data and Indigenous Knowledge for collective benefit.

This includes the right to create value from Indigenous data in ways that are grounded in Indigenous worldviews and realise opportunities within the knowledge economy. The CARE Principles for Indigenous Data Governance are people and purpose-oriented, reflecting the crucial role of data in advancing Indigenous innovation and self-determination. These principles complement the existing FAIR principles encouraging open and other data movements to consider both people and purpose in their advocacy and pursuits….(More)”.

Milwaukee’s Amani Neighborhood Uses Data to Target Traffic Safety and Build Trust


Article by Kassie Scott: “People in Milwaukee’s Amani neighborhood are using data to identify safety issues and build relationships with the police. It’s a story of community-engaged research at its best.

In 2017, the Milwaukee Police Department received a grant under the federal Byrne Criminal Justice Innovation program, now called the Community Based Crime Reduction Program, whose purpose is to bridge the gap between practitioners and researchers and advance the use of data in making communities safer. Because of its close ties in the Amani neighborhood, the Dominican Center was selected to lead this initiative, known as the Amani Safety Initiative, and they partnered with local churches, the district attorney’s office, LISC-Milwaukee, and others. To support the effort with data and coaching, the police department contracted with Data You Can Use.

Together with Data You Can Use, the Amani Safety Initiative team first implemented a survey to gauge perceptions of public safety and police legitimacy. Neighborhood ambassadors were trained (and paid) to conduct the survey themselves, going door to door to gather the information from nearly 300 of their neighbors. The ambassadors shared these results with their neighborhood during what they called “data chats.” They also printed summary survey results on door hangers, which they distributed throughout the neighborhood.

Neighbors and community organizations were surprised by the survey results. Though violent crime and mistrust in the police were commonly thought to be the biggest issues, the data showed that residents were most concerned about traffic safety. Ultimately, residents decided to post slow-down signs in intersections.

This project stands out for letting the people in the neighborhood lead the way. Neighbors collected data, shared results, and took action. The partnership between neighbors, police, and local organizations shows how people can drive decision-making for their neighborhood.

The larger story is one of social cohesion and mutual trust. Through participating in the initiative and learning more about their neighborhood, Amani neighbors built stronger relationships with the police. The police began coming to neighborhood community meetings, which helped them build relationships with people in the community and understand the challenges they face….(More).

Wanted: Data Stewards: (Re-)Defining The Roles and Responsibilities of Data Stewards for an Age of Data Collaboration


Wanted: Data Stewards: (Re-)Defining The Roles and Responsibilities of Data Stewards for an Age of Data Collaboration

Stefaan G. Verhulst, Andrew Zahuranec, Andrew Young and Michelle Winowatan at Data & Policy: “As data grows increasingly prevalent in our economy, it is increasingly clear, too, that tremendous societal value can be derived from reusing and combining previously separate datasets. One avenue that holds particular promise are data collaboratives. Data collaboratives are a new form of partnership in which data (such as data owned by corporations) or data expertise is made accessible for external parties (such as academics or statistical offices) working in the public interest. By bringing together a wide range of inter-sectoral expertise to bear on the data, collaboration can result in new insights and innovations, and can help unlock the public good potential of previously siloed data or expertise.

Yet, not all data collaboratives are successful or go beyond pilots. Based on research and analysis of hundreds of data collaboratives, one factor seems to stand out as determinative of success above all others — whether there exist individuals or teams within data-holding organizations who are empowered to proactively initiate, facilitate and coordinate data collaboratives toward the public interest. We call these individuals and teams “data stewards.”

They systematize the process of partnering, and help scale efforts when there are fledgling signs of success. Data stewards are essential for accelerating the re-use of data in the public interest by providing functional access, and more generally, to unlock the potential of our data age. Data stewards form an important — and new — link in the data value chain.

In its final report, the European Commission’s High-Level Expert Group on Business-to-Government (B2G) Data Sharing also noted the need for data stewards to enable responsible, accountable data sharing for the public interest. In their report, they write:

“A key success factor in setting up sustainable and responsible B2G partnerships is the existence, within both public- and private-sector organisations, of individuals or teams that are empowered to proactively initiate, facilitate and coordinate B2G data sharing when necessary. As such, ‘data stewards’ should become a recognised function.”

The report goes on further to acknowledge the need to scope, design, and establish a network or a community of practice around data stewardship.

Wanted: Data Stewards

A new position paper, released by The GovLab within the context of the UN Statistical Commission High-Level Forum on Official Statistics which focused on “Data stewardship — a solution for official statistics’ predicament?” seeks to begin that work. The paper, titled “Wanted: Data Stewards: (Re-)Defining The Roles and Responsibilities of Data Stewards for an Age of Data Collaboration” tackles questions regarding the profile and potential of data stewards. It aims to provide an operational roadmap to support the implementation (or expansion) of data stewardship functions in public- and private-sector entities; and to start building a community of expertise.

Moreover, it addresses the tendency to conflate the roles of data stewards with those of individuals or groups who might better be described as chief privacy, chief data or chief security officers. This slippage is perhaps understandable, we need to redefine the role that is somewhat broader. While data management, privacy and security are key components of trusted and effective data collaboratives, the real goal is to re-use data for broader social goals (while preventing any potential harms that may result from sharing).

In particular the position paper — which captures lived experience of numerous data stewards- seeks to provide more clarity on how data stewards can accomplish these duties by:

  • Defining the responsibilities of a data steward; and
  • Identifying the roles which a data steward must fill to achieve these responsibilities…(More)”.

Is Your Data Being Collected? These Signs Will Tell You Where


Flavie Halais at Wired: “Alphabet’s Sidewalk Labs is testing icons that provide “digital transparency” when information is collected in public spaces….

As cities incorporate digital technologies into their landscapes, they face the challenge of informing people of the many sensors, cameras, and other smart technologies that surround them. Few people have the patience to read through the lengthy privacy notice on a website or smartphone app. So how can a city let them know how they’re being monitored?

Sidewalk Labs, the Google sister company that applies technology to urban problems, is taking a shot. Through a project called Digital Transparency in the Public Realm, or DTPR, the company is demonstrating a set of icons, to be displayed in public spaces, that shows where and what kinds of data are being collected. The icons are being tested as part Sidewalk Labs’ flagship project in Toronto, where it plans to redevelop a 12-acre stretch of the city’s waterfront. The signs would be displayed at each location where data would be collected—streets, parks, businesses, and courtyards.

Data collection is a core feature of the project, called Sidewalk Toronto, and the source of much of the controversy surrounding it. In 2017, Waterfront Toronto, the organization in charge of administering the redevelopment of the city’s eastern waterfront, awarded Sidewalk Labs the contract to develop the waterfront site. The project has ambitious goals: It says it could create 44,000 direct jobs by 2040 and has the potential to be the largest “climate-positive” community—removing more CO2 from the atmosphere than it produces—in North America. It will make use of new urban technology like modular street pavers and underground freight delivery. Sensors, cameras, and Wi-Fi hotspots will monitor and control traffic flows, building temperature, and crosswalk signals.

All that monitoring raises inevitable concerns about privacy, which Sidewalk aims to address—at least partly—by posting signs in the places where data is being collected.

The signs display a set of icons in the form of stackable hexagons, derived in part from a set of design rules developed by Google in 2014. Some describe the purpose for collecting the data (mobility, energy efficiency, or waste management, for example). Others refer to the type of data that’s collected, such as photos, air quality, or sound. When the data is identifiable, meaning it can be associated with a person, the hexagon is yellow. When the information is stripped of personal identifiers, the hexagon is blue…(More)”.

Eurobarometer survey shows support for sustainability and data sharing


Press Release: “Europeans want their digital devices to be easier to repair or recycle and are willing to share their personal information to improve public services, as a special Eurobarometer survey shows. The survey, released today, measured attitudes towards the impact of digitalisation on daily lives of Europeans in 27 EU Member States and the United Kingdom. It covers several different areas including digitalisation and the environment, sharing personal information, disinformation, digital skills and the use of digital ID….

Overall, 59% of respondents would be willing to share some of their personal information securely to improve public services. In particular, most respondents are willing to share their data to improve medical research and care (42%), to improve the response to crisis (31%) or to improve public transport and reduce air pollution (26%).

An overwhelming majority of respondents who use their social media accounts to log in to other online services (74%) want to know how their data is used. A large majority would consider it useful to have a secure single digital ID that could serve for all online services and give them control over the use of their data….

In addition to the Special Eurobarometer report, the last iteration of the Standard Eurobarometer conducted in November 2019 also tested public perceptions related to Artificial Intelligence. The findings also published in a separate report today.

Around half of the respondents (51%) said that public policy intervention is needed to ensure ethical applications. Half of the respondents (50%) mention the healthcare sector as the area where AI could be most beneficial. A strong majority (80%) of the respondents think that they should be informed when a digital service or mobile application uses AI in various situations….(More)”.

Beyond Randomized Controlled Trials


Iqbal Dhaliwal, John Floretta & Sam Friedlander at SSIR: “…In its post-Nobel phase, one of J-PAL’s priorities is to unleash the treasure troves of big digital data in the hands of governments, nonprofits, and private firms. Primary data collection is by far the most time-, money-, and labor-intensive component of the vast majority of experiments that evaluate social policies. Randomized evaluations have been constrained by simple numbers: Some questions are just too big or expensive to answer. Leveraging administrative data has the potential to dramatically expand the types of questions we can ask and the experiments we can run, as well as implement quicker, less expensive, larger, and more reliable RCTs, an invaluable opportunity to scale up evidence-informed policymaking massively without dramatically increasing evaluation budgets.

Although administrative data hasn’t always been of the highest quality, recent advances have significantly increased the reliability and accuracy of GPS coordinates, biometrics, and digital methods of collection. But despite good intentions, many implementers—governments, businesses, and big NGOs—aren’t currently using the data they already collect on program participants and outcomes to improve anti-poverty programs and policies. This may be because they aren’t aware of its potential, don’t have the in-house technical capacity necessary to create use and privacy guidelines or analyze the data, or don’t have established partnerships with researchers who can collaborate to design innovative programs and run rigorous experiments to determine which are the most impactful. 

At J-PAL, we are leveraging this opportunity through a new global research initiative we are calling the “Innovations in Data and Experiments for Action” Initiative (IDEA). IDEA supports implementers to make their administrative data accessible, analyze it to improve decision-making, and partner with researchers in using this data to design innovative programs, evaluate impact through RCTs, and scale up successful ideas. IDEA will also build the capacity of governments and NGOs to conduct these types of activities with their own data in the future….(More)”.

Car Data Facts


About: “Welcome to CarDataFacts.eu! This website provides a fact-based overview on everything related to the sharing of vehicle-generated data with third parties. Through a series of educational infographics, this website answers the most common questions about access to car data in a clear and simple way.

CarDataFacts.eu also addresses consumer concerns about sharing data in a safe and a secure way, as well as explaining some of the complex and technical terminology surrounding the debate.

CarDataFacts.eu is brought to you by ACEA, the European Automobile Manufacturers’ Association, which represents the 15 Europe-based car, van, truck and bus makers….(More)”.

Invest 5% of research funds in ensuring data are reusable


Barend Mons at Nature: “It is irresponsible to support research but not data stewardship…

Many of the world’s hardest problems can be tackled only with data-intensive, computer-assisted research. And I’d speculate that the vast majority of research data are never published. Huge sums of taxpayer funds go to waste because such data cannot be reused. Policies for data reuse are falling into place, but fixing the situation will require more resources than the scientific community is willing to face.

In 2013, I was part of a group of Dutch experts from many disciplines that called on our national science funder to support data stewardship. Seven years later, policies that I helped to draft are starting to be put into practice. These require data created by machines and humans to meet the FAIR principles (that is, they are findable, accessible, interoperable and reusable). I now direct an international Global Open FAIR office tasked with helping communities to implement the guidelines, and I am convinced that doing so will require a large cadre of professionals, about one for every 20 researchers.

Even when data are shared, the metadata, expertise, technologies and infrastructure necessary for reuse are lacking. Most published data sets are scattered into ‘supplemental files’ that are often impossible for machines or even humans to find. These and other sloppy data practices keep researchers from building on each other’s work. In cases of disease outbreaks, for instance, this might even cost lives….(More)”.

Tesco Grocery 1.0, a large-scale dataset of grocery purchases in London


Paper by Luca Maria Aiello, Daniele Quercia, Rossano Schifanella & Lucia Del Prete: “We present the Tesco Grocery 1.0 dataset: a record of 420 M food items purchased by 1.6 M fidelity card owners who shopped at the 411 Tesco stores in Greater London over the course of the entire year of 2015, aggregated at the level of census areas to preserve anonymity. For each area, we report the number of transactions and nutritional properties of the typical food item bought including the average caloric intake and the composition of nutrients.

The set of global trade international numbers (barcodes) for each food type is also included. To establish data validity we: i) compare food purchase volumes to population from census to assess representativeness, and ii) match nutrient and energy intake to official statistics of food-related illnesses to appraise the extent to which the dataset is ecologically valid. Given its unprecedented scale and geographic granularity, the data can be used to link food purchases to a number of geographically-salient indicators, which enables studies on health outcomes, cultural aspects, and economic factors….(More)”.

Monitoring of the Venezuelan exodus through Facebook’s advertising platform


Paper by Palotti et al: “Venezuela is going through the worst economical, political and social crisis in its modern history. Basic products like food or medicine are scarce and hyperinflation is combined with economic depression. This situation is creating an unprecedented refugee and migrant crisis in the region. Governments and international agencies have not been able to consistently leverage reliable information using traditional methods. Therefore, to organize and deploy any kind of humanitarian response, it is crucial to evaluate new methodologies to measure the number and location of Venezuelan refugees and migrants across Latin America.

In this paper, we propose to use Facebook’s advertising platform as an additional data source for monitoring the ongoing crisis. We estimate and validate national and sub-national numbers of refugees and migrants and break-down their socio-economic profiles to further understand the complexity of the phenomenon. Although limitations exist, we believe that the presented methodology can be of value for real-time assessment of refugee and migrant crises world-wide….(More)”.