AI Isn’t a Solution to All Our Problems


Article by Griffin McCutcheon, John Malloy, Caitlyn Hall, and Nivedita Mahesh: “From the esoteric worlds of predictive health care and cybersecurity to Google’s e-mail completion and translation apps, the impacts of AI are increasingly being felt in our everyday lived experience. The way it has crepted into our lives in such diverse ways and its proficiency in low-level knowledge shows that AI is here to stay. But like any helpful new tool, there are notable flaws and consequences to blindly adapting it. 

AI is a tool—not a cure-all to modern problems….

Connecterra is trying to use TensorFlow to address global hunger through AI-enabled efficient farming and sustainable food development. The company uses AI-equipped sensors to track cattle health, helping farmers look for signs of illness early on. But, this only benefits one type of farmer: those rearing cattle who are able to afford a device to outfit their entire herd. Applied this way, AI can only improve the productivity of specific resource-intensive dairy farms and is unlikely to meet Connecterra’s goal of ending world hunger.

This solution, and others like it, ignores the wider social context of AI’s application. The belief that AI is a cure-all tool that will magically deliver solutions if only you can collect enough data is misleading and ultimately dangerous as it prevents other effective solutions from being implemented earlier or even explored. Instead, we need to both build AI responsibly and understand where it can be reasonably applied. 

Challenges with AI are exacerbated because these tools often come to the public as a “black boxes”—easy to use but entirely opaque in nature. This shields the user from understanding what biases and risks may be involved, and this lack of public understanding of AI tools and their limitations is a serious problem. We shouldn’t put our complete trust in programs whose workings their creators cannot interpret. These poorly understood conclusions from AI generate risk for individual users, companies or government projects where these tools are used. 

With AI’s pervasiveness and the slow change of policy, where do we go from here? We need a more rigorous system in place to evaluate and manage risk for AI tools….(More)”.

Barriers to Working With National Health Service England’s Open Data


Paper by Ben Goldacre and Seb Bacon: “Open data is information made freely available to third parties in structured formats without restrictive licensing conditions, permitting commercial and noncommercial organizations to innovate. In the context of National Health Service (NHS) data, this is intended to improve patient outcomes and efficiency. EBM DataLab is a research group with a focus on online tools which turn our research findings into actionable monthly outputs. We regularly import and process more than 15 different NHS open datasets to deliver OpenPrescribing.net, one of the most high-impact use cases for NHS England’s open data, with over 15,000 unique users each month. In this paper, we have described the many breaches of best practices around NHS open data that we have encountered. Examples include datasets that repeatedly change location without warning or forwarding; datasets that are needlessly behind a “CAPTCHA” and so cannot be automatically downloaded; longitudinal datasets that change their structure without warning or documentation; near-duplicate datasets with unexplained differences; datasets that are impossible to locate, and thus may or may not exist; poor or absent documentation; and withholding of data for dubious reasons. We propose new open ways of working that will support better analytics for all users of the NHS. These include better curation, better documentation, and systems for better dialogue with technical teams….(More)”.

Making Public Transit Fairer to Women Demands Way More Data


Flavie Halais at Wired: “Public transportation is sexist. This may be unintentional or implicit, but it’s also easy to see. Women around the world do more care and domestic work than men, and their resulting mobility habits are hobbled by most transport systems. The demands of running errands and caring for children and other family members mean repeatedly getting on and off the bus, meaning paying more fares. Strollers and shopping bags make travel cumbersome. A 2018 study of New Yorkers found women were harassed on the subway far more frequently than men were, and as a result paid more money to avoid transit in favor of taxis and ride-hail….

What is not measured is not known, and the world of transit data is still largely blind to women and other vulnerable populations. Getting that data, though, isn’t easy. Traditional sources like national censuses and user surveys provide reliable information that serve as the basis for policies and decisionmaking. But surveys are costly to run, and it can take years for a government to go through the process of adding a question to its national census.

Before pouring resources into costly data collection to find answers about women’s transport needs, cities could first turn to the trove of unconventional gender-disaggregated data that’s already produced. They include data exhaust, or the trail of data we leave behind as a result of our interactions with digital products and services like mobile phones, credit cards, and social media. Last year, researchers in Santiago, Chile, released a report based on their parsing of anonymized call detail records of female mobile phone users, to extract location information and analyze their mobility patterns. They found that women tended to travel to fewer locations than men, and within smaller geographical areas. When researchers cross-referenced location information with census data, they found a higher gender gap among lower-income residents, as poorer women made even shorter trips. And when using data from the local transit agency, they saw that living close to a public transit stop increased mobility for both men and women, but didn’t close the gender gap for poorer residents.

To encourage private companies to share such info, Stefaan Verhulst advocates for data collaboratives, flexible partnerships between data providers and researchers. Verhulst is the head of research and development at GovLab, a research center at New York University that contributed to the research in Santiago. And that’s how GovLab and its local research partner, Universidad del Desarollo, got access to the phone records owned by the Chilean phone company, Telefónica. Data collaboratives can enhance access to private data without exposing companies to competition or privacy concerns. “We need to find ways to access data according to different shades of openness,” Verhulst says….(More)”.

Reuse of open data in Quebec: from economic development to government transparency


Paper by

Reuse of open data in Quebec: from economic development to government transparency

Paper by Christian Boudreau: “Based on the history of open data in Quebec, this article discusses the reuse of these data by various actors within society, with the aim of securing desired economic, administrative and democratic benefits. Drawing on an analysis of government measures and community practices in the field of data reuse, the study shows that the benefits of open data appear to be inconclusive in terms of economic growth. On the other hand, their benefits seem promising from the point of view of government transparency in that it allows various civil society actors to monitor the integrity and performance of government activities. In the age of digital data and networks, the state must be seen not only as a platform conducive to innovation, but also as a rich field of study that is closely monitored by various actors driven by political and social goals….

Although the economic benefits of open data have been inconclusive so far, governments, at least in Quebec, must not stop investing in opening up their data. In terms of transparency, the results of the study suggest that the benefits of open data are sufficiently promising to continue releasing government data, if only to support the evaluation and planning activities of public programmes and services….(More)”.

How Aid Groups Map Refugee Camps That Officially Don't Exist


Abby Sewell at Wired: “On the outskirts of Zahle, a town in Lebanon’s Beqaa Valley, a pair of aid workers carrying clipboards and cell phones walk through a small refugee camp, home to 11 makeshift shelters built from wood and tarps.

A camp resident leading them through the settlement—one of many in the Beqaa, a wide agricultural plain between Beirut and Damascus with scattered villages of cinderblock houses—points out a tent being renovated for the winter. He leads them into the kitchen of another tent, highlighting cracking wood supports and leaks in the ceiling. The aid workers record the number of residents in each tent, as well as the number of latrines and kitchens in the settlement.

The visit is part of an initiative by the Switzerland-based NGO Medair to map the locations of the thousands of informal refugee settlements in Lebanon, a country where even many city buildings have no street addresses, much less tents on a dusty country road.

“I always say that this project is giving an address to people that lost their home, which is giving back part of their dignity in a way,” says Reine Hanna, Medair’s information management project manager, who helped develop the mapping project.

The initiative relies on GIS technology, though the raw data is collected the old-school way, without high tech mapping aids like drones. Mapping teams criss-cross the country year round, stopping at each camp to speak to residents and conduct a survey. They enter the coordinates of new camps or changes in the population or facilities of old ones into a database that’s shared with UNHCR, the UN refugee agency, and other NGOs working in the camps. The maps can be accessed via a mobile app by workers heading to the field to distribute aid or respond to emergencies.

Lebanon, a small country with an estimated native population of about 4 million, hosts more than 900,000 registered Syrian refugees and potentially hundreds of thousands more unregistered, making it the country with the highest population of refugees per capita in the world.

But there are no official refugee camps run by the government or the UN refugee agency in Lebanon, where refugees are a sensitive subject. The country is not a signatory to the 1951 Refugee Convention, and government officials refer to the Syrians as “displaced,” not “refugees.”

Lebanese officials have been wary of the Syrians settling permanently, as Palestinian refugees did beginning in 1948. Today, more than 70 years later, there are some 470,000 Palestinian refugees registered in Lebanon, though the number living in the country is believed to be much lower….(More)”.

Four maps showing the growth of informal Syrian refugee settlements in the Zahle district of the Beqaa Valley in Lebanon
Maps compiled by UNHCR showing the growth in the number of informal refugee camps in one area of Lebanon over the past six years.COURTESY OF UNHCR

Hospitals Give Tech Giants Access to Detailed Medical Records


Melanie Evans at the Wall Street Journal: “Hospitals have granted Microsoft Corp., International Business Machines and Amazon.com Inc. the ability to access identifiable patient information under deals to crunch millions of health records, the latest examples of hospitals’ growing influence in the data economy.

The breadth of access wasn’t always spelled out by hospitals and tech giants when the deals were struck.

The scope of data sharing in these and other recently reported agreements reveals a powerful new role that hospitals play—as brokers to technology companies racing into the $3 trillion health-care sector. Rapid digitization of health records and privacy laws enabling companies to swap patient data have positioned hospitals as a primary arbiter of how such sensitive data is shared. 

“Hospitals are massive containers of patient data,” said Lisa Bari, a consultant and former lead for health information technology for the Centers for Medicare and Medicaid Services Innovation Center. 

Hospitals can share patient data as long as they follow federal privacy laws, which contain limited consumer protections, she said. “The data belongs to whoever has it.”…

Digitizing patients’ medical histories, laboratory results and diagnoses has created a booming market in which tech giants are looking to store and crunch data, with potential for groundbreaking discoveries and lucrative products.

There is no indication of wrongdoing in the deals. Officials at the companies and hospitals say they have safeguards to protect patients. Hospitals control data, with privacy training and close tracking of tech employees with access, they said. Health data can’t be combined independently with other data by tech companies….(More)”.

Information literacy in the age of algorithms


Report by Alison J. Head, Ph.D., Barbara Fister, Margy MacMillan: “…Three sets of questions guided this report’s inquiry:

  1. What is the nature of our current information environment, and how has it influenced how we access, evaluate, and create knowledge today? What do findings from a decade of PIL research tell us about the information skills and habits students will need for the future?
  2. How aware are current students of the algorithms that filter and shape the news and information they encounter daily? What
    concerns do they have about how automated decision-making systems may influence us, divide us, and deepen inequalities?
  3. What must higher education do to prepare students to understand the new media landscape so they will be able to participate in sharing and creating information responsibly in a changing and challenged world?
    To investigate these questions, we draw on qualitative data that PIL researchers collected from student focus groups and faculty interviews during fall 2019 at eight U.S. colleges and universities. Findings from a sample of 103 students and 37 professors reveal levels of awareness and concerns about the age of algorithms on college campuses. They are presented as research takeaways….(More)”.

Finding the Blank Spots in Big Data


Eye on Design: “How often do we think of data as missing? Data is everywhere—it’s used to decide what products to stock in stores, to determine which diseases we’re most at risk for, to train AI models to think more like humans. It’s collected by our governments and used to make civic decisions. It’s mined by major tech companies to tailor our online experiences and sell to advertisers. As our data becomes an increasingly valuable commodity—usually profiting others, sometimes at our own expense—to not be “seen” or counted might seem like a good thing. But when data is used at such an enormous scale, gaps in the data take on an outsized importance, leading to erasure, reinforcing bias, and, ultimately, creating a distorted view of humanity. As Tea Uglow, director of Google’s Creative Lab, has said in reference to the exclusion of queer and transgender communities, “If the data does not exist, you do not exist.”

“In spaces that are oversaturated with data, there are blank spots where there’s nothing collected at all.”

This is something that artists and designers working in the digital realm understand better than most, and a growing number of them are working on projects that bring in the nuance, ethical outlook, and humanist approach necessary to take on the problem of data bias. This group includes artists like Onuoha that have the vision to seek out and highlight these absences (and offer a blueprint for others), as well as those like artist and software engineer Omayeli Arenyeka, who are working on projects that collect necessary data. It also includes artist and researcher Caroline Sinders and the collective Feminist Internet, who are working on building AI models, chatbots, and systems that take into account data bias and exclusion in every step of their processes. Others are academics like Catherine D’Ignazio and Lauren Klein, whose book Data Feminism considers how a feminist approach to data science would curb widespread bias. Still others are activists, like María Salguero, who saw there was a lack of comprehensive data on gender-based killings in Mexico and decided to collect it herself….(More)”.

Social media firms 'should hand over data amid suicide risk'


Denis Campbell at the Guardian: “Social media firms such as Facebook and Instagram should be forced to hand over data about who their users are and why they use the sites to reduce suicide among children and young people, psychiatrists have said.

The call from the Royal College of Psychiatrists comes as ministers finalise plans to crack down on issues caused by people viewing unsavoury material and messages online.

The college, which represents the UK’s 18,000 psychiatrists, wants the government to make social media platforms hand over the data to academics so that they can study what sort of content users are viewing.

“We will never understand the risks and benefits of social media use unless the likes of Twitter, Facebook and Instagram share their data with researchers,” said Dr Bernadka Dubicka, chair of the college’s child and adolescent mental health faculty. “Their research will help shine a light on how young people are interacting with social media, not just how much time they spend online.”

Data passed to academics would show the type of material viewed and how long users were spending on such platforms but would be anonymous, the college said.

The government plans to set up a new online safety regulator and the college says it should be given the power to compel firms to hand over data. It is also calling for the forthcoming 2% “turnover tax” on social media companies’ income to be extended so that it includes their turnover internationally, not from just the UK.

“Self-regulation is not working. It is time for government to step up and take decisive action to hold social media companies to account for escalating harmful content to vulnerable children and young people,” said Dubicka.

The college’s demands come amid growing concern that young people are being harmed by material that, for example, encourages self-harm, suicide and eating disorders. They are included in a new position statement on technology use and the mental health of children and young people.

NHS England challenged firms to hand over the sort of information that the college is suggesting. Claire Murdoch, its national director for mental health, said that action was needed “to rein in potentially misleading or harmful online content and behaviours”.

She said: “If these tech giants really want to be a force for good, put a premium on users’ wellbeing and take their responsibilities seriously, then they should do all they can to help researchers better understand how they operate and the risks posed. Until then, they cannot confidently say whether the good outweighs the bad.”

The demands have also been backed by Ian Russell, who has become a campaigner against social media harm since his 14-year-old daughter Molly killed herself in November 2017….(More)”.

Human-centred policy? Blending ‘big data’ and ‘thick data’ in national policy


Policy Lab (UK): “….Compared with quantitative data, ethnography creates different forms of data – what anthropologists call ‘thick data’. Complex social problems benefit from insights beyond linear, standardised evidence and this is where thick data shows its worth. In Policy Lab we have generated ethnographic films and analysis to sit alongside quantitative data, helping policy-makers to build a rich picture of current circumstances. 

On the other hand, much has been written about big data – data generated through digital interactions – whether it be traditional ledgers and spreadsheets or emerging use of artificial intelligence and the internet of things.  The ever-growing zettabytes of data can reveal a lot, providing a (sometimes real time) digital trail capturing and aggregating our individual choices, preferences, behaviours and actions.  

Much hyped, this quantitative data has great potential to inform future policy, but must be handled ethically, and also requires careful preparation and analysis to avoid biases and false assumptions creeping in. Three issues we have seen in our projects relate to:

  • partial data, for example not having data on people who are not digitally active, biasing the sample
  • the time-consuming challenge of cleaning up data, in a political context where time is often of the essence
  • the lack of data interoperability, where different localities/organisations capture different metrics

Through a number of Policy Lab projects we have used big data to see the big picture before then using thick data to zoom in to the detail of people’s lived experience.  Whereas big data can give us cumulative evidence at a macro, often systemic level, thick data provides insights at an individual or group level.  We have found the blending of ‘big data’ and ‘thick data’ – to be the sweet spot. 

This is a diagram of Policy Lab's model for combining big data and thick data.
Policy Lab’s model for combining big data and thick data (2020)

Policy Lab’s work develops data and insights into ideas for potential policy intervention which we can start to test as prototypes with real people. These operate at the ‘meso’ level (in the middle of the diagram above), informed by both the thick data from individual experiences and the big data at a population or national level. We have written a lot about prototyping for policy and are continuing to explore how you prototype a policy compared to say a digital service….(More)”.