Democratizing data in a 5G world


Blog by Dimitrios Dosis at Mastercard: “The next generation of mobile technology has arrived, and it’s more powerful than anything we’ve experienced before. 5G can move data faster, with little delay — in fact, with 5G, you could’ve downloaded a movie in the time you’ve read this far. 5G will also create a vast network of connected machines. The Internet of Things will finally deliver on its promise to fuse all our smart products — vehicles, appliances, personal devices — into a single streamlined ecosystem.

My smartwatch could monitor my blood pressure and schedule a doctor’s appointment, while my car could collect data on how I drive and how much gas I use while behind the wheel. In some cities, petrol trucks already act as roving gas stations, receiving pings when cars are low on gas and refueling them as needed, wherever they are.

This amounts to an incredible proliferation of data. By 2025, every connected person will conduct nearly 5,000 data interactions every day — one every 18 seconds — whether they know it or not. 

Enticing and convenient as new 5G-powered developments may be, it also raises complex questions about data. Namely, who is privy to our personal information? As your smart refrigerator records the foods you buy, will the refrigerator’s manufacturer be able to see your eating habits? Could it sell that information to a consumer food product company for market research without your knowledge? And where would the information go from there? 

People are already asking critical questions about data privacy. In fact, 72% of them say they are paying attention to how companies collect and use their data, according to a global survey released last year by the Harvard Business Review Analytic Services. The survey, sponsored by Mastercard, also found that while 60% of executives believed consumers think the value they get in exchange for sharing their data is worthwhile, only 44% of consumers actually felt that way.

There are many reasons for this data disconnect, including the lack of transparency that currently exists in data sharing and the tension between an individual’s need for privacy and his or her desire for personalization.

This paradox can be solved by putting data in the hands of the people who create it — giving consumers the ability to manage, control and share their own personal information when they want to, with whom they want to, and in a way that benefits them.

That’s the basis of Mastercard’s core set of principles regarding data responsibility – and in this 5G world, it’s more important than ever. We will be able to gain from these new technologies, but this change must come with trust and user control at its core. The data ecosystem needs to evolve from schemes dominated by third parties, where some data brokers collect inferred, often unreliable and inaccurate data, then share it without the consumer’s knowledge….(More)”.

Governance models for redistribution of data value


Essay by Maria Savona: “The growth of interest in personal data has been unprecedented. Issues of privacy violation, power abuse, practices of electoral behaviour manipulation unveiled in the Cambridge Analytica scandal, and a sense of imminent impingement of our democracies are at the forefront of policy debates. Yet, these concerns seem to overlook the issue of concentration of equity value (stemming from data value, which I use interchangeably here) that underpins the current structure of big tech business models. Whilst these quasi-monopolies own the digital infrastructure, they do not own the personal data that provide the raw material for data analytics. 

The European Commission has been at the forefront of global action to promote convergence of the governance of data (privacy), including, but not limited to, the General Data Protection Regulation (GDPR) (European Commission 2016), enforced in May 2018. Attempts to enforce similar regulations are emerging around the world, including the California Consumer Privacy Act, which came into effect on 1 January 2020. Notwithstanding greater awareness among citizens around the use of their data, companies find that complying with GDPR is, at best, a useless nuisance. 

Data have been seen as ‘innovation investment’ since the beginning of the 1990s. The first edition of the Oslo Manual, the OECD’s international guidelines for collecting and using data on innovation in firms, dates back to 19921 and included the collection of databases on employee best practices as innovation investments. Data are also measured as an ‘intangible asset’ (Corrado et al. 2009 was one of the pioneering studies). What has changed over the last decade? The scale of data generation today is such that its management and control might have already gone well beyond the capacity of the very tech giants we are all feeding. Concerns around data governance and data privacy might be too little and too late. 

In this column, I argue that economists have failed twice: first, to predict the massive concentration of data value in the hands of large platforms; and second, to account for the complexity of the political economy aspects of data accumulation. Based on a pair of recent papers (Savona 2019a, 2019b), I systematise recent research and propose a novel data rights approach to redistribute data value whilst not undermining the range of ethical, legal, and governance challenges that this poses….(More)”.

From satisficing to artificing: The evolution of administrative decision-making in the age of the algorithm


Paper by Thea Snow at Data & Policy: “Algorithmic decision tools (ADTs) are being introduced into public sector organizations to support more accurate and consistent decision-making. Whether they succeed turns, in large part, on how administrators use these tools. This is one of the first empirical studies to explore how ADTs are being used by Street Level Bureaucrats (SLBs). The author develops an original conceptual framework and uses in-depth interviews to explore whether SLBs are ignoring ADTs (algorithm aversion); deferring to ADTs (automation bias); or using ADTs together with their own judgment (an approach the author calls “artificing”). Interviews reveal that artificing is the most common use-type, followed by aversion, while deference is rare. Five conditions appear to influence how practitioners use ADTs: (a) understanding of the tool (b) perception of human judgment (c) seeing value in the tool (d) being offered opportunities to modify the tool (e) alignment of tool with expectations….(More)”.

Using “Big Data” to forecast migration


Blog Post by Jasper Tjaden, Andres Arau, Muertizha Nuermaimaiti, Imge Cetin, Eduardo Acostamadiedo, Marzia Rango: Act 1 — High Expectations

“Data is the new oil,” they say. ‘Big Data’ is even bigger than that. The “data revolution” will contribute to solving societies’ problems and help governments adopt better policies and run more effective programs. In the migration field, digital trace data are seen as a potentially powerful tool to improve migration management processes (visa applicationsasylum decision and geographic allocation of asylum seeker, facilitating integration, “smart borders” etc.).1

Forecasting migration is one particular area where big data seems to excite data nerds (like us) and policymakers alike. If there is one way big data has already made a difference, it is its ability to bring different actors together — data scientists, business people and policy makers — to sit through countless slides with numbers, tables and graphs. Traditional migration data sources, like censuses, administrative data and surveys, have never quite managed to generate the same level of excitement.

Many EU countries are currently heavily investing in new ways to forecast migration. Relatively large numbers of asylum seekers in 2014, 2015 and 2016 strained the capacity of many EU governments. Better forecasting tools are meant to help governments prepare in advance.

In a recent European Migration Network study, 10 out of the 22 EU governments surveyed said they make use of forecasting methods, many using open source data for “early warning and risk analysis” purposes. The 2020 European Migration Network conference was dedicated entirely to the theme of forecasting migration, hosting more than 15 expert presentations on the topic. The recently proposed EU Pact on Migration and Asylum outlines a “Migration Preparedness and Crisis Blueprint” which “should provide timely and adequate information in order to establish the updated migration situational awareness and provide for early warning/forecasting, as well as increase resilience to efficiently deal with any type of migration crisis.” (p. 4) The European Commission is currently finalizing a feasibility study on the use of artificial intelligence for predicting migration to the EU; Frontex — the EU Border Agency — is scaling up efforts to forecast irregular border crossings; EASO — the European Asylum Support Office — is devising a composite “push-factor index” and experimenting with forecasting asylum-related migration flows using machine learning and data at scale. In Fall 2020, during Germany’s EU Council Presidency, the German Interior Ministry organized a workshop series around Migration 4.0 highlighting the benefits of various ways to “digitalize” migration management. At the same time, the EU is investing substantial resources in migration forecasting research under its Horizon2020 programme, including QuantMigITFLOWS, and HumMingBird.

Is all this excitement warranted?

Yes, it is….(More)” See also: Big Data for Migration Alliance

These crowdsourced maps will show exactly where surveillance cameras are watching


Mark Sullivan at FastCompany: “Amnesty International is producing a map of all the places in New York City where surveillance cameras are scanning residents’ faces.

The project will enlist volunteers to use their smartphones to identify, photograph, and locate government-owned surveillance cameras capable of shooting video that could be matched against people’s faces in a database through AI-powered facial recognition.

The map that will eventually result is meant to give New Yorkers the power of information against an invasive technology the usage of which and purpose is often not fully disclosed to the public. It’s also meant to put pressure on the New York City Council to write and pass a law restricting or banning it. Other U.S. cities, such as Boston, Portland, and San Francisco, have already passed such laws.

Facial recognition technology can be developed by scraping millions of images from social media profiles and driver’s licenses without people’s consent, Amnesty says. Software from companies like Clearview AI can then use computer vision algorithms to match those images against facial images captured by closed-circuit television (CCTV) or other video surveillance cameras and stored in a database.

Starting in May, volunteers will be able to use a software tool to identify all the facial recognition cameras within their view—like at an intersection where numerous cameras can often be found. The tool, which runs on a phone’s browser, lets users place a square around any cameras they see. The software integrates Google Street View and Google Earth to help volunteers label and attach geolocation data to the cameras they spot.

The map is part of a larger campaign called “Ban the Scan” that’s meant to educate people around the world on the civil rights dangers of facial recognition. Research has shown that facial recognition systems aren’t as accurate when it comes to analyzing dark-skinned faces, putting Black people at risk of being misidentified. Even when accurate, the technology exacerbates systemic racism because it is disproportionately used to identify people of color, who are already subject to discrimination by law enforcement officials. The campaign is sponsored by Amnesty in partnership with a number of other tech advocacy, privacy, and civil liberties groups.

In the initial phase of the project, which was announced last Thursday, Amnesty and its partners launched a website that New Yorkers can use to generate public comments on the New York Police Department’s (NYPD’s) use of facial recognition….(More)”.

Inside India’s booming dark data economy


Snigdha Poonam and Samarath Bansal at the Rest of the World: “…The black market for data, as it exists online in India, resembles those for wholesale vegetables or smuggled goods. Customers are encouraged to buy in bulk, and the variety of what’s on offer is mind-boggling: There are databases about parents, cable customers, pregnant women, pizza eaters, mutual funds investors, and almost any niche group one can imagine. A typical database consists of a spreadsheet with row after row of names and key details: Sheila Gupta, 35, lives in Kolkata, runs a travel agency, and owns a BMW; Irfaan Khan, 52, lives in Greater Noida, and has a son who just applied to engineering college. The databases are usually updated every three months (the older one is, the less it is worth), and if you buy several at the same time, you’ll get a discount. Business is always brisk, and transactions are conducted quickly. No one will ask you for your name, let alone inquire why you want the phone numbers of five million people who have applied for bank loans.

There isn’t a reliable estimate of the size of India’s data economy or of how much money it generates annually. Regarding the former, each broker we spoke to had a different guess: One said only about one or two hundred professionals make up the top tier, another that every big Indian city has at least a thousand people trading data. To find them, potential customers need only look for their ads on social media or run searches with industry keywords and hashtags — “data,” “leads,” “database” — combined with detailed information about the kind of data they want and the city they want it from.

Privacy experts believe that the data-brokering industry has existed since the early days of the internet’s arrival in India. “Databases have been bought and sold in India for at least 15 years now. I remember a case from way back in 2006 of leaked employee data from Naukri.com (one of India’s first online job portals) being sold on CDs,” says Nikhil Pahwa, the editor and publisher of MediaNama, which covers technology policy. By 2009, data brokers were running SMS-marketing companies that offered complementary services: procuring targeted data and sending text messages in bulk. Back then, there was simply less data, “and those who had it could sell it at whatever price,” says Himanshu Bhatt, a data broker who claims to be retired. That is no longer the case: “Today, everyone has every kind of data,” he said.

No broker we contacted would openly discuss their methods of hunting, harvesting, and selling data. But the day-to-day work generally consists of following the trails that people leave during their travels around the internet. Brokers trawl data storage websites armed with a digital fishing net. “I was shocked when I was surfing [cloud-hosted data sites] one day and came across Aadhaar cards,” Bhatt remarked, referring to India’s state-issued biometric ID cards. Images of them were available to download in bulk, alongside completed loan applications and salary sheets.

Again, the legal boundaries here are far from clear. Anybody who has ever filled out a form on a coupon website or requested a refund for a movie ticket has effectively entered their information into a database that can be sold without their consent by the company it belongs to. A neighborhood cell phone store can sell demographic information to a political party for hyperlocal campaigning, and a fintech company can stealthily transfer an individual’s details from an astrology app onto its own server, to gauge that person’s creditworthiness. When somebody shares employment history on LinkedIn or contact details on a public directory, brokers can use basic software such as web scrapers to extract that data.

But why bother hacking into a database when you can buy it outright? More often, “brokers will directly approach a bank employee and tell them, ‘I need the high-end database’,” Bhatt said. And as demand for information increases, so, too, does data vulnerability. A 2019 survey found that 69% of Indian companies haven’t set up reliable data security systems; 44% have experienced at least one breach already. “In the past 12 months, we have seen an increasing trend of Indians’ data [appearing] on the dark web,” says Beenu Arora, the CEO of the global cyberintelligence firm Cyble….(More)”.

The Politics of Technology in Latin America


Book edited by Avery Plaw, Barbara Carvalho Gurgel and David Ramírez Plascencia: “This book analyses the arrival of emerging and traditional information and technology for public and economic use in Latin America. It focuses on the governmental, economic and security issues and the study of the complex relationship between citizens and government.

The book is divided into three parts:

• ‘Digital data and privacy, prospects and barriers’ centers on the debates among the right of privacy and the loss of intimacy in the Internet,

• ‘Homeland security and human rights’ focuses on how novel technologies such as drones and autonomous weapons systems reconfigure the strategies of police authorities and organized crime,

• ‘Labor Markets, digital media and emerging technologies’ emphasize the legal, economic and social perils and challenges caused by the increased presence of social media, blockchain-based applications, artificial intelligence and automation technologies in the Latin American economy….(More)”.

Enslaved.org


About: “As of December 2020, we have built a robust, open-source architecture to discover and explore nearly a half million people records and 5 million data points. From archival fragments and spreadsheet entries, we see the lives of the enslaved in richer detail. Yet there’s much more work to do, and with the help of scholars, educators, and family historians, Enslaved.org will be rapidly expanding in 2021. We are just getting started….

In recent years, a growing number of archives, databases, and collections that organize and make sense of records of enslavement have become freely and readily accessible for scholarly and public consumption. This proliferation of projects and databases presents a number of challenges:

  • Disambiguating and merging individuals across multiple datasets is nearly impossible given their current, siloed nature;
  • Searching, browsing, and quantitative analysis across projects is extremely difficult;
  • It is often difficult to find projects and databases;
  • There are no best practices for digital data creation;
  • Many projects and datasets are in danger of going offline and disappearing.

In response to these challenges, Matrix: The Center for Digital Humanities & Social Sciences at Michigan State University (MSU), in partnership with the MSU Department of History, University of Maryland, and scholars at multiple institutions, developed Enslaved: Peoples of the Historical Slave TradeEnslaved.org’s primary focus is people—individuals who were enslaved, owned slaves, or participated in slave trading….(More)”.

Reclaiming Free Speech for Democracy and Human Rights in a Digitally Networked World


Paper by Rebecca MacKinnon: : “…divided into three sections. The first section discusses the relevance of international human rights standards to U.S. internet platforms and universities. The second section identifies three common challenges to universities and internet platforms, with clear policy implications. The third section recommends approaches to internet policy that can better protect human rights and strengthen democracy. The paper concludes with proposals for how universities can contribute to the creation of a more robust digital information ecosystem that protects free speech along with other human rights, and advances social justice.

1) International human rights standards are an essential complement to the First Amendment. While the First Amendment does not apply to how privately owned and operated digital platforms set and enforce rules governing their users’ speech, international human rights standards set forth a clear framework to which companies any other type of private organization can and should be held accountable. Scholars of international law and freedom of expression point out that Article 19 of the International Covenant on Civil and Political Rights encompasses not only free speech, but also the right to access information and to formulate opinions without interference. Notably, this aspect of international human rights law is relevant in addressing the harms caused by disinformation campaigns aided by algorithms and targeted profiling. In protecting freedom of expression, private companies and organizations must also protect and respect other human rights, including privacy, non-discrimination, assembly, the right to political participation, and the basic right to security of person.

2) Three core challenges are common to universities and internet platforms. These common challenges must be addressed in order to protect free speech alongside other fundamental human rights including non-discrimination:

Challenge 1: The pretense of neutrality amplifies bias in an unjust world. In an inequitable and unjust world, “neutral” platforms and institutions will perpetuate and even exacerbate inequities and power imbalances unless they understand and adjust for those inequities and imbalances. This fundamental civil rights concept is better understood by the leaders of universities than by those in charge of social media platforms, which have clear impact on public discourse and civic engagement.

Challenge 2: Rules and enforcement are inadequate without strong leadership and cultural norms. Rules governing speech, and their enforcement, can be ineffective and even counterproductive unless they are accompanied by values-based leadership. Institutional cultures should take into account the context and circumstances of unique situations, individuals, and communities. For rules to have legitimacy, communities that are governed by them must be actively engaged in building a shared culture of responsibility.

Challenge 3: Communities need to be able to shape how and where they enable discourse and conduct learning. Different types of discourse that serve different purposes require differently designed spaces—be they physical or digital. It is important for communities to be able to set their own rules of engagement, and shape their spaces for different types of discourse. Overdependence upon a small number of corporate-controlled platforms does not serve communities well. Online free speech not only will be better served by policies that foster competition and strengthen antitrust law; policies and resources must also support the development of nonprofit, open source, and community-driven digital public infrastructure.

3) A clear and consistent policy environment that supports civil rights objectives and is compatible with human rights standards is essential to ensure that the digital public sphere evolves in a way that genuinely protects free speech and advances social justice. Analysis of twenty different consensus declarations, charters, and principles produced by international coalitions of civil society organizations reveals broad consensus with U.S.-based advocates of civil rights-compatible technology policy….(More)”.

Using artificial intelligence to make decisions: Addressing the problem of algorithmic bias (2020)


Foreword of a Report by the Australian Human Rights Commission: “Artificial intelligence (AI) promises better, smarter decision making.

Governments are starting to use AI to make decisions in welfare, policing and law enforcement, immigration, and many other areas. Meanwhile, the private sector is already using AI to make decisions about pricing and risk, to determine what sorts of people make the ‘best’ customers… In fact, the use cases for AI are limited only by our imagination.

However, using AI carries with it the risk of algorithmic bias. Unless we fully understand and address this risk, the promise of AI will be hollow.

Algorithmic bias is a kind of error associated with the use of AI in decision making, and often results in unfairness. Algorithmic bias can arise in many ways. Sometimes the problem is with the design of the AI-powered decision-making tool itself. Sometimes the problem lies with the data set that was used to train the AI tool, which could replicate or even make worse existing problems, including societal inequality.

Algorithmic bias can cause real harm. It can lead to a person being unfairly treated, or even suffering unlawful discrimination, on the basis of characteristics such as their race, age, sex or disability.

This project started by simulating a typical decision-making process. In this technical paper, we explore how algorithmic bias can ‘creep in’ to AI systems and, most importantly, how this problem can be addressed.

To ground our discussion, we chose a hypothetical scenario: an electricity retailer uses an AI-powered tool to decide how to offer its products to customers, and on what terms. The general principles and solutions for mitigating the problem, however, will be relevant far beyond this specific situation.

Because algorithmic bias can result in unlawful activity, there is a legal imperative to address this risk. However, good businesses go further than the bare minimum legal requirements, to ensure they always act ethically and do not jeopardise their good name.

Rigorous design, testing and monitoring can avoid algorithmic bias. This technical paper offers some guidance for companies to ensure that when they use AI, their decisions are fair, accurate and comply with human rights….(More)”