Ethics and Data Science


(Open) Ebook by Mike LoukidesHilary Mason and DJ Patil: “As the impact of data science continues to grow on society there is an increased need to discuss how data is appropriately used and how to address misuse. Yet, ethical principles for working with data have been available for decades. The real issue today is how to put those principles into action. With this report, authors Mike Loukides, Hilary Mason, and DJ Patil examine practical ways for making ethical data standards part of your work every day.

To help you consider all of possible ramifications of your work on data projects, this report includes:

  • A sample checklist that you can adapt for your own procedures
  • Five framing guidelines (the Five C’s) for building data products: consent, clarity, consistency, control, and consequences
  • Suggestions for building ethics into your data-driven culture

Now is the time to invest in a deliberate practice of data ethics, for better products, better teams, and better outcomes….(More)”.

Decentralisation: the next big step for the world wide web


Zoë Corbyn at The Observer: “The decentralised web, or DWeb, could be a chance to take control of our data back from the big tech firms. So how does it work and when will it be here?...What is the decentralised web? 
It is supposed to be like the web you know but without relying on centralised operators. In the early days of the world wide web, which came into existence in 1989, you connected directly with your friends through desktop computers that talked to each other. But from the early 2000s, with the advent of Web 2.0, we began to communicate with each other and share information through centralised services provided by big companies such as Google, Facebook, Microsoft and Amazon. It is now on Facebook’s platform, in its so called “walled garden”, that you talk to your friends. “Our laptops have become just screens. They cannot do anything useful without the cloud,” says Muneeb Ali, co-founder of Blockstack, a platform for building decentralised apps. The DWeb is about re-decentralising things – so we aren’t reliant on these intermediaries to connect us. Instead users keep control of their data and connect and interact and exchange messages directly with others in their network.

Why do we need an alternative? 
With the current web, all that user data concentrated in the hands of a few creates risk that our data will be hacked. It also makes it easier for governments to conduct surveillance and impose censorship. And if any of these centralised entities shuts down, your data and connections are lost. Then there are privacy concerns stemming from the business models of many of the companies, which use the private information we provide freely to target us with ads. “The services are kind of creepy in how much they know about you,” says Brewster Kahle, the founder of the Internet Archive. The DWeb, say proponents, is about giving people a choice: the same services, but decentralised and not creepy. It promises control and privacy, and things can’t all of a sudden disappear because someone decides they should. On the DWeb, it would be harder for the Chinese government to block a site it didn’t like, because the information can come from other places.

How does the DWeb work that is different? 

There are two big differences in how the DWeb works compared to the world wide web, explains Matt Zumwalt, the programme manager at Protocol Labs, which builds systems and tools for the DWeb. First, there is this peer-to-peer connectivity, where your computer not only requests services but provides them. Second, how information is stored and retrieved is different. Currently we use http and https links to identify information on the web. Those links point to content by its location, telling our computers to find and retrieve things from those locations using the http protocol. By contrast, DWeb protocols use links that identify information based on its content – what it is rather than where it is. This content-addressed approach makes it possible for websites and files to be stored and passed around in many ways from computer to computer rather than always relying on a single server as the one conduit for exchanging information. “[In the traditional web] we are pointing to this location and pretending [the information] exists in only one place,” says Zumwalt. “And from this comes this whole monopolisation that has followed… because whoever controls the location controls access to the information.”…(More)”.

Commonism: A New Aesthetics of the Real


Book edited by Nico Dockx and Pascal Gielen: “After half a century of neoliberalism, a new radical, practice-based ideology is making its way from the margins: commonism, with an o in the middle. It is based on the values of sharing, common (intellectual) ownership and new social co-operations. Commoners assert that social relationships can replace money (contract) relationships. They advocate solidarity and they trust in peer-to-peer relationships to develop new ways of production.

Commonism maps those new ideological thoughts. How do they work and, especially, what is their aesthetics? How do they shape the reality of our living together? Is there another, more just future imaginable through the commons? What strategies and what aesthetics do commoners adopt? This book explores this new political belief system, alternating between theoretical analysis, wild artistic speculation, inspiring art examples, almost empirical observations and critical reflection….(More)”.

Google launches new search engine to help scientists find the datasets they need


James Vincent at The Verge: “The service, called Dataset Search, launches today, and it will be a companion of sorts to Google Scholar, the company’s popular search engine for academic studies and reports. Institutions that publish their data online, like universities and governments, will need to include metadata tags in their webpages that describe their data, including who created it, when it was published, how it was collected, and so on. This information will then be indexed by Google’s search engine and combined with information from the Knowledge Graph. (So if dataset X was published by CERN, a little information about the institute will also be included in the search.)

Speaking to The Verge, Natasha Noy, a research scientist at Google AI who helped created Dataset Search, says the aim is to unify the tens of thousands of different repositories for datasets online. “We want to make that data discoverable, but keep it where it is,” says Noy.

At the moment, dataset publication is extremely fragmented. Different scientific domains have their own preferred repositories, as do different governments and local authorities. “Scientists say, ‘I know where I need to go to find my datasets, but that’s not what I always want,’” says Noy. “Once they step out of their unique community, that’s when it gets hard.”

Noy gives the example of a climate scientist she spoke to recently who told her she’d been looking for a specific dataset on ocean temperatures for an upcoming study but couldn’t find it anywhere. She didn’t track it down until she ran into a colleague at a conference who recognized the dataset and told her where it was hosted. Only then could she continue with her work. “And this wasn’t even a particularly boutique depository,” says Noy. “The dataset was well written up in a fairly prominent place, but it was still difficult to find.”

An example search for weather records in Google Dataset Search.
 Image: Google

The initial release of Dataset Search will cover the environmental and social sciences, government data, and datasets from news organizations like ProPublica. However, if the service becomes popular, the amount of data it indexes should quickly snowball as institutions and scientists scramble to make their information accessible….(More)”.

Reflecting the Past, Shaping the Future: Making AI Work for International Development


USAID Report: “We are in the midst of an unprecedented surge of interest in machine learning (ML) and artificial intelligence (AI) technologies. These tools, which allow computers to make data-derived predictions and automate decisions, have become part of daily life for billions of people. Ubiquitous digital services such as interactive maps, tailored advertisements, and voice-activated personal assistants are likely only the beginning. Some AI advocates even claim that AI’s impact will be as profound as “electricity or fire” that it will revolutionize nearly every field of human activity. This enthusiasm has reached international development as well. Emerging ML/AI applications promise to reshape healthcare, agriculture, and democracy in the developing world. ML and AI show tremendous potential for helping to achieve sustainable development objectives globally. They can improve efficiency by automating labor-intensive tasks, or offer new insights by finding patterns in large, complex datasets. A recent report suggests that AI advances could double economic growth rates and increase labor productivity 40% by 2035. At the same time, the very nature of these tools — their ability to codify and reproduce patterns they detect — introduces significant concerns alongside promise.

In developed countries, ML tools have sometimes been found to automate racial profiling, to foster surveillance, and to perpetuate racial stereotypes. Algorithms may be used, either intentionally or unintentionally, in ways that result in disparate or unfair outcomes between minority and majority populations. Complex models can make it difficult to establish accountability or seek redress when models make mistakes. These shortcomings are not restricted to developed countries. They can manifest in any setting, especially in places with histories of ethnic conflict or inequality. As the development community adopts tools enabled by ML and AI, we need a cleareyed understanding of how to ensure their application is effective, inclusive, and fair. This requires knowing when ML and AI offer a suitable solution to the challenge at hand. It also requires appreciating that these technologies can do harm — and committing to addressing and mitigating these harms.

ML and AI applications may sometimes seem like science fiction, and the technical intricacies of ML and AI can be off-putting for those who haven’t been formally trained in the field. However, there is a critical role for development actors to play as we begin to lean on these tools more and more in our work. Even without technical training in ML, development professionals have the ability — and the responsibility — to meaningfully influence how these technologies impact people.

You don’t need to be an ML or AI expert to shape the development and use of these tools. All of us can learn to ask the hard questions that will keep solutions working for, and not against, the development challenges we care about. Development practitioners already have deep expertise in their respective sectors or regions. They bring necessary experience in engaging local stakeholders, working with complex social systems, and identifying structural inequities that undermine inclusive progress. Unless this expert perspective informs the construction and adoption of ML/AI technologies, ML and AI will fail to reach their transformative potential in development.

This document aims to inform and empower those who may have limited technical experience as they navigate an emerging ML/AI landscape in developing countries. Donors, implementers, and other development partners should expect to come away with a basic grasp of common ML techniques and the problems ML is uniquely well-suited to solve. We will also explore some of the ways in which ML/AI may fail or be ill-suited for deployment in developing-country contexts. Awareness of these risks, and acknowledgement of our role in perpetuating or minimizing them, will help us work together to protect against harmful outcomes and ensure that AI and ML are contributing to a fair, equitable, and empowering future…(More)”.

The Role of Scholarly Communication in a Democratic Society


Introdution to Special Issue of the Journal of Librarianship and Scholarly Communication by Yasmeen Shorish: “The pillars of a democratic society (equity, a free press, fair elections, engaged citizens, and the equal application of laws) are directly impacted by the availability, accessibility, and accuracy of information. Additionally, engaged, critically thinking individuals require an understanding of how knowledge is produced and shared, who has the power to make that information available, and how they—as information consumers and producers—are involved in those processes. Proposed and adopted government policies and actions that limit transparency and engagement, the increasing commodification of learning, the framing of education as a measure of return on investment (ROI) in real dollars, and the rapid transition of the research landscape to an increasingly monopolized walled garden have been in motion for some time but come into sharp focus through the lens of scholarly communication.

Scholarly communication is a broad domain that covers how information and knowledge are created and shared, what levels of access to that information are available, and how economic factors influence information communication. This system affects both the production and consumption of information and knowledge.

As such, the question of democratic or equitable processes is internal (Is the scholarly communication domain democratic and equitable?) and external (How does scholarly communication affect a democratic society?). The scholarly communication and research landscapes have never been level playing fields for all interested parties. Funding constraints, prejudices, and politics have all been factors in the amplification and suppression of people’s perspectives. In this special issue, I wanted to investigate how librarians and other information professionals are interrogating those practices and situating their scholarly communication work within the frame of an equitable and democratic society. What are the challenges and the opportunities? Where are we making progress? Where is there disenfranchisement? …(More)”.

Keeping Democracy Alive in Cities


Myung J. Lee at the Stanford Social Innovation Review:  “It seems everywhere I go these days, people are talking and writing and podcasting about America’s lack of trust—how people don’t trust government and don’t trust each other. President Trump discourages us from trusting anything, especially the media. Even nonprofit organizations, which comprise the heart of civil society, are not exempt: A recent study found that trust in NGOsdropped by nine percent between 2017 and 2018. This fundamental lack of trust is eroding the shared public space where progress and even governance can happen, putting democracy at risk.

How did we get here? Perhaps it’s because Americans have taken our democratic way of life for granted. Perhaps it’s because people’s individual and collective beliefs are more polarized—and more out in the open—than ever before. Perhaps we’ve stopped believing we can solve problems together.

There are, however, opportunities to rebuild and fortify our sense of trust. This is especially true at the local level, where citizens can engage directly with elected leaders, nonprofit organizations, and each other.

As French political scientist Alexis de Tocqueville observed in Democracy in America, “Municipal institutions constitute the strength of free nations. Town meetings are to liberty what primary schools are to science; they bring it within the people’s reach; they teach men how to use and how to enjoy it.” Through town halls and other means, cities are where citizens, elected leaders, and nonprofit organizations can most easily connect and work together to improve their communities.

Research shows that, while trust in government is low everywhere, it is highest in local government. This is likely because people can see that their votes influence issues they care about, and they can directly interact with their mayors and city council members. Unlike with members of Congress, citizens can form real relationships with local leaders through events like “walks with the mayor” and neighborhood cleanups. Some mayors do even more to connect with their constituents. In Detroit, for example, Mayor Michael Duggan meets with residents in their homes to help them solve problems and answer questions in person. Many mayors also join in neighborhood projects. San Jose Mayor Sam Liccardo, for example, participates in a different community cleanup almost every week. Engaged citizens who participate in these activities are more likely to feel that their participation in democratic society is valuable and effective.

The role of nonprofit and community-based organizations, then, is partly to sustain democracy by being the bridge between city governments and citizens, helping them work together to solve concrete problems. It’s hard and important work. Time and again, this kind of relationship- and trust-building through action creates ripple effects that grow over time.

In my work with Cities of Service, which helps mayors and other city leaders effectively engage their citizens to solve problems, I’ve learned that local government works better when it is open to the ideas and talents of citizens. Citizen collaboration can take many forms, including defining and prioritizing problems, generating solutions, and volunteering time, creativity, and expertise to set positive change in motion. Citizens can leverage their own deep expertise about what’s best for their families and communities to deliver better services and solve public problems….(More)”.

Message and Environment: a framework for nudges and choice architecture


Paper by Luca Congiu and Ivan Moscati in Behavioural Public Policy: “We argue that the diverse components of a choice architecture can be classified into two main dimensions – Message and Environment – and that the distinction between them is useful in order to better understand how nudges work. In the first part of this paper, we define what we mean by nudge, explain what Message and Environment are, argue that the distinction between them is conceptually robust and show that it is also orthogonal to other distinctions advanced in the nudge literature. In the second part, we review some common types of nudges and show they target either Message or Environment or both dimensions of the choice architecture. We then apply the Message–Environment framework to discuss some features of Amazon’s website and, finally, we indicate how the proposed framework could help a choice architect to design a new choice architecture….(More)”.

The UK’s Gender Pay Gap Open Data Law Has Flaws, But Is A Positive Step Forward


Article by Michael McLaughlin: “Last year, the United Kingdom enacted a new regulation requiring companies to report information about their gender pay gap—a measure of the difference in average pay between men and women. The new rules are a good example of how open data can drive social change. However, the regulations have produced some misleading statistics, highlighting the importance of carefully crafting reporting requirements to ensure that they produce useful data.

In the UK, nearly 11,000 companies have filed gender pay gap reports, which include both the difference between the mean and median hourly pay rates for men and women as well the difference in bonuses. And the initial data reveals several interesting findings. Median pay for men is 11.8 percent higher than for women, on average, and nearly 87 percent of companies pay men more than women on average. In addition, over 1,000 firms had a median pay gap greater than 30 percent. The sectors with the highest pay gaps—construction, finance, and insurance—each pay men at least 20 percent more than women. A major reason for the gap is a lack of women in senior positions—UK women actually make more than men between the ages of 22-29. The total pay gap is also a result of more women holding part-time jobs.

However, as detractors note, the UK’s data can be misleading. For example, the data overstates the pay gap on bonuses because it does not adjust these figures for hours worked. More women work part-time than men, so it makes sense that women would receive less in bonus pay when they work less. The data also understates the pay gap because it excludes the high compensation of partners in organizations such as law firms, a group that includes few women. And it is important to note that—by definition—the pay gap data does not compare the wages of men and women working the same jobs, so the data says nothing about whether women receive equal pay for equal work.

Still, publication of the data has sparked an important national conversation. Google searches in the UK for the phrase “gender pay gap” experienced a 12-month high the week the regulations began enforcement, and major news sites like Financial Times have provided significant coverage of the issue by analyzing the reported data. While it is too soon to tell if the law will change employer behavior, such as businesses hiring more female executives, or employee behavior, such as women leaving companies or fields that pay less, countries with similar reporting requirements, such as Belgium, have seen the pay gap narrow following implementation of their rules.

Requiring companies to report this data to the government may be the only way to obtain gender pay gap data, because evidence suggests that the private sector will not produce this data on its own. Only 300 UK organizations joined a voluntary government program to report their gender pay gap in 2011, and as few as 11 actually published the data. Crowdsourced efforts, where women voluntary report their pay, have also suffered from incomplete data. And even complete data does not illuminate variables such as why women may work in a field that pays less….(More)”.

This co-op lets patients monetize their own health data


Eillie Anzilotti at FastCompany: “Diagnosed with juvenile arthritis as a kid, Jen Horonjeff knew she wanted to enter the medical field to help others navigate the healthcare system in America. She went on to get her Ph.D. in environmental medicine, hoping to better understand the social and contextual factors that surround the strict biology of a disease. Throughout her studies, though, something began to irk her. In both the practice of and research around medicine, she found that the perspective of the patient was all but nonexistent.

So in 2016, Horonjeff, along with her co-founder Ronnie Sharpe, who grew up with cystic fibrosis and founded a social network for others with the diseases, started Savvy, a platform to bridge the gap between patients and practitioners. The platform officially launched in the fall of 2017, and recently became a public benefit corporation….

But Savvy also tackles another imbalance in the patient-practitioner relationship. Whenever a patient is seen by a doctor, or enters their information into a medical app or platform, they’re providing the health community an invaluable resource: their data. But they’re not getting compensated for it. To ensure that patients participating in Savvy get something in return, Horonjeff and Sharpe set their platform up as a cooperative, owned collectively by the patients that contribute to it. Any patient who wants to become a Savvy member pays a buy-in fee of $34, which establishes them as a member of the co-op (the fee is waived for patients who cannot afford it, and some other members give more than the base membership fee to subsidize others). “When people become members, they have a voice in what we do, and they also share in our profits,” Horonjeff says….(More)”.