Chapter by Bettina Lange and Fiona Haines in the book Regulatory Transformations: “Regulation is no longer the prerogative of either states or markets. Increasingly citizens in association with businesses catalyse regulation which marks the rise of a social sphere in regulation. Around the world, in San Francisco, Melbourne, Munich and Mexico City, citizens have sought to transform how and to what end economic transactions are conducted. For instance, ‘carrot mob’ initiatives use positive economic incentives, not provided by a state legal system, but by a collective of civil society actors in order to change business behaviour. In contrast to ‘negative’ consumer boycotts, ‘carrotmob’ events use ‘buycotts’. They harness competition between businesses as the lever for changing how and for what purpose business transactions are conducted. Through new social media ‘carrotmobs’ mobilize groups of citizens to purchase goods at a particular time in a specific shop. The business that promises to spend the greatest percentage of its takings on, for instance, environmental improvements, such as switching to a supplier of renewable energy, will be selected for an organized shopping spree and financially benefit from the extra income it receives from the ‘carrot mob’ event.’Carrot mob’ campaigns chime with other fundamental challenges to conventional economic activity, such as the shared use of consumer goods through citizens collective consumption which questions traditional conceptions of private property….(More; Other Chapters)”
Keynote by Robert M. Goerge at the 2016 Third International Conference on eDemocracy & eGovernment (ICEDEG) Open data portals are springing up around the world. Municipalities, states and countries have made available data that has never been as accessible to the general public. These data have led to many applications that have informed the public of new urban conditions or provided information to make urban life easier. However, it should be clear that these data have limitations in the effort to solve many urban problems because in may cases they do not provide all of the information that is needed by government and NGOs to get at the cause or at least correlations of the problem at hand. It is still necessary to have access to data that cannot be made public to address some of most serious urban problems. While this seems just to apply to public access, it is also the case that government employees or those with legitimate access to the necessary non-open data lack access because of legal, organizational, privacy, or bureaucratic issues. This limits the promise of increasing data-driven efforts to address the most critical urban issues. Solutions to these problems in the context of ethical behavior will be discussed….(More)”
Pasi Sahlberg and Jonathan Hasak in the Washington Post: “One thing that distinguishes schools in the United States from schools around the world is how data walls, which typically reflect standardized test results, decorate hallways and teacher lounges. Green, yellow, and red colors indicate levels of performance of students and classrooms. For serious reformers, this is the type of transparency that reveals more data about schools and is seen as part of the solution to how to conduct effective school improvement. These data sets, however, often don’t spark insight about teaching and learning in classrooms; they are based on analytics and statistics, not on emotions and relationships that drive learning in schools. They also report outputs and outcomes, not the impacts of learning on the lives and minds of learners….
If you are a leader of any modern education system, you probably care a lot about collecting, analyzing, storing, and communicating massive amounts of information about your schools, teachers, and students based on these data sets. This information is “big data,” a term that first appeared around 2000, which refers to data sets that are so large and complex that processing them by conventional data processing applications isn’t possible. Two decades ago, the type of data education management systems processed were input factors of education system, such as student enrollments, teacher characteristics, or education expenditures handled by education department’s statistical officer. Today, however, big data covers a range of indicators about teaching and learning processes, and increasingly reports on student achievement trends over time.
With the outpouring of data, international organizations continue to build regional and global data banks. Whether it’s the United Nations, the World Bank, the European Commission, or the Organization for Economic Cooperation and Development, today’s international reformers are collecting and handling more data about human development than before. Beyond government agencies, there are global education and consulting enterprises like Pearson and McKinsey that see business opportunities in big data markets.
Among the best known today is the OECD’s Program for International Student Assessment (PISA), which measures reading, mathematical, and scientific literacy of 15-year-olds around the world. OECD now also administers an Education GPS, or a global positioning system, that aims to tell policymakers where their education systems place in a global grid and how to move to desired destinations. OECD has clearly become a world leader in the big data movement in education.
Despite all this new information and benefits that come with it, there are clear handicaps in how big data has been used in education reforms. In fact, pundits and policymakers often forget that Big data, at best, only reveals correlations between variables in education, not causality. As any introduction to statistics course will tell you, correlation does not imply causation….
We believe that it is becoming evident that big data alone won’t be able to fix education systems. Decision-makers need to gain a better understanding of what good teaching is and how it leads to better learning in schools. This is where information about details, relationships and narratives in schools become important. These are what Martin Lindstrom calls “small data”: small clues that uncover huge trends. In education, these small clues are often hidden in the invisible fabric of schools. Understanding this fabric must become a priority for improving education.
To be sure, there is not one right way to gather small data in education. Perhaps the most important next step is to realize the limitations of current big data-driven policies and practices. Too strong reliance on externally collected data may be misleading in policy-making. This is an example of what small data look like in practice:
It reduces census-based national student assessments to the necessary minimum and transfer saved resources to enhance the quality of formative assessments in schools and teacher education on other alternative assessment methods. Evidence shows that formative and other school-based assessments are much more likely to improve quality of education than conventional standardized tests.
It strengthens collective autonomy of schools by giving teachers more independence from bureaucracy and investing in teamwork in schools. This would enhance social capital that is proved to be critical aspects of building trust within education and enhancing student learning.
It empowers students by involving them in assessing and reflecting their own learning and then incorporating that information into collective human judgment about teaching and learning (supported by national big data). Because there are different ways students can be smart in schools, no one way of measuring student achievement will reveal success. Students’ voices about their own growth may be those tiny clues that can uncover important trends of improving learning.
Edwards Deming once said that “without data you are another person with an opinion.” But Deming couldn’t have imagined the size and speed of data systems we have today….(More)”
Pradip Sigdyal at CNBC: “…The often cited case of big data discrimination points to a research conducted few years ago by Latanya Sweeny, who heads the Data Privacy Lab at Harvard University.
The case involves Google ad results when searching for certain kinds of names on the internet. In her research, Sweeney found that distinct sounding names often associated with blacks showed up with a disproportionately higher number of arrest record ads compared to white sounding names by roughly 18 percent of the time. Google has since fixed the issue, although they never publicly stated what they did to correct the problem.
According to some advocates, the problem is not so much that ‘big data discriminates’, but that failures by data professionals risk misinterpreting the findings at the heart of data mining and statistical learning. They add that the benefits far outweigh the concerns.
“In my academic research and industry consulting, I have seen tremendous benefits accruing to firms, organizations and consumers alike from the use of data-driven decision-making, data science, and business analytics,” Anindya Ghose, the director of Center for Business Analytics at New York University’s Stern School of Business, said.
“To be perfectly honest, I do not at all understand these big-data cynics who engage in fear mongering about the implications of data analytics,” Ghose said.
“Here is my message to the cynics and those who keep cautioning us: ‘Deal with it, big data analytics is here to stay forever’.”…(More)”
Clayton A Davis et al at Peer J. PrePrint: “The study of social phenomena is becoming increasingly reliant on big data from online social networks. Broad access to social media data, however, requires software development skills that not all researchers possess. Here we present the IUNI Observatory on Social Media, an open analytics platform designed to facilitate computational social science. The system leverages a historical, ongoing collection of over 70 billion public messages from Twitter. We illustrate a number of interactive open-source tools to retrieve, visualize, and analyze derived data from this collection. The Observatory, now available at osome.iuni.iu.edu, is the result of a large, six-year collaborative effort coordinated by the Indiana University Network Science Institute.”…(More)”
César A. Hidalgo at Scientific American: “Imagine shopping in a supermarket where every item is stored in boxes that look exactly the same. Some are filled with cereal, others with apples, and others with shampoo. Shopping would be an absolute nightmare! The design of most open data sites—the (usually government) sites that distribute census, economic and other data to be used and redistributed freely—is not exactly equivalent to this nightmarish supermarket. But it’s pretty close.
During the last decade, such sites—data.gov, data.gov.uk, data.gob.cl,data.gouv.fr, and many others—have been created throughout the world. Most of them, however, still deliver data as sets of links to tables, or links to other sites that are also hard to comprehend. In the best cases, data is delivered through APIs, or application program interfaces, which are simple data query languages that require a user to have a basic knowledge of programming. So understanding what is inside each dataset requires downloading, opening, and exploring the set in ways that are extremely taxing for users. The analogy of the nightmarish supermarket is not that far off.
THE U.S. GOVERNMENT’S OPEN DATA SITE
The consensus among those who have participated in the creation of open data sites is that current efforts have failed and we need new options. Pointing your browser to these sites should show you why. Most open data sites are badly designed, and here I am not talking about their aesthetics—which are also subpar—but about the conceptual model used to organize and deliver data to users. The design of most open data sites follows a throwing-spaghetti-against-the-wall strategy, where opening more data, instead of opening data better, has been the driving force.
Some of the design flaws of current open data sites are pretty obvious. The datasets that are more important, or could potentially be more useful, are not brought into the surface of these sites or are properly organized. In our supermarket analogy, not only all boxes look the same, but also they are sorted in the order they came. This cannot be the best we can do.
There are other design problems that are important, even though they are less obvious. The first one is that most sites deliver data in the way in which it is collected, instead of used. People are often looking for data about a particular place, occupation, industry, or about an indicator (such as income, or population). If the data they need comes from the national survey of X, or the bureau of Y, it is secondary and often—although not always—irrelevant to the user. Yet, even though this is not the way we should be giving data back to users, this is often what open data sites do.
The second non-obvious design problem, which is probably the most important, is that most open data sites bury data in what is known as the deep web. The deep web is the fraction of the Internet that is not accessible to search engines, or that cannot be indexed properly. The surface of the web is made of text, pictures, and video, which search engines know how to index. But search engines are not good at knowing that the number that you are searching for is hidden in row 17,354 of a comma separated file that is inside a zip file linked in a poorly described page of an open data site. In some cases, pressing a radio button and selecting options from a number of dropdown menus can get you the desired number, but this does not help search engines either, because crawlers cannot explore dropdown menus. To make open data really open, we need to make it searchable, and for that we need to bring data to the surface of the web.
So how do we that? The solution may not be simple, but it starts by taking design seriously. This is something that I’ve been doing for more than half a decade when creating data visualization engines at MIT. The latest iteration of our design principles are now embodied in DataUSA, a site we created in a collaboration between Deloitte, Datawheel, and my group at MIT.
So what is design, and how do we use it to improve open data sites? My definition of design is simple. Design is discovering the forms that best fulfill a function….(More)”
Sarah Telford and Stefaan G. Verhulst at Understanding Risk Forum: “….In creating the policy, OCHA partnered with the NYU Governance Lab (GovLab) and Leiden University to understand the policy and privacy landscape, best practices of partner organizations, and how to assess the data it manages in terms of potential harm to people.
We seek to share our findings with the UR community to get feedback and start a conversation around the risk to using certain types of data in humanitarian and development efforts and when understanding risk.
What is High-Risk Data?
High-risk data is generally understood as data that includes attributes about individuals. This is commonly referred to as PII or personally identifiable information. Data can also create risk when it identifies communities or demographics within a group and ties them to a place (i.e., women of a certain age group in a specific location). The risk comes when this type of data is collected and shared without proper authorization from the individual or the organization acting as the data steward; or when the data is being used for purposes other than what was initially stated during collection.
The potential harms of inappropriately collecting, storing or sharing personal data can affect individuals and communities that may feel exploited or vulnerable as the result of how data is used. This became apparent during the Ebola outbreak of 2014, when a number of data projects were implemented without appropriate risk management measures. One notable example was the collection and use of aggregated call data records (CDRs) to monitor the spread of Ebola, which not only had limited success in controlling the virus, but also compromised the personal information of those in Ebola-affected countries. (See Ebola: A Big Data Disaster).
A Data-Risk Framework
Regardless of an organization’s data requirements, it is useful to think through the potential risks and harms for its collection, storage and use. Together with the Harvard Humanitarian Initiative, we have set up a four-step data risk process that includes doing an assessment and inventory, understanding risks and harms, and taking measures to counter them.
Assessment – The first step is to understand the context within which the data is being generated and shared. The key questions to ask include: What is the anticipated benefit of using the data? Who has access to the data? What constitutes the actionable information for a potential perpetrator? What could set off the threat to the data being used inappropriately?
Data Inventory – The second step is to take inventory of the data and how it is being stored. Key questions include: Where is the data – is it stored locally or hosted by a third party? Where could the data be housed later? Who might gain access to the data in the future? How will we know – is data access being monitored?
Risks and Harms – The next step is to identify potential ways in which risk might materialize. Thinking through various risk-producing scenarios will help prepare staff for incidents. Examples of risks include: your organization’s data being correlated with other data sources to expose individuals; your organization’s raw data being publicly released; and/or your organization’s data system being maliciously breached.
Counter-Measures – The next step is to determine what measures would prevent risk from materializing. Methods and tools include developing data handling policies, implementing access controls to the data, and training staff on how to use data responsibly….(More)
Latest White House report on Big Data charts pathways for fairness and opportunity but also cautions against re-encoding bias and discrimination into algorithmic systems: ” Advertisements tailored to reflect previous purchasing decisions; targeted job postings based on your degree and social networks; reams of data informing predictions around college admissions and financial aid. Need a loan? There’s an app for that.
As technology advances and our economic, social, and civic lives become increasingly digital, we are faced with ethical questions of great consequence. Big data and associated technologies create enormous new opportunities to revisit assumptions and instead make data-driven decisions. Properly harnessed, big data can be a tool for overcoming longstanding bias and rooting out discrimination.
The era of big data is also full of risk. The algorithmic systems that turn data into information are not infallible—they rely on the imperfect inputs, logic, probability, and people who design them. Predictors of success can become barriers to entry; careful marketing can be rooted in stereotype. Without deliberate care, these innovations can easily hardwire discrimination, reinforce bias, and mask opportunity.
Because technological innovation presents both great opportunity and great risk, the White House has released several reports on “big data” intended to prompt conversation and advance these important issues. The topics of previous reports on data analytics included privacy, prices in the marketplace, and consumer protection laws. Today, we are announcing the latest report on big data, one centered on algorithmic systems, opportunity, and civil rights.
The first big data report warned of “the potential of encoding discrimination in automated decisions”—that is, discrimination may “be the inadvertent outcome of the way big data technologies are structured and used.” A commitment to understanding these risks and harnessing technology for good prompted us to specifically examine the intersection between big data and civil rights.
Using case studies on credit lending, employment, higher education, and criminal justice, the report we are releasing today illustrates how big data techniques can be used to detect bias and prevent discrimination. It also demonstrates the risks involved, particularly how technologies can deliberately or inadvertently perpetuate, exacerbate, or mask discrimination.
The purpose of the report is not to offer remedies to the issues it raises, but rather to identify these issues and prompt conversation, research—and action—among technologists, academics, policy makers, and citizens, alike.
The report includes a number of recommendations for advancing work in this nascent field of data and ethics. These include investing in research, broadening and diversifying technical leadership, cross-training, and expanded literacy on data discrimination, bolstering accountability, and creating standards for use within both the government and the private sector. It also calls on computer and data science programs and professionals to promote fairness and opportunity as part of an overall commitment to the responsible and ethical use of data.
Big data is here to stay; the question is how it will be used: to advance civil rights and opportunity, or to undermine them….(More)”
Paper by Bezuidenhout, L, Rappert, B, Kelly, A and Leonelli, S: “Poor provision of information and communication technologies in low/middle-income countries represents a concern for promoting Open Data. This is often framed as a “digital divide” and addressed through initiatives that increase the availability of information and communication technologies to researchers based in low-resourced environments, as well as the amount of resources freely accessible online, including data themselves. Using empirical data from a qualitative study of lab-based research in Africa we highlight the limitations of such framing and emphasize the range of additional factors necessary to effectively utilize data available online. We adopt the ‘Capabilities Approach’ proposed by Sen to highlight the distinction between simply making resources available, and doing so while fostering researchers’ ability to use them. This provides an alternative orientation that highlights the persistence of deep inequalities within the seemingly egalitarian-inspired Open Data landscape. The extent and manner of future data sharing, we propose, will hinge on the ability to respond to the heterogeneity of research environments…(More)“
Conference Paper by Satchell, Christine et al :”Social media platforms risk polarising public opinions by employing proprietary algorithms that produce filter bubbles and echo chambers. As a result, the ability of citizens and communities to engage in robust debate in the public sphere is diminished. In response, this paper highlights the capacity of urban interfaces, such as pervasive displays, to counteract this trend by exposing citizens to the socio-cultural diversity of the city. Engagement with different ideas, networks and communities is crucial to both innovation and the functioning of democracy. We discuss examples of urban interfaces designed to play a key role in fostering this engagement. Based on an analysis of works empirically-grounded in field observations and design research, we call for a theoretical framework that positions pervasive displays and other urban interfaces as civic media. We argue that when designed for more than wayfinding, advertisement or television broadcasts, urban screens as civic media can rectify some of the pitfalls of social media by allowing the polarised user to break out of their filter bubble and embrace the cultural diversity and richness of the city….(More)”