‘Big data’ was supposed to fix education. It didn’t. It’s time for ‘small data’


Pasi Sahlberg and Jonathan Hasak in the Washington Post: “One thing that distinguishes schools in the United States from schools around the world is how data walls, which typically reflect standardized test results, decorate hallways and teacher lounges. Green, yellow, and red colors indicate levels of performance of students and classrooms. For serious reformers, this is the type of transparency that reveals more data about schools and is seen as part of the solution to how to conduct effective school improvement. These data sets, however, often don’t spark insight about teaching and learning in classrooms; they are based on analytics and statistics, not on emotions and relationships that drive learning in schools. They also report outputs and outcomes, not the impacts of learning on the lives and minds of learners….

If you are a leader of any modern education system, you probably care a lot about collecting, analyzing, storing, and communicating massive amounts of information about your schools, teachers, and students based on these data sets. This information is “big data,” a term that first appeared around 2000, which refers to data sets that are so large and complex that processing them by conventional data processing applications isn’t possible. Two decades ago, the type of data education management systems processed were input factors of education system, such as student enrollments, teacher characteristics, or education expenditures handled by education department’s statistical officer. Today, however, big data covers a range of indicators about teaching and learning processes, and increasingly reports on student achievement trends over time.

With the outpouring of data, international organizations continue to build regional and global data banks. Whether it’s the United Nations, the World Bank, the European Commission, or the Organization for Economic Cooperation and Development, today’s international reformers are collecting and handling more data about human development than before. Beyond government agencies, there are global education and consulting enterprises like Pearson and McKinsey that see business opportunities in big data markets.

Among the best known today is the OECD’s Program for International Student Assessment (PISA), which measures reading, mathematical, and scientific literacy of 15-year-olds around the world. OECD now also administers an Education GPS, or a global positioning system, that aims to tell policymakers where their education systems place in a global grid and how to move to desired destinations. OECD has clearly become a world leader in the big data movement in education.

Despite all this new information and benefits that come with it, there are clear handicaps in how big data has been used in education reforms. In fact, pundits and policymakers often forget that Big data, at best, only reveals correlations between variables in education, not causality. As any introduction to statistics course will tell you, correlation does not imply causation….
We believe that it is becoming evident that big data alone won’t be able to fix education systems. Decision-makers need to gain a better understanding of what good teaching is and how it leads to better learning in schools. This is where information about details, relationships and narratives in schools become important. These are what Martin Lindstrom calls “small data”: small clues that uncover huge trends. In education, these small clues are often hidden in the invisible fabric of schools. Understanding this fabric must become a priority for improving education.

To be sure, there is not one right way to gather small data in education. Perhaps the most important next step is to realize the limitations of current big data-driven policies and practices. Too strong reliance on externally collected data may be misleading in policy-making. This is an example of what small data look like in practice:

  • It reduces census-based national student assessments to the necessary minimum and transfer saved resources to enhance the quality of formative assessments in schools and teacher education on other alternative assessment methods. Evidence shows that formative and other school-based assessments are much more likely to improve quality of education than conventional standardized tests.
  • It strengthens collective autonomy of schools by giving teachers more independence from bureaucracy and investing in teamwork in schools. This would enhance social capital that is proved to be critical aspects of building trust within education and enhancing student learning.
  • It empowers students by involving them in assessing and reflecting their own learning and then incorporating that information into collective human judgment about teaching and learning (supported by national big data). Because there are different ways students can be smart in schools, no one way of measuring student achievement will reveal success. Students’ voices about their own growth may be those tiny clues that can uncover important trends of improving learning.

Edwards Deming once said that “without data you are another person with an opinion.” But Deming couldn’t have imagined the size and speed of data systems we have today….(More)”

OSoMe: The IUNI observatory on social media


Clayton A Davis et al at Peer J. PrePrint:  “The study of social phenomena is becoming increasingly reliant on big data from online social networks. Broad access to social media data, however, requires software development skills that not all researchers possess. Here we present the IUNI Observatory on Social Media, an open analytics platform designed to facilitate computational social science. The system leverages a historical, ongoing collection of over 70 billion public messages from Twitter. We illustrate a number of interactive open-source tools to retrieve, visualize, and analyze derived data from this collection. The Observatory, now available at osome.iuni.iu.edu, is the result of a large, six-year collaborative effort coordinated by the Indiana University Network Science Institute.”…(More)”

A Framework for Understanding Data Risk


Sarah Telford and Stefaan G. Verhulst at Understanding Risk Forum: “….In creating the policy, OCHA partnered with the NYU Governance Lab (GovLab) and Leiden University to understand the policy and privacy landscape, best practices of partner organizations, and how to assess the data it manages in terms of potential harm to people.

We seek to share our findings with the UR community to get feedback and start a conversation around the risk to using certain types of data in humanitarian and development efforts and when understanding risk.

What is High-Risk Data?

High-risk data is generally understood as data that includes attributes about individuals. This is commonly referred to as PII or personally identifiable information. Data can also create risk when it identifies communities or demographics within a group and ties them to a place (i.e., women of a certain age group in a specific location). The risk comes when this type of data is collected and shared without proper authorization from the individual or the organization acting as the data steward; or when the data is being used for purposes other than what was initially stated during collection.

The potential harms of inappropriately collecting, storing or sharing personal data can affect individuals and communities that may feel exploited or vulnerable as the result of how data is used. This became apparent during the Ebola outbreak of 2014, when a number of data projects were implemented without appropriate risk management measures. One notable example was the collection and use of aggregated call data records (CDRs) to monitor the spread of Ebola, which not only had limited success in controlling the virus, but also compromised the personal information of those in Ebola-affected countries. (See Ebola: A Big Data Disaster).

A Data-Risk Framework

Regardless of an organization’s data requirements, it is useful to think through the potential risks and harms for its collection, storage and use. Together with the Harvard Humanitarian Initiative, we have set up a four-step data risk process that includes doing an assessment and inventory, understanding risks and harms, and taking measures to counter them.

  1. Assessment – The first step is to understand the context within which the data is being generated and shared. The key questions to ask include: What is the anticipated benefit of using the data? Who has access to the data? What constitutes the actionable information for a potential perpetrator? What could set off the threat to the data being used inappropriately?
  1. Data Inventory – The second step is to take inventory of the data and how it is being stored. Key questions include: Where is the data – is it stored locally or hosted by a third party? Where could the data be housed later? Who might gain access to the data in the future? How will we know – is data access being monitored?
  1. Risks and Harms – The next step is to identify potential ways in which risk might materialize. Thinking through various risk-producing scenarios will help prepare staff for incidents. Examples of risks include: your organization’s data being correlated with other data sources to expose individuals; your organization’s raw data being publicly released; and/or your organization’s data system being maliciously breached.
  1. Counter-Measures – The next step is to determine what measures would prevent risk from materializing. Methods and tools include developing data handling policies, implementing access controls to the data, and training staff on how to use data responsibly….(More)

Impact of open government: Mapping the research landscape


Stephen Davenport at OGP Blog: “Government reformers and development practitioners in the open government space are experiencing the heady times associated with a newly-defined agenda. The opportunity for innovation and positive change can at times feel boundless. Yet, working in a nascent field also means a relative lack of “proven” tools and solutions (to such extent as they ever exist in development).

More research on the potential for open government initiatives to improve lives is well underway. However, keeping up with the rapidly evolving landscape of ongoing research, emerging hypotheses, and high-priority knowledge gaps has been a challenge, even as investment in open government activities has accelerated. This becomes increasing important as we gather to talk progress at the OGP Africa Regional Meeting 2016(link is external) and GIFT(link is external) consultations in Cape Town next week (May 4-6) .

Who’s doing what?
To advance the state of play, a new report commissioned by the World Bank, “Open Government Impact and Outcomes: Mapping the Landscape of Ongoing Research”(link is external), categorizes and takes stock of existing research. The report represents the first output of a newly-formed consortium (link is external) that aims to generate practical, evidence-based guidance for open government stakeholders, building on and complementing the work of organizations across the academic-practitioner spectrum.

The mapping exercise led to the creation of an interactive platform (link is external) with detailed information on how to find out more about each of the research projects covered, organized by a new typology for open government interventions. The inventory is limited in scope given practical and other considerations. It includes only projects that are currently underway. It is meant to be a forward-looking overview, rather than a literature review–and are relatively large and international in nature.

Charting a course: How can the World Bank add value?
The scope for increasing the open government knowledge base remains vast. The report suggests that, given its role as a lender, convener, and a policy advisor the World Bank is well positioned to complement and support existing research in a number of ways, such as:

  • Taking a demand-driven approach, focusing on specific areas where it can identify lessons for stakeholders seeking to turn open government enthusiasm into tangible results.
  • Linking researchers with governments and practitioners to study specific areas of interest (in particular, access to information and social accountability interventions).
  • Evaluating the impact of open government reforms against baseline data that may not be public yet, but that are accessible to the World Bank.
  • Contributing to a better understanding of the role and impact of ICTs through work like the recently-published study (link is external)that examines the relationship between digital citizen engagement and government responsiveness.
  • Ensuring that World Bank loans and projects are conceived as opportunities for knowledge generation, while incorporating the most relevant and up-to-date evidence on what works in different contexts.
  • Leveraging its involvement in the Open Government Partnership to help stakeholders make evidence-based reform commitments….(More)

Four Steps to Enhanced Crowdsourcing


Kendra L. Smith and Lindsey Collins at Planetizen: “Over the past decade, crowdsourcing has grown to significance through crowdfunding, crowd collaboration, crowd voting, and crowd labor. The idea behind crowdsourcing is simple: decentralize decision-making by utilizing large groups of people to assist with solving problems, generating ideas, funding, generating data, and making decisions. We have seen crowdsourcing used in both the private and public sectors. In a previous article, “Empowered Design, By ‘the Crowd,'” we discuss the significant role crowdsourcing can play in urban planning through citizen engagement.

Crowdsourcing in the public sector represents a more inclusive form of governance that incorporates a multi-stakeholder approach; it goes beyond regular forms of community engagement and allows citizens to participate in decision-making. When citizens help inform decision-making, new opportunities are created for cities—opportunities that are beginning to unfold for planners. However, despite its obvious utility, planners underutilize crowdsourcing. A key reason for its underuse can be attributed to a lack of credibility and accountability in crowdsourcing endeavors.

Crowdsourcing credibility speaks to the capacity to trust a source and discern whether information is, indeed, true. While it can be difficult to know if any information is definitively true, indicators of fact or truth include where information was collected, how information was collected, and how rigorously it was fact-checking or peer reviewed. However, in the digital universe of today, individuals can make a habit of posting inaccurate, salacious, malicious, and flat-out false information. The realities of contemporary media make it more difficult to trust crowdsourced information for decision-making, especially for the public sector, where the use of inaccurate information can impact the lives of many and the trajectory of a city. As a result, there is a need to establish accountability measures to enhance crowdsourcing in urban planning.

Establishing Accountability Measures

For urban planners considering crowdsourcing, establishing a system of accountability measures might seem like more effort than it is worth. However, that is simply not true. Recent evidence has proven traditional community engagement (e.g., town halls, forums, city council meetings) is lower than ever. Current engagement also tends to focus on problems in the community rather than the development of the community. Crowdsourcing offers new opportunities for ongoing and sustainable engagement with the community. It can be simple as well.

The following four methods can be used separately or together (we hope they are used together) to help establish accountability and credibility in the crowdsourcing process:

  1. Agenda setting
  2. Growing a crowdsourcing community
  3. Facilitators/subject matter experts (SME)
  4. Microtasking

In addition to boosting credibility, building a framework of accountability measures can help planners and crowdsourcing communities clearly define their work, engage the community, sustain community engagement, acquire help with tasks, obtain diverse opinions, and become more inclusive….(More)”

Supply and demand of open data in Mexico: A diagnostic report on the government’s new open data portal


Report by Juan Ortiz Freuler: “Following a promising and already well-established trend, in February 2014 the Office of the President of Mexico launched its open data portal (datos.gob.mx). This diagnostic –carried out between July and September of 2015- is designed to brief international donors and stakeholders such as members of the Open Government Partnership Steering Committee, provides the reader with contextual information to understand the state of supply and demand for open data from the portal, and the specific challenges the mexican government is facing in its quest to implement the policy. The insights offered through data processing and interviews with key stakeholders indicate the need to promote: i) A sense of ownership of datos.gob.mx by the user community, but particularly by the officials in charge of implementing the policy within each government unit; ii) The development of tools and mechanisms to increase the quality of the data provided through the portal; and iii) Civic hacking of the portal to promote innovation, and a sense of appropriation that would increase the policy’s long-term resilience to partisan and leadership change….(More)”

See also Underlying data: http://bit.ly/dataMXEng1Spanish here: http://bit.ly/DataMxCastellUnderlying data:http://bit.ly/dataMX2

A Political Economy Framework for the Urban Data Revolution


Research Report by Ben Edwards, Solomon Greene and G. Thomas Kingsley: “With cities growing rapidly throughout much of the developing world, the global development community increasingly recognizes the need to build the capacities of local leaders to analyze and apply data to improve urban policymaking and service delivery. Civil society leaders, development advocates, and local governments are calling for an “urban data revolution” to accompany the new UN Sustainable Development Goals (SDGs), a revolution that would provide city leaders new tools and resources for data-driven governance. The need for improved data and analytic capacity in rapidly growing cities is clear, as is the exponential increase in the volume and types of data available for policymaking. However, the institutional arrangements that will allow city leaders to use data effectively remain incompletely theorized and poorly articulated.

This paper begins to fill that gap with a political economy framework that introduces three new concepts: permission, incentive, and institutionalization. We argue that without addressing the permission constraints and competing incentives that local government officials face in using data, investments in improved data collection at the local level will fail to achieve smarter urban policies. Granting permission and aligning incentives are also necessary to institutionalize data-driven governance at the local level and create a culture of evidence-based decisionmaking that outlives individual political administrations. Lastly, we suggest how the SDGs could support a truly transformative urban data revolution in which city leaders are empowered and incentivized to use data to drive decisionmaking for sustainable development…(More)”

Crowdsourcing global governance: sustainable development goals, civil society, and the pursuit of democratic legitimacy


Paper by Joshua C. Gellers in International Environmental Agreements: Politics, Law and Economics: “To what extent can crowdsourcing help members of civil society overcome the democratic deficit in global environmental governance? In this paper, I evaluate the utility of crowdsourcing as a tool for participatory agenda-setting in the realm of post-2015 sustainable development policy. In particular, I analyze the descriptive representativeness (e.g., the degree to which participation mirrors the demographic attributes of non-state actors comprising global civil society) of participants in two United Nations orchestrated crowdsourcing processes—the MY World survey and e-discussions regarding environmental sustainability. I find that there exists a perceptible demographic imbalance among contributors to the MY World survey and considerable dissonance between the characteristics of participants in the e-discussions and those whose voices were included in the resulting summary report. The results suggest that although crowdsourcing may present an attractive technological approach to expand participation in global governance, ultimately the representativeness of that participation and the legitimacy of policy outputs depend on the manner in which contributions are solicited and filtered by international institutions….(More)”

UN-Habitat Urban Data Portal


Data Driven Journalism:UN-Habitat has launched a new web portal featuring a wealth of city data based on its repository of research on urban trends.

Launched during the 25th Governing Council, the Urban Data Portal allows users to explore data from 741 cities in 220 countries, and compare these for 103 indicators such as slum prevalence and city prosperity.

compare.PNG
Image: A comparison of share in national urban population and average annual rate of urban population change for San Salvador, El Salvador, and Asuncion, Paraguay.

The urban indicators data available are analyzed, compiled and published by UN-Habitat’s Global Urban Observatory, which supports governments, local authorities and civil society organizations to develop urban indicators, data and statistics.

Leveraging GIS technology, the Observatory collects data by taking aerial photographs, zooming into particular areas, and then sending in survey teams to answer any remaining questions about the area’s urban development.

The Portal also contains data collected by national statistics authorities, via household surveys and censuses, with analysis conducted by leading urbanists in UN-HABITAT’s State of the World’s Cities and the Global Report on Human Settlements report series.

For the first time, these datasets are available for use under an open licence agreement, and can be downloaded in straightforward database formats like CSV and JSON….(More)

The Wisdom of the Many in Global Governance: An Epistemic-Democratic Defence of Diversity and Inclusion


Paper by Stevenson, H. : “Over the past two decades, a growing body of literature has highlighted moral reasons for taking global democracy seriously. This literature justifies democracy on the grounds of its intrinsic value. But democracy also has instrumental value: the rule of the many is epistemically superior to the rule of one or the rule of the few. This paper draws on the tradition of epistemic democracy to develop an instrumentalist justification for democratizing global governance. The tradition of epistemic democracy is enjoying a renaissance within political theory and popular non-fiction, yet its relevance for international relations remains unexplored. I develop an epistemic-democratic framework for evaluating political institutions, which is constituted by three principles. The likelihood of making correct decisions within institutions of global governance will be greater when (1) human development and capacity for participation is maximised; (2) the internal cognitive diversity of global institutions is maximised; and (3) public opportunities for sharing objective and subjective knowledge are maximised. Applying this framework to global governance produces a better understanding of the nature and extent of the ‘democratic deficit’ of global governance, as well as the actions required to address this deficit….(More)”