When Guarding Student Data Endangers Valuable Research


Susan M. Dynarski  in the New York Times: “There is widespread concern over threats to privacy posed by the extensive personal data collected by private companies and public agencies.

Some of the potential danger comes from the government: The National Security Agency has swept up the telephone records of millions of people, in what it describes as a search for terrorists. Other threats are posed by hackers, who have exploited security gaps to steal data from retail giantslike Target and from the federal Office of Personnel Management.

Resistance to data collection was inevitable — and it has been particularly intense in education.

Privacy laws have already been strengthened in some states, and multiple bills now pending in state legislatures and in Congress would tighten the security and privacy of student data. Some of this proposed legislation is so broadly written, however, that it could unintentionally choke off the use of student data for its original purpose: assessing and improving education. This data has already exposed inequities, allowing researchers and advocates to pinpoint where poor, nonwhite and non-English-speaking children have been educated inadequately by their schools.

Data gathering in education is indeed extensive: Across the United States, large, comprehensive administrative data sets now track the academic progress of tens of millions of students. Educators parse this data to understand what is working in their schools. Advocates plumb the data to expose unfair disparities in test scores and graduation rates, building cases to target more resources for the poor. Researchers rely on this data when measuring the effectiveness of education interventions.

To my knowledge there has been no large-scale, Target-like theft of private student records — probably because students’ test scores don’t have the market value of consumers’ credit card numbers. Parents’ concerns have mainly centered not on theft, but on the sharing of student data with third parties, including education technology companies. Last year, parentsresisted efforts by the tech start-up InBloom to draw data on millions of students into the cloud and return it to schools as teacher-friendly “data dashboards.” Parents were deeply uncomfortable with a third party receiving and analyzing data about their children.

In response to such concerns, some pending legislation would scale back the authority of schools, districts and states to share student data with third parties, including researchers. Perhaps the most stringent of these proposals, sponsored by Senator David Vitter, a Louisiana Republican, would effectively end the analysis of student data by outside social scientists. This legislation would have banned recent prominent research documenting the benefits of smaller classes, the value of excellent teachersand the varied performance of charter schools.

Under current law, education agencies can share data with outside researchers only to benefit students and improve education. Collaborations with researchers allow districts and states to tap specialized expertise that they otherwise couldn’t afford. The Boston public school district, for example, has teamed up with early-childhood experts at Harvard to plan and evaluate its universal prekindergarten program.

In one of the longest-standing research partnerships, the University of Chicago works with the Chicago Public Schools to improve education. Partnerships like Chicago’s exist across the nation, funded by foundations and the United States Department of Education. In one initiative, a Chicago research consortium compiled reports showing high school principals that many of the seniors they had sent off to college swiftly dropped out without earning a degree. This information spurred efforts to improve high school counseling and college placement.

Specific, tailored information in the hands of teachers, principals or superintendents empowers them to do better by their students. No national survey could have told Chicago’s principals how their students were doing in college. Administrative data can provide this information, cheaply and accurately…(More)”

Beating the news’ with EMBERS: Forecasting Civil Unrest using Open Source Indicators


Paper by Naren Ramakrishnan et al: “We describe the design, implementation, and evaluation of EMBERS, an automated, 24×7 continuous system for forecasting civil unrest across 10 countries of Latin America using open source indicators such as tweets, news sources, blogs, economic indicators, and other data sources. Unlike retrospective studies, EMBERS has been making forecasts into the future since Nov 2012 which have been (and continue to be) evaluated by an independent T&E team (MITRE). Of note, EMBERS has successfully forecast the uptick and downtick of incidents during the June 2013 protests in Brazil. We outline the system architecture of EMBERS, individual models that leverage specific data sources, and a fusion and suppression engine that supports trading off specific evaluation criteria. EMBERS also provides an audit trail interface that enables the investigation of why specific predictions were made along with the data utilized for forecasting. Through numerous evaluations, we demonstrate the superiority of EMBERS over baserate methods and its capability to forecast significant societal happenings….(More)”

Big Data’s Impact on Public Transportation


InnovationEnterprise: “Getting around any big city can be a real pain. Traffic jams seem to be a constant complaint, and simply getting to work can turn into a chore, even on the best of days. With more people than ever before flocking to the world’s major metropolitan areas, the issues of crowding and inefficient transportation only stand to get much worse. Luckily, the traditional methods of managing public transportation could be on the verge of changing thanks to advances in big data. While big data use cases have been a part of the business world for years now, city planners and transportation experts are quickly realizing how valuable it can be when making improvements to city transportation. That hour long commute may no longer be something travelers will have to worry about in the future.

In much the same way that big data has transformed businesses around the world by offering greater insight in the behavior of their customers, it can also provide a deeper look at travellers. Like retail customers, commuters have certain patterns they like to keep to when on the road or riding the rails. Travellers also have their own motivations and desires, and getting to the heart of their actions is all part of what big data analytics is about. By analyzing these actions and the factors that go into them, transportation experts can gain a better understanding of why people choose certain routes or why they prefer one method of transportation over another. Based on these findings, planners can then figure out where to focus their efforts and respond to the needs of millions of commuters.

Gathering the accurate data needed to make knowledgeable decisions regarding city transportation can be a challenge in itself, especially considering how many people commute to work in a major city. New methods of data collection have made that effort easier and a lot less costly. One way that’s been implemented is through the gathering of call data records (CDR). From regular transactions made from mobile devices, information about location, time, and duration of an action (like a phone call) can give data scientists the necessary details on where people are traveling to, how long it takes them to get to their destination, and other useful statistics. The valuable part of this data is the sample size, which provides a much bigger picture of the transportation patterns of travellers.

That’s not the only way cities are using big data to improve public transportation though. Melbourne in Australia has long been considered one of the world’s best cities for public transit, and much of that is thanks to big data. With big data and ad hoc analysis, Melbourne’s acclaimed tram system can automatically reconfigure routes in response to sudden problems or challenges, such as a major city event or natural disaster. Data is also used in this system to fix problems before they turn serious.Sensors located in equipment like tram cars and tracks can detect when maintenance is needed on a specific part. Crews are quickly dispatched to repair what needs fixing, and the tram system continues to run smoothly. This is similar to the idea of the Internet of Things, wherein embedded sensors collect data that is then analyzed to identify problems and improve efficiency.

Sao Paulo, Brazil is another city that sees the value of using big data for its public transportation. The city’s efforts concentrate on improving the management of its bus fleet. With big data collected in real time, the city can get a more accurate picture of just how many people are riding the buses, which routes are on time, how drivers respond to changing conditions, and many other factors. Based off of this information, Sao Paulo can optimize its operations, providing added vehicles where demand is genuine whilst finding which routes are the most efficient. Without big data analytics, this process would have taken a very long time and would likely be hit-or-miss in terms of accuracy, but now, big data provides more certainty in a shorter amount of time….(More)”

Handbook: How to Catalyze Humanitarian Innovation in Computing Research Institutes


Patrick Meier: “The handbook below provides practical collaboration guidelines for both humanitarian organizations & computing research institutes on how to catalyze humanitarian innovation through successful partnerships. These actionable guidelines are directly applicable now and draw on extensive interviews with leading humanitarian groups and CRI’s including the International Committee of the Red Cross (ICRC), United Nations Office for the Coordination of Humanitarian Affairs (OCHA), United Nations Children’s Fund (UNICEF), United Nations High Commissioner for Refugees (UNHCR), UN Global Pulse, Carnegie Melon University (CMU), International Business Machines (IBM), Microsoft Research, Data Science for Social Good Program at the University of Chicago and others.

This handbook, which is the first of its kind, also draws directly on years of experience and lessons learned from the Qatar Computing Research Institute’s (QCRI) active collaboration and unique partnerships with multiple international humanitarian organizations. The aim of this blog post is to actively solicit feedback on this first, complete working draft, which is available here as an open and editable Google Doc. …(More)”

Want to fix the world? Start by making clean energy a default setting


Chris Mooney in the Washington Post: “In recent years, psychologists and behavioral scientists have begun to decipher why we make the choices that we do when it comes to using energy. And the bottom line is that it’s hard to characterize those choices as fully “rational.”

Rather than acting like perfect homo economicuses, they’ve found, we’rehighly swayed by the energy use of our neighbors and friends — peer pressure, basically. At the same time, we’re also heavily biased by the status quo — we delay in switching to new energy choices, even when they make a great deal of economic sense.

 All of which has led to the popular idea of “nudging,” or the idea that you can subtly sway people to change their behavior by changing, say, the environment in which they make choices, or the kinds of information they receive. Not in a coercive way, but rather, through gentle tweaks and prompts. And now, a major study in Nature Climate Change demonstrates that one very popular form of energy-use nudging that might be called “default switching,” or the “default effect,” does indeed work — and indeed, could possibly work at a very large scale.

“This is the first demonstration of a large-scale nudging effect using defaults in the domain of energy choices,” says Sebastian Lotz of Stanford University and the University of Lausanne in Switzerland, who conducted the research with Felix Ebeling of the University of Cologne in Germany….(More)”

Confidence in U.S. Institutions Still Below Historical Norms


Jeffrey M. Jones at Gallup: “Americans’ confidence in most major U.S. institutions remains below the historical average for each one. Only the military (72%) and small business (67%) — the highest-rated institutions in this year’s poll — are currently rated higher than their historical norms, based on the percentage expressing “a great deal” or “quite a lot” of confidence in the institution.

Confidence in U.S. Institutions, 2015 vs. Historical Average for Each Institution

These results are based on a June 2-7 Gallup poll that included Gallup’s latest update on confidence in U.S. institutions. Gallup first measured confidence ratings in 1973 and has updated them each year since 1993.

Americans’ confidence in most major institutions has been down for many years as the nation has dealt with prolonged wars in Iraq and Afghanistan, a major recession and sluggish economic improvement, and partisan gridlock in Washington. In fact, 2004 was the last year most institutions were at or above their historical average levels of confidence. Perhaps not coincidentally, 2004 was also the last year Americans’ satisfaction with the way things are going in the United States averaged better than 40%. Currently, 28% of Americans are satisfied with the state of the nation.

From a broad perspective, Americans’ confidence in all institutions over the last two years has been the lowest since Gallup began systematic updates of a larger set of institutions in 1993. The average confidence rating of the 14 institutions asked about annually since 1993 — excluding small business, asked annually since 2007 — is 32% this year. This is one percentage point above the all-institution average of 31% last year. Americans were generally more confident in all institutions in the late 1990s and early 2000s as the country enjoyed a strong economy and a rally in support for U.S. institutions after the 9/11 terrorist attacks.

Trend: Average Confidence Rating Across All Institutions, by Year

Confidence in Political, Financial and Religious Institutions Especially Low

Today’s confidence ratings of Congress, organized religion, banks, the Supreme Court and the presidency show the greatest deficits compared with their historical averages, all running at least 10 points below that mark. Americans’ frustration with the government’s performance has eroded the trust they have in all U.S. political institutions….(More)”

Why open data should be central to Fifa reform


Gavin Starks in The Guardian: “Over the past two weeks, Fifa has faced mounting pressure to radically improve its transparency and governance in the wake of corruption allegations. David Cameron has called for reforms including expanding the use of open data.

Open data is information made available by governments, businesses and other groups for anyone to read, use and share. Data.gov.uk was launched as the home of UK open government data in January 2010 and now has almost 21,000 published datasets, including on government spending.

Allowing citizens to freely access data related to the institutions that govern them is essential to a well-functioning democratic society. It is the first step towards holding leaders to account for failures and wrongdoing.

Fifa has a responsibility for the shared interests of millions of fans around the world. Football’s popularity means that Fifa’s governance has wide-ranging implications for society, too. This is particularly true of decisions about hosting the World Cup, which is often tied to large-scale government investment in infrastructure and even extends to law-making. Brazil spent up to £10bn hosting the 2014 World Cup and had to legalise the sale of beer at matches.

Following Sepp Blatter’s resignation, Fifa will gather its executive committee in July to plan for a presidential election, expected to take place in mid-December. Open data should form the cornerstone of any prospective candidate’s manifesto. It can help Fifa make better spending decisions and ensure partners deliver value for money, restore the trust of the international football community.

Fifa’s lengthy annual financial report gives summaries of financial expenditure,budgeted at £184m for operations and governance alone in 2016, but individual transactions are not published. Publishing spending data incentivises better spending decisions. If all Fifa’s outgoings – which totalled around £3.5bn between 2011 and 2014 – were made open, it would encourage much more efficiency….(more)”

Exploring Open Energy Data in Urban Areas


The Worldbank: “…Energy efficiency – using less energy input to deliver the same level of service – has been described by many as the ‘first fuel’ of our societies. However, lack of adequate data to accurately predict and measure energy efficiency savings, particularly at the city level, has limited the realization of its promise over the past two decades.
Why Open Energy Data?
Open Data can be a powerful tool to reduce information asymmetry in markets, increase transparency and help achieve local economic development goals. Several sectors like transport, public sector management and agriculture have started to benefit from Open Data practices. Energy markets are often characterized by less-than-optimal conditions with high system inefficiencies, misaligned incentives and low levels of transparency. As such, the sector has a lot to potentially gain from embracing Open Data principles.
The United States is a leader in this field with its ‘Energy Data’ initiative. This initiative makes data easy to find, understand and apply, helping to fuel a clean energy economy. For example, the Energy Information Administration’s (EIA) open application programming interface (API) has more than 1.2 million time series of data and is frequently visited by users from the private sector, civil society and media. In addition, the Green Button  initiative is empowering American citizens to have access to their own energy usage data, and OpenEI.org is an Open Energy Information platform to help people find energy information, share their knowledge and connect to other energy stakeholders.
Introducing the Open Energy Data Assessment
To address this data gap in emerging and developing countries, the World Bank is conducting a series of Open Energy Data Assessments in urban areas. The objective is to identify important energy-related data, raise awareness of the benefits of Open Data principles and improve the flow of data between traditional energy stakeholders and others interested in the sector.
The first cities we assessed were Accra, Ghana and Nairobi, Kenya. Both are among the fastest-growing cities in the world, with dynamic entrepreneurial and technology sectors, and both are capitals of countries with an ongoing National Open Data Initiative., The two cities have also been selected to be part of the Negawatt Challenge, a World Bank international competition supporting technology innovation to solve local energy challenges.
The ecosystem approach
The starting point for the exercise was to consider the urban energy sector as an ecosystem, comprised of data suppliers, data users, key datasets, a legal framework, funding mechanisms, and ICT infrastructure. The methodology that we used adapted the established World Bank Open Data Readiness Assessment (ODRA), which highlights valuable connections between data suppliers and data demand.  The assessment showcases how to match pressing urban challenges with the opportunity to release and use data to address them, creating a longer-term commitment to the process. Mobilizing key stakeholders to provide quick, tangible results is also key to this approach….(More) …See also World Bank Open Government Data Toolkit.”

Waze and the Traffic Panopticon


 in the New Yorker: “In April, during his second annual State of the City address, Los Angeles Mayor Eric Garcetti announced a data-sharing agreement with Waze, the Google-owned, Israel-based navigation service. Waze is different from most navigation apps, including Google Maps, in that it relies heavily on real-time, user-generated data. Some of this data is produced actively—a driver or passenger sees a stalled vehicle, then uses a voice command or taps a stalled-vehicle icon on the app to alert others—while other data, such as the user’s location and average speed, is gathered passively, via smartphones. The agreement will see the city provide Waze with some of the active data it collects, alerting drivers to road closures, construction, and parades, among other things. From Waze, the city will get real-time data on traffic and road conditions. Garcetti said that the partnership would mean “less congestion, better routing, and a more livable L.A.” Di-Ann Eisnor, Waze’s head of growth, acknowledged to me that these kinds of deals can cause discomfort to the people working inside city government. “It’s exciting, but people inside are also fearful because it seems like too much work, or it seems so unknown,” she said.

Indeed, the deal promises to help the city improve some of its traffic and infrastructure systems (L.A. still uses paper to manage pothole patching, for example), but it also acknowledges Waze’s role in the complex new reality of urban traffic planning. Traditionally, traffic management has been a largely top-down process. In Los Angeles, it is coördinated in a bunker downtown, several stories below the sidewalk, where engineers stare at blinking lights representing traffic and live camera feeds of street intersections. L.A.’s sensor-and-algorithm-driven Automated Traffic Surveillance and Control System is already one of the world’s most sophisticated traffic-mitigation tools, but it can only do so much to manage the city’s eternally unsophisticated gridlock. Los Angeles appears to see its partnership with Waze as an important step toward improving the bridge between its subterranean panopticon and the rest of the city still further, much like other metropolises that have struck deals with Waze under the company’s Connected Cities program.
Among the early adopters is Rio de Janeiro, whose urban command center tracks everything from accidents to hyperlocal weather conditions, pulling data from thirty departments and private companies, including Waze. “In Rio,” Eisnor said, traffic managers “were able to change the garbage routes, figure out where to install cameras, and deploy traffic personnel” because of the program. She also pointed out that Connected Cities has helped municipal workers in Washington, D.C., patch potholes within forty-eight hours of their being identified on Waze. “We’re helping reframe city planning through not just space but space and time,” she said…..(More)

The Tragedy of the Digital Commons


J. Nathan Matias in the Atlantic “….Milland and other regular Turkers navigate this precariously free market withTurkopticon, a DIY technology for rating employers created in 2008. To use it, workers install a browser plugin that extends Amazon’s website with special rating features. Before accepting a new task, workers check how others have rated the employer. After finishing, they can also leave their own rating of how well they were treated. Collective rating on Turkopticon is an act of citizenship in the digital world. This digital citizenship acknowledges that online experiences are as much a part of our common life as our schools, sidewalks, and rivers—requiring as much stewardship, vigilance, and improvement as anything else we share.

“How do you fix a broken system that isn’t yours to repair?” That’s the question that motivated the researchers Lilly Irani and Six Silberman to create Turkopticon, and it’s one that comes up frequently in digital environments dominated by large platforms with hands-off policies. (On social networks like Twitter, for example, harassment is a problem for many users.) Irani and Silberman describe Turkopticon as a “mutual aid for accountability” technology, a system that coordinates peer support to hold others accountable when platforms choose not to step in.

Mutual aid accountability is a growing response to the complex social problems people face online. On Twitter, systems like The Block Bot and BlockTogether coordinate collective judgments about alleged online harassers. The systems then collectively block tweets from accounts that a group prefers not to hear from. Last month, the advocacy organization Hollaback raised over $20,000 on Kickstarter to create support networks for people experiencing harassment. In November, I worked with the advocacy organization Women, Action, and the Media, which took a role as “authorized reporter” with Twitter. For three weeks WAM! accepted reports, sorted evidence, and forwarded serious cases to Twitter. In response, the company warned, suspended, and deleted the accounts of many alleged harassers.
These mutual aid technologies operate in the shadow of larger systems with gaps in how people are supported—even when platforms do step in, says Stuart Geiger, a Berkeley Ph.D. student. In other words, sometimes a platform’s system-wide solutions to a problem can create their own problems. For several years, Geiger and his colleague Aaron Halfaker, now a researcher at Wikimedia, were concerned that Wikipedia’s semi-automated anti-vandalism systems might be making the site unfriendly. As a graduate student unable to change Wikipedia’s code, Halfaker created Snuggle, a mutual-aid mentorship technology that tracks the site’s spam responders. When Snuggle users think a newcomer’s edits were mistakenly flagged as spam, the software coordinates Wikipedians to help those users recover from the negative experience of getting revoked.

By organizing peer support at scale, the designers of Turkopticon and its cousins draw attention to common problems, hoping to influence longer-term change on a complex issue. In time, the idea goes, requesters on Mechanical Turk might change their treatment of workers, Amazon might change its policies and software, or regulators might set new rules for digital labor. This is an approach with a long history in an area that might seem unlikely: the conservation movement. (Silberman and Irani cite the movement as inspiration for Turkopticon.)

To better understand how this approach might influence digital citizenship, I followed the history of mutual-aid accountability in a precious common network that the city of Boston enjoys every day: the Charles River. Planned, re-routed, exploited and contested, it has inspired and supported human life since before written history….(More)”