Data enriched research, data enhanced impact: the importance of UK data infrastructure.


Matthew Woollard at LSE Impact Blog: “…Data made available for reuse, such as those in the UK Data Service collection have huge potential. They can unlock new discoveries in research, provide evidence for policy decisions and help promote core data skills in the next generation of researchers. By being part of a single infrastructure, data owners and data creators can work together with the UK Data Service – rather than duplicating efforts – to engage with the people who can drive the impact of their research further to provide real benefit to society. As a service we are also identifying new ways to understand and promote our impact, and our Impact Fellow and Director of Impact and Communications, Victoria Moody, is focusing on raising the visibility of the UK Data Service holdings and developing and promoting the use and impact of the data and resources in policy-relevant research, especially to new audiences such as policymakers, government sectors, charities, the private sector and the media…..

We are improving how we demonstrate the impact of both the Service and the data which we hold, by focusing on generating more and more authentic user corroboration. Our emphasis is on drawing together evidence about the reach and significance of the impact of our data and resources, and of the Service as a whole through our infrastructure and expertise. Headline impact indicators through which we will better understand our impact cover a range of areas (outlined above) where the Service brings efficiency to data access and re-use, benefit to its users and a financial and social return on investment.

We are working to understand more about how Service data contributes to impact by tracking the use of Service data in a range of initiatives focused on developing impact from research and by developing our insight into usage of our data by our users. Data in the collection have featured in a range of impact case studies in the Research Excellence Framework 2014. We are also developing a focus on understanding the specific beneficial effect, rather than simply that data were used in an output, that is – as it appears in policy, debate or the evidential process (although important). Early thoughts in developing this process are where (ideally) cited data can be tracked through the specific beneficial outcome and on to an evidenced effect, corroborated by the end user.

data service 1

Our impact case studies demonstrate how the data have supported research which has led to policy change in a range of areas including; the development of mathematical models for Practice based Commissioning budgets for adult mental health in the UK and informing public policy on obesity; both using the Health Survey for England. Service data have also informed the development of impact around understanding public attitudes towards the police and other legal institutions using the Crime Survey for England and Wales and research to support the development of the national minimum wage using the Labour Force Survey. The cutting-edge new Demos Integration Hub maps the changing face of Britain’s diversity, revealing a mixed picture in the integration and upward mobility of ethnic minority communities and uses 2011 Census aggregate data (England and Wales) and Understanding Society….(More)”

Of Remixology: Ethics and Aesthetics after Remix


New book by David J. Gunkel : “Remix—or the practice of recombining preexisting content—has proliferated across media both digital and analog. Fans celebrate it as a revolutionary new creative practice; critics characterize it as a lazy and cheap (and often illegal) recycling of other people’s work. In Of Remixology, David Gunkel argues that to understand remix, we need to change the terms of the debate. The two sides of the remix controversy, Gunkel contends, share certain underlying values—originality, innovation, artistic integrity. And each side seeks to protect these values from the threat that is represented by the other. In reevaluating these shared philosophical assumptions, Gunkel not only provides a new way to understand remix, he also offers an innovative theory of moral and aesthetic value for the twenty-first century.

In a section called “Premix,” Gunkel examines the terminology of remix (including “collage,” “sample,” “bootleg,” and “mashup”) and its material preconditions, the technology of recording. In “Remix,” he takes on the distinction between original and copy; makes a case for repetition; and considers the question of authorship in a world of seemingly endless recompiled and repurposed content. Finally, in “Postmix,” Gunkel outlines a new theory of moral and aesthetic value that can accommodate remix and its cultural significance, remixing—or reconfiguring and recombining—traditional philosophical approaches in the process….(More)”

How Big Data is Helping to Tackle Climate Change


Bernard Marr at DataInformed: “Climate scientists have been gathering a great deal of data for a long time, but analytics technology’s catching up is comparatively recent. Now that cloud, distributed storage, and massive amounts of processing power are affordable for almost everyone, those data sets are being put to use. On top of that, the growing number of Internet of Things devices we are carrying around are adding to the amount of data we are collecting. And the rise of social media means more and more people are reporting environmental data and uploading photos and videos of their environment, which also can be analyzed for clues.

Perhaps one of the most ambitious projects that employ big data to study the environment is Microsoft’s Madingley, which is being developed with the intention of creating a simulation of all life on Earth. The project already provides a working simulation of the global carbon cycle, and it is hoped that, eventually, everything from deforestation to animal migration, pollution, and overfishing will be modeled in a real-time “virtual biosphere.” Just a few years ago, the idea of a simulation of the entire planet’s ecosphere would have seemed like ridiculous, pie-in-the-sky thinking. But today it’s something into which one of the world’s biggest companies is pouring serious money. Microsoft is doing this because it believes that analytical technology has finally caught up with the ability to collect and store data.

Another data giant that is developing tools to facilitate analysis of climate and ecological data is EMC. Working with scientists at Acadia National Park in Maine, the company has developed platforms to pull in crowd-sourced data from citizen science portals such as eBird and iNaturalist. This allows park administrators to monitor the impact of climate change on wildlife populations as well as to plan and implement conservation strategies.

Last year, the United Nations, under its Global Pulse data analytics initiative, launched the Big Data Climate Challenge, a competition aimed to promote innovate data-driven climate change projects. Among the first to receive recognition under the program is Global Forest Watch, which combines satellite imagery, crowd-sourced witness accounts, and public datasets to track deforestation around the world, which is believed to be a leading man-made cause of climate change. The project has been promoted as a way for ethical businesses to ensure that their supply chain is not complicit in deforestation.

Other initiatives are targeted at a more personal level, for example by analyzing transit routes that could be used for individual journeys, using Google Maps, and making recommendations based on carbon emissions for each route.

The idea of “smart cities” is central to the concept of the Internet of Things – the idea that everyday objects and tools are becoming increasingly connected, interactive, and intelligent, and capable of communicating with each other independently of humans. Many of the ideas put forward by smart-city pioneers are grounded in climate awareness, such as reducing carbon dioxide emissions and energy waste across urban areas. Smart metering allows utility companies to increase or restrict the flow of electricity, gas, or water to reduce waste and ensure adequate supply at peak periods. Public transport can be efficiently planned to avoid wasted journeys and provide a reliable service that will encourage citizens to leave their cars at home.

These examples raise an important point: It’s apparent that data – big or small – can tell us if, how, and why climate change is happening. But, of course, this is only really valuable to us if it also can tell us what we can do about it. Some projects, such as Weathersafe, which helps coffee growers adapt to changing weather patterns and soil conditions, are designed to help humans deal with climate change. Others are designed to tackle the problem at the root, by highlighting the factors that cause it in the first place and showing us how we can change our behavior to minimize damage….(More)”

Anonymous hackers could be Islamic State’s online nemesis


 at the Conversation: “One of the key issues the West has had to face in countering Islamic State (IS) is the jihadi group’s mastery of online propaganda, seen in hundreds of thousands of messages celebrating the atrocities against civilians and spreading the message of radicalisation. It seems clear that efforts to counter IS online are missing the mark.

A US internal State Department assessment noted in June 2015 how the violent narrative of IS had “trumped” the efforts of the world’s richest and most technologically advanced nations. Meanwhile in Europe, Interpol was to track and take down social media accounts linked to IS, as if that would solve the problem – when in fact doing so meant potentially missing out on intelligence gathering opportunities.

Into this vacuum has stepped Anonymous, a fragmented loose network of hacktivists that has for years launched occasional cyberattacks against government, corporate and civil society organisations. The group announced its intention to take on IS and its propaganda online, using its networks to crowd-source the identity of IS-linked accounts. Under the banner of #OpIsis and #OpParis, Anonymous published lists of thousands of Twitter accounts claimed to belong to IS members or sympathisers, claiming more than 5,500 had been removed.

The group pursued a similar approach following the attacks on Charlie Hebdo magazine in January 2015, with @OpCharlieHebdo taking down more than 200 jihadist Twitter acounts, bringing down the website Ansar-Alhaqq.net and publishing a list of 25,000 accounts alongside a guide on how to locate pro-IS material online….

Anonymous has been prosecuted for cyber attacks in many countries under cybercrime laws, as their activities are not seen as legitimate protest. It is worth mentioning the ethical debate around hacktivism, as some see cyber attacks that take down accounts or websites as infringing on others’ freedom of expression, while others argue that hacktivism should instead create technologies to circumvent censorship, enable digital equality and open access to information….(More)”

Crowdsourced phone camera footage maps conflicts


Springwise: “The UN requires accurate proof when investigating possible war crimes, but with different sides of a conflict providing contradicting evidence, and the unsafe nature of the environment, gaining genuine insight can be problematic. A team based at Goldsmith’s University in the UK are using amateur footage to investigate.

Forensic Architecture makes use of the increasingly prevalent smartphone footage on social media networks. By crowdsourcing several viewpoints around a given location on an accurately 3D rendered map, the team are able to determine where explosive devices were used, and of what calibre. Key resources are smoke plumes from explosions, which provide a unique shape at any moment, allowing the team to map them and identify the smoke at the exact moment from various viewpoints, providing a dossier of evidence to build up evidence against a war crime.

While Forensic Architecture’s method has been developed to validate war crime atrocities, the potential uses in other areas where satellite data are not available are numerous — forest fire sources could be located based on smoke plumes, and potential crowd crush scenarios may be spotted before they occur….(More)”

The promise and perils of predictive policing based on big data


H. V. Jagadish in the Conversation: “Police departments, like everyone else, would like to be more effective while spending less. Given the tremendous attention to big data in recent years, and the value it has provided in fields ranging from astronomy to medicine, it should be no surprise that police departments are using data analysis to inform deployment of scarce resources. Enter the era of what is called “predictive policing.”

Some form of predictive policing is likely now in force in a city near you.Memphis was an early adopter. Cities from Minneapolis to Miami have embraced predictive policing. Time magazine named predictive policing (with particular reference to the city of Santa Cruz) one of the 50 best inventions of 2011. New York City Police Commissioner William Bratton recently said that predictive policing is “the wave of the future.”

The term “predictive policing” suggests that the police can anticipate a crime and be there to stop it before it happens and/or apprehend the culprits right away. As the Los Angeles Times points out, it depends on “sophisticated computer analysis of information about previous crimes, to predict where and when crimes will occur.”

At a very basic level, it’s easy for anyone to read a crime map and identify neighborhoods with higher crime rates. It’s also easy to recognize that burglars tend to target businesses at night, when they are unoccupied, and to target homes during the day, when residents are away at work. The challenge is to take a combination of dozens of such factors to determine where crimes are more likely to happen and who is more likely to commit them. Predictive policing algorithms are getting increasingly good at such analysis. Indeed, such was the premise of the movie Minority Report, in which the police can arrest and convict murderers before they commit their crime.

Predicting a crime with certainty is something that science fiction can have a field day with. But as a data scientist, I can assure you that in reality we can come nowhere close to certainty, even with advanced technology. To begin with, predictions can be only as good as the input data, and quite often these input data have errors.

But even with perfect, error-free input data and unbiased processing, ultimately what the algorithms are determining are correlations. Even if we have perfect knowledge of your troubled childhood, your socializing with gang members, your lack of steady employment, your wacko posts on social media and your recent gun purchases, all that the best algorithm can do is to say it is likely, but not certain, that you will commit a violent crime. After all, to believe such predictions as guaranteed is to deny free will….

What data can do is give us probabilities, rather than certainty. Good data coupled with good analysis can give us very good estimates of probability. If you sum probabilities over many instances, you can usually get a robust estimate of the total.

For example, data analysis can provide a probability that a particular house will be broken into on a particular day based on historical records for similar houses in that neighborhood on similar days. An insurance company may add this up over all days in a year to decide how much to charge for insuring that house….(More)”

Does Open Data Need Journalism?


Paper by Jonathan Stoneman at Reuters Institute for Journalism: “The Open Data movement really came into being when President Obama issued his first policy paper, on his first day in office in January 2009. The US government opened up thousands of datasets to scrutiny by the public, by journalists, by policy-makers. Coders and developers were also invited to make the data useful to people and businesses in all manner of ways. Other governments across the globe followed suit, opening up data to their populations.

Opening data in this way has not resulted in genuine openness, save in a few isolated cases. In the USA and a few European countries, developers have created apps and websites which draw on Open Data, but these are not reaching a mass audience.

At the same time, journalists are not seen by government as the end users of these data. Data releases, even in the best cases, are uneven, and slow, and do not meet the needs of journalists. Although thousands of journalists have been learning and adopting the new skills of datajournalism they have tended to work with data obtained through Freedom of Information (FOI) legislation.

Stories which have resulted from datajournalists’ efforts have rarely been front page news; in many cases data-driven stories have ended up as lesser stories on inside pages, or as infographics, which relatively few people look at.

In this context, therefore, Open Data remains outside the mainstream of journalism, and out of the consciousness of the electorate, begging the question, “what are Open Data for?”, or as one developer put it – “if Open Data is the answer, what was the question?” Openness is seen as a badge of honour – scores of national governments have signed pledges to make data open, often repeating the same kind of idealistic official language as the previous announcement of a conversion to openness. But these acts are “top down”, and soon run out of momentum, becoming simply openness for its own sake. Looking at specific examples, the United States is the nearest to a success story: there is a rich ecosystem – made up of government departments, interest groups and NGOs, the media, civil society – which allows data driven projects the space to grow and the airtime to make an impact. (It probably helped that the media in the US were facing an existential challenge urgent enough to force them to embrace new, inexpensive, ways of carrying out investigative reporting).

Elsewhere data are making less impact on journalism. In the UK the new openness is being exploited by a small minority. Where data are made published on the data.gov.uk website they are frequently out of date, incomplete, or of limited new value, so where data do drive stories, these tend to be data released under FOI legislation, and the resulting stories take the form of statistics and/or infographics.

In developing countries where Open Data Portals have been launched with a fanfare – such as Kenya, and more recently Burkina Faso – there has been little uptake by coders, journalists, or citizens, and the number of fresh datasets being published drops to a trickle, and are soon well out of date. Small, apparently randomly selected datasets are soon outdated and inertia sets in.

The British Conservative Party, pledging greater openness in its 2010 manifesto, foresaw armies of “Armchair Auditors” who would comb through the data and present the government with ideas for greater efficiency in the use of public funds. Almost needless to say, these armies have never materialised, and thousands of datasets go unscrutinised by anybody. 2 In countries like Britain large amounts of data are being published but going (probably) unread and unscrutinised by anybody. At the same time, the journalists who want to make use of data are getting what they need through FOI, or even by gathering data themselves. Open Data is thus being bypassed, and could become an irrelevance. Yet, the media could be vital agents in the quest for the release of meaningful, relevant, timely data.

Governments seem in no hurry to expand the “comfort zone” from which they release the data which shows their policies at their most effective, and keeping to themselves data which paints a gloomier picture. Journalists seem likely to remain in their comfort zone, where they make use of FOI and traditional sources of information. For their part, journalists should push for better data and use it more, working in collaboration with open data activists. They need to change the habits of a lifetime and discuss their sources: revealing the source and quality of data used in a story would in itself be as much a part of the advocacy as of the actual reporting.

If Open Data are to be part of a new system of democratic accountability, they need to be more than a gesture of openness. Nor should Open Data remain largely the preserve of companies using them for commercial purposes. Governments should improve the quality and relevance of published data, making them genuinely useful for journalists and citizens alike….(More)”

The Transformation of Human Rights Fact-Finding


Book edited by Philip Alston and Sarah Knuckey: “Fact-finding is at the heart of human rights advocacy, and is often at the center of international controversies about alleged government abuses. In recent years, human rights fact-finding has greatly proliferated and become more sophisticated and complex, while also being subjected to stronger scrutiny from governments. Nevertheless, despite the prominence of fact-finding, it remains strikingly under-studied and under-theorized. Too little has been done to bring forth the assumptions, methodologies, and techniques of this rapidly developing field, or to open human rights fact-finding to critical and constructive scrutiny.

The Transformation of Human Rights Fact-Finding offers a multidisciplinary approach to the study of fact-finding with rigorous and critical analysis of the field of practice, while providing a range of accounts of what actually happens. It deepens the study and practice of human rights investigations, and fosters fact-finding as a discretely studied topic, while mapping crucial transformations in the field. The contributions to this book are the result of a major international conference organized by New York University Law School’s Center for Human Rights and Global Justice. Engaging the expertise and experience of the editors and contributing authors, it offers a broad approach encompassing contemporary issues and analysis across the human rights spectrum in law, international relations, and critical theory. This book addresses the major areas of human rights fact-finding such as victim and witness issues; fact-finding for advocacy, enforcement, and litigation; the role of interdisciplinary expertise and methodologies; crowd sourcing, social media, and big data; and international guidelines for fact-finding….(More)”

Role of Citizens in India’s Smart Cities Challenge


Florence Engasser and Tom Saunders at the World Policy Blog: “India faces a wide range of urban challenges — from serious air pollution and poor local governance, to badly planned cities and a lack of decent housing. India’s Smart Cities Challenge, which has now selected 98 of the 100 cities that will receive funding, could go a long way in addressing these issues.

According to Prime Minister Narendra Modi, there are five key instruments that make a “smart” city: the use of clean technologies, the use of information and communications technology (ICT), private sector involvement, citizen participation and smart governance. There are good examples of new practices for each of these pillars.

For example, New Delhi recently launched a program to replace streetlights with energy efficient LEDs. The Digital India program is designed to upgrade the country’s IT infrastructure and includes plans to build “broadband highways” across the country. As for private sector participation, the Indian government is trying to encourage it by listing sectors and opportunities for public-private partnerships.

Citizen participation is one of Modi’s five key instruments, but this is an area where smart city pilots around the world have tended to perform least well on. While people are the implied beneficiaries of programs that aim to improve efficiency and reduce waste, they are rarely given a chance to participate in the design or delivery of smart city projects, which are usually implemented and managed by experts who have only a vague idea of the challenges that local communities face.

Citizen Participation

Engaging citizens is especially important in an Indian context because there have already been several striking examples of failed urban redevelopments that have blatantly lacked any type of community consultation or participation….

In practice, how can Indian cities engage residents in their smart city projects?

There are many tools available to policymakers — from traditional community engagement activities such as community meetings, to websites like Mygov.in that ask for feedback on policies. Now, there are a number of reasons to think smartphones could be an important tool to help improve collaboration between residents and city governments in Indian cities.

First, while only around 10 percent of Indians currently own a smartphone, this is predicted to rise to around half by 2020, and will be much higher in urban areas. A key driver of this is local manufacturing giants like Micromax, which have revolutionized low-cost technology in India, with smartphones costing as little as $30 (compared to around $800 for the newest iPhone).

Second, smartphone apps give city governments the potential to interact directly with citizens to make the most of what they know and feel about their communities. This can happen passively, for example, the Waze Connected Citizens program, which shares user location data with city governments to help improve transport planning. It can also be more active, for example, FixMyStreet, which allows people to report maintenance issues like potholes to their city government.

Third, smartphones are one of the main ways for people to access social media, and researchers are now developing a range of new and innovative solutions to address urban challenges using these platforms. This includes Petajakarta, which creates crowd-sourced maps of flooding in Jakarta by aggregating tweets that mention the word ‘flood.’

Made in India

Considering some of the above trends, it is interesting to think about the role smartphones could play in the governance of Indian cities and in better engaging communities. India is far from being behind in the field, and there are already a few really good examples of innovative smartphone applications made in India.

Swachh Bharat Abhiyan (translated as Clean India Initiative) is a campaign launched by Modi in October 2014, covering over 4,000 towns all over the country, with the aim to clean India’s streets. The Clean India mobile application, launched at the end of 2014 to coincide with Modi’s initiative, was developed by Mahek Shah and allows users to take pictures to report, geo-locate, and timestamp streets that need cleaning or problems to be fixed by the local authorities.

Similar to FixMyStreet, users are able to tag their reports with keywords to categorize problems. Today, Clean India has been downloaded over 12,000 times and has 5,000 active users. Although still at a very early stage, Clean India has great potential to facilitate the complaint and reporting process by empowering people to become the eyes and ears of municipalities on the ground, who are often completely unaware of issues that matter to residents.

In Bangalore, an initiative by the MOD Institute, a local nongovernmental organization, enabled residents to come together, online and offline, to create a community vision for the redevelopment of Shanthinagar, a neighborhood of the city. The project, Next Bengaluru, used new technologies to engage local residents in urban planning and tap into their knowledge of the area to promote a vision matching their real needs.

The initiative was very successful. In just three months, between December 2014 and March 2015, over 1,200 neighbors and residents visited the on-site community space, and the team crowd-sourced more than 600 ideas for redevelopment and planning both on-site and through the Next Bangalore website.

The MOD Institute now intends to work with local urban planners to try get these ideas adopted by the city government. The project has also developed a pilot app that will enable people to map abandoned urban spaces via smartphone and messaging service in the future.

Finally, Safecity India is a nonprofit organization providing a platform for anyone to share, anonymously or not, personal stories of sexual harassment and abuse in public spaces. Men and women can report different types of abuses — from ogling, whistles and comments, to stalking, groping and sexual assault. The aggregated data is then mapped, allowing citizens and governments to better understand crime trends at hyper-local levels.

Since its launch in 2012, SafeCity has received more than 4,000 reports of sexual crime and harassment in over 50 cities across India and Nepal. SafeCity helps generate greater awareness, breaks the cultural stigma associated with reporting sexual abuse and gives voice to grassroots movements and campaigns such as SayftyProtsahan, or Stop Street Harassment, forcing authorities to take action….(More)

Using Crowdsourcing to Track the Next Viral Disease Outbreak


The TakeAway: “Last year’s Ebola outbreak in West Africa killed more than 11,000 people. The pandemic may be diminished, but public health officials think that another major outbreak of infectious disease is fast-approaching, and they’re busy preparing for it.

Boston public radio station WGBH recently partnered with The GroundTruth Project and NOVA Next on a series called “Next Outbreak.” As part of the series, they reported on an innovative global online monitoring system called HealthMap, which uses the power of the internet and crowdsourcing to detect and track emerging infectious diseases, and also more common ailments like the flu.

Researchers at Boston Children’s Hospital are the ones behind HealthMap (see below), and they use it to tap into tens of thousands of sources of online data, including social media, news reports, and blogs to curate information about outbreaks. Dr. John Brownstein, chief innovation officer at Boston Children’s Hospital and co-founder of HealthMap, says that smarter data collection can help to quickly detect and track emerging infectious diseases, fatal or not.

“Traditional public health is really slowed down by the communication process: People get sick, they’re seen by healthcare providers, they get laboratory confirmed, information flows up the channels to state and local health [agencies], national governments, and then to places like the WHO,” says Dr. Brownstein. “Each one of those stages can take days, weeks, or even months, and that’s the problem if you’re thinking about a virus that can spread around the world in a matter of days.”

The HealthMap team looks at a variety of communication channels to undo the existing hierarchy of health information.

“We make everyone a stakeholder when it comes to data about outbreaks, including consumers,” says Dr. Brownstein. “There are a suite of different tools that public health officials have at their disposal. What we’re trying to do is think about how to communicate and empower individuals to really understand what the risks are, what the true information is about a disease event, and what they can do to protect themselves and their families. It’s all about trying to demystify outbreaks.”

In addition to the map itself, the HealthMap team has a number of interactive tools that individuals can both use and contribute to. Dr. Brownstein hopes these resources will enable the public to care more about disease outbreaks that may be happening around them—it’s a way to put the “public” back in “public health,” he says.

“We have a app called Outbreaks Near Me that allows people to know about what disease outbreaks are happening in their neighborhood,” Dr. Brownstein says. “Flu Near You is a an app that people use to self report on symptoms; Vaccine Finder is a tool that allows people to know what vaccines are available to them and their community.”

In addition to developing their own app, the HealthMap has partnered with existing tech firms like Uber to spread the word about public health.

“We worked closely with Uber last year and actually put nurses in Uber cars and delivered vaccines to people,” Dr. Brownstein says. “The closest vaccine location might still be only a block away for people, but people are still hesitant to get it done.”…(More)”