Liesbet van Zoonen in Government Information Quarterly: “In this paper a framework is constructed to hypothesize if and how smart city technologies and urban big data produce privacy concerns among the people in these cities (as inhabitants, workers, visitors, and otherwise). The framework is built on the basis of two recurring dimensions in research about people’s concerns about privacy: one dimensions represents that people perceive particular data as more personal and sensitive than others, the other dimension represents that people’s privacy concerns differ according to the purpose for which data is collected, with the contrast between service and surveillance purposes most paramount. These two dimensions produce a 2 × 2 framework that hypothesizes which technologies and data-applications in smart cities are likely to raise people’s privacy concerns, distinguishing between raising hardly any concern (impersonal data, service purpose), to raising controversy (personal data, surveillance purpose). Specific examples from the city of Rotterdam are used to further explore and illustrate the academic and practical usefulness of the framework. It is argued that the general hypothesis of the framework offers clear directions for further empirical research and theory building about privacy concerns in smart cities, and that it provides a sensitizing instrument for local governments to identify the absence, presence, or emergence of privacy concerns among their citizens….(More)”
The Surprising History of the Infographic
Clive Thompson at the Smithsonian magazine: “As the 2016 election approaches, we’re hearing a lot about “red states” and “blue states.” That idiom has become so ingrained that we’ve almost forgotten where it originally came from: a data visualization.
In the 2000 presidential election, the race between Al Gore and George W. Bush was so razor close that broadcasters pored over electoral college maps—which they typically colored red and blue. What’s more, they talked about those shadings. NBC’s Tim Russert wondered aloud how George Bush would “get those remaining 61 electoral red states, if you will,” and that language became lodged in the popular imagination. America became divided into two colors—data spun into pure metaphor. Now Americans even talk routinely about “purple” states, a mental visualization of political information.
We live in an age of data visualization. Go to any news website and you’ll see graphics charting support for the presidential candidates; open your iPhone and the Health app will generate personalized graphs showing how active you’ve been this week, month or year. Sites publish charts showing how the climate is changing, how schools are segregating, how much housework mothers do versus fathers. And newspapers are increasingly finding that readers love “dataviz”: In 2013, the New York Times’ most-read story for the entire year was a visualization of regional accents across the United States. It makes sense. We live in an age of Big Data. If we’re going to understand our complex world, one powerful way is to graph it.
But this isn’t the first time we’ve discovered the pleasures of making information into pictures. Over a hundred years ago, scientists and thinkers found themselves drowning in their own flood of data—and to help understand it, they invented the very idea of infographics.
**********
The idea of visualizing data is old: After all, that’s what a map is—a representation of geographic information—and we’ve had maps for about 8,000 years. But it was rare to graph anything other than geography. Only a few examples exist: Around the 11th century, a now-anonymous scribe created a chart of how the planets moved through the sky. By the 18th century, scientists were warming to the idea of arranging knowledge visually. The British polymath Joseph Priestley produced a “Chart of Biography,” plotting the lives of about 2,000 historical figures on a timeline. A picture, he argued, conveyed the information “with more exactness, and in much less time, than it [would take] by reading.”
Still, data visualization was rare because data was rare. That began to change rapidly in the early 19th century, because countries began to collect—and publish—reams of information about their weather, economic activity and population. “For the first time, you could deal with important social issues with hard facts, if you could find a way to analyze it,” says Michael Friendly, a professor of psychology at York University who studies the history of data visualization. “The age of data really began.”
An early innovator was the Scottish inventor and economist William Playfair. As a teenager he apprenticed to James Watt, the Scottish inventor who perfected the steam engine. Playfair was tasked with drawing up patents, which required him to develop excellent drafting and picture-drawing skills. After he left Watt’s lab, Playfair became interested in economics and convinced that he could use his facility for illustration to make data come alive.
“An average political economist would have certainly been able to produce a table for publication, but not necessarily a graph,” notes Ian Spence, a psychologist at the University of Toronto who’s writing a biography of Playfair. Playfair, who understood both data and art, was perfectly positioned to create this new discipline.
In one famous chart, he plotted the price of wheat in the United Kingdom against the cost of labor. People often complained about the high cost of wheat and thought wages were driving the price up. Playfair’s chart showed this wasn’t true: Wages were rising much more slowly than the cost of the product.
“He wanted to discover,” Spence notes. “He wanted to find regularities or points of change.” Playfair’s illustrations often look amazingly modern: In one, he drew pie charts—his invention, too—and lines that compared the size of various country’s populations against their tax revenues. Once again, the chart produced a new, crisp analysis: The British paid far higher taxes than citizens of other nations.
Neurology was not yet a robust science, but Playfair seemed to intuit some of its principles. He suspected the brain processed images more readily than words: A picture really was worth a thousand words. “He said things that sound almost like a 20th-century vision researcher,” Spence adds. Data, Playfair wrote, should “speak to the eyes”—because they were “the best judge of proportion, being able to estimate it with more quickness and accuracy than any other of our organs.” A really good data visualization, he argued, “produces form and shape to a number of separate ideas, which are otherwise abstract and unconnected.”
Soon, intellectuals across Europe were using data visualization to grapple with the travails of urbanization, such as crime and disease….(More)”
DARPA wants to design an army of ultimate automated data scientists
Michael Cooney in NetworkWorld: “Because of a plethora of data from sensor networks, Internet of Things devices and big data resources combined with a dearth of data scientists to effectively mold that data, we are leaving many important applications – from intelligence to science and workforce management – on the table.
It is a situation the researchers at DARPA want to remedy with a new program called Data-Driven Discovery of Models (D3M). The goal of D3M is to develop algorithms and software to help overcome the data-science expertise gap by facilitating non-experts to construct complex empirical models through automation of large parts of the model-creation process. If successful, researchers using D3M tools will effectively have access to an army of “virtual data scientists,” DARPA stated.
This army of virtual data scientists is needed because some experts project deficits of 140,000 to 190,000 data scientists worldwide in 2016 alone, and increasing shortfalls in coming years. Also, because the process to build empirical models is so manual, their relative sophistication and value is often limited, DARPA stated.
“We have an urgent need to develop machine-based modeling for users with no data-science background. We believe it’s possible to automate certain aspects of data science, and specifically to have machines learn from prior example how to construct new models,” said Wade Shen, program manager in DARPA’s Information Innovation Office in a statement….(More)”
Big Data Challenges: Society, Security, Innovation and Ethics
Book edited by Bunnik, A., Cawley, A., Mulqueen, M., Zwitter, A: “This book brings together an impressive range of academic and intelligence professional perspectives to interrogate the social, ethical and security upheavals in a world increasingly driven by data. Written in a clear and accessible style, it offers fresh insights to the deep reaching implications of Big Data for communication, privacy and organisational decision-making. It seeks to demystify developments around Big Data before evaluating their current and likely future implications for areas as diverse as corporate innovation, law enforcement, data science, journalism, and food security. The contributors call for a rethinking of the legal, ethical and philosophical frameworks that inform the responsibilities and behaviours of state, corporate, institutional and individual actors in a more networked, data-centric society. In doing so, the book addresses the real world risks, opportunities and potentialities of Big Data….(More)”
The Billions We’re Wasting in Our Jails
Stephen Goldsmith and Jane Wiseman in Governing: “By using data analytics to make decisions about pretrial detention, local governments could find substantial savings while making their communities safer….
Few areas of local government spending present better opportunities for dramatic savings than those that surround pretrial detention. Cities and counties are wasting more than $3 billion a year, and often inducing crime and job loss, by holding the wrong people while they await trial. The problem: Only 10 percent of jurisdictions use risk data analytics when deciding which defendants should be detained.
As a result, dangerous people are out in our communities, while many who could be safely in the community are behind bars. Vast numbers of people accused of petty offenses spend their pretrial detention time jailed alongside hardened convicts, learning from them how to be better criminals….
In this era of big data, analytics not only can predict and prevent crime but also can discern who should be diverted from jail to treatment for underlying mental health or substance abuse issues. Avoided costs aggregating in the billions could be better spent on detaining high-risk individuals, more mental health and substance abuse treatment, more police officers and other public safety services.
Jurisdictions that do use data to make pretrial decisions have achieved not only lower costs but also greater fairness and lower crime rates. Washington, D.C., releases 85 percent of defendants awaiting trial. Compared to the national average, those released in D.C. are two and a half times more likely to remain arrest-free and one and a half times as likely to show up for court.
Louisville, Ky., implemented risk-based decision-making using a tool developed by the Laura and John Arnold Foundation and now releases 70 percent of defendants before trial. Those released have turned out to be twice as likely to return to court and to stay arrest-free as those in other jurisdictions. Mesa County, Colo., and Allegheny County, Pa., both have achieved significant savings from reduced jail populations due to data-driven release of low-risk defendants.
Data-driven approaches are beginning to produce benefits not only in the area of pretrial detention but throughout the criminal justice process. Dashboards now in use in a handful of jurisdictions allow not only administrators but also the public to see court waiting times by offender type and to identify and address processing bottlenecks….(More)”
Civic Data Initiatives
Burak Arikan at Medium: “Big data is the term used to define the perpetual and massive data gathered by corporations and governments on consumers and citizens. When the subject of data is not necessarily individuals but governments and companies themselves, we can call it civic data, and when systematically generated in large amounts, civic big data. Increasingly, a new generation of initiatives are generating and organizing structured data on particular societal issues from human rights violations, to auditing government budgets, from labor crimes to climate justice.
These civic data initiatives diverge from the traditional civil society organizations in their outcomes,that they don’t just publish their research as reports, but also open it to the public as a database.Civic data initiatives are quite different in their data work than international non-governmental organizations such as UN, OECD, World Bank and other similar bodies. Such organizations track social, economical, political conditions of countries and concentrate upon producing general statistical data, whereas civic data initiatives aim to produce actionable data on issues that impact individuals directly. The change in the GDP value of a country is useless for people struggling for free transportation in their city. Incarceration rate of a country does not help the struggle of the imprisoned journalists. Corruption indicators may serve as a parameter in a country’s credit score, but does not help to resolve monopolization created with public procurement. Carbon emission statistics do not prevent the energy deals between corrupt governments that destroy the nature in their region.
Needless to say, civic data initiatives also differ from governmental institutions, which are reluctant to share any more that they are legally obligated to. Many governments in the world simply dump scanned hardcopies of documents on official websites instead of releasing machine-readable data, which prevents systematic auditing of government activities.Civic data initiatives, on the other hand, make it a priority to structure and release their data in formats that are both accessible and queryable.
Civic data initiatives also deviate from general purpose information commons such as Wikipedia. Because they consistently engage with problems, closely watch a particular societal issue, make frequent updates,even record from the field to generate and organize highly granular data about the matter….
Several civic data initiatives generate data on variety of issues at different geographies, scopes, and scales. The non-exhaustive list below have information on founders, data sources, and financial support. It is sorted according to each initiative’s founding year. Please send your suggestions to contact at graphcommons.com. See more detailed information and updates on the spreadsheet of civic data initiatives.
Open Secrets tracks data about the money flow in the US government, so it becomes more accessible for journalists, researchers, and advocates.Founded as a non-profit in 1983 by Center for Responsive Politics, gets support from variety of institutions.
PolitiFact is a fact-checking website that rates the accuracy of claims by elected officials and others who speak up in American politics. Uses on-the-record interviews as its data source. Founded in 2007 as a non-profit organization by Tampa Bay Times. Supported by Democracy Fund, Bill &Melinda Gates Foundation, John S. and James L. Knight Foundation, FordFoundation, Knight Foundation, Craigslist Charitable Fund, and the CollinsCenter for Public Policy…..
La Fabrique de La loi (The Law Factory) maps issues of local-regional socio-economic development, public investments, and ecology in France.Started in 2014, the project builds a database by tracking bills from government sources, provides a search engine as well as an API. The partners of the project are CEE Sciences Po, médialab Sciences Po, RegardsCitoyens, and Density Design.
Mapping Media Freedom identifies threats, violations and limitations faced by members of the press throughout European Union member states,candidates for entry and neighbouring countries. Initiated by Index onCensorship and European Commission in 2004, the project…(More)”
The Racist Algorithm?
Anupam Chander in the Michigan Law Review (2017 Forthcoming) : “Are we on the verge of an apartheid by algorithm? Will the age of big data lead to decisions that unfairly favor one race over others, or men over women? At the dawn of the Information Age, legal scholars are sounding warnings about the ubiquity of automated algorithms that increasingly govern our lives. In his new book, The Black Box Society: The Hidden Algorithms Behind Money and Information, Frank Pasquale forcefully argues that human beings are increasingly relying on computerized algorithms that make decisions about what information we receive, how much we can borrow, where we go for dinner, or even whom we date. Pasquale’s central claim is that these algorithms will mask invidious discrimination, undermining democracy and worsening inequality. In this review, I rebut this prominent claim. I argue that any fair assessment of algorithms must be made against their alternative. Algorithms are certainly obscure and mysterious, but often no more so than the committees or individuals they replace. The ultimate black box is the human mind. Relying on contemporary theories of unconscious discrimination, I show that the consciously racist or sexist algorithm is less likely than the consciously or unconsciously racist or sexist human decision-maker it replaces. The principal problem of algorithmic discrimination lies elsewhere, in a process I label viral discrimination: algorithms trained or operated on a world pervaded by discriminatory effects are likely to reproduce that discrimination.
I argue that the solution to this problem lies in a kind of algorithmic affirmative action. This would require training algorithms on data that includes diverse communities and continually assessing the results for disparate impacts. Instead of insisting on race or gender neutrality and blindness, this would require decision-makers to approach algorithmic design and assessment in a race and gender conscious manner….(More)“
Are we too obsessed with data?
Lauren Woodman of Nethope:” Data: Everyone’s talking about it, everyone wants more of it….
Still, I’d posit that we’re too obsessed with data. Not just us in the humanitarian space, of course, but everyone. How many likes did that Facebook post get? How many airline miles did I fly last year? How many hours of sleep did I get last week?…
The problem is that data by itself isn’t that helpful: information is.
We need to develop a new obsession, around making sure that data is actionable, that it is relevant in the context in which we work, and on making sure that we’re using the data as effectively as we are collecting it.
In my talk at ICT4D, I referenced the example of 7-Eleven in Japan. In the 1970s, 7-Eleven in Japan became independent from its parent, Southland Corporation. The CEO had to build a viable business in a tough economy. Every month, each store manager would receive reams of data, but it wasn’t effective until the CEO stripped out the noise and provided just four critical data points that had the greatest relevance to drive the local purchasing that each store was empowered to do on their own.
Those points – what sold the day before, what sold the same day a year ago, what sold the last time the weather was the same, and what other stores sold the day before – were transformative. Within a year, 7-Eleven had turned a corner, and for 30 years, remained the most profitable retailer in Japan. It wasn’t about the Big Data; it was figuring out what data was relevant, actionable and empowered local managers to make nimble decisions.
For our sector to get there, we need to do the front-end work that transforms our data into information that we can use. That, after all, is where the magic happens.
A few examples provide more clarity as to why this is so critical.
We know that adaptive decision-making requires access to real-time data. By knowing what is happening in real-time, or near-real-time, we can adjust our approaches and interventions to be most impactful. But to do so, our data has to be accessible to those that are empowered to make decisions. To achieve that, we have to make investments in training, infrastructure, and capacity-building at the organizational level. But in the nonprofit sector, such investments are rarely supported by donors and beyond the limited unrestricted funding available to most most organizations. As a result, the sector has, so far, been able to take only limited steps towards effective data usage, hampering our ability to transform the massive amounts of data we have into useful information.
Another big question about data, and particularly in the humanitarian space, is whether it should be open, closed or somewhere in between. Privacy is certainly paramount, and for types of data, the need for close protection is very clear. For many other data, however, the rules are far less clear. Every country has its own rules about how data can and cannot be used or shared, and more work is needed to provide clarity and predictability so that appropriate data-sharing can evolve.
And perhaps more importantly, we need to think about not just the data, but the use cases. Most of us would agree, for example, that sharing information during a crisis situation can be hugely beneficial to the people and the communities we serve – but in a world where rules are unclear, that ambiguity limits what we can do with the data we have. Here again, the context in which data will be used is critically important.
Finally, all of in the sector have to realize that the journey to transforming data into information is one we’re on together. We have to be willing to give and take. Having data is great; sharing information is better. Sometimes, we have to co-create that basis to ensure we all benefit….(More)”
Leveraging ‘big data’ analytics in the public sector
Pandula Gamage in Public Money & Management: “This article examines the opportunities presented by effectively harnessing big data in the public sector context. The article is exploratory and reviews both academic- and practitioner–oriented literature related to big data developments. The findings suggest that big data will have an impact on the future role of public sector organizations in functional areas. However, the author also reveals that there are challenges to be addressed by governments in adopting big data applications. To realize the benefits of big data, policy-makers need to: invest in research; create incentives for private and public sector entities to share data; and set up programmes to develop appropriate skills….(More)”
Is artificial intelligence key to dengue prevention?
BreakDengue: “Dengue fever outbreaks are increasing in both frequency and magnitude. Not only that, the number of countries that could potentially be affected by the disease is growing all the time.
This growth has led to renewed efforts to address the disease, and a pioneering Malaysian researcher was recently recognized for his efforts to harness the power of big data and artificial intelligence to accurately predict dengue outbreaks.
Dr. Dhesi Baha Raja received the Pistoia Alliance Life Science Award at King’s College London in April of this year, for developing a disease prediction platform that employs technology and data to give people prior warning of when disease outbreaks occur.The medical doctor and epidemiologist has spent years working to develop AIME (Artificial Intelligence in Medical Epidemiology)…
it relies on a complex algorithm, which analyses a wide range of data collected by local government and also satellite image recognition systems. Over 20 variables such as weather, wind speed, wind direction, thunderstorm, solar radiation and rainfall schedule are included and analyzed. Population models and geographical terrain are also included. The ultimate result of this intersection between epidemiology, public health and technology is a map, which clearly illustrates the probability and location of the next dengue outbreak.
The ground-breaking platform can predict dengue fever outbreaks up to two or three months in advance, with an accuracy approaching 88.7 per cent and within a 400m radius. Dr. Dhesi has just returned from Rio de Janeiro, where the platform was employed in a bid to fight dengue in advance of this summer’s Olympics. In Brazil, its perceived accuracy was around 84 per cent, whereas in Malaysia in was over 88 per cent – giving it an average accuracy of 86.37 per cent.
The web-based application has been tested in two states within Malaysia, Kuala Lumpur, and Selangor, and the first ever mobile app is due to be deployed across Malaysia soon. Once its capability is adequately tested there, it will be rolled out globally. Dr. Dhesi’s team are working closely with mobile digital service provider Webe on this.
By making the app free to download, this will ensure the service becomes accessible to all, Dr Dhesi explains.
“With the web-based application, this could only be used by public health officials and agencies. We recognized the need for us to democratize this health service to the community, and the only way to do this is to provide the community with the mobile app.”
This will also enable the gathering of even greater knowledge on the possibility of dengue outbreaks in high-risk areas, as well as monitoring the changing risks as people move to different areas, he adds….(More)”