The Economist: “Rich countries are deluged with data; developing ones are suffering from drought…
AFRICA is the continent of missing data. Fewer than half of births are recorded; some countries have not taken a census in several decades. On maps only big cities and main streets are identified; the rest looks as empty as the Sahara. Lack of data afflicts other developing regions, too. The self-built slums that ring many Latin American cities are poorly mapped, and even estimates of their population are vague. Afghanistan is still using census figures from 1979—and that count was cut short after census-takers were killed by mujahideen.
As rich countries collect and analyse data from as many objects and activities as possible—including thermostats, fitness trackers and location-based services such as Foursquare—a data divide has opened up. The lack of reliable data in poor countries thwarts both development and disaster-relief. When Médecins Sans Frontières (MSF), a charity, moved into Liberia to combat Ebola earlier this year, maps of the capital, Monrovia, fell far short of what was needed to provide aid or track the disease’s spread. Major roads were marked, but not minor ones or individual buildings.
Poor data afflict even the highest-profile international development effort: the Millennium Development Goals (MDGs). The targets, which include ending extreme poverty, cutting infant mortality and getting all children into primary school, were set by UN members in 2000, to be achieved by 2015. But, according to a report by an independent UN advisory group published on November 6th, as the deadline approaches, the figures used to track progress are shaky. The availability of data on 55 core indicators for 157 countries has never exceeded 70%, it found (see chart)….
Some of the data gaps are now starting to be filled from non-government sources. A volunteer effort called Humanitarian OpenStreetMap Team (HOT) improves maps with information from locals and hosts “mapathons” to identify objects shown in satellite images. Spurred by pleas from those fighting Ebola, the group has intensified its efforts in Monrovia since August; most of the city’s roads and many buildings have now been filled in (see maps). Identifying individual buildings is essential, since in dense slums without formal roads they are the landmarks by which outbreaks can be tracked and assistance targeted.
On November 7th a group of charities including MSF, Red Cross and HOT unveiled MissingMaps.org, a joint initiative to produce free, detailed maps of cities across the developing world—before humanitarian crises erupt, not during them. The co-ordinated effort is needed, says Ivan Gayton of MSF: aid workers will not use a map with too little detail, and are unlikely, without a reason, to put work into improving a map they do not use. The hope is that the backing of large charities means the locals they work with will help.
In Kenya and Namibia mobile-phone operators have made call-data records available to researchers, who have used them to combat malaria. By comparing users’ movements with data on outbreaks, epidemiologists are better able to predict where the disease might spread. mTrac, a Ugandan programme that replaces paper reports from health workers with texts sent from their mobile phones, has made data on medical cases and supplies more complete and timely. The share of facilities that have run out of malaria treatments has fallen from 80% to 15% since it was introduced.
Private-sector data are also being used to spot trends before official sources become aware of them. Premise, a startup in Silicon Valley that compiles economics data in emerging markets, has found that as the number of cases of Ebola rose in Liberia, the price of staple foods soared: a health crisis risked becoming a hunger crisis. In recent weeks, as the number of new cases fell, prices did, too. The authorities already knew that travel restrictions and closed borders would push up food prices; they now have a way to measure and track price shifts as they happen….”
A New Taxonomy of Smart City Projects
New paper by Guido Perboli et al: “City logistics proposes an integrated vision of freight transportation systems within urban area and it aims at the optimization of them as a whole in terms of efficiency, security, safety, viability and environmental sustainability. Recently, this perspective has been extended by the Smart City concept in order to include other aspects of city management: building, energy, environment, government, living, mobility, education, health and so on. At the best of our knowledge, a classification of Smart City Projects has not been created yet. This paper introduces such a classification, highlighting success factors and analyzing new trends in Smart City.”
Giving Americans Easier Access to Their Own Data
the White House Blog: “…One of the newest My Data efforts is the IRS tool, Get Transcript. Launched in 2014, Get Transcript allows taxpayers to securely view, print, and download a PDF record of the last three years of their IRS tax account. Get Transcript has produced over 17 million so-called tax transcripts, reducing phone, mail, or in-person requests by approximately 40% from last year. Secure access to your own tax data makes it easier to demonstrate your income with prospective lenders and employers, or help with tax preparation. What was a paper-based transcript process which took multiple days has been made instantaneous and easy for the American taxpayer.
The IRS is an agency that serves virtually every American, and runs one of the nation’s largest customer service operations. To give an idea of the size and scope of responsibilities, the Internal Revenue Service:
- receives over 80 million phone calls per year, mostly from people eager to hear the status of their refund, understand a notice, make a payment, or update their account;
- sends out nearly 200 million paper notices annually; and
- receives over 50 million unique visitors to its website each month during filing season.
Meeting this demand from citizens is a challenge with limited staff and resources. Nonetheless, the IRS is committed to improving service to citizens across all of its channels – whether it’s by phone, walk-ins, or especially its digital services.
Building on the initial success of Get Transcript, there are more exciting improvements to IRS services in the pipeline. For instance, millions of taxpayers contact the IRS every year to ask about their tax status, whether their filing was received, if their refund was processed, or if their payment posted. In the future, taxpayers will be able to answer these types of questions independently by signing in to a mobile-friendly, personalized online account to conduct transactions and see all of their tax information in one place. Users will be able to view account history and balance, make payments or see payment status, or even authorize their tax preparer to view or make changes to their tax return. This will also include the ability to download personal tax information in an easy to use and machine-readable format so that taxpayers can share with trusted recipients if desired….”
The Creepy New Wave of the Internet
Review by Sue Halpern in the New York Review of Books from:
The Zero Marginal Cost Society: The Internet of Things, the Collaborative Commons, and the Eclipse of Capitalism
Enchanted Objects: Design, Human Desire, and the Internet of Things
Age of Context: Mobile, Sensors, Data and the Future of Privacy
More Awesome Than Money: Four Boys and Their Heroic Quest to Save Your Privacy from Facebook
…So here comes the Internet’s Third Wave. In its wake jobs will disappear, work will morph, and a lot of money will be made by the companies, consultants, and investment banks that saw it coming. Privacy will disappear, too, and our intimate spaces will become advertising platforms—last December Google sent a letter to the SEC explaining how it might run ads on home appliances—and we may be too busy trying to get our toaster to communicate with our bathroom scale to notice. Technology, which allows us to augment and extend our native capabilities, tends to evolve haphazardly, and the future that is imagined for it—good or bad—is almost always historical, which is to say, naive.”
Could digital badges clarify the roles of co-authors?
AAAS Science Magazine: “Ever look at a research paper and wonder how the half-dozen or more authors contributed to the work? After all, it’s usually only the first or last author who gets all the media attention or the scientific credit when people are considered for jobs, grants, awards, and more. Some journals try to address this issue with the “authors’ contributions” sections within a paper, but a collection of science, publishing, and software groups is now developing a more modern solution—digital “badges,” assigned on publication of a paper online, that detail what each author did for the work and that the authors can link to their profiles elsewhere on the Web.
atThose organizations include publishers BioMed Central and the Public Library of Science; The Wellcome Trust research charity; software development groups Mozilla Science Lab (a group of researchers, developers, librarians, and publishers) and Digital Science (a software and technology firm); and ORCID, an effort to assign researchers digital identifiers. The collaboration presented its progress on the project at the Mozilla Festival in London that ended last week. (Mozilla is the open software community behind the Firefox browser and other programs.)
The infrastructure of the badges is still being established, with early prototypes scheduled to launch early next year, according to Amye Kenall, the journal development manager of open data initiatives and journals at BioMed Central. She envisions the badge process in the following way: Once an article is published, the publisher would alert software maintained by Mozilla to automatically set up an online form, where authors fill out roles using a detailed contributor taxonomy. After the authors have completed this, the badges would then appear next to their names on the journal article, and double-clicking on a badge would lead to the ORCID site for that particular author, where the author’s badges, integrated with their publishing record, live….
The parties behind the digital badge effort are “looking to change behavior” of scientists in the competitive dog-eat-dog world of academia by acknowledging contributions, says Kaitlin Thaney, director of Mozilla Science Lab. Amy Brand, vice president of academic and research relations and VP of North America at Digital Science, says that the collaboration believes that the badges should be optional, to accommodate old-fashioned or less tech-savvy authors. She says that the digital credentials may improve lab culture, countering situations where junior scientists are caught up in lab politics and the “star,” who didn’t do much of the actual research apart from obtaining the funding, gets to be the first author of the paper and receive the most credit. “All of this calls out for more transparency,” Brand says….”
The collision between big data and privacy law
Finding Collaborators: Toward Interactive Discovery Tools for Research Network Systems
New paper by Charles D Borromeo, Titus K Schleyer, Michael J Becich, and Harry Hochheiser: “Background: Research networking systems hold great promise for helping biomedical scientists identify collaborators with the expertise needed to build interdisciplinary teams. Although efforts to date have focused primarily on collecting and aggregating information, less attention has been paid to the design of end-user tools for using these collections to identify collaborators. To be effective, collaborator search tools must provide researchers with easy access to information relevant to their collaboration needs.
Objective: The aim was to study user requirements and preferences for research networking system collaborator search tools and to design and evaluate a functional prototype.
Methods: Paper prototypes exploring possible interface designs were presented to 18 participants in semistructured interviews aimed at eliciting collaborator search needs. Interview data were coded and analyzed to identify recurrent themes and related software requirements. Analysis results and elements from paper prototypes were used to design a Web-based prototype using the D3 JavaScript library and VIVO data. Preliminary usability studies asked 20 participants to use the tool and to provide feedback through semistructured interviews and completion of the System Usability Scale (SUS).
Results: Initial interviews identified consensus regarding several novel requirements for collaborator search tools, including chronological display of publication and research funding information, the need for conjunctive keyword searches, and tools for tracking candidate collaborators. Participant responses were positive (SUS score: mean 76.4%, SD 13.9). Opportunities for improving the interface design were identified.
Conclusions: Interactive, timeline-based displays that support comparison of researcher productivity in funding and publication have the potential to effectively support searching for collaborators. Further refinement and longitudinal studies may be needed to better understand the implications of collaborator search tools for researcher workflows.”
Law is Code: A Software Engineering Approach to Analyzing the United States Code
New Paper by William Li, Pablo Azar, David Larochelle, Phil Hill & Andrew Lo: “The agglomeration of rules and regulations over time has produced a body of legal code that no single individual can fully comprehend. This complexity produces inefficiencies, makes the processes of understanding and changing the law difficult, and frustrates the fundamental principle that the law should provide fair notice to the governed. In this article, we take a quantitative, unbiased, and software-engineering approach to analyze the evolution of the United States Code from 1926 to today. Software engineers frequently face the challenge of understanding and managing large, structured collections of instructions, directives, and conditional statements, and we adapt and apply their techniques to the U.S. Code over time. Our work produces insights into the structure of the U.S. Code as a whole, its strengths and vulnerabilities, and new ways of thinking about individual laws. For example, we identify the first appearance and spread of important terms in the U.S. Code like “whistleblower” and “privacy.” We also analyze and visualize the network structure of certain substantial reforms, including the Patient Protection and Affordable Care Act (PPACA) and the Dodd-Frank Wall Street Reform and Consumer Protection Act, and show how the interconnections of references can increase complexity and create the potential for unintended consequences. Our work is a timely illustration of computational approaches to law as the legal profession embraces technology for scholarship, to increase efficiency, and to improve access to justice.”
Crowd-Sourcing Corruption: What Petrified Forests, Street Music, Bath Towels and the Taxman Can Tell Us About the Prospects for Its Future
Paper by Dieter Zinnbauer: “This article seeks to map out the prospects of crowd-sourcing technologies in the area of corruption-reporting. A flurry of initiative and concomitant media hype in this area has led to exuberant hopes that the end of impunity is not such a distant possibility any more – at least not for the most blatant, ubiquitous and visible forms of administrative corruption, such as bribes and extortion payments that on average almost a quarter of citizens reported to face year in, year out in their daily lives in so many countries around the world (Transparency International 2013).
Only with hindsight will we be able to tell, if these hopes were justified. However, a closer look at an interdisciplinary body of literature on corruption and social mobilisation can help shed some interesting light on these questions and offer a fresh perspective on the potential of social media based crowd-sourcing for better governance and less corruption. So far the potential of crowd-sourcing is mainly approached from a technology-centred perspective. Where challenges are identified, pondered, and worked upon they are primarily technical and managerial in nature, ranging from issues of privacy protection and fighting off hacker attacks to challenges of data management, information validation or fundraising.
In contrast, short shrift is being paid to insights from a substantive, multi-disciplinary and growing body of literature on how corruption works, how it can be fought and more generally how observed logics of collective action and social mobilisation interact with technological affordances and condition the success of these efforts.
This imbalanced debate is not really surprising as it seems to follow the trajectory of the hype-and-bust cycle that we have seen in the public debate for a variety of other technology applications. From electronic health cards to smart government, to intelligent transport systems, all these and many other highly ambitious initiatives start with technology-centric visions of transformational impact. However, over time – with some hard lessons learnt and large sums spent – they all arrive at a more pragmatic and nuanced view on how social and economic forces shape the implementation of such technologies and require a more shrewd design approach, in order to make it more likely that potential actually translates into impact….”
“Open” disclosure of innovations, incentives and follow-on reuse: Theory on processes of cumulative innovation and a field experiment in computational biology
Paper by Kevin J. Boudreau and Karim R. Lakhani: “Most of society’s innovation systems – academic science, the patent system, open source, etc. – are “open” in the sense that they are designed to facilitate knowledge disclosure among innovators. An essential difference across innovation systems is whether disclosure is of intermediate progress and solutions or of completed innovations. We theorize and present experimental evidence linking intermediate versus final disclosure to an ‘incentives-versus-reuse’ tradeoff and to a transformation of the innovation search process. We find intermediate disclosure has the advantage of efficiently steering development towards improving existing solution approaches, but also has the effect of limiting experimentation and narrowing technological search. We discuss the comparative advantages of intermediate versus final disclosure policies in fostering innovation.”