DATA – Page 527 – The Living Library

The collision between big data and privacy law

Curated on November 6, 2014August 3, 2018 by Stefaan Verhulst

Paper by Stephen Wilson in the Australian Journal of Telecommunications and the Digital Economy : “We live in an age where billionaires are self-made on the back of the most intangible of assets – the information they have about us. The digital economy is awash with data. It’s a new and endlessly re-useable raw material, increasingly left behind by ordinary people going about their lives online. Many information businesses proceed on the basis that raw data is up for grabs; if an entrepreneur is clever enough to find a new vein of it, they can feel entitled to tap it in any way they like. However, some tacit assumptions underpinning today’s digital business models are naive. Conventional data protection laws, older than the Internet, limit how Personal Information is allowed to flow. These laws turn out to be surprisingly powerful in the face of ‘Big Data’ and the ‘Internet of Things’. On the other hand, orthodox privacy management was not framed for new Personal Information being synthesised tomorrow from raw data collected today. This paper seeks to bridge a conceptual gap between data analytics and privacy, and sets out extended Privacy Principles to better deal with Big Data.”

Urban Observatory Is Snapping 9,000 Images A Day Of New York City

Curated on November 5, 2014August 3, 2018 by Stefaan Verhulst

FastCo-Exist: “Astronomers have long built observatories to capture the night sky and beyond. Now researchers at NYU are borrowing astronomy’s methods and turning their cameras towards Manhattan’s famous skyline.
NYU’s Center for Urban Science and Progress has been running what’s likely the world’s first “urban observatory” of its kind for about a year. From atop a tall building in downtown Brooklyn (NYU won’t say its address, due to security concerns), two cameras—one regular one and one that captures infrared wavelengths—take panoramic images of lower and midtown Manhattan. One photo is snapped every 10 seconds. That’s 8,640 images a day, or more than 3 million since the project began (or about 50 terabytes of data).

“The real power of the urban observatory is that you have this synoptic imaging. By synoptic imaging, I mean these large swaths of the city,” says the project’s chief scientist Gregory Dobler, a former astrophysicist at Harvard University and the University of California, Santa Barbara who now heads the 15-person observatory team at NYU.
Dobler’s team is collaborating with New York City officials on the project, which is now expanding to set up stations that study other parts of Manhattan and Brooklyn. Its major goal is to discover information about the urban landscape that can’t be seen at other scales. Such data could lead to applications like tracking which buildings are leaking energy (with the infrared camera), or measuring occupancy patterns of buildings at night, or perhaps detecting releases of toxic chemicals in an emergency.
The video above is an example. The top panel cycles through a one-minute slice of observatory images. The bottom panel is an analysis of the same images in which everything that remains static in each image is removed, such as buildings, trees, and roads. What’s left is an imprint of everything in flux within the scene—the clouds, the cars on the FDR Drive, the boat moving down the East River, and, importantly, a plume of smoke that puffs out of a building.
“Periodically, a building will burp,” says Dobler. “It’s hard to see the puffs of smoke . . . but we can isolate that plume and essentially identify it.” (As Dobler has done by highlighting it in red in the top panel).
To the natural privacy concerns about this kind of program, Dobler emphasizes that the pictures are only from an 8 megapixel camera (the same found in the iPhone 6) and aren’t clear enough to see inside a window or make out individuals. As a further privacy safeguard, the images are analyzed to only look at “aggregate” measures—such as the patterns of nighttime energy usage—rather than specific buildings. “We’re not really interested in looking at a given building, and saying, hey, these guys are particular offenders,” he says (He also says the team is not looking at uses for the data in security applications.) However, Dobler was not able to answer a question as to whether the project’s partners at city agencies are able to access data analysis for individual buildings….”

Finding Collaborators: Toward Interactive Discovery Tools for Research Network Systems

Curated on November 4, 2014October 30, 2018 by Stefaan Verhulst

New paper by Charles D Borromeo, Titus K Schleyer, Michael J Becich, and Harry Hochheiser: “Background: Research networking systems hold great promise for helping biomedical scientists identify collaborators with the expertise needed to build interdisciplinary teams. Although efforts to date have focused primarily on collecting and aggregating information, less attention has been paid to the design of end-user tools for using these collections to identify collaborators. To be effective, collaborator search tools must provide researchers with easy access to information relevant to their collaboration needs.
Objective: The aim was to study user requirements and preferences for research networking system collaborator search tools and to design and evaluate a functional prototype.
Methods: Paper prototypes exploring possible interface designs were presented to 18 participants in semistructured interviews aimed at eliciting collaborator search needs. Interview data were coded and analyzed to identify recurrent themes and related software requirements. Analysis results and elements from paper prototypes were used to design a Web-based prototype using the D3 JavaScript library and VIVO data. Preliminary usability studies asked 20 participants to use the tool and to provide feedback through semistructured interviews and completion of the System Usability Scale (SUS).
Results: Initial interviews identified consensus regarding several novel requirements for collaborator search tools, including chronological display of publication and research funding information, the need for conjunctive keyword searches, and tools for tracking candidate collaborators. Participant responses were positive (SUS score: mean 76.4%, SD 13.9). Opportunities for improving the interface design were identified.
Conclusions: Interactive, timeline-based displays that support comparison of researcher productivity in funding and publication have the potential to effectively support searching for collaborators. Further refinement and longitudinal studies may be needed to better understand the implications of collaborator search tools for researcher workflows.”

The New Thing in Google Flu Trends Is Traditional Data

Curated on November 3, 2014August 3, 2018 by Stefaan Verhulst

Steve Lohr in the New York Times: “Google is giving its Flu Trends service an overhaul — “a brand new engine,” as it announced in a blog post on Friday.

The new thing is actually traditional data from the Centers for Disease Control and Prevention that is being integrated into the Google flu-tracking model. The goal is greater accuracy after the Google service had been criticized for consistently over-estimating flu outbreaks in recent years.

The main critique came in an analysis done by four quantitative social scientists, published earlier this year in an article in Science magazine, “The Parable of Google Flu: Traps in Big Data Analysis.” The researchers found that the most accurate flu predictor was a data mash-up that combined Google Flu Trends, which monitored flu-related search terms, with the official C.D.C. reports from doctors on influenza-like illness.

The Google Flu Trends team is heeding that advice. In the blog post, written by Christian Stefansen, a Google senior software engineer, wrote, “We’re launching a new Flu Trends model in the United States that — like many of the best performing methods in the literature — takes official CDC flu data into account as the flu season progresses.”

Google’s flu-tracking service has had its ups and downs. Its triumph came in 2009, when it gave an advance signal of the severity of the H1N1 outbreak, two weeks or so ahead of official statistics. In a 2009 article in Nature explaining how Google Flu Trends worked, the company’s researchers did, as the Friday post notes, say that the Google service was not intended to replace official flu surveillance methods and that it was susceptible to “false alerts” — anything that might prompt a surge in flu-related search queries.

Yet those caveats came a couple of pages into the Nature article. And Google Flu Trends became a symbol of the superiority of the new, big data approach — computer algorithms mining data trails for collective intelligence in real time. To enthusiasts, it seemed so superior to the antiquated method of collecting health data that involved doctors talking to patients, inspecting them and filing reports.

But Google’s flu service greatly overestimated the number of cases in the United States in the 2012-13 flu season — a well-known miss — and, according to the research published this year, has persistently overstated flu cases over the years. In the Science article, the social scientists called it “big data hubris.”

Law is Code: A Software Engineering Approach to Analyzing the United States Code

Curated on November 3, 2014August 3, 2018 by Stefaan Verhulst

New Paper by William Li, Pablo Azar, David Larochelle, Phil Hill & Andrew Lo: “The agglomeration of rules and regulations over time has produced a body of legal code that no single individual can fully comprehend. This complexity produces inefficiencies, makes the processes of understanding and changing the law difficult, and frustrates the fundamental principle that the law should provide fair notice to the governed. In this article, we take a quantitative, unbiased, and software-engineering approach to analyze the evolution of the United States Code from 1926 to today. Software engineers frequently face the challenge of understanding and managing large, structured collections of instructions, directives, and conditional statements, and we adapt and apply their techniques to the U.S. Code over time. Our work produces insights into the structure of the U.S. Code as a whole, its strengths and vulnerabilities, and new ways of thinking about individual laws. For example, we identify the first appearance and spread of important terms in the U.S. Code like “whistleblower” and “privacy.” We also analyze and visualize the network structure of certain substantial reforms, including the Patient Protection and Affordable Care Act (PPACA) and the Dodd-Frank Wall Street Reform and Consumer Protection Act, and show how the interconnections of references can increase complexity and create the potential for unintended consequences. Our work is a timely illustration of computational approaches to law as the legal profession embraces technology for scholarship, to increase efficiency, and to improve access to justice.”

Open Data – Searching for the right questions

Curated on November 3, 2014May 29, 2019 by Stefaan Verhulst

Talk by Boyan Yurukov at TEDxBG: “Working on various projects Boyan started a sort of a quest for better transparency. It came with the promise of access that would yield answers to what is wrong and what is right with governments today. Over time, he realized that better transparency and more open data bring us almost no relevant answers. Instead, we get more questions and that’s great news. Questions help us see what is relevant, what is hidden, what our assumptions are. That’s the true value of data.
Boyan Yurukov is a software engineer and open data advocate based in Frankfurt. Graduated Computational Engineering with Data Mining from TU Darmstadt. Involved in data liberation, crowd sourcing and visualization projects focused on various issues in Bulgaria as well as open data legislation….

Ten Leaders In the Civic Space

Curated on November 1, 2014October 10, 2018 by Stefaan Verhulst

List Developed by SeeClickFix:

1. Granicus

Granicus is the leading provider of government webcasting and public meeting software, maintaining the world’s largest network of legislative content…
Read another article about them here.
And, here on their website.

2. Socrata

Socrata is a cloud software company that aims to democratize access to government data through their open data and open performance platform….
Read another article about them here.
And, here on their website.

3. CityWorks

Cityworks is the leading provider of GIS-centric asset management solutions, performing cost-effective inspection, monitoring, and condition assessment.

Read another article about them here.
And, here on their website.

4. NeighborWorks

NeighborWorks is a community development hub that supports more than 240 U.S. development organizations through grants and technical assistance.
Read another article about them here.
And, here on their website.

5. OpenGov Hub

The OpenGov Hub seeks to bring together existing small and medium-sized organizations working on the broader open government agenda. …
Learn more about them here on their website.

6. Blexting

Blexting is a mobile app that lets individuals photographically survey properties and update condition information for posting and sharing. …
Read another article about them here.

7. Code For America

Code for America aims to forge connections between the public and private sector by organizing a network of people to build technology that make government services better….
Read a recent news piece about Code for America here.
And, here on their website.

8. NationBuilder

NationBuilder is a cost-effective, accessible software platform that helps communities organize and people build relationships.
Read another article about them here.
And, here on their website.

9. Emerging Local Government Leaders

ELGL is a group of innovative local government leaders who are hungry to make an impact. …

Learn more about them here on their website.

10. ArchiveSocial

ArchiveSocial is a social media archiving solution that automates record keeping from social media networks like Facebook and Twitter. ….
Learn more about them here on their website.”

Crowd-Sourcing Corruption: What Petrified Forests, Street Music, Bath Towels and the Taxman Can Tell Us About the Prospects for Its Future

Curated on October 28, 2014August 3, 2018 by Stefaan Verhulst

Paper by Dieter Zinnbauer: “This article seeks to map out the prospects of crowd-sourcing technologies in the area of corruption-reporting. A flurry of initiative and concomitant media hype in this area has led to exuberant hopes that the end of impunity is not such a distant possibility any more – at least not for the most blatant, ubiquitous and visible forms of administrative corruption, such as bribes and extortion payments that on average almost a quarter of citizens reported to face year in, year out in their daily lives in so many countries around the world (Transparency International 2013).
Only with hindsight will we be able to tell, if these hopes were justified. However, a closer look at an interdisciplinary body of literature on corruption and social mobilisation can help shed some interesting light on these questions and offer a fresh perspective on the potential of social media based crowd-sourcing for better governance and less corruption. So far the potential of crowd-sourcing is mainly approached from a technology-centred perspective. Where challenges are identified, pondered, and worked upon they are primarily technical and managerial in nature, ranging from issues of privacy protection and fighting off hacker attacks to challenges of data management, information validation or fundraising.
In contrast, short shrift is being paid to insights from a substantive, multi-disciplinary and growing body of literature on how corruption works, how it can be fought and more generally how observed logics of collective action and social mobilisation interact with technological affordances and condition the success of these efforts.
This imbalanced debate is not really surprising as it seems to follow the trajectory of the hype-and-bust cycle that we have seen in the public debate for a variety of other technology applications. From electronic health cards to smart government, to intelligent transport systems, all these and many other highly ambitious initiatives start with technology-centric visions of transformational impact. However, over time – with some hard lessons learnt and large sums spent – they all arrive at a more pragmatic and nuanced view on how social and economic forces shape the implementation of such technologies and require a more shrewd design approach, in order to make it more likely that potential actually translates into impact….”

When Experts Are a Waste of Money

Curated on October 28, 2014August 3, 2018 by Stefaan Verhulst

Vivek Wadhwa at the Wall Street Journal: “Corporations have always relied on industry analysts, management consultants and in-house gurus for advice on strategy and competitiveness. Since these experts understand the products, markets and industry trends, they also get paid the big bucks.
But what experts do is analyze historical trends, extrapolate forward on a linear basis and protect the status quo — their field of expertise. And technologies are not progressing linearly anymore; they are advancing exponentially. Technology is advancing so rapidly that listening to people who just have domain knowledge and vested interests will put a company on the fastest path to failure. Experts are no longer the right people to turn to; they are a waste of money.
Just as the processing power of our computers doubles every 18 months, with prices falling and devices becoming smaller, fields such as medicine, robotics, artificial intelligence and synthetic biology are seeing accelerated change. Competition now comes from the places you least expect it to. The health-care industry, for example, is about to be disrupted by advances in sensors and artificial intelligence; lodging and transportation, by mobile apps; communications, by Wi-Fi and the Internet; and manufacturing, by robotics and 3-D printing.
To see the competition coming and develop strategies for survival, companies now need armies of people, not experts. The best knowledge comes from employees, customers and outside observers who aren’t constrained by their expertise or personal agendas. It is they who can best identify the new opportunities. The collective insight of large numbers of individuals is superior because of the diversity of ideas and breadth of knowledge that they bring. Companies need to learn from people with different skills and backgrounds — not from those confined to a department.
When used properly, crowdsourcing can be the most effective, least expensive way of solving problems.
Crowdsourcing can be as simple as asking employees to submit ideas via email or via online discussion boards, or it can assemble cross-disciplinary groups to exchange ideas and brainstorm. Internet platforms such as Zoho Connect, IdeaScale and GroupTie can facilitate group ideation by providing the ability to pose questions to a large number of people and having them discuss responses with each other.
Many of the ideas proposed by the crowd as well as the discussions will seem outlandish — especially if anonymity is allowed on discussion forums. And companies will surely hear things they won’t like. But this is exactly the input and out-of-the-box thinking that they need in order to survive and thrive in this era of exponential technologies….
Another way of harnessing the power of the crowd is to hold incentive competitions. These can solve problems, foster innovation and even create industries — just as the first XPRIZE did. Sponsored by the Ansari family, it offered a prize of $10 million to any team that could build a spacecraft capable of carrying three people to 100 kilometers above the earth’s surface, twice within two weeks. It was won by Burt Rutan in 2004, who launched a spacecraft called SpaceShipOne. Twenty-six teams, from seven countries, spent more than $100 million in competing. Since then, more than $1.5 billion has been invested in private space flight by companies such as Virgin Galactic, Armadillo Aerospace and Blue Origin, according to the XPRIZE Foundation….
Competitions needn’t be so grand. InnoCentive and HeroX, a spinoff from the XPRIZE Foundation, for example, allow prizes as small as a few thousand dollars for solving problems. A company or an individual can specify a problem and offer prizes for whoever comes up with the best idea to solve it. InnoCentive has already run thousands of public and inter-company competitions. The solutions they have crowdsourced have ranged from the development of biomarkers for Amyotrophic lateral sclerosis disease to dual-purpose solar lights for African villages….”

VoteATX

Curated on October 28, 2014August 3, 2018 by Stefaan Verhulst

PressRelease: “Local volunteers have released a free application that helps Austin area residents find the best place to vote. The application, Vote ATX, is available at http://voteatx.us
Travis County voters have many options for voting. The Vote ATX application tries to answer the simple question, “Where is the best place I can go vote right now?” The application is location and calendar aware, and helps identify available voting places – even mobile voting locations that move during the day.
The City of Austin has incorporated the Vote ATX technology to power the voting place finder on its election page at http://www.austintexas.gov/vote
The Vote ATX application was developed by volunteers at Open Austin, and is provided as a free public service. …Open Austin is a citizen volunteer group that promotes open government, open data, and civic application development in Austin, Texas. Open Austin was formed in 2009 by citizens interested in the City of Austin web strategy. Open Austin is non-partisan and non-endorsing. It has conducted voter outreach campaigns in every City of Austin municipal election since 2011. Open Austin is on the web at www.open-austin.org“