Gov.uk quietly disrupts the problem of online identity login


The Guardian: “A new “verified identity” scheme for gov.uk is making it simpler to apply for a new driving licence, passport or to file a tax return online, allowing users to register securely using one log in that connects and securely stores their personal data.
After nearly a year of closed testing with a few thousand Britons, the “Gov.UK Verify” scheme quietly opened to general users on 14 October, expanding across more services. It could have as many as half a million users with a year.
The most popular services are expected to be one for tax credit renewals, and CAP farm information – both expected to have around 100,000 users by April next year, and on their own making up nearly half of the total use.
The team behind the system claim this is a world first. Those countries that have developed advanced government services online, such as Estonia, rely on state identity cards – which the UK has rejected.
“This is a federated model of identity, not a centralised one,” said Janet Hughes, head of policy and engagement at the Government Digital Service’s identity assurance program, which developed and tested the system.
How it works
The Verify system has taken three years to develop, and involves checking a user’s identity against details from a range of sources, including credit reference agencies, utility bills, driving licences and mobile provider bills.
But it does not retain those pieces of information, and the credit checking companies do not know what service is being used. Only a mobile or landline number is kept in order to send verification codes for subsequent logins.
When people subsequently log in, they would have to provide a user ID and password, and verify their identity by entering a code sent to related stored phone number.
To enrol in the system, users have to be over 19, living in the UK, and been resident for over 12 months. A faked passport would not be sufficient: “they would need a very full false ID, and have to not appear on any list of fraudulent identities,” one source at the GDS told the Guardian.
Banks now following gov.uk’s lead
Government developers are confident that it presents a higher barrier to authentication than any other digital service – so that fraudulent transactions will be minimised. That has interested banks, which are understood to be expressing interest in using the same service to verify customer identities through an arms-length verification system.
The government system would not pass on people’s data, but would instead verify that someone is who they claim to be, much like Twitter and Facebook verify users’ identity to log in to third party sites, yet don’t share their users’ data.
The US, Canada and New Zealand have also expressed interest in following up the UK’s lead in the system, which requires separate pieces of verified information about themselves from different sources.
The system then cross-references that verified information with credit reference agencies and other sources, which can include a mobile phone provider, passport, bank account, utility bill or driving licence.
The level of confidence in an individual’s identity is split into four levels. The lowest is for the creation of simple accounts to receive reports or updates: “we don’t need to know who it is, only that it’s the same person returning,” said Hughes.
Level 2 requires that “on the balance of probability” someone is who they say they are – which is the level to which Verify will be able to identify people. Hughes says that this will cover the majority of services.
Level 3 requires identity “beyond reasonable doubt” – perhaps including the first application for a passport – and Level 4 would require biometric information to confirm individual identity.

Seattle Launches Sweeping, Ethics-Based Privacy Overhaul


for the Privacy Advisor: “The City of Seattle this week launched a citywide privacy initiative aimed at providing greater transparency into the city’s data collection and use practices.
To that end, the city has convened a group of stakeholders, the Privacy Advisory Committee, comprising various government departments, to look at the ways the city is using data collected from practices as common as utility bill payments and renewing pet licenses or during the administration of emergency services like police and fire. By this summer, the committee will deliver the City Council suggested principles and a “privacy statement” to provide direction on privacy practices citywide.
In addition, the city has partnered with the University of Washington, where Jan Whittington, assistant professor of urban design and planning and associate director at the Center for Information Assurance and Cybersecurity, has been given a $50,000 grant to look at open data, privacy and digital equity and how municipal data collection could harm consumers.
Responsible for all things privacy in this progressive city is Michael Mattmiller, who was hired to the position of chief technology officer (CTO) for the City of Seattle in June. Before his current gig, he worked as a senior strategist in enterprise cloud privacy for Microsoft. He said it’s an exciting time to be at the helm of the office because there’s momentum, there’s talent and there’s intention.
“We’re at this really interesting time where we have a City Council that strongly cares about privacy … We have a new police chief who wants to be very good on privacy … We also have a mayor who is focused on the city being an innovative leader in the way we interact with the public,” he said.
In fact, some City Council members have taken it upon themselves to meet with various groups and coalitions. “We have a really good, solid environment we think we can leverage to do something meaningful,” Mattmiller said….
Armbruster said the end goal is to create policies that will hold weight over time.
“I think when looking at privacy principles, from an ethical foundation, the idea is to create something that will last while technology dances around us,” she said, adding the principles should answer the question, “What do we stand for as a city and how do we want to move forward? So any technology that falls into our laps, we can evaluate and tailor or perhaps take a pass on as it falls under our ethical framework.”
The bottom line, Mattmiller said, is making a decision that says something about Seattle and where it stands.
“How do we craft a privacy policy that establishes who we want to be as a city and how we want to operate?” Mattmiller asked.”

Could digital badges clarify the roles of co-authors?


  at AAAS Science Magazine: “Ever look at a research paper and wonder how the half-dozen or more authors contributed to the work? After all, it’s usually only the first or last author who gets all the media attention or the scientific credit when people are considered for jobs, grants, awards, and more. Some journals try to address this issue with the “authors’ contributions” sections within a paper, but a collection of science, publishing, and software groups is now developing a more modern solution—digital “badges,” assigned on publication of a paper online, that detail what each author did for the work and that the authors can link to their profiles elsewhere on the Web.

Digital badges could clarify co-authors' roles

Those organizations include publishers BioMed Central and the Public Library of Science; The Wellcome Trust research charity; software development groups Mozilla Science Lab (a group of researchers, developers, librarians, and publishers) and Digital Science (a software and technology firm); and ORCID, an effort to assign researchers digital identifiers. The collaboration presented its progress on the project at the Mozilla Festival in London that ended last week. (Mozilla is the open software community behind the Firefox browser and other programs.)
The infrastructure of the badges is still being established, with early prototypes scheduled to launch early next year, according to Amye Kenall, the journal development manager of open data initiatives and journals at BioMed Central. She envisions the badge process in the following way: Once an article is published, the publisher would alert software maintained by Mozilla to automatically set up an online form, where authors fill out roles using a detailed contributor taxonomy. After the authors have completed this, the badges would then appear next to their names on the journal article, and double-clicking on a badge would lead to the ORCID site for that particular author, where the author’s badges, integrated with their publishing record, live….
The parties behind the digital badge effort are “looking to change behavior” of scientists in the competitive dog-eat-dog world of academia by acknowledging contributions, says Kaitlin Thaney, director of Mozilla Science Lab. Amy Brand, vice president of academic and research relations and VP of North America at Digital Science, says that the collaboration believes that the badges should be optional, to accommodate old-fashioned or less tech-savvy authors. She says that the digital credentials may improve lab culture, countering situations where junior scientists are caught up in lab politics and the “star,” who didn’t do much of the actual research apart from obtaining the funding, gets to be the first author of the paper and receive the most credit. “All of this calls out for more transparency,” Brand says….”

Urban Observatory Is Snapping 9,000 Images A Day Of New York City


FastCo-Exist: “Astronomers have long built observatories to capture the night sky and beyond. Now researchers at NYU are borrowing astronomy’s methods and turning their cameras towards Manhattan’s famous skyline.
NYU’s Center for Urban Science and Progress has been running what’s likely the world’s first “urban observatory” of its kind for about a year. From atop a tall building in downtown Brooklyn (NYU won’t say its address, due to security concerns), two cameras—one regular one and one that captures infrared wavelengths—take panoramic images of lower and midtown Manhattan. One photo is snapped every 10 seconds. That’s 8,640 images a day, or more than 3 million since the project began (or about 50 terabytes of data).

“The real power of the urban observatory is that you have this synoptic imaging. By synoptic imaging, I mean these large swaths of the city,” says the project’s chief scientist Gregory Dobler, a former astrophysicist at Harvard University and the University of California, Santa Barbara who now heads the 15-person observatory team at NYU.
Dobler’s team is collaborating with New York City officials on the project, which is now expanding to set up stations that study other parts of Manhattan and Brooklyn. Its major goal is to discover information about the urban landscape that can’t be seen at other scales. Such data could lead to applications like tracking which buildings are leaking energy (with the infrared camera), or measuring occupancy patterns of buildings at night, or perhaps detecting releases of toxic chemicals in an emergency.
The video above is an example. The top panel cycles through a one-minute slice of observatory images. The bottom panel is an analysis of the same images in which everything that remains static in each image is removed, such as buildings, trees, and roads. What’s left is an imprint of everything in flux within the scene—the clouds, the cars on the FDR Drive, the boat moving down the East River, and, importantly, a plume of smoke that puffs out of a building.
“Periodically, a building will burp,” says Dobler. “It’s hard to see the puffs of smoke . . . but we can isolate that plume and essentially identify it.” (As Dobler has done by highlighting it in red in the top panel).
To the natural privacy concerns about this kind of program, Dobler emphasizes that the pictures are only from an 8 megapixel camera (the same found in the iPhone 6) and aren’t clear enough to see inside a window or make out individuals. As a further privacy safeguard, the images are analyzed to only look at “aggregate” measures—such as the patterns of nighttime energy usage—rather than specific buildings. “We’re not really interested in looking at a given building, and saying, hey, these guys are particular offenders,” he says (He also says the team is not looking at uses for the data in security applications.) However, Dobler was not able to answer a question as to whether the project’s partners at city agencies are able to access data analysis for individual buildings….”

How Wikipedia Data Is Revolutionizing Flu Forecasting


They say their model has the potential to transform flu forecasting from a black art to a modern science as well-founded as weather forecasting.
Flu takes between 3,000 and 49,000 lives each year in the U.S. so an accurate forecast can have a significant impact on the way society prepares for the epidemic. The current method of monitoring flu outbreaks is somewhat antiquated. It relies on a voluntary system in which public health officials report the percentage of patients they see each week with influenza-like illnesses. This is defined as the percentage of people with a temperature higher than 100 degrees, a cough and no other explanation other than flu.
These numbers give a sense of the incidence of flu at any instant but the accuracy is clearly limited. They do not, for example, account for people with flu who do not seek treatment or people with flu-like symptoms who seek treatment but do not have flu.
There is another significant problem. The network that reports this data is relatively slow. It takes about two weeks for the numbers to filter through the system so the data is always weeks old.
That’s why the CDC is interested in finding new ways to monitor the spread of flu in real time. Google, in particular, has used the number of searches for flu and flu-like symptoms to forecast flu in various parts of the world. That approach has had considerable success but also some puzzling failures. One problem, however, is that Google does not make its data freely available and this lack of transparency is a potential source of trouble for this kind of research.
So Hickmann and co have turned to Wikipedia. Their idea is that the variation in numbers of people accessing articles about flu is an indicator of the spread of the disease. And since Wikipedia makes this data freely available to any interested party, it is an entirely transparent source that is likely to be available for the foreseeable future….
Ref: arxiv.org/abs/1410.7716 : Forecasting the 2013–2014 Influenza Season using Wikipedia”

The New Thing in Google Flu Trends Is Traditional Data


in the New York Times: “Google is giving its Flu Trends service an overhaul — “a brand new engine,” as it announced in a blog post on Friday.

The new thing is actually traditional data from the Centers for Disease Control and Prevention that is being integrated into the Google flu-tracking model. The goal is greater accuracy after the Google service had been criticized for consistently over-estimating flu outbreaks in recent years.

The main critique came in an analysis done by four quantitative social scientists, published earlier this year in an article in Science magazine, “The Parable of Google Flu: Traps in Big Data Analysis.” The researchers found that the most accurate flu predictor was a data mash-up that combined Google Flu Trends, which monitored flu-related search terms, with the official C.D.C. reports from doctors on influenza-like illness.

The Google Flu Trends team is heeding that advice. In the blog post, written by Christian Stefansen, a Google senior software engineer, wrote, “We’re launching a new Flu Trends model in the United States that — like many of the best performing methods in the literature — takes official CDC flu data into account as the flu season progresses.”

Google’s flu-tracking service has had its ups and downs. Its triumph came in 2009, when it gave an advance signal of the severity of the H1N1 outbreak, two weeks or so ahead of official statistics. In a 2009 article in Nature explaining how Google Flu Trends worked, the company’s researchers did, as the Friday post notes, say that the Google service was not intended to replace official flu surveillance methods and that it was susceptible to “false alerts” — anything that might prompt a surge in flu-related search queries.

Yet those caveats came a couple of pages into the Nature article. And Google Flu Trends became a symbol of the superiority of the new, big data approach — computer algorithms mining data trails for collective intelligence in real time. To enthusiasts, it seemed so superior to the antiquated method of collecting health data that involved doctors talking to patients, inspecting them and filing reports.

But Google’s flu service greatly overestimated the number of cases in the United States in the 2012-13 flu season — a well-known miss — and, according to the research published this year, has persistently overstated flu cases over the years. In the Science article, the social scientists called it “big data hubris.”

Governing the Smart, Connected City


Blog by Susan Crawford at HBR: “As politics at the federal level becomes increasingly corrosive and polarized, with trust in Congress and the President at historic lows, Americans still celebrate their cities. And cities are where the action is when it comes to using technology to thicken the mesh of civic goods — more and more cities are using data to animate and inform interactions between government and citizens to improve wellbeing.
Every day, I learn about some new civic improvement that will become possible when we can assume the presence of ubiquitous, cheap, and unlimited data connectivity in cities. Some of these are made possible by the proliferation of smartphones; others rely on the increasing number of internet-connected sensors embedded in the built environment. In both cases, the constant is data. (My new book, The Responsive City, written with co-author Stephen Goldsmith, tells stories from Chicago, Boston, New York City and elsewhere about recent developments along these lines.)
For example, with open fiber networks in place, sending video messages will become as accessible and routine as sending email is now. Take a look at rhinobird.tv, a free lightweight, open-source video service that works in browsers (no special download needed) and allows anyone to create a hashtag-driven “channel” for particular events and places. A debate or protest could be viewed from a thousand perspectives. Elected officials and public employees could easily hold streaming, virtual town hall meetings.
Given all that video and all those livestreams, we’ll need curation and aggregation to make sense of the flow. That’s why visualization norms, still in their infancy, will become a greater part of literacy. When the Internet Archive attempted late last year to “map” 400,000 hours of television news, against worldwide locations, it came up with pulsing blobs of attention. Although visionary Kevin Kelly has been talking about data visualization as a new form of literacy for years, city governments still struggle with presenting complex and changing information in standard, easy-to-consume ways.
Plenar.io is one attempt to resolve this. It’s a platform developed by former Chicago Chief Data Officer Brett Goldstein that allows public datasets to be combined and mapped with easy-to-see relationships among weather and crime, for example, on a single city block. (A sample question anyone can ask of Plenar.io: “Tell me the story of 700 Howard Street in San Francisco.”) Right now, Plenar.io’s visual norm is a map, but it’s easy to imagine other forms of presentation that could become standard. All the city has to do is open up its widely varying datasets…”

Law is Code: A Software Engineering Approach to Analyzing the United States Code


New Paper by William Li, Pablo Azar, David Larochelle, Phil Hill & Andrew Lo: “The agglomeration of rules and regulations over time has produced a body of legal code that no single individual can fully comprehend. This complexity produces inefficiencies, makes the processes of understanding and changing the law difficult, and frustrates the fundamental principle that the law should provide fair notice to the governed. In this article, we take a quantitative, unbiased, and software-engineering approach to analyze the evolution of the United States Code from 1926 to today. Software engineers frequently face the challenge of understanding and managing large, structured collections of instructions, directives, and conditional statements, and we adapt and apply their techniques to the U.S. Code over time. Our work produces insights into the structure of the U.S. Code as a whole, its strengths and vulnerabilities, and new ways of thinking about individual laws. For example, we identify the first appearance and spread of important terms in the U.S. Code like “whistleblower” and “privacy.” We also analyze and visualize the network structure of certain substantial reforms, including the Patient Protection and Affordable Care Act (PPACA) and the Dodd-Frank Wall Street Reform and Consumer Protection Act, and show how the interconnections of references can increase complexity and create the potential for unintended consequences. Our work is a timely illustration of computational approaches to law as the legal profession embraces technology for scholarship, to increase efficiency, and to improve access to justice.”

Research Handbook On Transparency


New book edited by Padideh Ala’i and Robert G. Vaughn: ‘”Transparency” has multiple, contested meanings. This broad-ranging volume accepts that complexity and thoughtfully contrasts alternative views through conceptual pieces, country cases, and assessments of policies–such as freedom of information laws, whistleblower protections, financial disclosure, and participatory policymaking procedures.’
– Susan Rose-Ackerman, Yale University Law School, US
In the last two decades transparency has become a ubiquitous and stubbornly ambiguous term. Typically understood to promote rule of law, democratic participation, anti-corruption initiatives, human rights, and economic efficiency, transparency can also legitimate bureaucratic power, advance undemocratic forms of governance, and aid in global centralization of power. This path-breaking volume, comprising original contributions on a range of countries and environments, exposes the many faces of transparency by allowing readers to see the uncertainties, inconsistencies and surprises contained within the current conceptions and applications of the term….
The expert contributors identify the goals, purposes and ramifications of transparency while presenting both its advantages and shortcomings. Through this framework, they explore transparency from a number of international and comparative perspectives. Some chapters emphasize cultural and national aspects of the issue, with country-specific examples from China, Mexico, the US and the UK, while others focus on transparency within global organizations such as the World Bank and the WTO. A number of relevant legal considerations are also discussed, including freedom of information laws, financial disclosure of public officials and whistleblower protection…”

Mapping the Age of Every Building in Manhattan


Kriston Capps at CityLab: “The Harlem Renaissance was the epicenter of new movements in dance, poetry, painting, and literature, and its impact still registers in all those art forms. If you want to trace the Harlem Renaissance, though, best look to Harlem itself.
Many if not most of the buildings in Harlem today rose between 1900 and 1940—and a new mapping tool called Urban Layers reveals exactly where and when. Harlem boasts very few of the oldest buildings in Manhattan today, but it does represent the island’s densest concentration of buildings constructed during the Great Migration.
Thanks to Morphocode‘s Urban Layers, it’s possible to locate nearly every 19th-century building still standing in Manhattan today. That’s just one of the things that you can isolate with the map, which combines two New York City building datasets (PLUTO and Building Footprints) and Mapbox GL JS vector technology to generate an interactive architectural history.
So, looking specifically at Harlem again (with some of the Upper West Side thrown in for good measure), it’s easy to see that very few of the buildings that went up between 1765 to 1860 still stand today….”