Automation Beyond the Physical: AI in the Public Sector


Ben Miller at Government Technology: “…The technology is, by nature, broadly applicable. If a thing involves data — “data” itself being a nebulous word — then it probably has room for AI. AI can help manage the data, analyze it and find patterns that humans might not have thought of. When it comes to big data, or data sets so big that they become difficult for humans to manually interact with, AI leverages the speedy nature of computing to find relationships that might otherwise be proverbial haystack needles.

One early area of government application is in customer service chatbots. As state and local governments started putting information on websites in the past couple of decades, they found that they could use those portals as a means of answering questions that constituents used to have to call an office to ask.

Ideally that results in a cyclical victory: Government offices didn’t have as many calls to answer, so they could devote more time and resources to other functions. And when somebody did call in, their call might be answered faster.

With chatbots, governments are betting they can answer even more of those questions. When he was the chief technology and innovation officer of North Carolina, Eric Ellis oversaw the setup of a system that did just that for IT help desk calls.

Turned out, more than 80 percent of the help desk’s calls were people who wanted to change their passwords. For something like that, where the process is largely the same each time, a bot can speed up the process with a little help from AI. Then, just like with the government Web portal, workers are freed up to respond to the more complicated calls faster….

Others are using AI to recognize and report objects in photographs and videos — guns, waterfowl, cracked concrete, pedestrians, semi-trucks, everything. Others are using AI to help translate between languages dynamically. Some want to use it to analyze the tone of emails. Some are using it to try to keep up with cybersecurity threats even as they morph and evolve. After all, if AI can learn to beat professional poker players, then why can’t it learn how digital black hats operate?

Castro sees another use for the technology, a more introspective one. The problem is this: The government workforce is a lot older than the private sector, and that can make it hard to create culture change. According to U.S. Census Bureau data, about 27 percent of public-sector workers are millennials, compared with 38 percent in the private sector.

“The traditional view [of government work] is you fill out a lot of forms, there are a lot of boring meetings. There’s a lot of bureaucracy in government,” Castro said. “AI has the opportunity to change a lot of that, things like filling out forms … going to routine meetings and stuff.”

As AI becomes more and more ubiquitous, people who work both inside and with government are coming up with an ever-expanding list of ways to use it. Here’s an inexhaustive list of specific use cases — some of which are already up and running and some of which are still just ideas….(More)”.

The Mobility Space Report: What the Street!?


What the Street!? was derived out of the question “How do new and old mobility concepts change our cities?”. It was raised by Michael Szell and Stephan Bogner during their residency at moovel lab. With support of the lab team they set out to wrangle data of cities around the world to develop and design this unique Mobility Space Report.

What the Street!? was made out of open-source software and resources. Thanks to the OpenStreetMap contributors and many other pieces we put together the puzzle of urban mobility space seen above….

If you take a snapshot of Berlin from space on a typical time of the day, you see 60,000 cars on the streets and 1,200,000 cars parked. Why are so many cars parked? Because cars are used only 36 minutes per day, while 95% of the time they just stand around unused. In Berlin, these 1.2 million parking spots take up the area of 64,000 playgrounds, or the area of 4 Central Parks.

If you look around the world, wasted public space is not particular to Berlin – many cities have the same problem. But why is so much space wasted in the first place? How “fair” is the distribution of space towards other forms of mobility like bikes and trams? Is there an arrogance of space? If so, how could we raise awareness or even improve the situation?

Who “owns” the city?

Let us first look at how much space there is in a city for moving around, and how it is allocated between bikes, rails, and cars. With What the Street!? – The Mobility Space Report, we set out to provide a public tool for exploring this urban mobility space and to answer our questions systematically, interactively, and above all, in a fun way. Inspired by recently developed techniques in data visualization of unrollingpacking, and ordering irregular shapes, we packed and rolled all mobility spaces into rectangular bins to visualize the areas they take up.

How do you visualize the total area taken by parking spaces? – You pack them tightly.
How do you visualize the total area taken by streets and tracks? – You roll them up tightly.…(More)”.

Massive Ebola data site planned to combat outbreaks


Amy Maxmen at Nature: “More than 11,000 people died when Ebola tore through West Africa between 2014 and 2016, and yet clinicians still lack data that would enable them to reliably identify the disease when a person first walks into a clinic. To fill that gap and others before the next outbreak hits, researchers are developing a platform to organize and share Ebola data that have so far been scattered beyond reach.

The information system is coordinated by the Infectious Diseases Data Observatory (IDDO), an international research network based at the University of Oxford, UK, and is expected to launch by the end of the year. …

During the outbreak, for example, a widespread rumour claimed that the plague was an experiment conducted by the West, which led some people to resist going to clinics and helped Ebola to spread.

Merson and her collaborators want to avoid the kind of data fragmentation that hindered efforts to stop the outbreak in Liberia, Guinea and Sierra Leone. As the Ebola crisis was escalating in October 2014, she visited treatment units in the three countries to advise on research. Merson found tremendous variation in practices, which complicated attempts to merge and analyse the information. For instance, some record books listed lethargy and hiccups as symptoms, whereas others recorded fatigue but not hiccups.

“People were just collecting what they could,” she recalls. Non-governmental organizations “were keeping their data private; academics take a year to get it out; and West Africa had set up surveillance but they were siloed from the international systems”, she says. …

In July 2015, the IDDO received pilot funds from the UK charity the Wellcome Trust to pool anonymized data from the medical records of people who contracted Ebola — and those who survived it — as well as data from clinical trials and public health projects during outbreaks in West Africa, Uganda and the Democratic Republic of Congo. The hope is that a researcher could search for data to help in diagnosing, treating and understanding the disease. The platform would also provide a home for new data as they emerge. A draft research agenda lists questions that the information might answer, such as how long the virus can survive outside the human body, and what factors are associated with psychological issues in those who survive Ebola.

One sensitive issue is deciding who will control the data. …It’s vital that these discussions happen now, in a period of relative calm, says Jeremy Farrar, director of the Wellcome Trust in London. When the virus emerges again, clinicians, scientists, and regulatory boards will need fast access to data so as not to repeat mistakes made last time. “We need to sit down and make sure we have a data platform in place so that we can respond to a new case of Ebola in hours and days, and not in months and years,” he says. “A great danger is that the world will move on and forget the horror of Ebola in West Africa.”…(More)”

Data-Driven Policy Making: The Policy Lab Approach


Paper by Anne Fleur van Veenstra and Bas Kotterink: “Societal challenges such as migration, poverty, and climate change can be considered ‘wicked problems’ for which no optimal solution exists. To address such problems, public administrations increasingly aim for datadriven policy making. Data-driven policy making aims to make optimal use of sensor data, and collaborate with citizens to co-create policy. However, few public administrations have realized this so far. Therefore, in this paper an approach for data-driven policy making is developed that can be used in the setting of a Policy Lab. A Policy Lab is an experimental environment in which stakeholders collaborate to develop and test policy. Based on literature, we first identify innovations in data-driven policy making. Subsequently, we map these innovations to the stages of the policy cycle. We found that most innovations are concerned with using new data sources in traditional statistics and that methodologies capturing the benefits of data-driven policy making are still under development. Further research should focus on policy experimentation while developing new methodologies for data-driven policy making at the same time….(More)”.

Dictionaries and crowdsourcing, wikis and user-generated content


Living Reference Work Entry by Michael Rundel: “It is tempting to dismiss crowdsourcing as a largely trivial recent development which has nothing useful to contribute to serious lexicography. This temptation should be resisted. When applied to dictionary-making, the broad term “crowdsourcing” in fact describes a range of distinct methods for creating or gathering linguistic data. A provisional typology is proposed, distinguishing three approaches which are often lumped under the heading “crowdsourcing.” These are: user-generated content (UGC), the wiki model, and what is referred to here as “crowd-sourcing proper.” Each approach is explained, and examples are given of their applications in linguistic and lexicographic projects. The main argument of this chapter is that each of these methods – if properly understood and carefully managed – has significant potential for lexicography. The strengths and weaknesses of each model are identified, and suggestions are made for exploiting them in order to facilitate or enhance different operations within the process of developing descriptions of language. Crowdsourcing – in its various forms – should be seen as an opportunity rather than as a threat or diversion….(More)”.

The Case for Sharing All of America’s Data on Mosquitoes


Ed Yong in the Atlantic: “The U.S. is sitting on one of the largest data sets on any animal group, but most of it is inaccessible and restricted to local agencies….For decades, agencies around the United States have been collecting data on mosquitoes. Biologists set traps, dissect captured insects, and identify which species they belong to. They’ve done this for millions of mosquitoes, creating an unprecedented trove of information—easily one of the biggest long-term attempts to monitor any group of animals, if not the very biggest.

The problem, according to Micaela Elvira Martinez from Princeton University and Samuel Rund from the University of Notre Dame, is that this treasure trove of data isn’t all in the same place, and only a small fraction of it is public. The rest is inaccessible, hoarded by local mosquito-control agencies around the country.

Currently, these agencies can use their data to check if their attempts to curtail mosquito populations are working. Are they doing enough to remove stagnant water, for example? Do they need to spray pesticides? But if they shared their findings, Martinez and Rund say that scientists could do much more. They could better understand the ecology of these insects, predict the spread of mosquito-borne diseases like dengue fever or Zika, coordinate control efforts across states and counties, and quickly spot the arrival of new invasive species.

That’s why Martinez and Rund are now calling for the creation of a national database of mosquito records that anyone can access. “There’s a huge amount of taxpayer investment and human effort that goes into setting traps, checking them weekly, dissecting all those mosquitoes under a microscope, and tabulating the data,” says Martinez. “It would be a big bang for our buck to collate all that data and make it available.”

Martinez is a disease modeler—someone who uses real-world data to build simulations that reveal how infections rise, spread, and fall. She typically works with childhood diseases like measles and polio, where researchers are almost spoiled for data. Physicians are legally bound to report any cases, and the Centers for Disease Control and Prevention (CDC) compiles and publishes this information as a weekly report.

The same applies to cases of mosquito-borne diseases like dengue and Zika, but not to populations of the insects themselves. So, during last year’s Zika epidemic, when Martinez wanted to study the Aedes aegypti mosquito that spreads the disease, she had a tough time. “I was really surprised that I couldn’t find data on Aedes aegypti numbers,” she says. Her colleagues explained that scientists use climate variables like temperature and humidity to predict where mosquitoes are going to be abundant. That seemed ludicrous to her, especially since organizations collect information on the actual insects. It’s just that no one ever gathers those figures together….

Together with Rund and a team of undergraduate students, she found that there are more than 1,000 separate agencies in the United States that collect mosquito data—at least one in every county or jurisdiction. Only 152 agencies make their data publicly available in some way. The team collated everything they could find since 2009, and ended up with information about more than 15 million mosquitoes. Imagine what they’d have if all the datasets were open, especially since some go back decades.

A few mosquito-related databases do exist, but none are quite right. ArboNET, which is managed by the CDC and state health departments, mainly stores data about mosquito-borne diseases, and whatever information it has on the insects themselves isn’t precise enough in either time or space to be useful for modeling. MosquitoNET, which was developed by the CDC, does track mosquitoes, but “it’s a completely closed system, and hardly anyone has access to it,” says Rund. The Smithsonian Institution’s VectorMap is better in that it’s accessible, “but it lacks any real-time data from the continental United States,” says Rund. “When I checked a few months ago, it had just one record of Aedes aegypti since 2013.”…

Some scientists who work on mosquito control apparently disagree, and negative reviews have stopped Martinez and Rund from publishing their ideas in prominent academic journals. (For now, they’ve uploaded a paper describing their vision to the preprint repository bioRxiv.) “Some control boards say: What if people want to sue us because we’re showing that they have mosquito vectors near their homes, or if their house prices go down?” says Martinez. “And one mosquito-control scientist told me that no one should be able to work with mosquito data unless they’ve gone out and trapped mosquitoes themselves.”…

“Data should be made available without having to justify exactly what’s going to be done with it,” Martinez says. “We should put it out there for scientists to start unlocking it. I think there are a ton of biologists who will come up with cool things to do.”…(More)”.

Debating big data: A literature review on realizing value from big data


Wendy Arianne Günther et al in The Journal of Strategic Information Systems: “Big data has been considered to be a breakthrough technological development over recent years. Notwithstanding, we have as yet limited understanding of how organizations translate its potential into actual social and economic value. We conduct an in-depth systematic review of IS literature on the topic and identify six debates central to how organizations realize value from big data, at different levels of analysis. Based on this review, we identify two socio-technical features of big data that influence value realization: portability and interconnectivity. We argue that, in practice, organizations need to continuously realign work practices, organizational models, and stakeholder interests in order to reap the benefits from big data. We synthesize the findings by means of an integrated model….(More)”.

From Katrina To Harvey: How Disaster Relief Is Evolving With Technology


Cale Guthrie Weissman at Fast Company: “Open data may sound like a nerdy thing, but this weekend has proven it’s also a lifesaver in more ways than one.

As Hurricane Harvey pelted the southern coast of Texas, a local open-data resource helped provide accurate and up-to-date information to the state’s residents. Inside Harris County’s intricate bayou system–intended to both collect water and effectively drain it–gauges were installed to sense when water is overflowing. The sensors transmit the data to a website, which has become a vital go-to for Houston residents….

This open access to flood gauges is just one of the many ways new tech-driven projects have helped improve responses to disasters over the years. “There’s no question that technology has played a much more significant role,” says Lemaitre, “since even Hurricane Sandy.”

While Sandy was noted in 2012 for its ability to connect people with Twitter hashtags and other relatively nascent social apps like Instagram, the last few years have brought a paradigm shift in terms of how emergency relief organizations integrate technology into their responses….

Social media isn’t just for the residents. Local and national agencies–including FEMA–rely on this information and are using it to help create faster and more effective disaster responses. Following the disaster with Hurricane Katrina, FEMA worked over the last decade to revamp its culture and methods for reacting to these sorts of situations. “You’re seeing the federal government adapt pretty quickly,” says Lemaitre.

There are a few examples of this. For instance, FEMA now has an app to push necessary information about disaster preparedness. The agency also employs people to cull the open web for information that would help make its efforts better and more effective. These “social listeners” look at all the available Facebook, Snapchat, and other social media posts in aggregate. Crews are brought on during disasters to gather intelligence, and then report about areas that need relief efforts–getting “the right information to the right people,” says Lemaitre.

There’s also been a change in how this information is used. Often, when disasters are predicted, people send supplies to the affected areas as a way to try and help out. Yet they don’t know exactly where they should send it, and local organizations sometimes become inundated. This creates a huge logistical nightmare for relief organizations that are sitting on thousands of blankets and tarps in one place when they should be actively dispersing them across hundreds of miles.

“Before, you would just have a deluge of things dropped on top of a disaster that weren’t particularly helpful at times,” says Lemaitre. Now people are using sites like Facebook to ask where they should direct the supplies. For example, after a bad flood in Louisiana last year, a woman announced she had food and other necessities on Facebook and was able to direct the supplies to an area in need. This, says Lemaitre, is “the most effective way.”

Put together, Lemaitre has seen agencies evolve with technology to help create better systems for quicker disaster relief. This has also created a culture of learning updates and reacting in real time. Meanwhile, more data is becoming open, which is helping both people and agencies alike. (The National Weather Service, which has long trumpeted its open data for all, has become a revered stalwart for such information, and has already proven indispensable in Houston.)

Most important, the pace of technology has caused organizations to change their own procedures. Twelve years ago, during Katrina, the protocol was to wait until an assessment before deploying any assistance. Now organizations like FEMA know that just doesn’t work. “You can’t afford to lose time,” says Lemaitre. “Deploy as much as you can and be fast about it–you can always scale back.”

It’s important to note that, even with rapid technological improvements, there’s no way to compare one disaster response to another–it’s simply not apples to apples. All the same, organizations are still learning about where they should be looking and how to react, connecting people to their local communities when they need them most….(More)”.

Bridging Governments’ Borders


Robyn Scott & Lisa Witter at SSIR: “…Our research found that “disconnection” falls into five, negatively reinforcing categories in the public sector; a closer look at these categories may help policy makers see the challenge before them more clearly:

1. Disconnected Governments

There is a truism in politics and government that all policy is local and context-dependent. Whether this was ever an accurate statement is questionable; it is certainly no longer. While all policy must ultimately be customized for local conditions, it absurd to assume there is little or nothing to learn from other countries. Three trends, in fact, indicate that solutions will become increasingly fungible between countries…..

2. Disconnected Issues

What climate change policy can endure without a job-creation strategy? What sensible criminal justice reform does not consider education? Yet even within countries, departments and their employees often remain as foreign to each other as do nations….

3. Disconnected Public Servants

The isolation of governments, and of government departments, is caused by and reinforces the isolation of people working in government, who have few incentives—and plenty of disincentives—to share what they are working on…..

4. Disconnected Citizens

…There are areas of increasingly visible progress in bridging the disconnections of government, citizen engagement being one. We’re still in the early stages, but private sector fashions such as human-centered design and design thinking have become government buzzwords. And platforms enabling new types of citizen engagement—from participatory budgeting to apps that people use to report potholes—are increasingly popping up around the world…..

5. Disconnected Ideas

According to the World Bank’s own data, one third of its reports are never read, even once. Foundations and academia pour tens of millions of dollars into policy research with few targeted channels to reach policymakers; they also tend to produce and deliver information in formats that policymakers don’t find useful. People in government, like everyone else, are frequently on their mobile phones, and short of time….(More)”

 

What does it mean to be differentially private?


Paul Francis at IAPP: “Back in June 2016, Apple announced it will use differential privacy to protect individual privacy for certain data that it collects. Though already a hot research topic for over a decade, this announcement introduced differential privacy to the broader public. Before that announcement, Google had already been using differential privacy for collecting Chrome usage statistics. And within the last month, Uber announced that they too are using differential privacy.

If you’ve done a little homework on differential privacy, you may have learned that it provides provable guarantees of privacy and concluded that a database that is differentially private is, well, private — in other words, that it protects individual privacy. But that isn’t necessarily the case. When someone says, “a database is differentially private,” they don’t mean that the database is private. Rather, they mean, “the privacy of the database can be measured.”

Really, it is like saying that “a bridge is weight limited.” If you know the weight limit of a bridge, then yes, you can use the bridge safely. But the bridge isn’t safe under all conditions. You can exceed the weight limit and hurt yourself.

The weight limit of bridges is expressed in tons, kilograms or number of people. Simplifying here a bit, the amount of privacy afforded by a differentially private database is expressed as a number, by convention labeled ε (epsilon). Lower ε means more private.

All bridges have a weight limit. Everybody knows this, so it sounds dumb to say, “a bridge is weight limited.” And guess what? All databases are differentially private. Or, more precisely, all databases have an ε. A database with no privacy protections at all has an ε of infinity. It is pretty misleading to call such a database differentially private, but mathematically speaking, it is not incorrect to do so. A database that can’t be queried at all has an ε of zero. Private, but useless.

In their paper on differential privacy for statistics, Cynthia Dwork and Adam Smith write, “The choice of ε is essentially a social question. We tend to think of ε as, say, 0.01, 0.1, or in some cases, ln 2 or ln 3.” The natural logarithm of 3 (ln 3) is around 1.1….(More)”.