IBM using Watson to build a “SIRI for Cities”


 at FastCompany: “A new app that incorporates IBM’s Watson cognitive computing platform is like Siri for ordering city services.

IBM said today that the city of Surrey, in British Columbia, Canada, has rolled out the new app, which leverages Watson’s sophisticated language and data analysis system to allow residents to make requests for things like finding out why their trash wasn’t picked up or how to find a lost cat using natural language.

Watson is best known as the computer system that autonomously vanquished the world’s best Jeopardy players during a highly publicized competition in 2011. In the years since, IBM has applied the system to a wide range of computing problems in industries like health care, banking, retail, and education. The system is based on Watson’s ability to understand natural language queries and to analyze huge data sets.

Recently, Watson rolled out a tool designed to help people detect the tone in their writing.

Surrey worked with the developer Purple Forge to build the new city services app, which will be combined with the city’s existing “My Surrey” mobile and web tools. IBM said that residents can ask a wide range of questions on devices like smartphones, laptops, or even Apple Watches. Big Blue said Surrey’s app is the first time Watson has been utilized in a “citizen services” app.

The tool offers a series of frequently asked questions, but also allows residents in the city of nearly half a million to come up with their own. IBM said Surrey officials are hopeful that the app will help them be more responsive to residents’ concerns.

Among the services users can ask about are those provided by Surrey’s police and fire departments, animal control, parking enforcement, trash pickup, and others….(More)”

The Trouble With Disclosure: It Doesn’t Work


Jesse Eisinger at ProPublica: “Louis Brandeis was wrong. The lawyer and Supreme Court justice famously declared that sunlight is the best disinfectant, and we have unquestioningly embraced that advice ever since.

 Over the last century, disclosure and transparency have become our regulatory crutch, the answer to every vexing problem. We require corporations and government to release reams of information on food, medicine, household products, consumer financial tools, campaign finance and crime statistics. We have a booming “report card” industry for a range of services, including hospitals, public schools and restaurants.

All this sunlight is blinding. As new scholarship is demonstrating, the value of all this information is unproved. Paradoxically, disclosure can be useless — and sometimes actually harmful or counterproductive.

“We are doing disclosure as a regulatory move all over the board,” says Adam J. Levitin, a law professor at Georgetown, “The funny thing is, we are doing this despite very little evidence of its efficacy.”

Let’s start with something everyone knows about — the “terms of service” agreements for the likes of iTunes. Like everybody else, I click the “I agree” box, feeling a flash of resentment. I’m certain that in Paragraph 184 is a clause signing away my firstborn to a life of indentured servitude to Timothy D. Cook as his chief caviar spoon keeper.

Our legal theoreticians have determined these opaque monstrosities work because someone, somewhere reads the fine print in these contracts and keeps corporations honest. It turns out what we laymen intuit is true: No one reads them, according to research by a New York University law professor, Florencia Marotta-Wurgler.

In real life, there is no critical mass of readers policing the agreements. And if there were an eagle-eyed crew of legal experts combing through these agreements, what recourse would they have? Most people don’t even know that the Supreme Court has gutted their rights to sue in court, and they instead have to go into arbitration, which usually favors corporations.

The disclosure bonanza is easy to explain. Nobody is against it. It’s politically expedient. Companies prefer such rules, especially in lieu of actual regulations that would curtail bad products or behavior. The opacity lobby — the remora fish class of lawyers, lobbyists and consultants in New York and Washington — knows that disclosure requirements are no bar to dodgy practices. You just have to explain what you’re doing in sufficiently incomprehensible language, a task that earns those lawyers a hefty fee.

Of course, some disclosure works. Professor Levitin cites two examples. The first is an olfactory disclosure. Methane doesn’t have any scent, but a foul smell is added to alert people to a gas leak. The second is ATM. fees. A study in Australia showed that once fees were disclosed, people avoided the high-fee machines and took out more when they had to go to them.

But to Omri Ben-Shahar, co-author of a recent book, ” More Than You Wanted To Know: The Failure of Mandated Disclosure,” these are cherry-picked examples in a world awash in useless disclosures. Of course, information is valuable. But disclosure as a regulatory mechanism doesn’t work nearly well enough, he argues….(More)

Algorithms and Bias


Q. and A. With Cynthia Dwork in the New York Times: “Algorithms have become one of the most powerful arbiters in our lives. They make decisions about the news we read, the jobs we get, the people we meet, the schools we attend and the ads we see.

Yet there is growing evidence that algorithms and other types of software can discriminate. The people who write them incorporate their biases, and algorithms often learn from human behavior, so they reflect the biases we hold. For instance, research has shown that ad-targeting algorithms have shown ads for high-paying jobs to men but not women, and ads for high-interest loans to people in low-income neighborhoods.

Cynthia Dwork, a computer scientist at Microsoft Research in Silicon Valley, is one of the leading thinkers on these issues. In an Upshot interview, which has been edited, she discussed how algorithms learn to discriminate, who’s responsible when they do, and the trade-offs between fairness and privacy.

Q: Some people have argued that algorithms eliminate discriminationbecause they make decisions based on data, free of human bias. Others say algorithms reflect and perpetuate human biases. What do you think?

A: Algorithms do not automatically eliminate bias. Suppose a university, with admission and rejection records dating back for decades and faced with growing numbers of applicants, decides to use a machine learning algorithm that, using the historical records, identifies candidates who are more likely to be admitted. Historical biases in the training data will be learned by the algorithm, and past discrimination will lead to future discrimination.

Q: Are there examples of that happening?

A: A famous example of a system that has wrestled with bias is the resident matching program that matches graduating medical students with residency programs at hospitals. The matching could be slanted to maximize the happiness of the residency programs, or to maximize the happiness of the medical students. Prior to 1997, the match was mostly about the happiness of the programs.

This changed in 1997 in response to “a crisis of confidence concerning whether the matching algorithm was unreasonably favorable to employers at the expense of applicants, and whether applicants could ‘game the system,’ ” according to a paper by Alvin Roth and Elliott Peranson published in The American Economic Review.

Q: You have studied both privacy and algorithm design, and co-wrote a paper, “Fairness Through Awareness,” that came to some surprising conclusions about discriminatory algorithms and people’s privacy. Could you summarize those?

A: “Fairness Through Awareness” makes the observation that sometimes, in order to be fair, it is important to make use of sensitive information while carrying out the classification task. This may be a little counterintuitive: The instinct might be to hide information that could be the basis of discrimination….

Q: The law protects certain groups from discrimination. Is it possible to teach an algorithm to do the same?

A: This is a relatively new problem area in computer science, and there are grounds for optimism — for example, resources from the Fairness, Accountability and Transparency in Machine Learning workshop, which considers the role that machines play in consequential decisions in areas like employment, health care and policing. This is an exciting and valuable area for research. …(More)”

Yelp’s Consumer Protection Initiative: ProPublica Partnership Brings Medical Info to Yelp


Yelp Official Blog: “…exists to empower and protect consumers, and we’re continually focused on how we can enhance our service while enhancing the ability for consumers to make smart transactional decisions along the way.

A few years ago, we partnered with local governments to launch the LIVES open data standard. Now, millions of consumers find restaurant inspection scores when that information is most relevant: while they’re in the middle of making a dining decision (instead of when they’re signing the check). Studies have shown that displaying this information more prominently has a positive impact.

Today we’re excited to announce we’ve joined forces with ProPublica to incorporate health care statistics and consumer opinion survey data onto the Yelp business pages of more than 25,000 medical treatment facilities. Read more in today’s Washington Post story.

We couldn’t be more excited to partner with ProPublica, the Pulitzer Prize winning non-profit newsroom that produces investigative journalism in the public interest.

The information is compiled by ProPublica from their own research and the Centers for Medicare and Medicaid Services (CMS) for 4,600 hospitals, 15,000 nursing homes, and 6,300 dialysis clinics in the US and will be updated quarterly. Hover text on the business page will explain the statistics, which include number of serious deficiencies and fines per nursing home and emergency room wait times for hospitals. For example, West Kendall Baptist Hospital has better than average doctor communication and an average 33 minute ER wait time, Beachside Nursing Center currently has no deficiencies, and San Mateo Dialysis Center has a better than average patient survival rate.

Now the millions of consumers who use Yelp to find and evaluate everything from restaurants to retail will have even more information at their fingertips when they are in the midst of the most critical life decisions, like which hospital to choose for a sick child or which nursing home will provide the best care for aging parents….(More)

Print Wikipedia


Print Wikipedia is a both a utilitarian visualization of the largest accumulation of human knowledge and a poetic gesture towards the futility of the scale of big data. Michael Mandiberg has written software that parses the entirety of the English-language Wikipedia database and programmatically lays out 7600 volumes, complete with covers, and then uploads them to Lulu.com. In addition, he has compiled a Wikipedia Table of Contents, and a Wikipedia Contributor Appendix…..

Michael Mandiberg is an interdisciplinary artist, scholar, and educator living in Brooklyn, New York. He received his M.F.A. from the California Institute of the Arts and his B.A. from Brown University. His work traces the lines of political and symbolic power online, working on the Internet in order to comment on and intercede in the real flows of information. His work lives at Mandiberg.com.

Print Wikipedia by Michael Mandiberg from Lulu.com on Vimeo.”

 

How We’re Changing the Way We Respond to Petitions


Jason Goldman (White House) at Medium: “…In 2011 (years before I arrived at the White House), the team here developed a petitions platform called We the People. It provided a clear and easy way for the American people to petition their government — along with a threshold for action. Namely — once a petition gains 100,000 signatures.

This was a new system for the United States government, announced as a flagship effort in the first U.S. Open Government National Action Plan. Right now it exists only for the White House (Hey, Congress! We have anopen API! Get in touch!) Some other countries, including Germany and theUnited Kingdom, do online petitions, too. In fact, the European Parliamenthas even started its own online petitioning platform.

For the most part, we’ve been pretty good about responding — before today, the Obama Administration had responded to 255 petitions that had collectively gathered more than 11 million signatures. That’s more than 91 percent of the petitions that have met our threshold requiring a response. Some responses have taken a little longer than others. But now, I’m happy to say, we have caught up.

Today, the White House is responding to every petition in our We the Peoplebacklog — 20 in all.

This means that nearly 2.5 million people who had petitioned us to take action on something heard back today. And it’s our goal to make that response the start of the conversation, not the final page. The White House is made up of offices that research and analyze the kinds of policy issues raised by these petitions, and leaders from those offices will be taking questions today, and in the weeks to come, from petition signers, on topics such as vaccination policy, community policing, and other petition subjects.

Take a look at more We the People stats here.

We’ll start the conversation on Twitter. Follow @WeThePeople, and join the conversation using hashtag #WeThePeople. (I’ll be personally taking your questions on @Goldman44 about how we’re changing the platform specifically at 3:30 p.m. Eastern.)

We the People, Moving Forward

We’re going to be changing a few things about We the People.

  1. First, from now on, if a petition meets the signature goal within a designated period of time, we will aim to respond to it — with an update or policy statement — within 60 days wherever possible. You can read about the details of our policy in the We the People Terms of Participation.
  2. Second, other outside petitions platforms are starting to tap into the We the People platform. We’re excited to announce today that Change.org is choosing to integrate with the We the People platform, meaning the future signatures of its 100 million users will count toward the threshold for getting an official response from the Administration. We’re also opening up the code behind petitions.whitehouse.gov on Drupal.org and GitHub, which empowers other governments and outside organizations to create their own versions of this platform to engage their own citizens and constituencies.
  3. Third, and most importantly, the process of hearing from us about your petition is going to look a little different. We’ve assembled a team of people responsible for taking your questions and requests and bringing them to the right people — whether within the White House or in an agency within the Administration — who may be in a position to say something about your request….(More)

A Visual Introduction to Machine Learning


R2D3 introduction: “In machine learning, computers apply statistical learning techniques to automatically identify patterns in data. These techniques can be used to make highly accurate predictions.

Keep scrolling. Using a data set about homes, we will create a machine learning model to distinguish homes in New York from homes in San Francisco…./

 

  1. Machine learning identifies patterns using statistical learning and computers by unearthing boundaries in data sets. You can use it to make predictions.
  2. One method for making predictions is called a decision trees, which uses a series of if-then statements to identify boundaries and define patterns in the data
  3. Overfitting happens when some boundaries are based on on distinctions that don’t make a difference. You can see if a model overfits by having test data flow through the model….(More)”

Urban Informatics


Special issue of Data Engineering: “Most data related to people and the built world originates in urban settings. There is increasing demand to capture and exploit this data to support efforts in areas such as Smart Cities, City Science and Intelligent Transportation Systems. Urban informatics deals with the collection, organization, dissemination and analysis of urban information used in such applications. However, the dramatic growth in the volume of this urban data creates challenges for existing data-management and analysis techniques. The collected data is also increasingly diverse, with a wide variety of sensor, GIS, imagery and graph data arising in cities. To address these challenges, urban informatics requires development of advanced data-management approaches, analysis methods, and visualization techniques. It also provides an opportunity to confront the “Variety” axis of Big Data head on. The contributions in this issue cross the spectrum of urban information, from its origin, to archiving and retrieval, to analysis and visualization. …

Collaborative Sensing for Urban Transportation (By Sergio Ilarri, et al)

Open Civic Data: Of the People, For the People, By the People (by Arnaud Sahuguet, et al, The GovLab)

Plenario: An Open Data Discovery and Exploration Platform for Urban Science (by Charlie Catlett et al)

Riding from Urban Data to Insight Using New York City Taxis (by Juliana Freire et al)…(More)”

 

The Causes, Costs and Consequences of Bad Government Data


Katherine Barrett & Richard Greene in Governing: “Data is the lifeblood of state government. It’s the crucial commodity that’s necessary to manage projects, avoid fraud, assess program performance, keep the books in balance and deliver services efficiently. But even as the trend toward greater reliance on data has accelerated over the past decades, the information itself has fallen dangerously short of the mark. Sometimes it doesn’t exist at all. But worse than that, all too often it’s just wrong.

There are examples everywhere. Last year, the California auditor’s office issued a report that looked at accounting records at the State Controller’s Office to see whether it was accurately recording sick leave and vacation credits. “We found circumstances where instead of eight hours, it was 80 and in one case, 800,” says Elaine Howle, the California state auditor. “And the system didn’t have controls to say that’s impossible.” The audit found 200,000 questionable hours of leave due to data entry errors, with a value of $6 million.

Mistakes like that are embarrassing, and can lead to unequal treatment of valued employees. Sometimes, however, decisions made with bad data can have deeper consequences. In 2012, the secretary of environmental protection in Pennsylvania told Congress that there was no evidence the state’s water quality had been affected by fracking. “Tens of thousands of wells have been hydraulically fractured in Pennsylvania,” he said, “without any indication that groundwater quality has been impacted.”

But by August 2014, the same department published a list of 248 incidents of damage to well water due to gas development. Why didn’t the department pick up on the water problems sooner? A key reason was that the data collected by its six regional offices had not been forwarded to the central office. At the same time, the regions differed greatly in how they collected, stored, transmitted and dealt with the information. An audit concluded that Pennsylvania’s complaint tracking system for water quality was ineffective and failed to provide “reliable information to effectively manage the program.”

When data is flawed, the consequences can reach throughout the entire government enterprise. Services are needlessly duplicated; evaluation of successful programs is difficult; tax dollars go uncollected; infrastructure maintenance is conducted inefficiently; health-care dollars are wasted. The list goes on and on. Increasingly, states are becoming aware of just how serious the problem is. “The poor quality of government data,” says Dave Yost, Ohio’s state auditor, “is probably the most important emerging trend for government executives, across the board, at all levels.”

Just how widespread a problem is data quality? In aGoverning telephone survey with more than 75 officials in 46 states, about 7 out of 10 said that data problems were frequently or often an impediment to doing their business effectively. No one who worked with program data said this was rarely the case. (View the full results of the survey in this infographic.)…(More)

See also: Bad Data Is at All Levels of Government and The Next Big Thing in Data Analytics

Disruptive Technology that Could Transform Government-Citizen Relationships


David Raths at GovTech: “William Gibson, the science fiction writer who coined the term “cyberspace,” once said: “The future is already here — it’s just not very evenly distributed.” That may be exactly the way to look at the selection of disruptive technologies we have chosen to highlight in eight critical areas of government, ranging from public safety to health to transportation. ….

PUBLIC SAFETY: WEARABLE TECH IS TRANSFORMING EMERGENCY RESPONSE

The wearable technology market is expected to grow from $20 billion in 2015 to almost $70 billion in 2025, according to research firm IDTechEx. As commercial applications bloom, more will find their way into the public sector and emergency response.

This year has seen an increase in the number of police departments using body cameras. And already under development are wireless devices that monitor a responder’s breathing, heart rate and blood pressure, as well as potentially harmful environmental conditions, and relay concerns back to incident command.

But rather than sitting back and waiting for the market to develop, the U.S. Department of Homeland Security is determined to spur innovation in the field. DHS’ research and development arm is funding a startup accelerator program called Emerge managed by the Center for Innovative Technology (CIT), a Virginia-based nonprofit. Two accelerators, in Texas and Illinois, will work with 10 to 15 startups this year to develop wearable products and adopt them for first responder use….

HEALTH & HUMAN SERVICES: ‘HOT-SPOTTING’ FOR POPULATION HEALTH MANAGEMENT

A hot health-care trend is population health management: using data to improve health at a community level as well as an individual level. The growth in sophistication of GIS tools has allowed public health researchers to more clearly identify and start addressing health resource disparities.

Dr. Jeffrey Brenner, a Camden, N.J.-based physician, uses data gathered in a health information exchange (HIE) to target high-cost individuals. The Camden Coalition of Healthcare Providers uses the HIE data to identify high-cost “hot spots” — high-rise buildings where a large number of hospital emergency room “super users” live. By identifying and working with these individuals on patient-centered care coordination issues, the coalition has been able to reduce emergency room use and in-patient stays….

PARKS & RECREATION: TRACKING TREES FOR A BETTER FUTURE

A combination of advances in mobile data collection systems and geocoding lets natural resources and parks agencies be more proactive about collecting tree data, managing urban forests and quantifying their value, as forests become increasingly important resources in an era of climate change.

Philadelphia Parks and Recreation has added approximately 2 million trees to its database in the past few years. It plans to create a digital management system for all of them. Los Angeles City Parks uses the Davey Tree Expert Co.’s Web-based TreeKeeper management software to manage existing tree inventories and administer work orders. The department can also more easily look at species balance to manage against pests, disease and drought….

CORRECTIONS: VIDEO-BASED TOOLS TRANSFORM PRISONS AND JAILS

Videoconferencing is disrupting business as usual in U.S. jails and prisons in two ways: One is the rising use of telemedicine to reduce inmate health-care costs and to increase access to certain types of care for prisoners. The other is video visitation between inmates and families.

A March 2015 report by Southern California Public Radio noted that the federal court-appointed receiver overseeing inmate health care in California is reviewing telemedicine capabilities to reduce costly overtime billing by physicians and nurses at prisons. In one year, overtime has more than doubled for this branch of corrections, from more than $12 million to nearly $30 million….

FINANCE & BUDGETING: DATA PORTALS OFFER TRANSPARENCY AT UNPRECEDENTED LEVELS

The transparency and open data movements have hit the government finance sector in a big way and promise to be an area of innovation in the years ahead.

A partnership between Ohio Treasurer Josh Mandel and the finance visualization startup OpenGov will result in one of the most sweeping statewide transparency efforts to date.

The initiative offers 3,900-plus local governments — from townships, cities and counties to school districts and more — a chance to place revenues and expenditures online free of charge through the state’s budget transparency site OhioCheckbook.com. Citizens will be able to track local government revenues and expenditures via interactive graphs that illustrate not only a bird’s-eye view of a budget, but also the granular details of check-by-check spending….

DMV: DRIVERS’ LICENSES: THERE WILL SOON BE AN APP FOR THAT

The laminated driver’s license you keep in your wallet may eventually give way to an app on your smartphone, and that change may have wider significance for how citizens interact digitally with their government. Legislatures in at least three states have seen bills introduced authorizing their transportation departments to begin piloting digital drivers’ licenses…..

TRANSPORTATION & MASS TRANSIT: BIG BREAKTHROUGHS ARE JUST AROUND THE CORNER

Nothing is likely to be more disruptive to transportation, mass transit and urban planning than the double whammy of connected vehicle technology and autonomous vehicles.
The U.S. Department of Transportation expects great things from the connected vehicles of the future ­— and that future may be just around the corner. Vehicle-to-infrastructure communication capabilities and anonymous information from passengers’ wireless devices relayed through dedicated short-range connections could provide transportation agencies with improved traffic, transit and parking data, making it easier to manage transportation systems and improve traffic safety….. (More)”