To Reduce Privacy Risks, the Census Plans to Report Less Accurate Data


Mark Hansen at the New York Times: “When the Census Bureau gathered data in 2010, it made two promises. The form would be “quick and easy,” it said. And “your answers are protected by law.”

But mathematical breakthroughs, easy access to more powerful computing, and widespread availability of large and varied public data sets have made the bureau reconsider whether the protection it offers Americans is strong enough. To preserve confidentiality, the bureau’s directors have determined they need to adopt a “formal privacy” approach, one that adds uncertainty to census data before it is published and achieves privacy assurances that are provable mathematically.

The census has always added some uncertainty to its data, but a key innovation of this new framework, known as “differential privacy,” is a numerical value describing how much privacy loss a person will experience. It determines the amount of randomness — “noise” — that needs to be added to a data set before it is released, and sets up a balancing act between accuracy and privacy. Too much noise would mean the data would not be accurate enough to be useful — in redistricting, in enforcing the Voting Rights Act or in conducting academic research. But too little, and someone’s personal data could be revealed.

On Thursday, the bureau will announce the trade-off it has chosen for data publications from the 2018 End-to-End Census Test it conducted in Rhode Island, the only dress rehearsal before the actual census in 2020. The bureau has decided to enforce stronger privacy protections than companies like Apple or Google had when they each first took up differential privacy….

In presentation materials for Thursday’s announcement, special attention is paid to lessening any problems with redistricting: the potential complications of using noisy counts of voting-age people to draw district lines. (By contrast, in 2000 and 2010 the swapping mechanism produced exact counts of potential voters down to the block level.)

The Census Bureau has been an early adopter of differential privacy. Still, instituting the framework on such a large scale is not an easy task, and even some of the big technology firms have had difficulties. For example, shortly after Apple’s announcement in 2016 that it would use differential privacy for data collected from its macOS and iOS operating systems, it was revealed that the actual privacy loss of their systems was much higher than advertised.

Some scholars question the bureau’s abandonment of techniques like swapping in favor of differential privacy. Steven Ruggles, Regents Professor of history and population studies at the University of Minnesota, has relied on census data for decades. Through the Integrated Public Use Microdata Series, he and his team have regularized census data dating to 1850, providing consistency between questionnaires as the forms have changed, and enabling researchers to analyze data across years.

“All of the sudden, Title 13 gets equated with differential privacy — it’s not,” he said, adding that if you make a guess about someone’s identity from looking at census data, you are probably wrong. “That has been regarded in the past as protection of privacy. They want to make it so that you can’t even guess.”

“There is a trade-off between usability and risk,” he added. “I am concerned they may go far too far on privileging an absolutist standard of risk.”

In a working paper published Friday, he said that with the number of private services offering personal data, a prospective hacker would have little incentive to turn to public data such as the census “in an attempt to uncover uncertain, imprecise and outdated information about a particular individual.”…(More)”.

Chatbots Are a Danger to Democracy


Jamie Susskind in the New York Times: “As we survey the fallout from the midterm elections, it would be easy to miss the longer-term threats to democracy that are waiting around the corner. Perhaps the most serious is political artificial intelligence in the form of automated “chatbots,” which masquerade as humans and try to hijack the political process.

Chatbots are software programs that are capable of conversing with human beings on social media using natural language. Increasingly, they take the form of machine learning systems that are not painstakingly “taught” vocabulary, grammar and syntax but rather “learn” to respond appropriately using probabilistic inference from large data sets, together with some human guidance.

Some chatbots, like the award-winning Mitsuku, can hold passable levels of conversation. Politics, however, is not Mitsuku’s strong suit. When asked “What do you think of the midterms?” Mitsuku replies, “I have never heard of midterms. Please enlighten me.” Reflecting the imperfect state of the art, Mitsuku will often give answers that are entertainingly weird. Asked, “What do you think of The New York Times?” Mitsuku replies, “I didn’t even know there was a new one.”

Most political bots these days are similarly crude, limited to the repetition of slogans like “#LockHerUp” or “#MAGA.” But a glance at recent political history suggests that chatbots have already begun to have an appreciable impact on political discourse. In the buildup to the midterms, for instance, an estimated 60 percent of the online chatter relating to “the caravan” of Central American migrants was initiated by chatbots.

In the days following the disappearance of the columnist Jamal Khashoggi, Arabic-language social media erupted in support for Crown Prince Mohammed bin Salman, who was widely rumored to have ordered his murder. On a single day in October, the phrase “we all have trust in Mohammed bin Salman” featured in 250,000 tweets. “We have to stand by our leader” was posted more than 60,000 times, along with 100,000 messages imploring Saudis to “Unfollow enemies of the nation.” In all likelihood, the majority of these messages were generated by chatbots.

Chatbots aren’t a recent phenomenon. Two years ago, around a fifth of all tweets discussing the 2016 presidential election are believed to have been the work of chatbots. And a third of all traffic on Twitter before the 2016 referendum on Britain’s membership in the European Union was said to come from chatbots, principally in support of the Leave side….

We should also be exploring more imaginative forms of regulation. Why not introduce a rule, coded into platforms themselves, that bots may make only up to a specific number of online contributions per day, or a specific number of responses to a particular human? Bots peddling suspect information could be challenged by moderator-bots to provide recognized sources for their claims within seconds. Those that fail would face removal.

We need not treat the speech of chatbots with the same reverence that we treat human speech. Moreover, bots are too fast and tricky to be subject to ordinary rules of debate. For both those reasons, the methods we use to regulate bots must be more robust than those we apply to people. There can be no half-measures when democracy is at stake….(More)”.

New methods help identify what drives sensitive or socially unacceptable behaviors


Mary Guiden at Physorg: “Conservation scientists and statisticians at Colorado State University have teamed up to solve a key problem for the study of sensitive behaviors like poaching, harassment, bribery, and drug use.

Sensitive behaviors—defined as socially unacceptable or not compliant with rules and regulations—are notoriously hard to study, researchers say, because people often do not want to answer direct questions about them.

To overcome this challenge, scientists have developed indirect questioning approaches that protect responders’ identities. However, these methods also make it difficult to predict which sectors of a population are more likely to participate in sensitive behaviors, and which factors, such as knowledge of laws, education, or income, influence the probability that an individual will engage in a sensitive behavior.

Assistant Professor Jennifer Solomon and Associate Professor Michael Gavin of the Department of Human Dimensions of Natural Resources at CSU, and Abu Conteh from MacEwan University in Alberta, Canada, have teamed up with Professor Jay Breidt and doctoral student Meng Cao in the CSU Department of Statistics to develop a new method to solve the problem.

The study, “Understanding the drivers of sensitive behavior using Poisson regression from quantitative randomized response technique data,” was published recently in PLOS One.

Conteh, who, as a doctoral student, worked with Gavin in New Zealand, used a specific technique, known as quantitative randomized response, to elicit confidential answers to questions on behaviors related to non-compliance with natural resource regulations from a protected area in Sierra Leone.

In this technique, the researcher conducting interviews has a large container containing pingpong balls, some with numbers and some without numbers. The interviewer asks the respondent to pick a ball at random, without revealing it to the interviewer. If the ball has a number, the respondent tells the interviewer the number. If the ball does not have a number, the respondent reveals how many times he illegaly hunted animals in a given time period….

Armed with the new computer program, the scientists found that people from rural communities with less access to jobs in urban centers were more likely to hunt in the reserve. People in communities with a greater proportion people displaced by Sierra Leone’s 10-year civil war were also more likely to hunt illegally….(More)”

The researchers said that collaborating across disciplines was and is key to addressing complex problems like this one. It is commonplace for people to be noncompliant with rules and regulations and equally important for social scientists to analyze these behaviors….(More)”

The Constitution of Knowledge


Jonathan Rauch at National Affairs: “America has faced many challenges to its political culture, but this is the first time we have seen a national-level epistemic attack: a systematic attack, emanating from the very highest reaches of power, on our collective ability to distinguish truth from falsehood. “These are truly uncharted waters for the country,” wrote Michael Hayden, former CIA director, in the Washington Post in April. “We have in the past argued over the values to be applied to objective reality, or occasionally over what constituted objective reality, but never the existence or relevance of objective reality itself.” To make the point another way: Trump and his troll armies seek to undermine the constitution of knowledge….

The attack, Hayden noted, is on “the existence or relevance of objective reality itself.” But what is objective reality?

In everyday vernacular, reality often refers to the world out there: things as they really are, independent of human perception and error. Reality also often describes those things that we feel certain about, things that we believe no amount of wishful thinking could change. But, of course, humans have no direct access to an objective world independent of our minds and senses, and subjective certainty is in no way a guarantee of truth. Philosophers have wrestled with these problems for centuries, and today they have a pretty good working definition of objective reality. It is a set of propositions: propositions that have been validated in some way, and have thereby been shown to be at least conditionally true — true, that is, unless debunked. Some of these propositions reflect the world as we perceive it (e.g., “The sky is blue”). Others, like claims made by quantum physicists and abstract mathematicians, appear completely removed from the world of everyday experience.

It is worth noting, however, that the locution “validated in some way” hides a cheat. In what way? Some Americans believe Elvis Presley is alive. Should we send him a Social Security check? Many people believe that vaccines cause autism, or that Barack Obama was born in Africa, or that the murder rate has risen. Who should decide who is right? And who should decide who gets to decide?

This is the problem of social epistemology, which concerns itself with how societies come to some kind of public understanding about truth. It is a fundamental problem for every culture and country, and the attempts to resolve it go back at least to Plato, who concluded that a philosopher king (presumably someone like Plato himself) should rule over reality. Traditional tribal communities frequently use oracles to settle questions about reality. Religious communities use holy texts as interpreted by priests. Totalitarian states put the government in charge of objectivity.

There are many other ways to settle questions about reality. Most of them are terrible because they rely on authoritarianism, violence, or, usually, both. As the great American philosopher Charles Sanders Peirce said in 1877, “When complete agreement could not otherwise be reached, a general massacre of all who have not thought in a certain way has proved a very effective means of settling opinion in a country.”

As Peirce implied, one way to avoid a massacre would be to attain unanimity, at least on certain core issues. No wonder we hanker for consensus. Something you often hear today is that, as Senator Ben Sasse put it in an interview on CNN, “[W]e have a risk of getting to a place where we don’t have shared public facts. A republic will not work if we don’t have shared facts.”

But that is not quite the right answer, either. Disagreement about core issues and even core facts is inherent in human nature and essential in a free society. If unanimity on core propositions is not possible or even desirable, what is necessary to have a functional social reality? The answer is that we need an elite consensus, and hopefully also something approaching a public consensus, on the method of validating propositions. We needn’t and can’t all agree that the same things are true, but a critical mass needs to agree on what it is we do that distinguishes truth from falsehood, and more important, on who does it.

Who can be trusted to resolve questions about objective truth? The best answer turns out to be no one in particular….(More)”.

Library of Congress Launches Crowdsourcing Platform


Matt Enis at the Library Journal: “The Library of Congress (LC) last month launched crowd.loc.gov, a new crowdsourcing platform that will improve discovery and access to the Library’s digital collections with the help of volunteer transcription and tagging. The project kicked off with the “Letters to Lincoln Challenge,” a campaign encouraging volunteers to transcribe 10,000 digitized versions of documents written by or to Abraham Lincoln, which will make these materials full-text searchable for the first time….

The new project is the earliest example of LC’s new Digital Strategy, which complements the library’s new 2019–23 strategic plan. Announced in October, the strategic plan, “Enriching the User Experience,” outlines four high-level goals—expanding access, enhancing services, optimizing resources, and measuring results—while the digital strategy outlines how LC plans to accomplish these goals with its digital resources, described as “throwing open the treasure chest, connecting, and investing in our future”…

LC aims to use crowdsourcing to enrich the user experience in two key ways, Zwaard said.

“First, it helps with the legibility of our collections,” she explained. “The Library of Congress is home to so many historic treasures, but the handwriting can be hard to read…. For example, we have this amazing letter from Abraham Lincoln to his first fiancée. It’s really quite lovely, but at a glance, if you’re not familiar with historic handwriting, it’s hard to read.”…

Second, crowdsourcing “invites people into the collections,” she added. “The library is very optimized around answering specific research questions. One of the things we’re thinking about is how to serve users who don’t have a specific research question—who just want to see all of the cool stuff. We have so much cool stuff! But it can be hard for people to find purchase when they are just browsing and don’t have anything specific in mind. One of the ways we can [showcase interesting content] is by offering them a window into the collections by asking for their help.”…

To facilitate ongoing engagement with these varied projects, LC has set up an online forum on History Hub, a site hosted by the National Archives, to encourage crowd.loc.gov participants to ask questions, discuss projects, and meet other volunteers. …

Crowd.loc.gov is not LC’s first crowdsourcing project. Followers of the library’s official Flickr account have added tens of thousands of descriptive tags to digitized historical photos since the account debuted in 2007. And last year, the debut of labs.loc.gov—which aims to encourage creative use of LOC’s digital collections—included the Beyond Words crowdsourcing project developed by LC software developer Tong Wang….(More)”

Why We Need to Audit Algorithms


James Guszcza, Iyad Rahwan, Will Bible, Manuel Cebrian and Vic Katyal at Harvard Business Review: “Algorithmic decision-making and artificial intelligence (AI) hold enormous potential and are likely to be economic blockbusters, but we worry that the hype has led many people to overlook the serious problems of introducing algorithms into business and society. Indeed, we see many succumbing to what Microsoft’s Kate Crawford calls “data fundamentalism” — the notion that massive datasets are repositories that yield reliable and objective truths, if only we can extract them using machine learning tools. A more nuanced view is needed. It is by now abundantly clear that, left unchecked, AI algorithms embedded in digital and social technologies can encode societal biasesaccelerate the spread of rumors and disinformation, amplify echo chambers of public opinion, hijack our attention, and even impair our mental wellbeing.

Ensuring that societal values are reflected in algorithms and AI technologies will require no less creativity, hard work, and innovation than developing the AI technologies themselves. We have a proposal for a good place to start: auditing. Companies have long been required to issue audited financial statements for the benefit of financial markets and other stakeholders. That’s because — like algorithms — companies’ internal operations appear as “black boxes” to those on the outside. This gives managers an informational advantage over the investing public which could be abused by unethical actors. Requiring managers to report periodically on their operations provides a check on that advantage. To bolster the trustworthiness of these reports, independent auditors are hired to provide reasonable assurance that the reports coming from the “black box” are free of material misstatement. Should we not subject societally impactful “black box” algorithms to comparable scrutiny?

Indeed, some forward thinking regulators are beginning to explore this possibility. For example, the EU’s General Data Protection Regulation (GDPR) requires that organizations be able to explain their algorithmic decisions. The city of New York recently assembled a task force to study possible biases in algorithmic decision systems. It is reasonable to anticipate that emerging regulations might be met with market pull for services involving algorithmic accountability.

So what might an algorithm auditing discipline look like? First, it should adopt a holistic perspective. Computer science and machine learning methods will be necessary, but likely not sufficient foundations for an algorithm auditing discipline. Strategic thinking, contextually informed professional judgment, communication, and the scientific method are also required.

As a result, algorithm auditing must be interdisciplinary in order for it to succeed….(More)”.

Nudging compliance in government: A human-centered approach to public sector program design


Article by Michelle Cho, Joshua Schoop, Timothy Murphy: “What are the biggest challenges facing government? Bureaucracy? Gridlock? A shrinking pool of resources?

Chances are compliance—when people act in accordance with preset rules, policies, and/or expectations—doesn’t top the list for many. Yet maybe it should. Compliance touches nearly every aspect of public policy implementation. Over the past 10 years, US government spending on compliance reached US$7.5 billion.

Even the most sophisticated and well-planned policies often require cooperation and input from real humans to be successful. From voluntary tax filing at the Internal Revenue Service (IRS) to reducing greenhouse emissions at the Environmental Protection Agency (EPA), to achieving the public policy outcomes decision-makers intend, compliance is fundamental.

Consider these examples of noncompliance and their costs:

  • Taxes. By law, the IRS requires all income-earning, eligible constituents to file and pay their owed taxes. Tax evasion—the illegal nonpayment or underpayment of tax—cost the federal government an average of US$458 billion per year between 2008 and 2010.3 The IRS believes it will recover just 11 percent of the amount lost in that time frame.
  • The environment. The incorrect disposal of recyclable materials has cost more than US$744 million in the state of Washington since 2009.4 The city audit in San Diego found that 76 percent of materials disposed of citywide are recyclable and estimates that those recyclables could power 181,000 households for a year or conserve 3.4 million barrels of oil.5

Those who fail to comply with these rules could face direct and indirect consequences, including penalties and even jail time. Yet a significant subset of the population still behaves in a noncompliant manner. Why?

Behavioral sciences offer some clues. Through the combination of psychology, economics, and neuroscience, behavioral sciences demonstrate that people do not always do what is asked of them, even when it seems in their best interest to do so. Often, people choose a noncompliant path because of one of these reasons: They are unaware of their improper behavior, they find the “right” choice is too complex to decipher, or they simply are not intrinsically motivated to make the compliant choice.

For any of these reasons, when a cognitive hurdle emerges, some people resort to noncompliant behavior. But these hurdles can be overcome. Policymakers can use these same behavioral insights to understand why noncompliance occurs and alternatively, employ behavioral-inspired tools to encourage compliant behavior in a more agile and resource-efficient fashion.

In this spirit, leaders can take a more human-centered approach to program design by using behavioral science lessons to develop policies and programs in a manner that can make compliance easier and more appealing. In our article, we discuss three common reasons behind noncompliance and how better, more human-centered design can help policymakers achieve more positive results….(More)”.

Waze-fed AI platform helps Las Vegas cut car crashes by almost 20%


Liam Tung at ZDNet: “An AI-led, road-safety pilot program between analytics firm Waycare and Nevada transportation agencies has helped reduce crashes along the busy I-15 in Las Vegas.

The Silicon Valley Waycare system uses data from connected cars, road cameras and apps like Waze to build an overview of a city’s roads and then shares that data with local authorities to improve road safety.

Waycare struck a deal with Google-owned Waze earlier this year to “enable cities to communicate back with drivers and warn of dangerous roads, hazards, and incidents ahead”. Waze’s crowdsourced data also feeds into Waycare’s traffic management system, offering more data for cities to manage traffic.

Waycare has now wrapped up a year-long pilot with the Regional Transportation Commission of Southern Nevada (RTC), Nevada Highway Patrol (NHP), and the Nevada Department of Transportation (NDOT).

RTC reports that Waycare helped the city reduce the number of primary crashes by 17 percent along the Interstate 15 Las Vegas.

Waycare’s data, as well as its predictive analytics, gave the city’s safety and traffic management agencies the ability to take preventative measures in high risk areas….(More)”.

Beijing to Judge Every Resident Based on Behavior by End of 2020


Bloomberg News: “China’s plan to judge each of its 1.3 billion people based on their social behavior is moving a step closer to reality, with Beijing set to adopt a lifelong points program by 2021 that assigns personalized ratings for each resident.

The capital city will pool data from several departments to reward and punish some 22 million citizens based on their actions and reputations by the end of 2020, according to a plan posted on the Beijing municipal government’s website on Monday. Those with better so-called social credit will get “green channel” benefits while those who violate laws will find life more difficult.

The Beijing project will improve blacklist systems so that those deemed untrustworthy will be “unable to move even a single step,” according to the government’s plan. Xinhua reported on the proposal Tuesday, while the report posted on the municipal government’s website is dated July 18.

China has long experimented with systems that grade its citizens, rewarding good behavior with streamlined services while punishing bad actions with restrictions and penalties. Critics say such moves are fraught with risks and could lead to systems that reduce humans to little more than a report card.

Ambitious Plan

Beijing’s efforts represent the most ambitious yet among more than a dozen cities that are moving ahead with similar programs.

Hangzhou rolled out its personal credit system earlier this year, rewarding “pro-social behaviors” such as volunteer work and blood donations while punishing those who violate traffic laws and charge under-the-table fees. By the end of May, people with bad credit in China have been blocked from booking more than 11 million flights and 4 million high-speed train trips, according to the National Development and Reform Commission.

According to the Beijing government’s plan, different agencies will link databases to get a more detailed picture of every resident’s interactions across a swathe of services….(More)”.

Using Artificial Intelligence to Promote Diversity


Paul R. Daugherty, H. James Wilson, and Rumman Chowdhury at MIT Sloan Management Review:  “Artificial intelligence has had some justifiably bad press recently. Some of the worst stories have been about systems that exhibit racial or gender bias in facial recognition applications or in evaluating people for jobs, loans, or other considerations. One program was routinely recommending longer prison sentences for blacks than for whites on the basis of the flawed use of recidivism data.

But what if instead of perpetuating harmful biases, AI helped us overcome them and make fairer decisions? That could eventually result in a more diverse and inclusive world. What if, for instance, intelligent machines could help organizations recognize all worthy job candidates by avoiding the usual hidden prejudices that derail applicants who don’t look or sound like those in power or who don’t have the “right” institutions listed on their résumés? What if software programs were able to account for the inequities that have limited the access of minorities to mortgages and other loans? In other words, what if our systems were taught to ignore data about race, gender, sexual orientation, and other characteristics that aren’t relevant to the decisions at hand?

AI can do all of this — with guidance from the human experts who create, train, and refine its systems. Specifically, the people working with the technology must do a much better job of building inclusion and diversity into AI design by using the right data to train AI systems to be inclusive and thinking about gender roles and diversity when developing bots and other applications that engage with the public.

Design for Inclusion

Software development remains the province of males — only about one-quarter of computer scientists in the United States are women— and minority racial groups, including blacks and Hispanics, are underrepresented in tech work, too.  Groups like Girls Who Code and AI4ALL have been founded to help close those gaps. Girls Who Code has reached almost 90,000 girls from various backgrounds in all 50 states,5 and AI4ALL specifically targets girls in minority communities….(More)”.