Unleashing the Power of Data to Serve the American People


Memorandum: Unleashing the Power of Data to Serve the American People
To: The American People
From: Dr. DJ Patil, Deputy U.S. CTO for Data Policy and Chief Data Scientist

….While there is a rich history of companies using data to their competitive advantage, the disproportionate beneficiaries of big data and data science have been Internet technologies like social media, search, and e-commerce. Yet transformative uses of data in other spheres are just around the corner. Precision medicine and other forms of smarter health care delivery, individualized education, and the “Internet of Things” (which refers to devices like cars or thermostats communicating with each other using embedded sensors linked through wired and wireless networks) are just a few of the ways in which innovative data science applications will transform our future.

The Obama administration has embraced the use of data to improve the operation of the U.S. government and the interactions that people have with it. On May 9, 2013, President Obama signed Executive Order 13642, which made open and machine-readable data the new default for government information. Over the past few years, the Administration has launched a number of Open Data Initiatives aimed at scaling up open data efforts across the government, helping make troves of valuable data — data that taxpayers have already paid for — easily accessible to anyone. In fact, I used data made available by the National Oceanic and Atmospheric Administration to improve numerical methods of weather forecasting as part of my doctoral work. So I know firsthand just how valuable this data can be — it helped get me through school!

Given the substantial benefits that responsibly and creatively deployed data can provide to us and our nation, it is essential that we work together to push the frontiers of data science. Given the importance this Administration has placed on data, along with the momentum that has been created, now is a unique time to establish a legacy of data supporting the public good. That is why, after a long time in the private sector, I am returning to the federal government as the Deputy Chief Technology Officer for Data Policy and Chief Data Scientist.

Organizations are increasingly realizing that in order to maximize their benefit from data, they require dedicated leadership with the relevant skills. Many corporations, local governments, federal agencies, and others have already created such a role, which is usually called the Chief Data Officer (CDO) or the Chief Data Scientist (CDS). The role of an organization’s CDO or CDS is to help their organization acquire, process, and leverage data in a timely fashion to create efficiencies, iterate on and develop new products, and navigate the competitive landscape.

The Role of the First-Ever U.S. Chief Data Scientist

Similarly, my role as the U.S. CDS will be to responsibly source, process, and leverage data in a timely fashion to enable transparency, provide security, and foster innovation for the benefit of the American public, in order to maximize the nation’s return on its investment in data.

So what specifically am I here to do? As I start, I plan to focus on these four activities:

…(More)”

Amid Open Data Push, Agencies Feel Urge for Analytics


Jack Moore at NextGov: “Federal agencies, thanks to their unique missions, have long been collectors of valuable, vital and, no doubt, arcane data. Under a nearly two-year-old executive order from President Barack Obama, agencies are releasing more of this data in machine-readable formats to the public and entrepreneurs than ever before.
But agencies still need a little help parsing through this data for their own purposes. They are turning to industry, academia and outside researchers for cutting-edge analytics tools to parse through their data to derive insights and to use those insights to drive decision-making.
Take the U.S. Agency for International Development, for example. The agency administers U.S. foreign aid programs aimed at ending extreme poverty and helping support democratic societies around the globe.
Under the agency’s own recent open data policy, it’s started collecting reams of data from its overseas missions. Starting Oct. 1, organizations doing development work on the ground – including through grants and contracts – have been directed to also collect data generated by their work and submit it to back to agency headquarters. Teams go through the data, scrub it to remove sensitive material and then publish it.
The data spans the gamut from information on land ownership in South Sudan to livestock demographics in Senegal and HIV prevention activities in Zambia….The agency took the first step in solving that problem with a Jan. 20 request for information from outside groups for cutting-edge data analytics tools.
“Operating units within USAID are sometimes constrained by existing capacity to transform data into insights that could inform development programming,” the RFI stated.
The RFI queries industry on their capabilities in data mining and social media analytics and forecasting and systems modeling.
USAID is far from alone in its quest for data-driven decision-making.
A Jan. 26 RFI from the Transportation Department’s Federal Highway Administration also seeks innovative ideas from industry for “advanced analytical capabilities.”…(More)”

Opinion Mining in Social Big Data


New Paper by Wlodarczak, Peter and Ally, Mustafa and Soar, Jeffrey: “Opinion mining has rapidly gained importance due to the unprecedented amount of opinionated data on the Internet. People share their opinions on products, services, they rate movies, restaurants or vacation destinations. Social Media such as Facebook or Twitter has made it easier than ever for users to share their views and make it accessible for anybody on the Web. The economic potential has been recognized by companies who want to improve their products and services, detect new trends and business opportunities or find out how effective their online marketing efforts are. However, opinion mining using social media faces many challenges due to the amount and the heterogeneity of the available data. Also, spam or fake opinions have become a serious issue. There are also language related challenges like the usage of slang and jargon on social media or special characters like smileys that are widely adopted on social media sites.
These challenges create many interesting research problems such as determining the influence of social media on people’s actions, understanding opinion dissemination or determining the online reputation of a company. Not surprisingly opinion mining using social media has become a very active area of research, and a lot of progress has been made over the last years. This article describes the current state of research and the technologies that have been used in recent studies….(More)”
 

The Tricky Task of Rating Neighborhoods on 'Livability'


Tanvi Misra at CityLab: “Jokubas Neciunas was looking to buy an apartment almost two years back in Vilnius, Lithuania. He consulted real estate platforms and government data to help him decide the best option for him. In the process, he realized that there was a lot of information out there, but no one was really using it very well.
Fast-forward two years, and Neciunas and his colleagues have created PlaceILive.com—a start-up trying to leverage open data from cities and information from social media to create a holistic, accessible tool that measures the “livability” of any apartment or house in a city.
“Smart cities are the ones that have smart citizens,” says PlaceILive co-founder Sarunas Legeckas.
The team recognizes that foraging for relevant information in the trenches of open data might not be for everyone. So they tried to “spice it up” by creating a visually appealing, user-friendly portal for people looking for a new home to buy or rent. The creators hope PlaceILive becomes a one-stop platform where people find ratings on every quality-of-life metric important to them before their housing hunt begins.
In its beta form, the site features five cities—New York, Chicago, San Francisco, London and Berlin. Once you click on the New York portal, for instance, you can search for the place you want to know about by borough, zip code, or address. I pulled up Brooklyn….The index is calculated using a variety of public information sources (from transit agencies, police departments, and the Census, for instance) as well as other available data (from the likes of Google, Socrata, and Foursquare)….(More)”

The Epidemic of Facelessness


Stephen Marche in the New York Times: “….Every month brings fresh figuration to the sprawling, shifting Hieronymus Bosch canvas of faceless 21st-century contempt. Faceless contempt is not merely topical. It is increasingly the defining trait of topicality itself. Every day online provides its measure of empty outrage.

When the police come to the doors of the young men and women who send notes telling strangers that they want to rape them, they and their parents are almost always shocked, genuinely surprised that anyone would take what they said seriously, that anyone would take anything said online seriously. There is a vast dissonance between virtual communication and an actual police officer at the door. It is a dissonance we are all running up against more and more, the dissonance between the world of faces and the world without faces. And the world without faces is coming to dominate…..

The Gyges effect, the well-noted disinhibition created by communications over the distances of the Internet, in which all speech and image are muted and at arm’s reach, produces an inevitable reaction — the desire for impact at any cost, the desire to reach through the screen, to make somebody feel something, anything. A simple comment can so easily be ignored. Rape threat? Not so much. Or, as Mr. Nunn so succinctly put it on Twitter: “If you can’t threaten to rape a celebrity, what is the point in having them?”

The challenge of our moment is that the face has been at the root of justice and ethics for 2,000 years. The right to face an accuser is one of the very first principles of the law, described in the “confrontation clause” of the Sixth Amendment of the United States Constitution, but reaching back through English common law to ancient Rome. In Roman courts no man could be sentenced to death without first seeing his accuser. The precondition of any trial, of any attempt to reconcile competing claims, is that the victim and the accused look each other in the face.

For the great French-Jewish philosopher Emmanuel Levinas, the encounter with another’s face was the origin of identity — the reality of the other preceding the formation of the self. The face is the substance, not just the reflection, of the infinity of another person. And from the infinity of the face comes the sense of inevitable obligation, the possibility of discourse, the origin of the ethical impulse.

The connection between the face and ethical behavior is one of the exceedingly rare instances in which French phenomenology and contemporary neuroscience coincide in their conclusions. A 2009 study by Marco Iacoboni, a neuroscientist at the Ahmanson-Lovelace Brain Mapping Center at the University of California, Los Angeles, explained the connection: “Through imitation and mimicry, we are able to feel what other people feel. By being able to feel what other people feel, we are also able to respond compassionately to other people’s emotional states.” The face is the key to the sense of intersubjectivity, linking mimicry and empathy through mirror neurons — the brain mechanism that creates imitation even in nonhuman primates.

The connection goes the other way, too. Inability to see a face is, in the most direct way, inability to recognize shared humanity with another. In a metastudy of antisocial populations, the inability to sense the emotions on other people’s faces was a key correlation. There is “a consistent, robust link between antisocial behavior and impaired recognition of fearful facial affect. Relative to comparison groups, antisocial populations showed significant impairments in recognizing fearful, sad and surprised expressions.” A recent study in the Journal of Vision showed that babies between the ages of 4 months and 6 months recognized human faces at the same level as grown adults, an ability which they did not possess for other objects. …

The neurological research demonstrates that empathy, far from being an artificial construct of civilization, is integral to our biology. And when biological intersubjectivity disappears, when the face is removed from life, empathy and compassion can no longer be taken for granted.

The new facelessness hides the humanity of monsters and of victims both. Behind the angry tangles of wires, the question is, how do we see their faces again?

(More)”

A lot of private-sector data is also used for public good


Josh New in Computerworld: “As the private sector continues to invest in data-driven innovation, the capacity for society to benefit from this data collection grows as well. Much has been said about how the private sector is using the data it collects to improve corporate bottom lines, but positive stories about how that data contributes to the greater public good are largely unknown.
This is unfortunate, because data collected by the private sector is being used in a variety of important ways, including to advance medical research, to help students make better academic decisions and to provide government agencies and nonprofits with actionable insights. However, overzealous actions by government to restrict the collection and use of data by the private sector are likely to have a chilling effect on such data-driven innovation.
Companies are working to advance medical research with data sharing. Personal genetics company 23andMe, which offers its customers inexpensive DNA test kits, has obtained consent from three-fourths of its 800,000 customers to donate their genetic information for research purposes. 23andMe has partnered with pharmaceutical companies, such as Genentech and Pfizer, to advance genomics research by providing scientists with the data needed to develop new treatments for diseases like Crohn’s and Parkinson’s. The company has also worked with researchers to leverage its network of customers to recruit patients for clinical trials more effectively than through previous protocols.
Private-sector data is also helping students make more informed decisions about education. With the cost of attending college rising, data that helps make this investment worthwhile is incredibly valuable. The social networking company LinkedIn has built tools that provide prospective college students with valuable information about their potential career path, field of study and choice of school. By analyzing the education tracks and careers of its users, LinkedIn can offer students critical data-driven insights into how to make the best out of the enormous and costly decision to go to college. Through LinkedIn’s higher-education tools, students now have an unprecedented resource to develop data-supported education and career plans….(More)”

Data Driven: Creating a Data Culture


New book by Hilary Mason and DJ Patil: “Succeeding with data isn’t just a matter of putting Hadoop in your machine room, or hiring some physicists with crazy math skills. It requires you to develop a data culture that involves people throughout the organization. In this O’Reilly report, DJ Patil and Hilary Mason outline the steps you need to take if your company is to be truly data-driven—including the questions you should ask and the methods you should adopt.
You’ll not only learn examples of how Google, LinkedIn, and Facebook use their data, but also how Walmart, UPS, and other organizations took advantage of this resource long before the advent of Big Data. No matter how you approach it, building a data culture is the key to success in the 21st century.
You’ll explore:

  • Data scientist skills—and why every company needs a Spock
  • How the benefits of giving company-wide access to data outweigh the costs
  • Why data-driven organizations use the scientific method to explore and solve data problems
  • Key questions to help you develop a research-specific process for tackling important issues
  • What to consider when assembling your data team
  • Developing processes to keep your data team (and company) engaged
  • Choosing technologies that are powerful, support teamwork, and easy to use and learn …(More)”

R U There?


in the New Yorker on a new counselling service harnesses the power of the text message:” …. a person can contact Crisis Text Line without even looking at her phone. The number—741741—traces a simple, muscle-memory-friendly path down the left column of the keypad. Anyone who texts in receives an automatic response welcoming her to the service. Another provides a link to the organization’s privacy policy and explains that she can text “STOP” to end a conversation at any time. Meanwhile, the incoming message appears on the screen of Crisis Text Line’s proprietary computer system. The interface looks remarkably like a Facebook feed—pale background, blue banner at the top, pop-up messages in the lower right corner—a design that is intended to feel familiar and frictionless. The system, which receives an average of fifteen thousand texts a day, highlights messages containing words that might indicate imminent danger, such as “suicide,” “kill,” and “hopeless.”

Within five minutes, one of the counsellors on duty will write back. (Up to fifty people, most of them in their late twenties, are available at any given time, depending upon demand, and they can work wherever there’s an Internet connection.) An introductory message from a counsellor includes a casual greeting and a question about why the texter is writing in….(More)”

Participation in Public and Social Media Interactions


New Book edited by Marta Dynel and Jan Chovanec: “This book deals with participation frameworks in modern social and public media. It brings together several cutting-edge research studies that offer exciting new insights into the nature and formats of interpersonal communication in diverse technology-mediated contexts. Some papers introduce new theoretical extensions to participation formats, while others present case studies in various discourse domains spanning public and private genres. Adopting the perspective of the pragmatics of interaction, these contributions discuss data ranging from public, mass-mediated and quasi-authentic texts, fully staged and scripted textual productions, to authentic, non-scripted private messages and comments, both of a permanent and ephemeral nature. The analyses include news interviews, online sports reporting, sitcoms, comedy shows, stand-up comedies, drama series, institutional and personal blogs, tweets, follow-up YouTube video commentaries, and Facebook status updates. All the authors emphasize the role of context and pay attention to how meaning is constructed by participants in interactions in increasingly complex participation frameworks existing in traditional as well as novel technologically mediated interactions….(Table of Contents)”.

Have a smartphone? This start-up will turn you into a lobbyist.


in the Washington Post: “…We needed a solution; some way to turn that passion into action right there in the moment.”
So she built one – but not on her own. In late 2012, Hartsock lured two entrepreneurs, Jeb Ory and Patrick Stoddart, away from tech start-ups they were already running to launch a new venture called Phone2Action. The company now provides an online and mobile platform that helps private companies, nonprofits and trade associations connect their customers and supporters with local and federal policymakers.
In short, the company’s clients – which include the Consumer Electronics Association, the American Heart Association, Ford and ridesharing company Lyft – pay a subscription for access to Phone2Action’s software tools, which allows them to build digital campaign pages featuring stories or information that illustrate the importance of, say, patent reform legislation (for CEA) or eased transportation regulations (for Lyft). On the side of the page, visitors can input their name, Zip code and e-mail address, and the site will automatically populate an e-mail, tweet and Facebook post (authored by, in this case, CEA or Lyft) expressing support and urging policymakers to vote in favor of the cause or legislation.
One more click, and those messages are automatically sent to the inboxes and social media feeds of the proper elected officials, based on the individual’s Zip code.
The idea, Ory said, is two-fold: One, to give individuals an easier way to connect with elected officials, and two, to give Phone2Action’s clients a more effective way to harness the lobbying power of their supporters. What made that possible, he explained, was really the proliferation of smartphones….While Phone2Action’s campaign model has proven viable, several hurdles still stand in the company’s way – not the least of which is the sense of powerlessness felt by many Americans when it comes to public policy and today’s legislative process…(More).”