Mashable: “Could a sophisticated algorithm be the future of science? One innovative economist thinks so.
Phil Parker, who holds a doctorate in business economics from the Wharton School, has built an algorithm that auto-writes books. Now he’s taking that model and applying it to loftier goals than simply penning periodicals: namely, medicine and forensics. Working with professors and researchers at NYU, Parker is trying to decode complex genetic structures and find cures for diseases. And he’s doing it with the help of man’s real best friend: technology.
Parker’s recipe is a complex computer program that mimics formulaic writing….
Parker’s been at this for years. His formula, originally used for printing, is able to churn out entire books in minutes. It’s similar to the work being done by Narrative Science and StatSheet, except those companies are known for short form auto-writing for newspapers. Parker’s work is much longer, focusing on obscure non-fiction and even poetry.
It’s not creative writing, though, and Parker isn’t interested in introspection, exploring emotion or storytelling. He’s interested in exploiting reproducible patterns — that’s how his algorithm can find, collect and “write” so quickly. And how he can apply that model to other disciplines, like science.
Parker’s method seems to be a success; indeed, his ICON Group International, Inc., has auto-written so many books that Parker has lost count. But this isn’t the holy grail of literature, he insists. Instead, he says, his work is a play on mechanizing processes to create a simple formula. And he thinks that “finding new knowledge structures within data” stretches far beyond print.”
Techs and the City
Alec Appelbaum, who teaches at Pratt Institute in The New York Times: “THIS spring New York City is rolling out its much-ballyhooed bike-sharing program, which relies on a sophisticated set of smartphone apps and other digital tools to manage it. The city isn’t alone: across the country, municipalities are buying ever more complicated technological “solutions” for urban life.
But higher tech is not always essential tech. Cities could instead be making savvier investments in cheaper technology that may work better to stoke civic involvement than the more complicated, expensive products being peddled by information-technology developers….
To be sure, big tech can zap some city weaknesses. According to I.B.M., its predictive-analysis technology, which examines historical data to estimate the next crime hot spots, has helped Memphis lower its violent crime rate by 30 percent.
But many problems require a decidedly different approach. Take the seven-acre site in Lower Manhattan called the Seward Park Urban Renewal Area, where 1,000 mixed-income apartments are set to rise. A working-class neighborhood that fell to bulldozers in 1969, it stayed bare as co-ops nearby filled with affluent families, including my own.
In 2010, with the city ready to invite developers to bid for the site, long-simmering tensions between nearby public-housing tenants and wealthier dwellers like me turned suddenly — well, civil.
What changed? Was it some multimillion-dollar “open democracy” platform from Cisco, or a Big Data program to suss out the community’s real priorities? Nope. According to Dominic Pisciotta Berg, then the chairman of the local community board, it was plain old e-mail, and the dialogue it facilitated. “We simply set up an e-mail box dedicated to receiving e-mail comments” on the renewal project, and organizers would then “pull them together by comment type and then consolidate them for display during the meetings,” he said. “So those who couldn’t be there had their voices considered and those who were there could see them up on a screen and adopted, modified or rejected.”
Through e-mail conversations, neighbors articulated priorities — permanently affordable homes, a movie theater, protections for small merchants — that even a supercomputer wouldn’t necessarily have identified in the data.
The point is not that software is useless. But like anything else in a city, it’s only as useful as its ability to facilitate the messy clash of real human beings and their myriad interests and opinions. And often, it’s the simpler software, the technology that merely puts people in contact and steps out of the way, that works best.”
The Dictatorship of Data
Kenneth Cukier and Viktor Mayer-Schönberger in MIT Technology Review: “Big data is poised to transform society, from how we diagnose illness to how we educate children, even making it possible for a car to drive itself. Information is emerging as a new economic input, a vital resource. Companies, governments, and even individuals will be measuring and optimizing everything possible.
But there is a dark side. Big data erodes privacy. And when it is used to make predictions about what we are likely to do but haven’t yet done, it threatens freedom as well. Yet big data also exacerbates a very old problem: relying on the numbers when they are far more fallible than we think. Nothing underscores the consequences of data analysis gone awry more than the story of Robert McNamara.”
Empowering Consumers through the Smart Disclosure of Data
OSTP: “Today, the Administration’s interagency National Science and Technology Council released Smart Disclosure and Consumer Decision Making: Report of the Task Force on Smart Disclosure—the first comprehensive description of the Federal Government’s efforts to promote the smart disclosure of information that can help consumers make wise decisions in the marketplace.
Whether they are searching for colleges, health insurance, credit cards, airline flights, or energy providers, consumers can find it difficult to identify the specific product or service that best suits their particular needs. In some cases, the effort required to sift through all of the available information is so large that consumers make decisions using inadequate information. As a result, they may overpay, miss out on a product that would better meet their needs, or be surprised by fees.
The report released today outlines ways in which Federal agencies and other governmental and non-governmental organizations can use—and in many cases are already using—smart disclosure approaches that increase market transparency and empower consumers facing complex choices in domains such as health, education, energy and personal finance.”
A Data-Powered Revolution in Health Care
Todd Park @ White House Blog: “Thomas Friedman’s New York Times column, Obamacare’s Other Surprise, highlights a rising tide of innovation that has been unleashed by the Affordable Care Act and the Administration’s health IT and data initiatives. Supported by digital data, new data-driven tools, and payment policies that reward improving the quality and value of care, doctors, hospitals, patients, and entrepreneurs across the nation are demonstrating that smarter, better, more accessible, and more proactive care is the best way to improve quality and control health care costs.
We are witnessing the emergence of a data-powered revolution in health care. Catalyzed by the Recovery Act, adoption of electronic health records is increasing dramatically. More than half of all doctors and other eligible providers and nearly 80 percent of hospitals are using electronic health records to improve care, an increase of more than 200 percent since 2008. In addition, the Administration’s Health Data Initiative is making a growing supply of key government data on everything from hospital charges and quality to regional health care system performance statistics freely available in computer-readable, downloadable form, as fuel for innovation, entrepreneurship, and discovery.
"A bite of me"
I spend hours every day surfing the internet. Meanwhile, companies like Facebook and Google have been using my online information (the websites I visit, the friends I have, the videos I watch) for their own benefit.
In 2012, advertising revenue in the United States was around $30 billion. That same year, I made exactly $0 from my own data. But what if I tracked everything myself? Could I at least make a couple bucks back?
I started looking at the terms of service for the websites I often use. In their privacy policies, I have found sentences like this: “You grant a worldwide, non-exclusive, royalty-free license to use, copy, reproduce, process, adapt, modify, publish, transmit, display and distribute such content in any and all media or distribution methods (now known or later developed).” I’ve basically agreed to give away a lifelong, international, sub-licensable right to use my personal data….
Check out myprivacy.info to see some of the visualizations I’ve made.
http://myprivacy.info”
Principles and Practices for a Federal Statistical Agency
New National Academies Publication : “Publicly available statistics from government agencies that are credible, relevant, accurate, and timely are essential for policy makers, individuals, households, businesses, academic institutions, and other organizations to make informed decisions. Even more, the effective operation of a democratic system of government depends on the unhindered flow of statistical information to its citizens.
In the United States, federal statistical agencies in cabinet departments and independent agencies are the governmental units whose principal function is to compile, analyze, and disseminate information for such statistical purposes as describing population characteristics and trends, planning and monitoring programs, and conducting research and evaluation. The work of these agencies is coordinated by the U.S. Office of Management and Budget. Statistical agencies may acquire information not only from surveys or censuses of people and organizations, but also from such sources as government administrative records, private-sector datasets, and Internet sources that are judged of suitable quality and relevance for statistical use. They may conduct analyses, but they do not advocate policies or take partisan positions. Statistical purposes for which they provide information relate to descriptions of groups and exclude any interest in or identification of an individual person, institution, or economic unit.
Four principles are fundamental for a federal statistical agency: relevance to policy issues, credibility among data users, trust among data providers, and independence from political and other undue external influence. Principles and Practices for a Federal Statistical Agency: Fifth Edition explains these four principles in detail.”
Life and Death of Tweets Not so Random After All
MIT Technology Review: “MIT assistant professor Tauhid Zaman and two other researchers (Emily Fox at the University of Washington and Eric Bradlow at the University of Pennsylvania’s Wharton School) have come up with a model that can predict how many times a tweet will ultimately be retweeted, minutes after it is posted. The model was created by collecting retweets on a slew of topics and looking at the time when the original tweet was posted and how fast it spread. That provided knowledge used to predict how popular a new tweet will be by looking at how many times it was retweeted shortly after it was first posted.
The researchers’ findings were explained in a paper submitted to the Annals of Applied Statistics. In the paper, the authors note that “understanding retweet behavior could lead to a better understanding of how broader ideas spread in Twitter and in other social networks,” and such data may be helpful in a number of areas, like marketing and political campaigning.
You can check out the model here.”
Cities must do more with data than ‘crowdsource pothole locations’
Technically: “Using data from citizen-powered mobile and web apps has become such a clear best practice for city governments, that a new question was the focus at the 7th annual Mayors’ Innovation Summit held in Philadelphia last week. What’s next?…
When it comes to moving the civic technology movement forward, the consensus was twofold: we need to continue reaching out to new user bases and seeking better ways to make sense of the data we’re collecting. (A similar need for deeper goals also came out of a civic innovation panel)
“We’re going to have to get better as cities at processing all of this info,” said Mesa, Ariz. Mayor Scott Smith, whose iMesa application invites ideas from citizens for how to make Mesa a better place to live. He reported the app is already becoming overloaded with data, in his words “a great problem to have.”
If My Data Is an Open Book, Why Can’t I Read It?
Natasha Singer in the New York Times: “Never mind all the hoopla about the presumed benefits of an “open data” society. In our day-to-day lives, many of us are being kept in the data dark.
“The fact that I am producing data and companies are collecting it to monetize it, if I can’t get a copy myself, I do consider it unfair,” says Latanya Sweeney, the director of the Data Privacy Lab at Harvard, where she is a professor of government and technology….
In fact, a few companies are challenging the norm of corporate data hoarding by actually sharing some information with the customers who generate it — and offering tools to put it to use. It’s a small but provocative trend in the United States, where only a handful of industries, like health care and credit, are required by federal law to provide people with access to their records.
Last year, San Diego Gas and Electric, a utility, introduced an online energy management program in which customers can view their electricity use in monthly, daily or hourly increments. There is even a practical benefit: customers can earn credits by reducing energy consumption during peak hours….