What Cars Did for Today’s World, Data May Do for Tomorrow’s


Quentin Hardy in the New York Times: “New technology products head at us constantly. There’s the latest smartphone, the shiny new app, the hot social network, even the smarter thermostat.

As great (or not) as all these may be, each thing is a small part of a much bigger process that’s rarely admired. They all belong inside a world-changing ecosystem of digital hardware and software, spreading into every area of our lives.

Thinking about what is going on behind the scenes is easier if we consider the automobile, also known as “the machine that changed the world.” Cars succeeded through the widespread construction of highways and gas stations. Those things created a global supply chain of steel plants and refineries. Seemingly unrelated things, including suburbs, fast food and drive-time talk radio, arose in the success.

Today’s dominant industrial ecosystem is relentlessly acquiring and processing digital information. It demands newer and better ways of collecting, shipping, and processing data, much the way cars needed better road building. And it’s spinning out its own unseen businesses.

A few recent developments illustrate the new ecosystem. General Electric plans to announce Monday that it has created a “data lake” method of analyzing sensor information from industrial machinery in places like railroads, airlines, hospitals and utilities. G.E. has been putting sensors on everything it can for a couple of years, and now it is out to read all that information quickly.

The company, working with an outfit called Pivotal, said that in the last three months it has looked at information from 3.4 million miles of flights by 24 airlines using G.E. jet engines. G.E. said it figured out things like possible defects 2,000 times as fast as it could before.

The company has to, since it’s getting so much more data. “In 10 years, 17 billion pieces of equipment will have sensors,” said William Ruh, vice president of G.E. software. “We’re only one-tenth of the way there.”

It hardly matters if Mr. Ruh is off by five billion or so. Billions of humans are already augmenting that number with their own packages of sensors, called smartphones, fitness bands and wearable computers. Almost all of that will get uploaded someplace too.

Shipping that data creates challenges. In June, researchers at the University of California, San Diego announced a method of engineering fiber optic cable that could make digital networks run 10 times faster. The idea is to get more parts of the system working closer to the speed of light, without involving the “slow” processing of electronic semiconductors.

“We’re going from millions of personal computers and billions of smartphones to tens of billions of devices, with and without people, and that is the early phase of all this,” said Larry Smarr, drector of the California Institute for Telecommunications and Information Technology, located inside U.C.S.D. “A gigabit a second was fast in commercial networks, now we’re at 100 gigabits a second. A terabit a second will come and go. A petabit a second will come and go.”

In other words, Mr. Smarr thinks commercial networks will eventually be 10,000 times as fast as today’s best systems. “It will have to grow, if we’re going to continue what has become our primary basis of wealth creation,” he said.

Add computation to collection and transport. Last month, U.C. Berkeley’s AMP Lab, created two years ago for research into new kinds of large-scale computing, spun out a company called Databricks, that uses new kinds of software for fast data analysis on a rental basis. Databricks plugs into the one million-plus computer servers inside the global system of Amazon Web Services, and will soon work inside similar-size megacomputing systems from Google and Microsoft.

It was the second company out of the AMP Lab this year. The first, called Mesosphere, enables a kind of pooling of computing services, building the efficiency of even million-computer systems….”

Monitoring Arms Control Compliance With Web Intelligence


Chris Holden and Maynard Holliday at Commons Lab: “Traditional monitoring of arms control treaties, agreements, and commitments has required the use of National Technical Means (NTM)—large satellites, phased array radars, and other technological solutions. NTM was a good solution when the treaties focused on large items for observation, such as missile silos or nuclear test facilities. As the targets of interest have shrunk by orders of magnitude, the need for other, more ubiquitous, sensor capabilities has increased. The rise in web-based, or cloud-based, analytic capabilities will have a significant influence on the future of arms control monitoring and the role of citizen involvement.
Since 1999, the U.S. Department of State has had at its disposal the Key Verification Assets Fund (V Fund), which was established by Congress. The Fund helps preserve critical verification assets and promotes the development of new technologies that support the verification of and compliance with arms control, nonproliferation, and disarmament requirements.
Sponsored by the V Fund to advance web-based analytic capabilities, Sandia National Laboratories, in collaboration with Recorded Future (RF), synthesized open-source data streams from a wide variety of traditional and nontraditional web sources in multiple languages along with topical texts and articles on national security policy to determine the efficacy of monitoring chemical and biological arms control agreements and compliance. The team used novel technology involving linguistic algorithms to extract temporal signals from unstructured text and organize that unstructured text into a multidimensional structure for analysis. In doing so, the algorithm identifies the underlying associations between entities and events across documents and sources over time. Using this capability, the team analyzed several events that could serve as analogs to treaty noncompliance, technical breakout, or an intentional attack. These events included the H7N9 bird flu outbreak in China, the Shanghai pig die-off and the fungal meningitis outbreak in the United States last year.
h7n9-for-blog
 
For H7N9 we found that open source social media were the first to report the outbreak and give ongoing updates.  The Sandia RF system was able to roughly estimate lethality based on temporal hospitalization and fatality reporting.  For the Shanghai pig die-off the analysis tracked the rapid assessment by Chinese authorities that H7N9 was not the cause of the pig die-off as had been originally speculated. Open source reporting highlighted a reduced market for pork in China due to the very public dead pig display in Shanghai. Possible downstream health effects were predicted (e.g., contaminated water supply and other overall food ecosystem concerns). In addition, legitimate U.S. food security concerns were raised based on the Chinese purchase of the largest U.S. pork producer (Smithfield) because of a fear of potential import of tainted pork into the United States….
To read the full paper, please click here.”

EU-funded tool to help our brain deal with big data


EU Press Release: “Every single minute, the world generates 1.7 million billion bytes of data, equal to 360,000 DVDs. How can our brain deal with increasingly big and complex datasets? EU researchers are developing an interactive system which not only presents data the way you like it, but also changes the presentation constantly in order to prevent brain overload. The project could enable students to study more efficiently or journalists to cross check sources more quickly. Several museums in Germany, the Netherlands, the UK and the United States have already showed interest in the new technology.

Data is everywhere: it can either be created by people or generated by machines, such as sensors gathering climate information, satellite imagery, digital pictures and videos, purchase transaction records, GPS signals, etc. This information is a real gold mine. But it is also challenging: today’s datasets are so huge and complex to process that they require new ideas, tools and infrastructures.

Researchers within CEEDs (@ceedsproject) are transposing big data into an interactive environment to allow the human mind to generate new ideas more efficiently. They have built what they are calling an eXperience Induction Machine (XIM) that uses virtual reality to enable a user to ‘step inside’ large datasets. This immersive multi-modal environment – located at Pompeu Fabra University in Barcelona – also contains a panoply of sensors which allows the system to present the information in the right way to the user, constantly tailored according to their reactions as they examine the data. These reactions – such as gestures, eye movements or heart rate – are monitored by the system and used to adapt the way in which the data is presented.

Jonathan Freeman,Professor of Psychology at Goldsmiths, University of London and coordinator of CEEDs, explains: The system acknowledges when participants are getting fatigued or overloaded with information.  And it adapts accordingly. It either simplifies the visualisations so as to reduce the cognitive load, thus keeping the user less stressed and more able to focus.  Or it will guide the person to areas of the data representation that are not as heavy in information.

Neuroscientists were the first group the CEEDs researchers tried their machine on (BrainX3). It took the typically huge datasets generated in this scientific discipline and animated them with visual and sound displays. By providing subliminal clues, such as flashing arrows, the machine guided the neuroscientists to areas of the data that were potentially more interesting to each person. First pilots have already demonstrated the power of this approach in gaining new insights into the organisation of the brain….”

The infrastructure Africa really needs is better data reporting


Data reporting on the continent is sketchy. Just look at the recent GDP revisions of large countries. How is it that Nigeria’s April GDP recalculation catapulted it ahead of South Africa, making it the largest economy in Africa overnight? Or that Kenya’s economy is actually 20% larger (paywall) than previously thought?

Indeed, countries in Africa get noticeably bad scores on the World Bank’s Bulletin Board on Statistical Capacity, an index of data reporting integrity.

Bad data is not simply the result of inconsistencies or miscalculations: African governments have an incentive to produce statistics that overstate their economic development.

A recent working paper from the Center for Global Development (CGD) shows how politics influence the statistics released by many African countries…

But in the long run, dodgy statistics aren’t good for anyone. They “distort the way we understand the opportunities that are available,” says Amanda Glassman, one of the CGD report’s authors. US firms have pledged $14 billion in trade deals at the summit in Washington. No doubt they would like to know whether high school enrollment promises to create a more educated workforce in a given country, or whether its people have been immunized for viruses.

Overly optimistic indicators also distort how a government decides where to focus its efforts. If school enrollment appears to be high, why implement programs intended to increase it?

The CGD report suggests increased funding to national statistical agencies, and making sure that they are wholly independent from their governments. President Obama is talking up $7 billion into African agriculture. But unless cash and attention are given to improving statistical integrity, he may never know whether that investment has borne fruit”

An Infographic That Maps 2,000 Years of Cultural History in 5 Minutes


in Wired:  “…Last week in the journal Science, the researchers (led by University of Texas art historian Maximilian Schich) published a study that looked at the cultural history of Europe and North America by mapping the birth and deaths of more than 150,000 notable figures—including everyone from Leonardo Da Vinci to Ernest Hemingway. That data was turned into an amazing animated infographic that looks strikingly similar to the illustrated flight paths you find in the back of your inflight magazine. Blue dots indicate a birth, red ones means death.

The researchers used data from Freebase, which touts itself as a “community curated database of people, places and things.” This gives the data a strong western-bent. You’ll notice that many parts of Asia and the Middle East (not to mention pre-colonized North America), are almost wholly ignored in this video. But to be fair, the abstract did acknowledge that the study was focused mainly on Europe and North America.
Still, mapping the geography of cultural migration does gives you some insight about how the kind of culture we value has shifted over the centuries. It’s also a novel lens through which to view our more general history, as those migration trends likely illuminate bigger historical happenings like wars and the building of cross-country infrastructure.

Using technology, data and crowdsourcing to hack infrastructure problems


Courtney M. Fowler at CAFWD.ORG: “Technology has become a way of life for most Americans, not just for communication but also for many daily activities. However, there’s more that can be done than just booking a trip or crushing candy. With a majority of Americans now owning smartphones, it’s only becoming more obvious that there’s room for governments to engage the public and provide more bang for their buck via technology.
CA Fwd has been putting on an “Open Data roadshow” around the state to highlight ways the marriage of tech and info can make government more efficient and transparent.
Jurisdictions have also been discovering that using technology and smartphone apps can be beneficial in the pursuit of improving infrastructure. Saving any amount of money on such projects is especially important for California, where it’s been estimated the state will only have half of the $765 billion needed for infrastructure investments over the next decade.
One of the best examples of applying technology to infrastructure problems comes from South Carolina, where an innovative bridge-monitoring system is producing real savings, despite being in use on only eight bridges.
Girder sensors are placed on each bridge so that they can measure its carrying capacity and can be monitored 24/7. Although, the monitors don’t eliminate the need for inspections, the technology does make the need for them significantly less frequent. Data from the monitors also led the South Carolina Department of Transportation to correct one bridge’s problems with a $100,000 retrofit, rather than spending $800,000 to replace it…”
In total, having the monitors on just eight bridges, at a cost of about $50,000 per bridge, saved taxpayers $5 million.
That kind of innovation and savings is exactly what California needs to ensure that infrastructure projects happen in a more timely and efficient fashion in the future. It’s also what is driving civic innovators to bring together technology and crowdsourcing and make sure infrastructure projects also are results oriented.

Opportunities for strengthening open meetings with open data


at the Sunlight Foundation Blog: “Governments aren’t alone in thinking about how open data can help improve the open meetings process. There are an increasing number of tools governments can use to help bolster open meetings with open data. From making public records generated by meetings more easily accessible and reusable online to inviting the public to participate in the decision-making process from wherever they may be, these tools allow governments to upgrade open meetings for the opportunities and demands of the 21st Century.
Improving open meetings with open data may involve taking advantage of simple solutions already freely available online, developing new tools within government, using open-source tools, or investing in new software, but it can all help serve the same goal: bringing more information online where it’s easily accessible to the public….
It’s not just about making open meetings more accessible, either. More communities are thinking about how they can bring government to the people. Open meetings are typically held in government-designated buildings at specified times, but are those locations and times truly accessible for most of the public or for those who may be most directly impacted by what’s being discussed?
Technology presents opportunities for governments to engage with the public outside of regularly scheduled meetings. Tools like Speakup and Textizen, for example, are being used to increase public participation in the general decision-making process. A continually increasing array of toolsprovidenewways for government and the public to identify issues, share ideas, and work toward solutions, even outside of open meetings. Boston, for example, took an innovative approach to this issue with its City Hall To Go truck and other efforts, bringing government services to locations around the city rather than requiring people to come to a government building…”

App enables citizens to report water waste in drought regions


Springwise: “Rallying citizens to take a part in looking after the community they live in has become easier thanks to smartphones. In the past, the Creek Watch app has enabled anyone to help monitor their local water quality by sending data back to the state water board. Now Everydrop LA wants to use similar techniques to avoid drought in California, encouraging residents to report incidents of water wastage.
According to the team behind the app — which also created the CitySourced platform for engaging users in civic issues — even the smallest amount of water wastage can lead to meaningful losses over time. A faucet that drips just once a minute will lose over 2000 gallons of drinkable water each year. Using the Everydrop LA, citizens can report the location of leaking faucets and fire hydrants as well as occurrences of blatant water wastage. They can also see how much water is being wasted in their local area and learn about what they can do to cut their own water usage. In times when drought is a risk, the app notifies users to conserve. Cities and counties can use the data in their reports and learn more about how water wastage is affecting their jurisdiction.”

Using the Wisdom of the Crowd to Democratize Markets


David Weidner at the Wall Street Journal: “For years investors have largely depended on three sources to distill the relentless onslaught of information about public companies: the companies themselves, Wall Street analysts and the media.
Each of these has their strengths, but they may have even bigger weaknesses. Companies spin. Analysts have conflicts of interest. The financial media is under deadline pressure and ill-equipped to act as a catch-all watchdog.
But in recent years, the tech whizzes out of Silicon Valley have been trying to democratize the markets. In 2010 I wrote about an effort called Moxy Vote, an online system for shareholders to cast ballots in proxy contests. Moxy Vote had some initial success but ran into regulatory trouble and failed to gain traction.
Some newer efforts are more promising, mostly because they depend on users, or some form of crowdsourcing, for their content. Crowdsourcing is when a need is turned over to a large group, usually an online community, rather than traditional paid employees or outside providers….
Estimize.com is one. It was founded in 2011 by former trader Leigh Drogan, but recently has undergone some significant expansion, adding a crowd-sourced prediction for mergers and acquisitions. Estimize also boasts a track record. It claims it beats Wall Street analysts 65.9% of the time during earnings season. Like SeekingAlpha, Estimize does, however, lean heavily on pros or semi-pros. Nearly 5,000 of its contributors are analysts.
Closer to the social networking world there’s scutify.com, a website and mobile app that aggregates what’s being said about individual stocks on social networks, blogs and other sources. It highlights trending stocks and links to chatter on social networks. (The site is owned by Cody Willard, a contributor to MarketWatch, which is owned by Dow Jones, the publisher of The Wall Street Journal.)
Perhaps the most intriguing startup is TwoMargins.com. The site allows investors, analysts, average Joes — anyone, really — to annotate company releases. In that way, Two Margins potentially can tap the power of the crowd to provide a fourth source for the marketplace.
Two Margins, a startup funded by Bloomberg L.P.’s venture capital fund, borrows annotation technology that’s already in use on other sites such as genius.com and scrible.com. Participants can sign in with their Twitter or Facebook accounts and post to those networks from the site. (Dow Jones competes with Bloomberg in the provision of news and financial data.)
At this moment, Two Margins isn’t a game changer. Founders Gniewko Lubecki and Akash Kapur said the site is in a pre-beta phase, which is to say it’s sort of up and running and being constantly tweaked.
Right now there’s nothing close to the critical mass needed for an exhaustive look at company filings. There’s just a handful of users and less than a dozen company releases and filings available.
Still, in the first moments after Twitter Inc.’s earnings were released Tuesday, Two Margins’ most loyal users began to scour the release. “Looks like Twitter is getting significantly better at monetizing users,” wrote a user named “George” who had annotated the revenue line from the company’s financial statement. Another user, “Scott Paster,” noted Twitter’s stock option grants to executives were nearly as high as its reported loss.
“The sum is greater than it’s parts when you pull together a community of users,” Mr. Kapur said. “Widening access to these documents is one goal. The other goal is broadening the pool of knowledge that’s brought to bear on these documents.”
In the end, this new wave of tech-driven services may never capture enough users to make it into the investing mainstream. They all struggle with uninformed and inaccurate content especially if they gain critical mass. Vetting is a problem.
For that reasons, it’s hard to predict whether these new entries will flourish or even survive. That’s not a bad thing. The march of technology will either improve on the idea or come up with a new one.
Ultimately, technology is making possible what hasn’t been. That is, free discussion, access and analysis of information. Some may see it as a threat to Wall Street, which has always charged for expert analysis. Really, though, these efforts are good for markets, which pride themselves on being fair and transparent.
It’s not just companies that should compete, but ideas too.”

The Responsive City: Engaging Communities Through Data-Smart Governance


New book by Stephen Goldsmith, and Susan P. Crawford: “The Responsive City: Engaging Communities Through Data-Smart Governance. The Responsive City is a guide to civic engagement and governance in the digital age that will help leaders link important breakthroughs in about technology and big data analytics with age-old lessons of small-group community input to create more agile, competitive, and economically resilient cities. Featuring vivid case-studies highlighting the work of individuals in New York, Boston, Rio de Janeiro, Stockholm, Indiana, and Chicago, the book provides a compelling model for the future of cities and states. The authors demonstrate how digital innovations will drive a virtuous cycle of responsiveness centered on “empowerment” : 1) empowering public employees with tools to both power their performance and to help them connect more personally to those they service, 2) empowering constituents to see and understand problems and opportunities faced by cities so that they can better engage in the life of their communities, and 3) empowering leaders to drive towards their missions and address the grand challenges confronting cities by harnessing the predictive power of cross-government Big Data, the book will help mayors, chief technology officers, city administrators, agency directors, civic groups and nonprofit leaders break out of current paradigms in order to collectively address civic problems. Co-authored by Stephen Goldsmith, former Mayor of Indianapolis, and current Director of the Innovations in Government Program at the Harvard Kennedy School and Susan Crawford, co-director of Harvard’s Berkman Center for Internet and Society.

The Responsive City highlights the ways in which leadership, empowered government employees, thoughtful citizens, and 21st century technology can combine to improve government operations and strengthen civic trust. It provides actionable advice while exploring topics like:

  • Visualizing service delivery and predicting improvement
  • Making the work of government employees more meaningful
  • Amplification and coordination of focused citizen engagement
  • Big Data in big cities – stories of surprising successes and enormous potential”