The Strength of the Strongest Ties in Collaborative Problem Solving


Yves-Alexandre de Montjoye, Arkadiusz Stopczynski, Erez Shmueli, Alex Pentland & Sune Lehmann in Nature (Scientific Reports) : “Complex problem solving in science, engineering, and business has become a highly collaborative endeavor. Teams of scientists or engineers collaborate on projects using their social networks to gather new ideas and feedback. Here we bridge the literature on team performance and information networks by studying teams’ problem solving abilities as a function of both their within-team networks and their members’ extended networks. We show that, while an assigned team’s performance is strongly correlated with its networks of expressive and instrumental ties, only the strongest ties in both networks have an effect on performance. Both networks of strong ties explain more of the variance than other factors, such as measured or self-evaluated technical competencies, or the personalities of the team members. In fact, the inclusion of the network of strong ties renders these factors non-significant in the statistical analysis. Our results have consequences for the organization of teams of scientists, engineers, and other knowledge workers tackling today’s most complex problems.”

We Need a Citizen Maker Movement


Lorelei Kelly at the Huffington Post: “It was hard to miss the giant mechanical giraffe grazing on the White House lawn last week. For the first time ever, the President organized a Maker Faire–inviting entrepreneurs and inventors from across the USA to celebrate American ingenuity in the service of economic progress.
The maker movement is a California original. Think R2D2 serving margaritas to a jester with an LED news scroll. The #nationofmakers Twitter feed has dozens of examples of collaborative production, of making, sharing and learning.
But since this was the White House, I still had to ask myself, what would the maker movement be if the economy was not the starting point? What if it was about civics? What if makers decided to create a modern, hands-on democracy?
What is democracy anyway but a never ending remix of new prototypes? Last week’s White House Maker Faire heralded a new economic bonanza. This revolution’s poster child is 3-D printing– decentralized fabrication that is customized to meet local needs. On the government front, new design rules for democracy are already happening in communities, where civics and technology have generated a front line of maker cities.
But the distance between California’s tech capacity and DC does seem 3000 miles wide. The NSA’s over collection/surveillance problem and Healthcare.gov’s doomed rollout are part of the same system-wide capacity deficit. How do we close the gap between California’s revolution and our institutions?

  • In California, disruption is a business plan. In DC, it’s a national security threat.
  • In California, hackers are artists. In DC, they are often viewed as criminals.
  • In California, “cyber” is a dystopian science fiction word. In DC, cyber security is in a dozen oversight plans for Congress.
  • in California, individuals are encouraged to “fail forward.” In DC, risk-aversion is bipartisan.

Scaling big problems with local solutions is a maker specialty. Government policymaking needs this kind of help.
Here’s the issue our nation is facing: The inability of the non-military side of our public institutions to process complex problems. Today, this competence and especially the capacity to solve technical challenges often exist only in the private sector. If something is urgent and can’t be monetized, it becomes a national security problem. Which increasingly means that critical decision making that should be in the civilian remit instead migrates to the military. Look at our foreign policy. Good government is a counter terrorism strategy in Afghanistan. Decades of civilian inaction on climate change means that now Miami is referred to as a battle space in policy conversations.
This rhetoric reflects an understandable but unacceptable disconnect for any democracy.
To make matters more confusing, much of the technology in civics (like list building petitions) is suited for elections, not for governing. It is often antagonistic. The result? policy making looks like campaigning. We need some civic tinkering to generate governing technology that comes with relationships. Specifically, this means technology that includes many voices, but has identifiable channels for expertise that can sort complexity and that is not compromised by financial self-interest.
Today, sorting and filtering information is a huge challenge for participation systems around the world. Information now ranks up there with money and people as a lever of power. On the people front, the loud and often destructive individuals are showing up effectively. On the money front, our public institutions are at risk of becoming purely pay to play (wonks call this “transactional”).
Makers, ask yourselves, how can we turn big data into a political constituency for using real evidence–one that can compete with all the negative noise and money in the system? For starters, technologists out West must stop treating government like it’s a bad signal that can be automated out of existence. We are at a moment where our society requires an engineering mindset to develop modern, tech-savvy rules for democracy. We need civic makers….”

How Crowdsourced Astrophotographs on the Web Are Revolutionizing Astronomy


Emerging Technology From the arXiv: “Astrophotography is currently undergoing a revolution thanks to the increased availability of high quality digital cameras and the software available to process the pictures after they have been taken.
Since photographs of the night sky are almost always better with long exposures that capture more light, this processing usually involves combining several images of the same part of the sky to produce one with a much longer effective exposure.
That’s all straightforward if you’ve taken the pictures yourself with the same gear under the same circumstances. But astronomers want to do better.
“The astrophotography group on Flickr alone has over 68,000 images,” say Dustin Lang at Carnegie Mellon University in Pittsburgh and a couple of pals. These and other images represent a vast source of untapped data for astronomers.
The problem is that it’s hard to combine images accurately when little is known about how they were taken. Astronomers take great care to use imaging equipment in which the pixels produce a signal that is proportional to the number of photons that hit.
But the same cannot be said of the digital cameras widely used by amateurs. All kinds of processes can end up influencing the final image.
So any algorithm that combines them has to cope with these variations. “We want to do this without having to infer the (possibly highly nonlinear) processing that has been applied to each individual image, each of which has been wrecked in its own loving way by its creator,” say Lang and co.
Now, these guys say they’ve cracked it. They’ve developed a system that automatically combines images from the same part of the sky to increase the effective exposure time of the resulting picture. And they say the combined images can rival those from much professional telescopes.
They’ve tested this approach by downloading images of two well-known astrophysical objects: the NGC 5907 Galaxy and the colliding pair of galaxies—Messier 51a and 51b.
For NGC 5907, they ended up with 4,000 images from Flickr, 1,000 from Bing and 100 from Google. They used an online system called astrometry.net that automatically aligns and registers images of the night sky and then combined the images using their new algorithm, which they call Enhance.
The results are impressive. They say that the combined images of NGC5907 (bottom three images) show some of the same faint features that revealed a single image taken over 11 hours of exposure using a 50 cm telescope (the top left image). All the images reveal the same kind of fine detail such as a faint stellar stream around the galaxy.
The combined image for the M51 galaxies is just as impressive, taking only 40 minutes to produce on a single processor. It reveals extended structures around both galaxies, which astronomers know to be debris from their gravitational interaction as they collide.
Lang and co say these faint features are hugely important because they allow astronomers to measure the age, mass ratios, and orbital configurations of the galaxies involved. Interestingly, many of these faint features are not visible in any of the input images taken from the Web. They emerge only once images have been combined.
One potential problem with algorithms like this is that they need to perform well as the number of images they combine increases. It’s no good if they grind to a halt as soon as a substantial amount of data becomes available.
On this score, Lang and co say astronomers can rest easy. The performance of their new Enhance algorithm scales linearly with the number of images it has to combine. That means it should perform well on large datasets.
The bottom line is that this kind of crowd-sourced astronomy has the potential to make a big impact, given that the resulting images rival those from large telescopes.
And it could also be used for historical images, say Lang and co. The Harvard Plate Archives, for example, contain half a million images dating back to the 1880s. These were all taken using different emulsions, with different exposures and developed using different processes. So the plates all have different responses to light, making them hard to compare.
That’s exactly the problem that Lang and co have solved for digital images on the Web. So it’s not hard to imagine how they could easily combine the data from the Harvard archives as well….”
Ref: arxiv.org/abs/1406.1528 : Towards building a Crowd-Sourced Sky Map

Towards a comparative science of cities: using mobile traffic records in New York, London and Hong Kong


Book chapter by S. Grauwin, S. Sobolevsky, S. Moritz, I. Gódor, C. Ratti, to be published in “Computational Approaches for Urban Environments” (Springer Ed.), October 2014: “This chapter examines the possibility to analyze and compare human activities in an urban environment based on the detection of mobile phone usage patterns. Thanks to an unprecedented collection of counter data recording the number of calls, SMS, and data transfers resolved both in time and space, we confirm the connection between temporal activity profile and land usage in three global cities: New York, London and Hong Kong. By comparing whole cities typical patterns, we provide insights on how cultural, technological and economical factors shape human dynamics. At a more local scale, we use clustering analysis to identify locations with similar patterns within a city. Our research reveals a universal structure of cities, with core financial centers all sharing similar activity patterns and commercial or residential areas with more city-specific patterns. These findings hint that as the economy becomes more global, common patterns emerge in business areas of different cities across the globe, while the impact of local conditions still remains recognizable on the level of routine people activity.”

Every citizen a scientist? An EU project tries to change the face of research


Project News from the European Commission:  “SOCIENTIZE builds on the concept of ‘Citizen Science’, which sees thousands of volunteers, teachers, researchers and developers put together their skills, time and resources to advance scientific research. Thanks to open source tools developed under the project, participants can help scientists collect data – which will then be analysed by professional researchers – or even perform tasks that require human cognition or intelligence like image classification or analysis.

Every citizen can be a scientist
The project helps usher in new advances in everything from astronomy to social science.
‘One breakthrough is our increased capacity to reproduce, analyse and understand complex issues thanks to the engagement of large groups of volunteers,’ says Mr Fermin Serrano Sanz, researcher at the University of Zaragoza and Project Coordinator of SOCIENTIZE. ‘And everyone can be a neuron in our digitally-enabled brain.’
But how can ordinary citizens help with such extraordinary science? The key, says Mr Serrano Sanz, is in harnessing the efforts of thousands of volunteers to collect and classify data. ‘We are already gathering huge amounts of user-generated data from the participants using their mobile phones and surrounding knowledge,’ he says.
For example, the experiment ‘SavingEnergy@Home’ asks users to submit data about the temperatures in their homes and neighbourhoods in order to build up a clearer picture of temperatures in cities across the EU, while in Spain, GripeNet.es asks citizens to report when they catch the flu in order to monitor outbreaks and predict possible epidemics.
Many Hands Make Light Work
But citizens can also help analyse data. Even the most advanced computers are not very good at recognising things like sun spots or cells, whereas people can tell the difference between living and dying cells very easily, given only a short training.
The SOCIENTIZE projects ‘Sun4All’ and ‘Cell Spotting’ ask volunteers to label images of solar activity and cancer cells from an application on their phone or computer. With Cell Spotting, for instance, participants can observe cell cultures being studied with a microscope in order to determine their state and the effectiveness of medicines. Analysing this data would take years and cost hundreds of thousands of euros if left to a small team of scientists – but with thousands of volunteers helping the effort, researchers can make important breakthroughs quickly and more cheaply than ever before.
But in addition to bringing citizens closer to science, SOCIENTIZE also brings science closer to citizens. On 12-14 June, the project participated in the SONAR festival with ‘A Collective Music Experiment’ (CME). ‘Two hundred people joined professional DJs and created musical patterns using a web tool; participants shared their creations and re-used other parts in real time. The activity in the festival also included a live show of RdeRumba and Mercadal playing amateurs rhythms’ Mr. Serrano Sanz explains.
The experiment – which will be presented in a mini-documentary to raise awareness about citizen science – is expected to help understand other innovation processes observed in emergent social, technological, economic or political transformations. ‘This kind of event brings together a really diverse set of participants. The diversity does not only enrich the data; it improves the dialogue between professionals and volunteers. As a result, we see some new and innovative approaches to research.’
The EUR 0.7 million project brings together 6 partners from 4 countries: Spain (University of Zaragoza and TECNARA), Portugal (Museu da Ciência-Coimbra, MUSC ; Universidade de Coimbra),  Austria (Zentrum für Soziale Innovation) and Brazil (Universidade Federal de Campina Grande, UFCG).
SOCIENTIZE will end in October 2104 after bringing together 12000 citizens in different phases of research activities for 24 months.”

A Big Day for Big Data: The Beginning of Our Data Transformation


Mark Doms, Under Secretary for Economic Affairs at the US Department of Commerce: “Wednesday, June 18, 2014, was a big day for big data.  The Commerce Department participated in the inaugural Open Data Roundtable at the White House, with GovLab at NYU and the White House Office of Science and Technology Policy. The event brought businesses and non-profit organizations that rely on Commerce data together with Commerce Department officials to discuss how to make the data we collect and release easier to find, understand and use.  This initiative has significant potential to fuel new businesses; create jobs; and help federal, state and local governments make better decisions.
OpenData 500

Under Secretary Mark Doms presented and participated in the first Open Data Roundtable at the White House, organized by Commerce, GovLab at NYU and the White House Office of Science and Technology Policy 
Data innovation is revolutionizing every aspect of our society and government data is playing a major role in the revolution. From the National Oceanic and Atmospheric Administration’s (NOAA’s) climate data to the U.S. Census Bureau’s American Community Survey, the U.S. Patent and Trademark Office (USPTO) patent and trademark records, and National Institute of Standards and Technology (NIST) research, companies, organizations and people are using this information to innovate, grow our economy and better plan for the future.
 At this week’s Open Data 500, some key insights I came away with include: 

  • There is a strong desire for data consistency across the Commerce Department, and indeed the federal government. 
  • Data should be catalogued in a common, machine-readable format. 
  • Data should be accessible in bulk, allowing the private sector greater flexibility to harness the information. 
  • The use of a single platform for access to government data would create efficiencies and help coordination across agencies.

Furthermore, business leaders stand ready to help us achieve these goals.
Secretary Pritzker is the first Secretary of Commerce to make data a departmental priority in the Commerce Department’s Strategic Plan, and has branded Commerce as “America’s Data Agency.” In keeping with that mantra, over the next several months, my team at the Economics and Statistics Administration (ESA), which includes the Bureau of Economic Analysis and the U.S. Census Bureau, will be involved in similar forums.  We will be engaging our users – businesses, academia, advocacy organizations, and state and local governments – to drive this open data conversation forward. 
Today was a big first step in that process. The insight gained will help inform our efforts ahead. Thanks again to the team at GovLab and the White House for their hard work in making it possible!”

Transparency, legitimacy and trust


John Kamensky at Federal Times: “The Open Government movement has captured the imagination of many around the world as a way of increasing transparency, participation, and accountability. In the US, many of the federal, state, and local Open Government initiatives have been demonstrated to achieve positive results for citizens here and abroad. In fact, the White House’s science advisors released a refreshed Open Government plan in early June.
However, a recent study in Sweden says the benefits of transparency may vary, and may have little impact on citizens’ perception of legitimacy and trust in government. This research suggests important lessons on how public managers should approach the design of transparency strategies, and how they work in various conditions.
Jenny de Fine Licht, a scholar at the University of Gothenberg in Sweden, offers a more nuanced view of the influence of transparency in political decision making on public legitimacy and trust, in a paper that appears in the current issue of “Public Administration Review.” Her research challenges the assumption of many in the Open Government movement that greater transparency necessarily leads to greater citizen trust in government.
Her conclusion, based on an experiment involving over 1,000 participants, was that the type and degree of transparency “has different effects in different policy areas.” She found that “transparency is less effective in policy decisions that involve trade-offs related to questions of human life and death or well-being.”

The background

Licht says there are some policy decisions that involve what are called “taboo tradeoffs.” A taboo tradeoff, for example, would be making budget tradeoffs in policy areas such as health care and environmental quality, where human life or well-being is at stake. In cases where more money is an implicit solution, the author notes, “increased transparency in these policy areas might provoke feeling of taboo, and, accordingly, decreased perceived legitimacy.”
Other scholars, such as Harvard’s Jane Mansbridge,contend that “full transparency may not always be the best practice in policy making.” Full transparency in decision-making processes would include, for example, open appropriation committee meetings. Instead, she recommends “transparency in rationale – in procedures, information, reasons, and the facts on which the reasons are based.” That is, provide a full explanation after-the-fact.
Licht tested the hypothesis that full transparency of the decision-making process vs. partial transparency via providing after-the-fact rationales for decisions may create different results, depending on the policy arena involved…
Open Government advocates have generally assumed that full and open transparency is always better. Licht’s conclusion is that “greater transparency” does not necessarily increase citizen legitimacy and trust. Instead, the strategy of encouraging a high degree of transparency requires a more nuanced application in its use. While the she cautions about generalizing from her experiment, the potential implications for government decision-makers could be significant.
To date, many of the various Open Government initiatives across the country have assumed a “one size fits all” approach, across the board. Licht’s conclusions, however, help explain why the results of various initiatives have been divergent in terms of citizen acceptance of open decision processes.
Her experiment seems to suggest that citizen engagement is more likely to create a greater citizen sense of legitimacy and trust in areas involving “routine” decisions, such as parks, recreation, and library services. But that “taboo” decisions in policy areas involving tradeoffs of human life, safety, and well-being may not necessarily result in greater trust as a result of the use of full and open transparency of decision-making processes.
While she says that transparency – whether full or partial – is always better than no transparency, her experiment at least shows that policy makers will, at a minimum, know that the end result may not be greater legitimacy and trust. In any case, her research should engender a more nuanced conversation among Open Government advocates at all levels of government. In order to increase citizens’ perceptions of legitimacy and trust in government, it will take more than just advocating for Open Data!”

Big Data, My Data


Jane Sarasohn-Kahn  at iHealthBeat: “The routine operation of modern health care systems produces an abundance of electronically stored data on an ongoing basis,” Sebastian Schneeweis writes in a recent New England Journal of Medicine Perspective.
Is this abundance of data a treasure trove for improving patient care and growing knowledge about effective treatments? Is that data trove a Pandora’s black box that can be mined by obscure third parties to benefit for-profit companies without rewarding those whose data are said to be the new currency of the economy? That is, patients themselves?
In this emerging world of data analytics in health care, there’s Big Data and there’s My Data (“small data”). Who most benefits from the use of My Data may not actually be the consumer.
Big focus on Big Data. Several reports published in the first half of 2014 talk about the promise and perils of Big Data in health care. The Federal Trade Commission’s study, titled “Data Brokers: A Call for Transparency and Accountability,” analyzed the business practices of nine “data brokers,” companies that buy and sell consumers’ personal information from a broad array of sources. Data brokers sell consumers’ information to buyers looking to use those data for marketing, managing financial risk or identifying people. There are health implications in all of these activities, and the use of such data generally is not covered by HIPAA. The report discusses the example of a data segment called “Smoker in Household,” which a company selling a new air filter for the home could use to target-market to an individual who might seek such a product. On the downside, without the consumers’ knowledge, the information could be used by a financial services company to identify the consumer as a bad health insurance risk.
Big Data and Privacy: A Technological Perspective,” a report from the President’s Office of Science and Technology Policy, considers the growth of Big Data’s role in helping inform new ways to treat diseases and presents two scenarios of the “near future” of health care. The first, on personalized medicine, recognizes that not all patients are alike or respond identically to treatments. Data collected from a large number of similar patients (such as digital images, genomic information and granular responses to clinical trials) can be mined to develop a treatment with an optimal outcome for the patients. In this case, patients may have provided their data based on the promise of anonymity but would like to be informed if a useful treatment has been found. In the second scenario, detecting symptoms via mobile devices, people wishing to detect early signs of Alzheimer’s Disease in themselves use a mobile device connecting to a personal couch in the Internet cloud that supports and records activities of daily living: say, gait when walking, notes on conversations and physical navigation instructions. For both of these scenarios, the authors ask, “Can the information about individuals’ health be sold, without additional consent, to third parties? What if this is a stated condition of use of the app? Should information go to the individual’s personal physicians with their initial consent but not a subsequent confirmation?”
The World Privacy Foundation’s report, titled “The Scoring of America: How Secret Consumer Scores Threaten Your Privacy and Your Future,” describes the growing market for developing indices on consumer behavior, identifying over a dozen health-related scores. Health scores include the Affordable Care Act Individual Health Risk Score, the FICO Medication Adherence Score, various frailty scores, personal health scores (from WebMD and OneHealth, whose default sharing setting is based on the user’s sharing setting with the RunKeeper mobile health app), Medicaid Resource Utilization Group Scores, the SF-36 survey on physical and mental health and complexity scores (such as the Aristotle score for congenital heart surgery). WPF presents a history of consumer scoring beginning with the FICO score for personal creditworthiness and recommends regulatory scrutiny on the new consumer scores for fairness, transparency and accessibility to consumers.
At the same time these three reports went to press, scores of news stories emerged discussing the Big Opportunities Big Data present. The June issue of CFO Magazine published a piece called “Big Data: Where the Money Is.” InformationWeek published “Health Care Dives Into Big Data,” Motley Fool wrote about “Big Data’s Big Future in Health Care” and WIRED called “Cloud Computing, Big Data and Health Care” the “trifecta.”
Well-timed on June 5, the Office of the National Coordinator for Health IT’s Roadmap for Interoperability was detailed in a white paper, titled “Connecting Health and Care for the Nation: A 10-Year Vision to Achieve an Interoperable Health IT Infrastructure.” The document envisions the long view for the U.S. health IT ecosystem enabling people to share and access health information, ensuring quality and safety in care delivery, managing population health, and leveraging Big Data and analytics. Notably, “Building Block #3” in this vision is ensuring privacy and security protections for health information. ONC will “support developers creating health tools for consumers to encourage responsible privacy and security practices and greater transparency about how they use personal health information.” Looking forward, ONC notes the need for “scaling trust across communities.”
Consumer trust: going, going, gone? In the stakeholder community of U.S. consumers, there is declining trust between people and the companies and government agencies with whom people deal. Only 47% of U.S. adults trust companies with whom they regularly do business to keep their personal information secure, according to a June 6 Gallup poll. Furthermore, 37% of people say this trust has decreased in the past year. Who’s most trusted to keep information secure? Banks and credit card companies come in first place, trusted by 39% of people, and health insurance companies come in second, trusted by 26% of people.
Trust is a basic requirement for health engagement. Health researchers need patients to share personal data to drive insights, knowledge and treatments back to the people who need them. PatientsLikeMe, the online social network, launched the Data for Good project to inspire people to share personal health information imploring people to “Donate your data for You. For Others. For Good.” For 10 years, patients have been sharing personal health information on the PatientsLikeMe site, which has developed trusted relationships with more than 250,000 community members…”

The Art and Science of Data-driven Journalism


Alex Howard for the Tow Center for digital journalism: Journalists have been using data in their stories for as long as the profession has existed. A revolution in computing in the 20th century created opportunities for data integration into investigations, as journalists began to bring technology into their work. In the 21st century, a revolution in connectivity is leading the media toward new horizons. The Internet, cloud computing, agile development, mobile devices, and open source software have transformed the practice of journalism, leading to the emergence of a new term: data journalism. Although journalists have been using data in their stories for as long as they have been engaged in reporting, data journalism is more than traditional journalism with more data. Decades after early pioneers successfully applied computer-assisted reporting and social science to investigative journalism, journalists are creating news apps and interactive features that help people understand data, explore it, and act upon the insights derived from it. New business models are emerging in which data is a raw material for profit, impact, and insight, co-created with an audience that was formerly reduced to passive consumption. Journalists around the world are grappling with the excitement and the challenge of telling compelling stories by harnessing the vast quantity of data that our increasingly networked lives, devices, businesses, and governments produce every day. While the potential of data journalism is immense, the pitfalls and challenges to its adoption throughout the media are similarly significant, from digital literacy to competition for scarce resources in newsrooms. Global threats to press freedom, digital security, and limited access to data create difficult working conditions for journalists in many countries. A combination of peer-to-peer learning, mentorship, online training, open data initiatives, and new programs at journalism schools rising to the challenge, however, offer reasons to be optimistic about more journalists learning to treat data as a source. (Download the report)”

Reflections on How Designers Design With Data


Alex Bigelow, Steven Drucker, Danyel Fisher, and Miriah Meyer at Microsoft Research: “In recent years many popular data visualizations have emerged that are created largely by designers whose main area of expertise is not computer science. Designers generate these visualizations using a handful of design tools and environments. To better inform the development of tools intended for designers working with data, we set out to understand designers’ challenges and perspectives. We interviewed professional designers, conducted observations of designers working with data in the lab, and observed designers working with data in team settings in the wild. A set of patterns emerged from these observations from which we extract a number of themes that provide a new perspective on design considerations for visualization tool creators, as well as on known engineering problems.”