For Crowdsourcing to Work, Everyone Needs an Equal Voice


Joshua Becker and Edward “Ned” Smith in Havard Business Review: “How useful is the wisdom of crowds? For years, it has been recognized as producing incredibly accurate predictions by aggregating the opinions of many people, allowing even amateur forecasters to beat the experts. The belief is that when large numbers of people make forecasts independently, their errors are uncorrelated and ultimately cancel each other out, which leads to more accurate final answers.

However, researchers and pundits have argued that the wisdom of crowds is extremely fragile, especially in two specific circumstances: when people are influenced by the opinions of others (because they lose their independence) and when opinions are distorted by cognitive biases (for example, strong political views held by a group).

In new research, we and our colleagues zeroed in on these assumptions and found that the wisdom of crowds is more robust than previously thought — it can even withstand the groupthink of similar-minded people. But there’s one important caveat: In order for the wisdom of crowds to retain its accuracy for making predictions, every member of the group must be given an equal voice, without any one person dominating. As we discovered, the pattern of social influence within groups — that is, who talks to whom and when — is the key determinant of the crowd’s accuracy in making predictions….(More)”.

Bringing machine learning to the masses


Matthew Hutson at Science: “Artificial intelligence (AI) used to be the specialized domain of data scientists and computer programmers. But companies such as Wolfram Research, which makes Mathematica, are trying to democratize the field, so scientists without AI skills can harness the technology for recognizing patterns in big data. In some cases, they don’t need to code at all. Insights are just a drag-and-drop away. One of the latest systems is software called Ludwig, first made open-source by Uber in February and updated last week. Uber used Ludwig for projects such as predicting food delivery times before releasing it publicly. At least a dozen startups are using it, plus big companies such as Apple, IBM, and Nvidia. And scientists: Tobias Boothe, a biologist at the Max Planck Institute of Molecular Cell Biology and Genetics in Dresden, Germany, uses it to visually distinguish thousands of species of flatworms, a difficult task even for experts. To train Ludwig, he just uploads images and labels….(More)”.

What can the labor flow of 500 million people on LinkedIn tell us about the structure of the global economy?


Paper by Jaehyuk Park et al: “…One of the most popular concepts for policy makers and business economists to understand the structure of the global economy is “cluster”, the geographical agglomeration of interconnected firms such as Silicon ValleyWall Street, and Hollywood. By studying those well-known clusters, we become to understand the advantage of participating in a geo-industrial cluster for firms and how it is related to the economic growth of a region. 

However, the existing definition of geo-industrial cluster is not systematic enough to reveal the whole picture of the global economy. Often, after defining as a group of firms in a certain area, the geo-industrial clusters are considered as independent to each other. As we should consider the interaction between accounting team and marketing team to understand the organizational structure of a firm, the relationships among those geo-industrial clusters are the essential part of the whole picture….

In this new study, my colleagues and I at Indiana University — with support from LinkedIn — have finally overcome these limitations by defining geo-industrial clusters through labor flow and constructing a global labor flow network from LinkedIn’s individual-level job history dataset. Our access to this data was made possible by our selection as one of 11 teams selected to participate in the LinkedIn Economic Graph Challenge.

The transitioning of workers between jobs and firms — also known as labor flow — is considered central in driving firms towards geo-industrial clusters due to knowledge spillover and labor market pooling. In response, we mapped the cluster structure of the world economy based on labor mobility between firms during the last 25 years, constructing a “labor flow network.” 

To do this, we leverage LinkedIn’s data on professional demographics and employment histories from more than 500 million people between 1990 and 2015. The network, which captures approximately 130 million job transitions between more than 4 million firms, is the first-ever flow network of global labor.

The resulting “map” allows us to:

  • identify geo-industrial clusters systematically and organically using network community detection
  • verify the importance of region and industry in labor mobility
  • compare the relative importance between the two constraints in different hierarchical levels, and
  • reveal the practical advantage of the geo-industrial cluster as a unit of future economic analyses.
  • show a better picture of what industry in what region leads the economic growth of the industry or the region, at the same time
  • find out emerging and declining skills based on the representativeness of them in growing and declining geo-industrial clusters…(More)”.

For academics, what matters more: journal prestige or readership?


Katie Langin at Science: “With more than 30,000 academic journals now in circulation, academics can have a hard time figuring out where to submit their work for publication. The decision is made all the more difficult by the sky-high pressure of today’s academic environment—including working toward tenure and trying to secure funding, which can depend on a researcher’s publication record. So, what does a researcher prioritize?

According to a new study posted on the bioRxiv preprint server, faculty members say they care most about whether the journal is read by the people they most want to reach—but they think their colleagues care most about journal prestige. Perhaps unsurprisingly, prestige also held more sway for untenured faculty members than for their tenured colleagues.

“I think that it is about the security that comes with being later in your career,” says study co-author Juan Pablo Alperin, an assistant professor in the publishing program at Simon Fraser University in Vancouver, Canada. “It means you can stop worrying so much about the specifics of what is being valued; there’s a lot less at stake.”

According to a different preprint that Alperin and his colleagues posted on PeerJ in April, 40% of research-intensive universities in the United States and Canada explicitly mention that journal impact factors can be considered in promotion and tenure decisions. More likely do so unofficially, with faculty members using journal names on a CV as a kind of shorthand for how “good” a candidate’s publication record is. “You can’t ignore the fact that journal impact factor is a reality that gets looked at,” Alperin says. But some argue that journal prestige and impact factor are overemphasized and harm science, and that academics should focus on the quality of individual work rather than journal-wide metrics. 

In the new study, only 31% of the 338 faculty members who were surveyed—all from U.S. and Canadian institutions and from a variety of disciplines, including 38% in the life and physical sciences and math—said that journal prestige was “very important” to them when deciding where to submit a manuscript. The highest priority was journal readership, which half said was very important. Fewer respondents felt that publication costs (24%) and open access (10%) deserved the highest importance rating.

But, when those same faculty members were asked to assess how their colleagues make the same decision, journal prestige shot to the top of the list, with 43% of faculty members saying that it was very important to their peers when deciding where to submit a manuscript. Only 30% of faculty members thought the same thing about journal readership—a drop of 20 percentage points compared with how faculty members assessed their own motivations….(More)”.

Hacking for Housing: How open data and civic hacking creates wins for housing advocates


Krista Chan at Sunlight: “…Housing advocates have an essential role to play in protecting residents from the consequences of real estate speculation. But they’re often at a significant disadvantage; the real estate lobby has access to a wealth of data and technological expertise. Civic hackers and open data could play an essential role in leveling the playing field.

Civic hackers have facilitated wins for housing advocates by scraping data or submitting FOIA requests where data is not open and creating apps to help advocates gain insights that they can turn into action. 

Hackers at New York City’s Housing Data Coalition created a host of civic apps that identify problematic landlords by exposing owners behind shell companies, or flagging buildings where tenants are at risk of displacement. In a similar vein, Washington DC’s Housing Insights tool aggregates a wide variety of data to help advocates make decisions about affordable housing.

Barriers and opportunities

Today, the degree to which housing data exists, is openly available, and consistently reliable varies widely, even within cities themselves. Cities with robust communities of affordable housing advocacy groups may not be connected to people who can help open data and build usable tools. Even in cities with robust advocacy and civic tech communities, these groups may not know how to work together because of the significant institutional knowledge that’s required to understand how to best support housing advocacy efforts.

In cities where civic hackers have tried to create useful open housing data repositories, similar data cleaning processes have been replicated, such as record linkage of building owners or identification of rent-controlled units. Civic hackers need to take on these data cleaning and “extract, transform, load” (ETL) processes in order to work with the data itself, even if it’s openly available. The Housing Data Coalition has assembled NYC-DB, a tool which builds a postgres database containing a variety of housing related data pertaining to New York City, and Washington DC’s Housing Insights similarly ingests housing data into a postgres database and API for front-end access

Since these tools are open source, civic hackers in a multitude of cities can use existing work to develop their own, locally relevant tools to support local housing advocates….(More)”.

Concerns About Online Data Privacy Span Generations


Internet Innovations Alliance: “Are Millennials okay with the collection and use of their data online because they grew up with the internet?

In an effort to help inform policymakers about the views of Americans across generations on internet privacy, the Internet Innovation Alliance, in partnership with Icon Talks, the Hispanic Technology & Telecommunications Partnership (HTTP), and the Millennial Action Project, commissioned a national study of U.S. consumers who have witnessed a steady stream of online privacy abuses, data misuses, and security breaches in recent years. The survey examined the concerns of U.S. adults—overall and separated by age group, as well as other demographics—regarding the collection and use of personal data and location information by tech and social media companies, including tailoring the online experience, the potential for their personal financial information to be hacked from online tech and social media companies, and the need for a single, national policy addressing consumer data privacy.

Download: “Concerns About Online Data Privacy Span Generations” IIA white paper pdf.

Download: “Consumer Data Privacy Concerns” Civic Science report pdf….(More)”

Value in the Age of AI


Project Syndicate: “Much has been written about Big Data, artificial intelligence, and automation. The Fourth Industrial Revolution will have far-reaching implications for jobs, ethics, privacy, and equality. But more than that, it will also transform how we think about value – where it comes from, how it is captured, and by whom.

In “Value in the Age of AI,” Project Syndicate, with support from the Dubai Future Foundation, GovLab (New York University), and the Centre for Data & Society (Brussels), will host an ongoing debate about the changing nature of value in the twenty-first century. In the commentaries below, leading thinkers at the intersection of technology, economics, culture, and politics discuss how new technologies are changing our societies, businesses, and individual lived experiences, and what that might mean for our collective future….(More)”.

Where next for open government?


Blog Post by Natalia Domagala: “…We can all agree that open government is a necessary and valuable concept. 

Nevertheless, eight years since the Open Government Partnership (OGP) was founded — the leading intergovernmental forum moving the agenda of open government forward — the challenge is now how to adapt their processes to reflect the dynamic and often unstable realm of global politics. 

For open government to be truly impactful, policies should account for the reality of government work. If we get this wrong, there is a risk of open government becoming a token of participation without any meaning. 

The collective goal of open government practitioners/community should be to strive for open government to become the new normal — an aim that requires looking at the cracks in the current process and thinking of what can be done to address them. 

As an example, there have been an increasing number of letters sent by the OGP in the past few years as a reaction to national action plans being published too or as notifications of late self-assessment returns. 

If a large number of countries across the geographical spectrum continuously miss these deadlines, this would indicate that a change of approach may be needed. Perhaps it’s time to move away from the two year cycles of national action plans that seemingly haven’t been working for an increasing number of countries, and experiment with the length and format of open government plans. 

Changing the policy rhythm

Longer, 4 or 6 year strategic commitments could lead to structural changes in how governments approach open dataparticipatory policymaking, and other principles of open government. 

Two years is a short time in the cycle of government, and offers insufficient time to deliver desirable results. The pressure to start thinking about the next plan half way through implementing the first one can negatively impact the quality of commitments and their impact. 

Having a rolling NAP that is updated with very specific actions for every two years could be another alternative. Open government is a vibrant and fast-growing movement, therefore action plans should reflect it through being living and interactive documents. Perhaps after two or three national action plans countries should be allowed to adjust the cycle to their needs and domestic government planning timescales. 

There is an opportunity for open government as a movement in going beyond the national action plan commitments. Open government teams within governments should scrutinise existing policies and advise their colleagues on how to align their policymaking process with the principles of participation, accountability, and inclusion, to eventually embed the open government approach across all policy projects. 

Appetite for new strategies 

The rise of “open”, “agile”, and “participatory” attitudes to policy indicate that there is an appetite for more responsive and better-tailored strategies, an appetite that the global open government movement could look to satisfy. 

The next steps could be focused on raising awareness of open ways of working within governments, and developing the policymaker’s capacity to deploy them through workshops and guidance….(More)”.

The Psychology of Prediction


Blog post by Morgan Housel: “During the Vietnam War Secretary of Defense Robert McNamara tracked every combat statistic he could, creating a mountain of analytics and predictions to guide the war’s strategy.

Edward Lansdale, head of special operations at the Pentagon, once looked at McNamara’s statistics and told him something was missing.

“What?” McNamara asked.

“The feelings of the Vietnamese people,” Landsdale said.

That’s not the kind of thing a statistician pays attention to. But, boy, did it matter.

I believe in prediction. I think you have to in order to get out of bed in the morning.

But prediction is hard. Either you know that or you’re in denial about it.

A lot of the reason it’s hard is because the visible stuff that happens in the world is a small fraction of the hidden stuff that goes on inside people’s heads. The former is easy to overanalyze; the latter is easy to ignore.

This report describes 12 common flaws, errors, and misadventures that occur in people’s heads when predictions are made….(More)”.

What Restaurant Reviews Reveal About Cities


Linda Poon at CityLab: “Online review sites can tell you a lot about a city’s restaurant scene, and they can reveal a lot about the city itself, too.

Researchers at MIT recently found that information about restaurants gathered from popular review sites can be used to uncover a number of socioeconomic factors of a neighborhood, including its employment rates and demographic profiles of the people who live, work, and travel there.

A report published last week in the Proceedings of the National Academy of Sciences explains how the researchers used information found on Dianping—a Yelp-like site in China—to find information that might usually be gleaned from an official government census. The model could prove especially useful for gathering information about cities that don’t have that kind of reliable or up-to-date government data, especially in developing countries with limited resources to conduct regular surveys….

Zheng and her colleagues tested out their machine-learning model using restaurant data from nine Chinese cities of various sizes—from crowded ones like Beijing, with a population of more than 10 million, to smaller ones like Baoding, a city of fewer than 3 million people.

They pulled data from 630,000 restaurants listed on Dianping, including each business’s location, menu prices, opening day, and customer ratings. Then they ran it through a machine-learning model with official census data and with anonymous location and spending data gathered from cell phones and bank cards. By comparing the information, they were able to determine where the restaurant data reflected the other data they had about neighborhoods’ characteristics.

They found that the local restaurant scene can predict, with 95 percent accuracy, variations in a neighborhood’s daytime and nighttime populations, which are measured using mobile phone data. They can also predict, with 90 and 93 percent accuracy, respectively, the number of businesses and the volume of consumer consumption. The type of cuisines offered and kind of eateries available (coffeeshop vs. traditional teahouses, for example), can also predict the proportion of immigrants or age and income breakdown of residents. The predictions are more accurate for neighborhoods near urban centers as opposed to those near suburbs, and for smaller cities, where neighborhoods don’t vary as widely as those in bigger metropolises….(More)”.