Bringing machine learning to the masses


Matthew Hutson at Science: “Artificial intelligence (AI) used to be the specialized domain of data scientists and computer programmers. But companies such as Wolfram Research, which makes Mathematica, are trying to democratize the field, so scientists without AI skills can harness the technology for recognizing patterns in big data. In some cases, they don’t need to code at all. Insights are just a drag-and-drop away. One of the latest systems is software called Ludwig, first made open-source by Uber in February and updated last week. Uber used Ludwig for projects such as predicting food delivery times before releasing it publicly. At least a dozen startups are using it, plus big companies such as Apple, IBM, and Nvidia. And scientists: Tobias Boothe, a biologist at the Max Planck Institute of Molecular Cell Biology and Genetics in Dresden, Germany, uses it to visually distinguish thousands of species of flatworms, a difficult task even for experts. To train Ludwig, he just uploads images and labels….(More)”.

What can the labor flow of 500 million people on LinkedIn tell us about the structure of the global economy?


Paper by Jaehyuk Park et al: “…One of the most popular concepts for policy makers and business economists to understand the structure of the global economy is “cluster”, the geographical agglomeration of interconnected firms such as Silicon ValleyWall Street, and Hollywood. By studying those well-known clusters, we become to understand the advantage of participating in a geo-industrial cluster for firms and how it is related to the economic growth of a region. 

However, the existing definition of geo-industrial cluster is not systematic enough to reveal the whole picture of the global economy. Often, after defining as a group of firms in a certain area, the geo-industrial clusters are considered as independent to each other. As we should consider the interaction between accounting team and marketing team to understand the organizational structure of a firm, the relationships among those geo-industrial clusters are the essential part of the whole picture….

In this new study, my colleagues and I at Indiana University — with support from LinkedIn — have finally overcome these limitations by defining geo-industrial clusters through labor flow and constructing a global labor flow network from LinkedIn’s individual-level job history dataset. Our access to this data was made possible by our selection as one of 11 teams selected to participate in the LinkedIn Economic Graph Challenge.

The transitioning of workers between jobs and firms — also known as labor flow — is considered central in driving firms towards geo-industrial clusters due to knowledge spillover and labor market pooling. In response, we mapped the cluster structure of the world economy based on labor mobility between firms during the last 25 years, constructing a “labor flow network.” 

To do this, we leverage LinkedIn’s data on professional demographics and employment histories from more than 500 million people between 1990 and 2015. The network, which captures approximately 130 million job transitions between more than 4 million firms, is the first-ever flow network of global labor.

The resulting “map” allows us to:

  • identify geo-industrial clusters systematically and organically using network community detection
  • verify the importance of region and industry in labor mobility
  • compare the relative importance between the two constraints in different hierarchical levels, and
  • reveal the practical advantage of the geo-industrial cluster as a unit of future economic analyses.
  • show a better picture of what industry in what region leads the economic growth of the industry or the region, at the same time
  • find out emerging and declining skills based on the representativeness of them in growing and declining geo-industrial clusters…(More)”.

Blockchain and Democracy


Literature Review by Jörn Erbguth: “Democratic states are entities where issues are decided by a large group – the people. There is a democratic process that builds upon elections, a legislative procedure, judicial review and separation of powers by checks and balances. Blockchains rely on decentralization, meaning they rely on a large group of participants as well. Blockchains are therefore confronted with similar problems. Even further, blockchains try to avoid central coordinating authorities.

Consensus methods ensure that the systems align with the majority of their participants. Above the layer of the consensus method, blockchain governance coordinates decisions about software updates, bugfixes and possibly other interventions. What are the strengths and weaknesses of this blockchain governance?
Should we use blockchain to secure e-voting? Blockchain governance has two central aspects. First, it is decentralized governance based on a large group of people, which resembles democratic decision-making. Second, it is algorithmic decision-making and limits unwanted human intervention

Cornerstones
Blockchain and democracy can be split into three areas:

First, the use of democratic principles in order to make blockchain work. This ranges from the basic concensus algorithm to the (self-)governance of a blockchain.

Second, blockchain is seen as providing a reliable tool for democracy. This ranges from the use of blockchain for electronic voting to the use in administration.

Third, to study possible impacts of blockchain technology on a democratic society. This focusses on regulatory and legal aspects as well as ethical aspects….(More)”

Hacking for Housing: How open data and civic hacking creates wins for housing advocates


Krista Chan at Sunlight: “…Housing advocates have an essential role to play in protecting residents from the consequences of real estate speculation. But they’re often at a significant disadvantage; the real estate lobby has access to a wealth of data and technological expertise. Civic hackers and open data could play an essential role in leveling the playing field.

Civic hackers have facilitated wins for housing advocates by scraping data or submitting FOIA requests where data is not open and creating apps to help advocates gain insights that they can turn into action. 

Hackers at New York City’s Housing Data Coalition created a host of civic apps that identify problematic landlords by exposing owners behind shell companies, or flagging buildings where tenants are at risk of displacement. In a similar vein, Washington DC’s Housing Insights tool aggregates a wide variety of data to help advocates make decisions about affordable housing.

Barriers and opportunities

Today, the degree to which housing data exists, is openly available, and consistently reliable varies widely, even within cities themselves. Cities with robust communities of affordable housing advocacy groups may not be connected to people who can help open data and build usable tools. Even in cities with robust advocacy and civic tech communities, these groups may not know how to work together because of the significant institutional knowledge that’s required to understand how to best support housing advocacy efforts.

In cities where civic hackers have tried to create useful open housing data repositories, similar data cleaning processes have been replicated, such as record linkage of building owners or identification of rent-controlled units. Civic hackers need to take on these data cleaning and “extract, transform, load” (ETL) processes in order to work with the data itself, even if it’s openly available. The Housing Data Coalition has assembled NYC-DB, a tool which builds a postgres database containing a variety of housing related data pertaining to New York City, and Washington DC’s Housing Insights similarly ingests housing data into a postgres database and API for front-end access

Since these tools are open source, civic hackers in a multitude of cities can use existing work to develop their own, locally relevant tools to support local housing advocates….(More)”.

Concerns About Online Data Privacy Span Generations


Internet Innovations Alliance: “Are Millennials okay with the collection and use of their data online because they grew up with the internet?

In an effort to help inform policymakers about the views of Americans across generations on internet privacy, the Internet Innovation Alliance, in partnership with Icon Talks, the Hispanic Technology & Telecommunications Partnership (HTTP), and the Millennial Action Project, commissioned a national study of U.S. consumers who have witnessed a steady stream of online privacy abuses, data misuses, and security breaches in recent years. The survey examined the concerns of U.S. adults—overall and separated by age group, as well as other demographics—regarding the collection and use of personal data and location information by tech and social media companies, including tailoring the online experience, the potential for their personal financial information to be hacked from online tech and social media companies, and the need for a single, national policy addressing consumer data privacy.

Download: “Concerns About Online Data Privacy Span Generations” IIA white paper pdf.

Download: “Consumer Data Privacy Concerns” Civic Science report pdf….(More)”

Value in the Age of AI


Project Syndicate: “Much has been written about Big Data, artificial intelligence, and automation. The Fourth Industrial Revolution will have far-reaching implications for jobs, ethics, privacy, and equality. But more than that, it will also transform how we think about value – where it comes from, how it is captured, and by whom.

In “Value in the Age of AI,” Project Syndicate, with support from the Dubai Future Foundation, GovLab (New York University), and the Centre for Data & Society (Brussels), will host an ongoing debate about the changing nature of value in the twenty-first century. In the commentaries below, leading thinkers at the intersection of technology, economics, culture, and politics discuss how new technologies are changing our societies, businesses, and individual lived experiences, and what that might mean for our collective future….(More)”.

Strategies and limitations in app usage and human mobility


Paper by Marco De Nadai, Angelo Cardoso, Antonio Lima, Bruno Lepri, and Nuria Oliver: “Cognition has been found to constrain several aspects of human behaviour, such as the number of friends and the number of favourite places a person keeps stable over time. this limitation has been empirically defined in the physical and social spaces. But do people exhibit similar constraints in the digital space? We address this question through the analysis of pseudonymised mobility and mobile application (app) usage data of 400,000 individuals in a European country for six months. Despite the enormous heterogeneity of apps usage, we find that individuals exhibit a conserved capacity that limits the number of applications they regularly use. Moreover, we find that this capacity steadily decreases with age, as does the capacity in the physical space but with more complex dynamics. Even though people might have the same capacity, applications get added and removed over time.

In this respect, we identify two profiles of individuals: app keepers and explorers, which differ in their stable (keepers) vs exploratory (explorers) behaviour regarding their use of mobile applications. Finally, we show that the capacity of applications predicts mobility capacity and vice-versa. By contrast, the behaviour of keepers and explorers may considerably vary across the two domains. Our empirical findings provide an intriguing picture linking human behaviour in the physical and digital worlds which bridges research studies from Computer Science, Social Physics and Computational Social Sciences…(More)”.

How Can We Use Administrative Data to Prevent Homelessness among Youth Leaving Care?


Article by Naomi Nichols: “In 2017, I was part of a team of people at the Canadian Observatory on Homelessness and A Way Home Canada who wrote a policy brief titled, Child Welfare and Youth Homelessness in Canada: A proposal for action. Drawing on the results of the first pan-Canadian survey on youth homelessness, Without a Home: The National Youth Homelessness Surveythe brief focused on the disproportionate number of young people who had been involved with child protection services and then later became homeless. Indeed, 57.8% of homeless youth surveyed reported some type of involvement with child protection services over their lifetime. By comparison, in the general population, only 0.3% of young people receive child welfare service. This means, youth experiencing homelessness are far more likely to report interactions with the child welfare system than young people in the general population. 

Where research reveals systematic patterns of exclusion and neglect – that is, where findings reveal that one group is experiencing disproportionately negative outcomes (relative to the general population) in a particular public sector context – this suggests the need for changes in public policy, programming and practice. Since producing this brief, I have been working with an incredibly talented and passionate McGill undergraduate student (who also happens to be the Vice President of Youth in Care Canada), Arisha Khan. Together, we have been exploring just uses of data to better serve the interests of those young people who depend on the state for their access to basic services (e.g., housing, healthcare and food) as well as their self-efficacy and status as citizens. 

One component of this work revolved around a grant application that has just been funded by the Social Sciences and Humanities Research Council of Canada (Data Justice: Fostering equitable data-led strategies to prevent, reduce and end youth homelessness). Another aspect of our work revolved around a policy brief, which we co-wrote and published with the Montreal data-for-good organization, Powered by Data. The brief outlines how a rights-based and custodial approach to administrative data could a) effectively support young people in and leaving care to participate more actively in their transition planning and engage in institutional self-advocacy; and b) enable systemic oversight of intervention implementation and outcomes for young people in and leaving the provincial care system. We produced this brief with the hope that it would be useful to government decision-makers, service providers, researchers, and advocates interested in understanding how institutional data could be used to improve outcomes for youth in and leaving care. In particular, we wanted to explore whether a different orientation to data collection and use in child protection systems could prevent young people from graduating from provincial child welfare systems into homelessness. In addition to this practical concern, we also undertook to think through the ethical and human rights implications of more recent moves towards data-driven service delivery in Canada, focusing on how we might make this move with the best interests of young people in mind. 

As data collection, management and use practices have become more popularresearch is beginning to illuminate how these new monitoring, evaluative and predictive technologies are changing governance processes within and across the public sector, as well as in civil society. ….(More)”.

Data Is a Development Issue


Paper by Susan Ariel Aaronson: “Many wealthy states are transitioning to a new economy built on data. Individuals and firms in these states have expertise in using data to create new goods and services as well as in how to use data to solve complex problems. Other states may be rich in data but do not yet see their citizens’ personal data or their public data as an asset. Most states are learning how to govern and maintain trust in the data-driven economy; however, many developing countries are not well positioned to govern data in a way that encourages development. Meanwhile, some 76 countries are developing rules and exceptions to the rules governing cross-border data flows as part of new negotiations on e-commerce. This paper uses a wide range of metrics to show that most developing and middle-income countries are not ready or able to provide an environment where their citizens’ personal data is protected and where public data is open and readily accessible. Not surprisingly, greater wealth is associated with better scores on all the metrics. Yet, many industrialized countries are also struggling to govern the many different types and uses of data. The paper argues that data governance will be essential to development, and that donor nations have a responsibility to work with developing countries to improve their data governance….(More)”.

The New York Times thinks a blockchain could help stamp out fake news


MIT Technology Review: “Blockchain technology is at the core of a new research project the New York Times has launched, aimed at making “the origins of journalistic content clearer to [its] audience.”

The news: The Times has launched what it calls The News Provenance Project, which will experiment with ways to combat misinformation in the news media. The first project will focus on using a blockchain—specifically a platform designed by IBM—to prove that photos are authentic.

Blockchain? Really? Rumors and speculation swirled in March, after CoinDesk reported that the New York Times was looking for someone to help it develop a “blockchain-based proof-of-concept for news publishers.” Though the newspaper removed the job posting after the article came out, apparently it was serious. In a new blog post, project lead Sasha Koren explains that by using a blockchain, “we might in theory provide audiences with a way to determine the source of a photo, or whether it had been edited after it was published.”

Unfulfilled promise: Using a blockchain to prove the authenticity of journalistic content has long been considered a potential application of the technology, but attempts to do it so far haven’t gotten much traction. If the New York Times can develop a compelling application, it has enough influence to change that….(More)”.