DATA – Page 280 – The Living Library

Open data for electricity modeling: Legal aspects

Curated on December 19, 2019December 19, 2019 by Stefaan Verhulst

Paper by Lion Hirth: “Power system modeling is data intensive. In Europe, electricity system data is often available from sources such as statistical offices or system operators. However, it is often unclear if these data can be legally used for modeling, and in particular if such use infringes intellectual property rights. This article reviews the legal status of power system data, both as a guide for data users and for data publishers.

It is based on interpretation of the law, a review of the secondary literature, an analysis of the licenses used by major data distributors, expert interviews, and a series of workshops. A core finding is that in many cases the legality of current practices is doubtful: in fact, it seems likely that modelers infringe intellectual property rights quite regularly. This is true for industry analysis but also academic researchers. A straightforward solution is open data – the idea that data can be freely used, modified, and shared by anyone for any purpose. To be open, it is not sufficient for data to be accessible free of cost, it must also come with an open data license, the most common types of which are also reviewed in this paper….(More)”.

Federal Sources of Entrepreneurship Data: A Compendium

Curated on December 18, 2019December 18, 2019 by Stefaan Verhulst

Compendium developed by Andrew Reamer: “The E.M. Kauffman Foundation has asked the George Washington Institute of Public Policy (GWIPP) to prepare a compendium of federal sources of data on self-employment, entrepreneurship, and small business development. The Foundation believes that the availability of useful, reliable federal data on these topics would enable robust descriptions and explanations of entrepreneurship trends in the United States and so help guide the development of effective entrepreneurship policies.

Achieving these ends first requires the identification and detailed description of available federal datasets, as provided in this compendium. Its contents include:

An overview and discussion of 18 datasets from four federal agencies, organized by two categories and five subcategories.
Tables providing information on each dataset, including:
- scope of coverage of self-employed, entrepreneurs, and businesses;
- data collection methods (nature of data source, periodicity, sampling frame, sample size);
- dataset variables (owner characteristics, business characteristics and operations, geographic areas);
- Data release schedule; and
- Data access by format (including fixed tables, interactive tools, API, FTP download, public use microdata samples [PUMS], and confidential microdata).

For each dataset, examples of studies, if any, that use the data source to describe and explain trends in entrepreneurship.
The author’s aim is for the compendium to facilitate an assessment of the strengths and weaknesses of currently available federal datasets, discussion about how data availability and value can be improved, and implementation of desired improvements…(More)”

Why the Global South should nationalise its data

Curated on December 18, 2019December 18, 2019 by Stefaan Verhulst

Ulises Ali Mejias at AlJazeera: “The recent coup in Bolivia reminds us that poor countries rich in resources continue to be plagued by the legacy of colonialism. Anything that stands in the way of a foreign corporation’s ability to extract cheap resources must be removed.

Today, apart from minerals and fossil fuels, corporations are after another precious resource: Personal data. As with natural resources, data too has become the target of extractive corporate practices.

As sociologist Nick Couldry and I argue in our book, The Costs of Connection: How Data is Colonizing Human Life and Appropriating It for Capitalism, there is a new form of colonialism emerging in the world: data colonialism. By this, we mean a new resource-grab whereby human life itself has become a direct input into economic production in the form of extracted data.

We acknowledge that this term is controversial, given the extreme physical violence and structures of racism that historical colonialism employed. However, our point is not to say that data colonialism is the same as historical colonialism, but rather to suggest that it shares the same core function: extraction, exploitation, and dispossession.

Like classical colonialism, data colonialism violently reconfigures human relations to economic production. Things like land, water, and other natural resources were valued by native people in the precolonial era, but not in the same way that colonisers (and later, capitalists) came to value them: as private property. Likewise, we are experiencing a situation in which things that were once primarily outside the economic realm – things like our most intimate social interactions with friends and family, or our medical records – have now been commodified and made part of an economic cycle of data extraction that benefits a few corporations.

So what could countries in the Global South do to avoid the dangers of data colonialism?…(More)”.

Industry and Public Sector Leaders Partner to Launch the Mobility Data Collaborative

Curated on December 17, 2019December 17, 2019 by Stefaan Verhulst

Press Release: “The Mobility Data Collaborative (the Collaborative), a multi-sector forum with the goal of creating a framework to improve mobility through data, launches today…

New mobility services, such as shared cars, bikes, and scooters, are emerging and integrating into the urban transportation landscape across the globe. Data generated by these new mobility services offers an exciting opportunity to inform local policies and infrastructure planning. The Collaborative brings together key members from the public and private sectors to develop best practices to harness the potential of this valuable data to support safe, equitable, and livable streets.

The Collaborative will leverage the knowledge of its current and future members to solve the complex challenges facing shared mobility operators and the public agencies who manage access to infrastructure that these new services require. A critical component of this collaboration is providing an open and impartial forum for sharing information and developing best practices.

Membership is open to public agencies, nonprofits, academic institutions and private companies….(More)”.

Imagery: A better “picture” of the city

Curated on December 17, 2019December 17, 2019 by Stefaan Verhulst

Daniel Arribas-Bel at Catapult: ‘When trying to understand something as complex as the city, every bit of data helps create a better picture. Researchers, practitioners and policymakers gather as much information as they can to represent every aspect of their city – from noise levels captured by open-source sensors and the study of social isolation using tweets to where the latest hipster coffee shop has opened – exploration and creativity seem to have no limits.

But what about imagery?

You might well ask, what type of images? How do you analyse them? What’s the point anyway?

Let’s start with the why. Images contain visual cues that encode a host of socio-economic information. Imagine a picture of a street with potholes outside a derelict house next to a burnt out car. It may be easy to make some fairly sweeping assumptions about the average income of its resident population. Or the image of a street with a trendy barber-shop next door to a coffee-shop with bare concrete feature walls on one side, and an independent record shop on the other. Again, it may be possible to describe the character of this area.

These are just some of the many kinds of signals embedded in image data. In fact, there is entire literature in geography and sociology that document these associations (see, for example, Cityscapes by Daniel Aaron Silver and Terry Nichols Clark for a sociology approach and The Predictive Postcode by Richard Webber and Roger Burrows for a geography perspective). Imagine if we could figure out ways to condense such information into formal descriptors of cities that help us measure aspects that traditional datasets can’t, or to update them more frequently than standard sources currently allow…(More)”.

Assessing employer intent when AI hiring tools are biased

Curated on December 16, 2019December 16, 2019 by Stefaan Verhulst

Report by Caitlin Chin at Brookings: “When it comes to gender stereotypes in occupational roles, artificial intelligence (AI) has the potential to either mitigate historical bias or heighten it. In the case of the Word2vec model, AI appears to do both.

Word2vec is a publicly available algorithmic model built on millions of words scraped from online Google News articles, which computer scientists commonly use to analyze word associations. In 2016, Microsoft and Boston University researchers revealed that the model picked up gender stereotypes existing in online news sources—and furthermore, that these biased word associations were overwhelmingly job related. Upon discovering this problem, the researchers neutralized the biased word correlations in their specific algorithm, writing that “in a small way debiased word embeddings can hopefully contribute to reducing gender bias in society.”

Their study draws attention to a broader issue with artificial intelligence: Because algorithms often emulate the training datasets that they are built upon, biased input datasets could generate flawed outputs. Because many contemporary employers utilize predictive algorithms to scan resumes, direct targeted advertising, or even conduct face- or voice-recognition-based interviews, it is crucial to consider whether popular hiring tools might be susceptible to the same cultural biases that the researchers discovered in Word2vec.

In this paper, I discuss how hiring is a multi-layered and opaque process and how it will become more difficult to assess employer intent as recruitment processes move online. Because intent is a critical aspect of employment discrimination law, I ultimately suggest four ways upon which to include it in the discussion surrounding algorithmic bias….(More)”

This report from The Brookings Institution’s Artificial Intelligence and Emerging Technology (AIET) Initiative is part of “AI and Bias,” a series that explores ways to mitigate possible biases and create a pathway toward greater fairness in AI and emerging technologies.

Platform Urbanism: Negotiating Platform Ecosystems in Connected Cities

Curated on December 11, 2019December 11, 2019 by Stefaan Verhulst

Book by Sarah Barns: “This book reflects on what it means to live as urban citizens in a world increasingly shaped by the business and organisational logics of digital platforms. Where smart city strategies promote the roll-out of internet of things (IoT) technologies and big data analytics by city governments worldwide, platform urbanism responds to the deep and pervasive entanglements that exist between urban citizens, city services and platform ecosystems today.

Recent years have witnessed a backlash against major global platforms, evidenced by burgeoning literatures on platform capitalism, the platform society, platform surveillance and platform governance, as well as regulatory attention towards the market power of platforms in their dominance of global data infrastructure.

This book responds to these developments and asks: How do platform ecosystems reshape connected cities? How do urban researchers and policy makers respond to the logics of platform ecosystems and platform intermediation? What sorts of multisensory urban engagements are rendered through platform interfaces and modalities? And what sorts of governance challenges and responses are needed to cultivate and champion the digital public spaces of our connected lives….(More)”.

Data Protection in the Humanitarian Sector – A Blockchain Approach

Curated on December 11, 2019December 11, 2019 by Stefaan Verhulst

Report by Andrej Verity and Irene Solaiman: “Data collection and storage are becoming increasingly digital. In the humanitarian sector, data motivates action, informing organizations who then determine priorities and resource allocation in crises.

“Humanitarians are dependent on technology and on the Internet. When life-saving aid isn’t delivered on time and to the right beneficiaries, people can die.” -Brookings

In the age of information and cyber warfare, humanitarian organizations must take measures to protect civilians, especially those in critical and vulnerable positions.

“Data privacy and ensuring protection from harm, including the provision of data security, are therefore fundamentally linked—and neither can be realized without the other.” -The Signal Code

Information in the wrong hands can risk lives or even force aid organizations to shut down. For example, in 2009, Sudan expelled over a dozen international nongovernmental organizations (NGOs) that were deemed key to maintaining a lifeline to 4.7 million people in western Darfur. The expulsion occurred after the Sudanese Government collected Internet-accessible information that made leadership fear international criminal charges. Responsible data protection is a crucial component of cybersecurity. As technology develops, so do threats and data vulnerabilities. Emerging technologies such as blockchain provide further security to sensitive information and overall data storage. Still, with new technologies come considerations for implementation…(More)”.

What are hidden data treasuries and how can they help development outcomes?

Curated on December 10, 2019December 10, 2019 by Stefaan Verhulst

Blogpost by Damien Jacques et al: “Cashew nuts in Burkina Faso can be seen growing from space. Such is the power of satellite technology, it’s now possible to observe the changing colors of fields as crops slowly ripen.

This matters because it can be used as an early warning of crop failure and food crisis – giving governments and aid agencies more time to organize a response.

Our team built an exhaustive crop type and yield estimation map in Burkina Faso, using artificial intelligence and satellite images from the European Space Agency.

But building the map would not have been possible without a data set that GIZ, the German government’s international development agency, had collected for one purpose on the ground some years before – and never looked at again.

At Dalberg, we call this a “hidden data treasury” and it has huge potential to be used for good.

Unlocking data potential

In the records of the GIZ Data Lab, the GPS coordinates and crop yield measurements of just a few hundred cashew fields were sitting dormant.

They’d been collected in 2015 to assess the impact of a program to train farmers. But through the power of machine learning, that data set has been given a new purpose.

Using Dalberg Data Insights’ AIDA platform, our team trained algorithms to analyze satellite images for cashew crops, track the crops’ color as they ripen, and from there, estimate yields for the area covered by the data.

From this, it’s now possible to predict crop failures for thousands of fields.

We believe this “recycling” of old data, when paired with artificial intelligence, can help to bridge the data gaps in low-income countries and meet the UN’s Sustainable Development Goals….(More)”.

The Politics of Open Government Data: Understanding Organizational Responses to Pressure for More Transparency

Curated on December 10, 2019December 10, 2019 by Stefaan Verhulst

Paper by Erna Ruijer et al: “This article contributes to the growing body of literature within public management on open government data by taking
a political perspective. We argue that open government data are a strategic resource of organizations and therefore organizations are not likely to share it. We develop an analytical framework for studying the politics of open government data, based on theories of strategic responses to institutional processes, government transparency, and open government data. The framework shows that there can be different organizational strategic responses to open data—varying from conformity to active resistance—and that different institutional antecedents influence these responses. The value of the framework is explored in two cases: a province in the Netherlands and a municipality in France. The cases provide insights into why governments might release datasets in certain policy domains but not in others thereby producing “strategically opaque transparency.” The article concludes that the politics of open government data framework helps us understand open data practices in relation to broader institutional pressures that influence government transparency….(More)”.