The Magazine of Early American Datasets


Mark Boonshoft at The Junto: “Data. Before postmodernism, or environmental history, or the cultural turn, or the geographic turn, and even before the character on the old Star Trek series, historians began to gather and analyze quantitative evidence to understand the past. As computers became common during the 1970s and 1980s, scholars responded by painstakingly compiling and analyzing datasets, using that evidence to propose powerful new historical interpretations. Today, much of that information (as well as data compiled since) is in danger of disappearing. For that and other reasons, we have developed a website designed to preserve and share the datasets permanently (or at least until aliens destroy our planet). We appeal to all early American historians (not only the mature ones from earlier decades) to take the time both to preserve and to share their statistical evidence with present and future scholars. It will not only be a legacy to the profession but also will encourage historians to share their data more openly and to provide a foundation on which scholars can build.

In coordination with the McNeil Center for Early American Studies and specialists at the University of Pennsylvania Libraries, in addition to bepress, we have established the Magazine of Early American Datasets (MEAD), available athttp://repository.upenn.edu/mead/. We’d love to have your datasets, your huddled 1’s and 0’s (and other numbers and letters) yearning to be free.  The best would be in either .csv or, if you have commas in your data, .txt, because both of those are non-proprietary and somewhat close to universal.  However, if the data is in other forms, like Access Excel or SPSS, that will do fine as well. Ultimately, we should be able to convert files to a more permanent database and to preserve those files in perpetuity.  In addition, we are asking scholars, out of the goodness of their heart and commitment to the profession, to load a separate document as a codebook explaining the meaning of the variables.  The files will all be available to any scholar regardless of their academic affiliation.

How will a free, open centralized data center benefit Early American Historians and why should you participate in using and sharing data? Let us count just a few ways. In our experience, most historians of early America are extremely generous in sharing not only their expertise but also their evidence with other scholars. However, that generally occurs on an individual, case-by-case basis in a somewhat serendipitous fashion. A centralized website would permit scholars quickly to investigate rather quantitative evidence was available on which they might begin to construct their own research. Ideally, scholars setting out on a new topic might be guided somewhat the existence and availability of data. Moreover, it would set a precedent that future historians might follows—routinely sharing their evidence, either before or after their publications analyzing the data have appeared in print or online….(More)”

Open Data Intermediaries: Their Crucial Role


Web Foundation: “….data intermediaries are undertaking a wide range of functions.  As well as connecting data providers (for example, governments) with those who can benefit by using data or data-driven products, intermediaries are helping to articulate demand for data, creating and repackaging data, and creating novel applications. In Nepal for instance, intermediaries play a range of roles, from running the government open data portal, to translating complex data sets into formats that are easily understood by a population that is largely offline and suffers from low literacy levels.

How are intermediaries connecting data providers to end-users?

To answer this question, French sociologist Pierre Bourdieu’s social model, in particular his concept of species of capital, was used as a lens. According to Bourdieu, capital is not only economic or material. We use other symbolic forms of capital like social capital (e.g. friends and memberships) or cultural capital (e.g. competencies and qualifications) in our social interactions.

Unsurprisingly, the study found that most open data intermediaries use their technical capital to connect to data providers (97%). However, to fulfil their role of not only connecting to data providers but of facilitating the use of open data, open data intermediaries require multiple forms of capital. And because no single intermediary necessarily has all the types of capital to link effectively to users, multiple intermediaries with complementary configurations of capital are more likely to connect data providers and users.

Intermediaries ModelA model of layers of intermediaries connecting a data source with users  

 

….Of the 32 intermediaries studied, 72% can be described as not-for profit and, as a consequence, rely on donor funding to sustain their operations. This has significant implications for future sustainability.

What are some of the key conclusions and implications of the study?

  • Intermediaries are playing a critical role in making data truly useful.
  • The presence of multiple intermediaries in an ecosystem may increase the probability of use (and impact) because no single intermediary is likely to possess all the types of capital required to unlock the full value of the transaction between the provider and the end user.
  • Working either alone or in collaboration with others, intermediaries must go beyond technical capital to unlock the benefits of open data – using social, political or economic capital too.
  • Governments would do well to engage with a broad spectrum of intermediaries, and not simply focus on intermediaries who possess only the technical capital required to interpret and repackage open government data.
  • Given that intermediaries are presently largely donor funded, in the short term, funders should ask whether possible grantees possess all the types of capital required not only to re-use open data but to connect open data to specific user groups in order to ensure the use and impact of open data.
  • In the medium term, different funding models for intermediaries may need to be explored, or the sustainability of civically-minded open data initiatives could be at risk….

ACCESS THE FULL REPORT:

ODintermediariesOpen Data Intermediaries in Developing Countries

By François van Schalkwyk, Michael Caňares, Sumandro Chattapadhyay & Alexander Andrason

How Startups Are Transforming the Smart City Movement


Jason Shueh at GovTech: “Remember the 1990s visions of the future? Those first incantations of the sweeping “smart city,” so technologically utopian and Tomorrowland-ish in design? The concept and solutions were pitched by tech titans like IBM and Cisco, cost obscene amounts of money, and promised equally outlandish levels of innovation.

It was a drive — as idealistic as it was expedient — to spark a new industry that infused cities with data, analytics, sensors and clean energy. Two-and-a-half decades later, the smart city market has evolved. Its solutions are more pragmatic and its benefits more potent. Evidence brims inSingapore, where officials boast that they can predict traffic congestion an hour in advance with 90 percent accuracy. Similarly, in Chicago, the city has embraced analytics to estimate rodent infestations and prioritizerestaurant inspections. These of course are a few standouts, but as many know, the movement is highly diverse and runs its fingers through cities and across continents.

And yet what’s not as well-known is what’s happened in the last few years. The industry appears to be undergoing another metamorphosis, one that takes the ingenuity inspired by its beginnings and reimagines it with the help of do-it-yourself entrepreneurs….

Asked for a definition, Abrahamson centered his interpretation on tech that enhances quality of life. With the possible exception of health care, finance and education — systems large enough to merit their own categories, Abrahamson explains smart cities by highlighting investment areas at Urban.us. Specific areas are packaged as follows:

Mobility and Logistics: How cities move people and things to, from and within cities.

Built Environment: The public and private spaces in which citizens work and live.

Utilities: Critical resources including water, waste and energy.

Service Delivery: How local governments provide services ranging from public works to law enforcement….

Who’s Investing?

….Here is a sampling of a few types, with examples of their startup investments.

General Venture Capitalists

a16z (Andreessen Horowitz) – Mapillary and Moovit

Specialty Venture Capitalists

Fontinalis – Lyft, ParkMe, LocoMobi

Black Coral Capital – Digital Lumens, Clean Energy Collective, newterra

Govtech Fund – AmigoCloud, Mark43, MindMixer

Corporate Venture Capitalists

Google Ventures – Uber, Skycatch, Nest

Motorola Solutions Venture Capital – CyPhy Works and SceneDoc

BMW i Ventures – Life360 and ChargePoint

Impact/Social Investors

Omidyar Network – SeeClickFix and Nationbuilder

Knight Foundation – Public Stuff, Captricity

Kapor Capital – Uber, Via, Blocpower

1776 – Radiator Labs, Water Lens… (More)

The Spectrum of Control: A Social Theory of the Smart City


Sadowski , Jathan and Pasquale, Frank at First Monday: “There is a certain allure to the idea that cities allow a person to both feel at home and like a stranger in the same place. That one can know the streets and shops, avenues and alleys, while also going days without being recognized. But as elites fill cities with “smart” technologies — turning them into platforms for the “Internet of Things” (IoT): sensors and computation embedded within physical objects that then connect, communicate, and/or transmit information with or between each other through the Internet — there is little escape from a seamless web of surveillance and power. This paper will outline a social theory of the “smart city” by developing our Deleuzian concept of the “spectrum of control.” We present two illustrative examples: biometric surveillance as a form of monitoring, and automated policing as a particularly brutal and exacting form of manipulation. We conclude by offering normative guidelines for governance of the pervasive surveillance and control mechanisms that constitute an emerging critical infrastructure of the “smart city.”…(More)”

 

A systematic review of open government data initiatives


Paper by J. Attard, F. Orlandi, S. Scerri, and S. Auer in Government Information Quarterly: “We conduct a systematic survey with the aim of assessing open government data initiatives, that is; any attempt, by a government or otherwise, to open data that is produced by a governmental entity. We describe the open government data life-cycle and we focus our discussion on publishing and consuming processes required within open government data initiatives. We cover current approaches undertaken for such initiatives, and classify them. A number of evaluations found within related literature are discussed, and from them we extract challenges and issues that hinder open government initiatives from reaching their full potential. In a bid to overcome these challenges, we also extract guidelines for publishing data and provide an integrated overview. This will enable stakeholders to start with a firm foot in a new open government data initiative. We also identify the impacts on the stakeholders involved in such initiatives….(More)”

On the Farm: Startups Put Data in Farmers’ Hands


Jacob Bunge at the Wall Street Journal: “Farmers and entrepreneurs are starting to compete with agribusiness giants over the newest commodity being harvested on U.S. farms—one measured in bytes, not bushels.

Startups including Farmobile LLC, Granular Inc. and Grower Information Services Cooperative are developing computer systems that will enable farmers to capture data streaming from their tractors and combines, store it in digital silos and market it to agriculture companies or futures traders. Such platforms could allow farmers to reap larger profits from a technology revolution sweeping the U.S. Farm Belt and give them more control over the information generated on their fields.

The efforts in some cases would challenge a wave of data-analysis tools from big agricultural companies such as Monsanto Co., DuPontCo., Deere & Co. and Cargill Inc. Those systems harness modern planters, combines and other machinery outfitted with sensors to track planting, spraying and harvesting, then crunch that data to provide farm-management guidance that these firms say can help farmers curb costs and grow larger crops. The companies say farmers own their data, and it won’t be sold to third parties.

Some farmers and entrepreneurs say crop producers can get the most from their data by compiling and analyzing it themselves—for instance, to determine the best time to apply fertilizer to their soil and how much. Then, farmers could profit further by selling data to seed, pesticide and equipment makers seeking a glimpse into how and when farmers use machinery and crop supplies.

The new ventures come as farmers weigh the potential benefits of sharing their data with large agricultural firms against privacy concerns and fears that agribusinesses could leverage farm-level information to charge higher rates for seeds, pesticides and other supplies.

“We need to get farmers involved in this because it’s their information,” said Dewey Hukill, board president of Grower Information Services Cooperative, or GISC, a farmer-owned cooperative that is building a platform to collect its members’ data. The cooperative has signed up about 1,500 members across 37 states….

Companies developing markets for farm data say it’s not their intention to displace big seed and machinery suppliers but to give farmers a platform that would enable them to manage their own information. Storing and selling their own data wouldn’t necessarily bar a farmer from sharing information with a seed company to get a planting recommendation, they say….(More)”

 

The World of Indicators: The Making of Governmental Knowledge through Quantification


New Book by Richard Rottenburg et al: “The twenty-first century has seen a further dramatic increase in the use of quantitative knowledge for governing social life after its explosion in the 1980s. Indicators and rankings play an increasing role in the way governmental and non-governmental organizations distribute attention, make decisions, and allocate scarce resources. Quantitative knowledge promises to be more objective and straightforward as well as more transparent and open for public debate than qualitative knowledge, thus producing more democratic decision-making. However, we know little about the social processes through which this knowledge is constituted nor its effects. Understanding how such numeric knowledge is produced and used is increasingly important as proliferating technologies of quantification alter modes of knowing in subtle and often unrecognized ways. This book explores the implications of the global multiplication of indicators as a specific technology of numeric knowledge production used in governance. (More)”

Give me location data, and I shall move the world


Marta Poblet at the Conversation: “Behind the success of the new wave of location based mobile apps taking hold around the world is digital mapping. Location data is core to popular ride-sharing services such as Uber and Lyft, but also to companies such as Amazon or Domino’s Pizza, which are testing drones for faster deliveries.

Last year, German delivery firm DHL launched its first “parcelcopter” to send medication to the island of Juist in the Northern Sea. In the humanitarian domain, drones are also being tested for disaster relief operations.

Better maps can help app-led companies gain a competitive edge, but it’s hard to produce them at a global scale. …

A flagship base map for the past ten years has been OpenStreetMap (OSM), also known as the “Wikipedia of mapping”. With more than two million registered users, OpenStreetMap aims to create a free map of the world. OSM volunteers have been particularly active in mapping disaster-affected areas such as Haiti, the Philippines or Nepal. A recent study reports how humanitarian response has been a driver of OSM’s evolution, “in part because open data and participatory ideals align with humanitarian work, but also because disasters are catalysts for organizational innovation”….

Intense competition for digital maps also flags the start of the self-driving car race. Google is already testing its prototypes outside Silicon Valley and Apple has been rumoured to work on a secret car project code named Titan.

Uber has partnered with Carnegie Mellon and Arizona Universities to work on vehicle safety and cheaper laser mapping systems. Tesla is also planning to make its electric cars self-driving.

Legal and ethical challenges are not to be underestimated either. Most countries impose strict limits on testing self-driving cars on public roads. Similar limitations apply to the use of civilian drones. And the ethics of fully autonomous cars is still in its infancy. Autonomous cars probably won’t be caught texting, but they will still be confronted with tough decisions when trying to avoid potential accidents. Current research engages engineers and philosophers to work on how to assist cars when making split-second decisions that can raise ethical dilemmas….(More)”

When Big Data Becomes Bad Data


Lauren Kirchner at ProPublica: “A recent ProPublica analysis of The Princeton Review’s prices for online SAT tutoring shows that customers in areas with a high density of Asian residents are often charged more. When presented with this finding, The Princeton Review called it an “incidental” result of its geographic pricing scheme. The case illustrates how even a seemingly neutral price model could potentially lead to inadvertent bias — bias that’s hard for consumers to detect and even harder to challenge or prove.

Over the past several decades, an important tool for assessing and addressing discrimination has been the “disparate impact” theory. Attorneys have used this idea to successfully challenge policies that have a discriminatory effect on certain groups of people, whether or not the entity that crafted the policy was motivated by an intent to discriminate. It’s been deployed in lawsuits involving employment decisions, housing and credit. Going forward, the question is whether the theory can be applied to bias that results from new technologies that use algorithms….(More)”

A data revolution is underway. Will NGOs miss the boat?


Opinion by Sophia Ayele at Oxfam: “The data revolution has arrived. ….The UN has even launched a Data Revolution Group (to ensure that the revolution penetrates into international development). The Group’s 2014 report suggests that harnessing the power of newly available data could ultimately lead to, “more empowered people, better policies, better decisions and greater participation and accountability, leading to better outcomes for people and the planet.”

But where do NGOs fit in?

NGOs are generating dozens (if not hundreds) of datasets every year. Over the last two decades, NGO have been collecting increasing amounts of research and evaluation data, largely driven by donor demands for more rigorous evaluations of programs. The quality and efficiency of data collection has also been enhanced by mobile data collection. However, a quick scan of UK development NGOs reveals that few, if any, are sharing the data that they collect. This means that NGOs are generating dozens (if not hundreds) of datasets every year that aren’t being fully exploited and analysed. Working on tight budgets, with limited capacity, it’s not surprising that NGOs often shy away from sharing data without a clear mandate.

But change is in the air. Several donors have begun requiring NGOs to publicise data and others appear to be moving in that direction. Last year, USAID launched its Open Data Policy which requires that grantees “submit any dataset created or collected with USAID funding…” Not only does USAID stipulate this requirement, it also hosts this data on its Development Data Library (DDL) and provides guidance on anonymisation to depositors. Similarly, Gates Foundation’s 2015 Open Access Policy stipulates that, “Data underlying published research results will be accessible and open immediately.” However, they are allowing a two-year transition period…..Here at Oxfam, we have been exploring ways to begin sharing research and evaluation data. We aren’t being required to do this – yet – but, we realise that the data that we collect is a public good with the potential to improve lives through more effective development programmes and to raise the voices of those with whom we work. Moreover, organizations like Oxfam can play a crucial role in highlighting issues facing women and other marginalized communities that aren’t always captured in national statistics. Sharing data is also good practice and would increase our transparency and accountability as an organization.

… the data that we collect is a public good with the potential to improve lives. However, Oxfam also bears a huge responsibility to protect the rights of the communities that we work with. This involves ensuring informed consent when gathering data, so that communities are fully aware that their data may be shared, and de-identifying data to a level where individuals and households cannot be easily identified.

As Oxfam has outlined in our, recently adopted, Responsible Data Policy,”Using data responsibly is not just an issue of technical security and encryption but also of safeguarding the rights of people to be counted and heard, ensuring their dignity, respect and privacy, enabling them to make an informed decision and protecting their right to not be put at risk… (More)”