Surveying the citizen science landscape


Paper by Andrea Wiggins and Kevin Crowston in First Monday: “Citizen science has seen enormous growth in recent years, in part due to the influence of the Internet, and a corresponding growth in interest. However, the few stand-out examples that have received attention from media and researchers are not representative of the diversity of the field as a whole, and therefore may not be the best models for those seeking to study or start a citizen science project. In this work, we present the results of a survey of citizen science project leaders, identifying sub-groups of project types according to a variety of features related to project design and management, including funding sources, goals, participant activities, data quality processes, and social interaction. These combined features highlight the diversity of citizen science, providing an overview of the breadth of the phenomenon and laying a foundation for comparison between citizen science projects and to other online communities….(More).”

U.S. Public Participation Playbook


“The U.S. Public Participation Playbook is a resource for government managers to effectively evaluate and build better services through public participation using best practices and performance metrics.
Public participation—where citizens help shape and implement government programs—is a foundation of open, transparent, and engaging government services. From emergency management, town hall discussions and regulatory development to science and education, better engagement with those who use public services can measurably improve those services for everyone.
Developing a U.S. Public Participation Playbook is an open government priority included in both the first and second U.S. Open Government National Action Plans as part of the United States effort to increase public integrity in government programs. This resource reflects the commitment of the government and civic partners to measurably improve participation programs, and is designed using the same inclusive principles that it champions.
How is the playbook structured?

We needed to create a resource that combines best practices and suggested performance metrics for public servants to use to evaluate and build better services — to meet this need, based on discussions with federal managers and stakeholders, we identified five main categories that should be addressed in all programs, whether digital or offline. Within each category we identified 12 unifying plays to start with, each including a checklist to consider, resources and training. We then provide suggested performance metrics for each main category.
This is only the beginning, however, and we hope the plays will quickly expand and enrich. The U.S. Public Participation Playbook was not just designed for a more open government — it was designed collaboratively through a more open government…(More)”

Cultures of Code


Brian Hayes in the American Scientist: “Kim studies parallel algorithms, designed for computers with thousands of processors. Chris builds computer simulations of fluids in motion, such as ocean currents. Dana creates software for visualizing geographic data. These three people have much in common. Computing is an essential part of their professional lives; they all spend time writing, testing, and debugging computer programs. They probably rely on many of the same tools, such as software for editing program text. If you were to look over their shoulders as they worked on their code, you might not be able to tell who was who.
Despite the similarities, however, Kim, Chris, and Dana were trained in different disciplines, and they belong to  different intellectual traditions and communities. Kim, the parallel algorithms specialist, is a professor in a university department of computer science. Chris, the fluids modeler, also lives in the academic world, but she is a physicist by training; sometimes she describes herself as a computational scientist (which is not the same thing as a computer scientist). Dana has been programming since junior high school but didn’t study computing in college; at the startup company where he works, his title is software developer.
These factional divisions run deeper than mere specializations. Kim, Chris, and Dana belong to different professional societies, go to different conferences, read different publications; their paths seldom cross. They represent different cultures. The resulting Balkanization of computing seems unwise and unhealthy, a recipe for reinventing wheels and making the same mistake three times over. Calls for unification go back at least 45 years, but the estrangement continues. As a student and admirer of all three fields, I find the standoff deeply frustrating.
Certain areas of computation are going through a period of extraordinary vigor and innovation. Machine learning, data analysis, and programming for the web have all made huge strides. Problems that stumped earlier generations, such as image recognition, finally seem to be yielding to new efforts. The successes have drawn more young people into the field; suddenly, everyone is “learning to code.” I am cheered by (and I cheer for) all these events, but I also want to whisper a question: Will the wave of excitement ever reach other corners of the computing universe?…
What’s the difference between computer science, computational science, and software development?…(More)”

The Royal Statistical Society Data Manifesto


ePSiplatform: “A Data Manifesto released by the Royal Statistical Society describes ten recommendations that focus on how the next UK government can improve data for policymaking, democracy and for prosperity…..the Society calls for official statistics to be at the heart of policy debate and recommends that the Office for National Statistics and the wider Government Statistical Service be given adequate resources, as well as calling for greater investment in research, science and innovation.
The document shows that the Society is broadly supportive of the open data agenda; in particular the opening up of government data and giving citizens greater access to quality local data.
It calls for greater data sharing between government departments for statistics and research purposes and believes the private sector should be encouraged to share data with researchers for the same purpose. It also calls for an end to pre-release access to official statistics….Download the Data Manifesto in PDF format.”

Digital Enlightenment Yearbook 2014


Book edited O’Hara, K. , Nguyen, M-H.C., Haynes, P.: “Tracking the evolution of digital technology is no easy task; changes happen so fast that keeping pace presents quite a challenge. This is, nevertheless, the aim of the Digital Enlightenment Yearbook.
This book is the third in the series which began in 2012 under the auspices of the Digital Enlightenment Forum. This year, the focus is on the relationship of individuals with their networks, and explores “Social networks and social machines, surveillance and empowerment”. In what is now the well-established tradition of the yearbook, different stakeholders in society and various disciplinary communities (technology, law, philosophy, sociology, economics, policymaking) bring their very different opinions and perspectives to bear on this topic.
The book is divided into four parts: the individual as data manager; the individual, society and the market; big data and open data; and new approaches. These are bookended by a Prologue and an Epilogue, which provide illuminating perspectives on the discussions in between. The division of the book is not definitive; it suggests one narrative, but others are clearly possible.
The 2014 Digital Enlightenment Yearbook gathers together the science, social science, law and politics of the digital environment in order to help us reformulate and address the timely and pressing questions which this new environment raises. We are all of us affected by digital technology, and the subjects covered here are consequently of importance to us all. (Contents)”

Access to Scientific Data in the 21st Century: Rationale and Illustrative Usage Rights Review


Paper by James Campbell  in Data Science Journal: “Making scientific data openly accessible and available for re-use is desirable to encourage validation of research results and/or economic development. Understanding what users may, or may not, do with data in online data repositories is key to maximizing the benefits of scientific data re-use. Many online repositories that allow access to scientific data indicate that data is “open,” yet specific usage conditions reviewed on 40 “open” sites suggest that there is no agreed upon understanding of what “open” means with respect to data. This inconsistency can be an impediment to data re-use by researchers and the public. (More)”

Open Government: Origin, Development, and Conceptual Perspectives


Paper by Bernd W. Wirtz & Steven Birkmeyer in the International Journal of Public Administration: “The term “open government” is frequently used in practice and science. Since President Obama’s Memorandum for the Heads of Executive Departments and Agencies in March 2009, open government has attracted an enormous amount of public attention. It is applied by authors from diverse areas, leading to a very heterogeneous comprehension of the concept. Against this background, this article screens the current open government literature to deduce an integrative definition of open government. Furthermore, this article analyzes the empirical and conceptual literature of open government to deduce an open government framework. In general, this article provides a clear understanding of the open government concept. (More)”

With a Few Bits of Data, Researchers Identify ‘Anonymous’ People


in the New York Times: “Even when real names and other personal information are stripped from big data sets, it is often possible to use just a few pieces of the information to identify a specific person, according to a study to be published Friday in the journal Science.

In the study, titled “Unique in the Shopping Mall: On the Reidentifiability of Credit Card Metadata,” a group of data scientists analyzed credit card transactions made by 1.1 million people in 10,000 stores over a three-month period. The data set contained details including the date of each transaction, amount charged and name of the store.

Although the information had been “anonymized” by removing personal details like names and account numbers, the uniqueness of people’s behavior made it easy to single them out.

In fact, knowing just four random pieces of information was enough to reidentify 90 percent of the shoppers as unique individuals and to uncover their records, researchers calculated. And that uniqueness of behavior — or “unicity,” as the researchers termed it — combined with publicly available information, like Instagram or Twitter posts, could make it possible to reidentify people’s records by name.

“The message is that we ought to rethink and reformulate the way we think about data protection,” said Yves-Alexandre de Montjoye, a graduate student in computational privacy at the M.I.T. Media Lab who was the lead author of the study. “The old model of anonymity doesn’t seem to be the right model when we are talking about large-scale metadata.”

The analysis of large data sets containing details on people’s behavior holds great potential to improve public health, city planning and education.

But the study calls into question the standard methods many companies, hospitals and government agencies currently use to anonymize their records. It may also give ammunition to some technologists and privacy advocates who have challenged the consumer-tracking processes used by advertising software and analytics companies to tailor ads to so-called anonymous users online….(More).”

How Network Science Is Changing Our Understanding of Law


Emerging Technology From the arXiv: “One of the more fascinating areas of science that has emerged in recent years is the study of networks and their application to everyday life. It turns out that many important properties of our world are governed by networks with very specific properties.
These networks are not random by any means. Instead, they are often connected in the now famous small world pattern in which any part of the network can be reached in a relatively small number of steps. These kinds of networks lie behind many natural phenomena such as earthquakes, epidemics and forest fires and are equally ubiquitous in social phenomena such as the spread of fashions, languages, and even wars.
So it should come as no surprise that the same kind of network should exist in the legal world. Today, Marios Koniaris and pals at the National Technical University of Athens in Greece show that the network of links between laws follows exactly the same pattern. They say their network approach provides a unique insight into the nature of the law, the way it has emerged and how changes may influence it in the future.
The work of Koniaris and co focuses entirely on the law associated with the European Union. They begin by pointing out that this legal network is different from many other types of networks in two important ways….
The network can also be using for visualizing the nature of the legal world. It reveals clusters and related connections and can help legislators determine the effect of proposed changes…..It also shows how network science is spreading to every corner of scientific and social research.
Ref:  arxiv.org/abs/1501.05237 : Network Analysis In The Legal Domain: A Complex Model For European Union Legal Sources”

The new scientific revolution: Reproducibility at last


in the Washington Post:”…Reproducibility is a core scientific principle. A result that can’t be reproduced is not necessarily erroneous: Perhaps there were simply variables in the experiment that no one detected or accounted for. Still, science sets high standards for itself, and if experimental results can’t be reproduced, it’s hard to know what to make of them.
“The whole point of science, the way we know something, is not that I trust Isaac Newton because I think he was a great guy. The whole point is that I can do it myself,” said Brian Nosek, the founder of a start-up in Charlottesville, Va., called the Center for Open Science. “Show me the data, show me the process, show me the method, and then if I want to, I can reproduce it.”
The reproducibility issue is closely associated with a Greek researcher, John Ioannidis, who published a paper in 2005 with the startling title “Why Most Published Research Findings Are False.”
Ioannidis, now at Stanford, has started a program to help researchers improve the reliability of their experiments. He said the surge of interest in reproducibility was in part a reflection of the explosive growth of science around the world. The Internet is a factor, too: It’s easier for researchers to see what everyone else is doing….
Errors can potentially emerge from a practice called “data dredging”: When an initial hypothesis doesn’t pan out, the researcher will scan the data for something that looks like a story. The researcher will see a bump in the data and think it’s significant, but the next researcher to come along won’t see it — because the bump was a statistical fluke….
So far about 7,000 people are using that service, and the center has received commitments for $14 million in grants, with partners that include the National Science Foundation and the National Institutes of Health, Nosek said.
Another COS initiative will help researchers register their experiments in advance, telling the world exactly what they plan to do, what questions they will ask. This would avoid the data-dredging maneuver in which researchers who are disappointed go on a deep dive for something publishable.
Nosek and other reformers talk about “publication bias.” Positive results get reported, negative results ignored. Someone reading a journal article may never know about all the similar experiments that came to naught….(More).”