The Open Data/Environmental Justice Connection


Jeffrey Warren for Wilson’s Commons Lab: “… Open data initiatives seem to assume that all data is born in the hallowed halls of government, industry and academia, and that open data is primarily about convincing such institutions to share it to the public.
It is laudable when institutions with important datasets — such as campaign finance, pollution or scientific data — see the benefit of opening it to the public. But why do we assume unilateral control over data production?
The revolution in user-generated content shows the public has a great deal to contribute – and to gain—from the open data movement. Likewise, citizen science projects that solicit submissions or “task completion” from the public rarely invite higher-level participation in research –let alone true collaboration.
This has to change. Data isn’t just something you’re given if you ask nicely, or a kind of community service we perform to support experts. Increasingly, new technologies make it possible for local groups to generate and control data themselves — especially in environmental health. Communities on the front line of pollution’s effects have the best opportunities to monitor it and the most to gain by taking an active role in the research process.
DIY Data
Luckily, an emerging alliance between the maker/Do-It-Yourself (DIY) movement and watchdog groups is starting to challenge the conventional model.
The Smart Citizen project, the Air Quality Egg and a variety of projects in the Public Lab network are recasting members of the general public as actors in the framing of new research questions and designers of a new generation of data tools.
The Riffle, a <$100 water quality sensor built inside of hardware-store pipe, can be left in a creek near an industrial site to collect data around the clock for weeks or months. In the near future, when pollution happens – like the ash spill in North Carolina or the chemical spill in West Virginia – the public will be alerted and able to track its effects without depending on expensive equipment or distant labs.
This emerging movement is recasting environmental issues not as intractably large problems, but up-close-and-personal health issues — just what environmental justice (EJ) groups have been arguing for years. The difference is that these new initiatives hybridize such EJ community organizers and the technology hackers of the open hardware movement. Just as the Homebrew Computer Club’s tinkering with early prototypes led to the personal computer, a new generation of tinkerers sees that their affordable, accessible techniques can make an immediate difference in investigating lead in their backyard soil, nitrates in their tap water and particulate pollution in the air they breathe.
These practitioners see that environmental data collection is not a distant problem in a developing country, but an issue that anyone in a major metropolitan area, or an area affected by oil and gas extraction, faces on a daily basis. Though underserved communities are often disproportionally affected, these threats often transcend socioeconomic boundaries…”

The Next Frontier in Crowdsourcing: Your Smartphone


Rachel Metz in MIT TechnologyReview: “Rather than swiping the screen or entering a passcode to unlock the smartphone in my hand, I have to tell it how energetic the people around me are feeling by tapping one of four icons. I’m the only one here, and the one that best fits my actual energy level, to be honest, is a figure lying down and emitting a trail of z’s.
I’m trying out an Android app called Twitch. Created by Stanford researchers, it asks you to complete a few simple tasks—contributing information, as with the reported energy levels, or performing simple tasks like ranking images or structuring data extracted from Wikipedia pages—each time you unlock your phone. The information collected by apps like Twitch could be useful to academics, market researchers, or local businesses. Such software could also provide a low-cost way to perform useful work that can easily be broken up into pieces and fed to millions of devices.

Twitch is one of several projects exploring crowdsourcing via the lock screen. Plenty of people already contribute freely to crowdsourcing websites like Wikipedia and Quora or paid services like Amazon’s Mechanical Turk, and the sustained popularity of traffic app Waze shows that people are willing to contribute to a common cause from their handsets if it provides a timely, helpful result.
There are certainly enough smartphones with lock screens ready to be harnessed. According to data from market researcher comScore, 160 million people in the U.S.—or 67 percent of cell phone users—have smartphones, and nearly 52 percent of these run Google’s Android OS, which allows apps like Twitch to replace the standard lock screen….”

The Challenges of Challenge.Gov: Adopting Private Sector Business Innovations in the Federal Government


I Mergel, SI Bretschneider, C Louis, J Smith at the HICSS ’14 Proceedings of the 2014 47th Hawaii International Conference on System Sciences: “As part of the Open Government Initiative in the U.S. federal government, the White House has introduced a new policy instrument called “Challenges and Prizes”, implemented as Challenge.gov that allows federal departments to run Open Innovation (OI) contests. This initiative was motivated by similar OI initiatives in the private sector and to enhance innovativeness and performance among federal agencies. Here we first define the underlying theoretical concepts of OI, crowd sourcing and contests and apply them to the existing theory of public ness and the creation of public goods. We then analyze over 200 crowd sourcing contests on CHALLENGE.GOV and conclude that federal departments and agencies use this policy instrument for four different purpose: awareness, service, knowledge and technical solutions. We conclude that Challenge.gov is currently used as an innovative format to inform and educate the public about public management problems and less frequently to solicit complex technological solutions from problem solvers.”

Putting Crowdsourcing on the Map


MIT Technology Review: “Even in San Francisco, where Google’s roving Street View cars have mapped nearly every paved surface, there are still places that have remained untouched, such as the flights of stairs that serve as pathways between streets in some of the city’s hilliest neighborhoods.
It’s these places that a startup called Mapillary is focusing on. Cofounders Jan Erik Solem and Johan Gyllenspetz are attempting to build an open, crowdsourced, photographic map that lets smartphone users log all sorts of places, creating a richer view of the world than what is offered by Street View and other street-level mapping services. If contributors provide images often, that view could be more representative of how things look right now.
Google itself is no stranger to the benefits of crowdsourced map content: it paid $966 million last year for traffic and navigation app Waze, whose users contribute data. Google also lets people augment Street View content with their own images. But Solem and Gyllenspetz think there’s still plenty of room for Mapillary, which they say can be used for everything from tracking a nature hike to offering more up-to-date images to house hunters and Airbnb users.
Solem and Gyllenspetz have only been working on the project for four months; they released an iPhone app in November, and an Android app in January. So far, there are just a few hundred users who have shared about 100,000 photos on the service. While it’s free for anyone to use, the startup plans to eventually make money by licensing the data its users generate to companies.
With the app, a user can choose to collect images by walking, biking, or driving. Once you press a virtual shutter button within the app, it takes a photo every two seconds, until you press the button again. You can then upload the images to Mapillary’s service via Wi-Fi, where each photo’s location is noted through its GPS tag. Computer-vision software compares each photo with others that are within a radius of about 100 meters, searching for matching image features so it can find the geometric relationship between the photos. It then places those images properly on the map, and stitches them all together. When new images come in of an area that has already been mapped, Mapillary will add them to its database, too.
It can take less than 30 seconds for the images to show up on the Web-based map, but several minutes for the images to be fully processed. As with Google’s Street View photos, image-recognition software blurs out faces and license plate numbers.
Users can edit Mapillary’s map by moving around the icons that correspond to images—to fix a misplaced image, for instance. Eventually, users will also be able to add comments and tags.
So far, Mapillary’s map is quite sparse. But the few hundred users trying out Mapillary include some map providers in Europe, and the 100,000 or so images to the service ranging from a bike path on Venice Beach in California to a snow-covered ski slope in Sweden.
Street-level images can be viewed on the Web or through Mapillary’s smartphone apps (though the apps just pull up the Web page within the app). Blue lines and colored tags indicate where users have added photos to the map; you can zoom in to see them at the street level.

Navigating through photos is still quite rudimentary; you can tap or click to move from one image to the next with onscreen arrows, depending on the direction you want to explore.
Beyond technical and design challenges, the biggest issue Mapillary faces is convincing a large enough number of users to build up its store of images so that others will start using it and contributing as well, and then ensuring that these users keep coming back.”

Coordinating the Commons: Diversity & Dynamics in Open Collaborations


Dissertation by Jonathan T. Morgan: “The success of Wikipedia demonstrates that open collaboration can be an effective model for organizing geographically-distributed volunteers to perform complex, sustained work at a massive scale. However, Wikipedia’s history also demonstrates some of the challenges that large, long-term open collaborations face: the core community of Wikipedia editors—the volunteers who contribute most of the encyclopedia’s content and ensure that articles are correct and consistent — has been gradually shrinking since 2007, in part because Wikipedia’s social climate has become increasingly inhospitable for newcomers, female editors, and editors from other underrepresented demographics. Previous research studies of change over time within other work contexts, such as corporations, suggests that incremental processes such as bureaucratic formalization can make organizations more rule-bound and less adaptable — in effect, less open— as they grow and age. There has been little research on how open collaborations like Wikipedia change over time, and on the impact of those changes on the social dynamics of the collaborating community and the way community members prioritize and perform work. Learning from Wikipedia’s successes and failures can help researchers and designers understand how to support open collaborations in other domains — such as Free/Libre Open Source Software, Citizen Science, and Citizen Journalism.

In this dissertation, I examine the role of openness, and the potential antecedents and consequences of formalization, within Wikipedia through an analysis of three distinct but interrelated social structures: community-created rules within the Wikipedia policy environment, coordination work and group dynamics within self-organized open teams called WikiProjects, and the socialization mechanisms that Wikipedia editors use to teach new community members how to participate.To inquire further, I have designed a new editor peer support space, the Wikipedia Teahouse, based on the findings from my empirical studies. The Teahouse is a volunteer-driven project that provides a welcoming and engaging environment in which new editors can learn how to be productive members of the Wikipedia community, with the goal of increasing the number and diversity of newcomers who go on to make substantial contributions to Wikipedia …”

True Collective Intelligence? A Sketch of a Possible New Field


Paper by Geoff Mulgan in Philosophy & Technology :” Collective intelligence is much talked about but remains very underdeveloped as a field. There are small pockets in computer science and psychology and fragments in other fields, ranging from economics to biology. New networks and social media also provide a rich source of emerging evidence. However, there are surprisingly few useable theories, and many of the fashionable claims have not stood up to scrutiny. The field of analysis should be how intelligence is organised at large scale—in organisations, cities, nations and networks. The paper sets out some of the potential theoretical building blocks, suggests an experimental and research agenda, shows how it could be analysed within an organisation or business sector and points to the possible intellectual barriers to progress.”

The Problem With Serious Games–Solved


Emerging Technology From the arXiv:” Serious games are becoming increasingly popular but the inability to generate realistic new content has hampered their progress. Until now.

Here’s an imaginary scenario: you’re a law enforcement officer confronted with John, a 21-year-old male suspect who is accused of breaking into a private house on Sunday evening and stealing a laptop, jewellery and some cash. Your job is to find out whether John has an alibi and if so whether it is coherent and believable.
That’s exactly the kind of scenario that police officers the world over face on a regular basis. But how do you train for such a situation? How do you learn the skills necessary to gather the right kind of information?
An increasingly common way of doing this is with serious games, those designed primarily for purposes other than entertainment. In the last 10 years or so, medical, military and commercial organisations all over the world began to experiment with game-based scenarios that are designed to teach people how to perform their jobs and tasks in realistic situations.
But there is a problem with serious games which require realistic interaction is with another person. It’s relatively straightforward to design one or two scenarios that are coherent, lifelike and believable but it’s much harder to generate them continually on an ongoing basis.
Imagine in the example above, that John is a computer-generated character. What kind of activities could he describe that would serve as a believable, coherent alibi for Sunday evening? And how could he do it a thousand times, each describing a different realistic alibi. Therein lies the problem.
Today, Sigal Sina at Bar-Ilan University in Israel, and a couple pals, say they’ve solved this probelm. These guys have come up with a novel way of generating ordinary, realistic scenarios that can be cut and pasted into a serious game to serve exactly this purpose. The secret sauce in their new approach is to crowdsource the new scenarios from real people using Amazon’s Mechanical Turk service.
The approach is straightforward. Sina and co simply ask Turkers to answer a set of questions asking what they did during each one-hour period throughout various days, offering bonuses to those who provide the most varied detail.
They then analyse the answers, categorising activities by factors such as the times they are performed, the age and sex of the person doing it, the number of people involved and so on.
This then allows a computer game to cut and paste activities into the action at appropriate times. So for example, the computer can select an appropriate alibi for John on a Sunday evening by choosing an activity described by a male Turker for the same time while avoiding activitiesthat a woman might describe for a Friday morning, which might otherwise seem unbelievable. The computer also changes certain details in the narrative, such as names, locations and so on to make the narrative coherent with John’s profile….
That solves a significant problem with serious games. Until now, developers have had to spend an awful lot of time producing realistic content, a process known as procedural content generation. That’s always been straightforward for things like textures, models and terrain in game settings. Now, thanks to this new crowdsourcing technique, it can be just as easy for human interactions in serious games too.
Ref:  arxiv.org/abs/1402.5034 : Using the Crowd to Generate Content for Scenario-Based Serious-Games”

Crowdsourcing voices to study Parkinson’s disease


TedMed: “Mathematician Max Little is launching a project that aims to literally give Parkinson’s disease (PD) patients a voice in their own diagnosis and help them monitor their disease progression.
Patients Voice Analysis (PVA) is an open science project that uses phone-based voice recordings and self-reported symptoms, along with software Little designed, to track disease progression. Little, a TEDMED 2013 speaker and TED Fellow, is partnering with the online community PatientsLikeMe, co-founded by TEDMED 2009 speaker James Heywood, and Sage Bionetworks, a non-profit research organization, to conduct the research.
The new project is an extension of Little’s Parkinson’s Voice Initiative, which used speech analysis algorithms to diagnose Parkinson’s from voice records with the help of 17,000 volunteers. This time, he seeks to not only detect markers of PD, but also to add information reported by patients using PatientsLikeMe’s Parkinson’s Disease Rating Scale (PDRS), a tool that documents patients’ answers to questions that measure treatment effectiveness and disease progression….
As openly shared information, the collected data has potential to help vast numbers of individuals by tapping into collective ingenuity. Little has long argued that for science to progress, researchers need to democratize research and move past jostling for credit. Sage Bionetworks has designed a platform called Synapse to allow data sharing with collaborative version control, an effort led by open data advocate John Wilbanks.
“If you can’t share your data, how can you reproduce your science? One of the big problems we’re facing with this kind of medical research is the data is not open and getting access to it is a nightmare,” Little says.
With the PVA project, “Basically anyone can log on download the anonymized data and play around with data mining techniques. We don’t really care what people are able to come up with. We just want the most accurate prediction we can get.
“In research, you’re almost always constrained by what you think is the best way to do things. Unless you open it to the community at large, you’ll never know,” he says.”

L’intelligence d’une ville : ses citoyens


Michel Dumais: “Tic toc! disions-nous. Bientôt la centième. Et avec la cent-unième, de nouveaux défis. Ville intelligente, disiez-vous? Je subodore le traditionnel appel de pied aux trois lettres et à une logique administrative archaïque. Et si on faisait plutôt appel à l’intelligence de ceux qui connaissent le plus leur ville, ses citoyens?

Pour régler un problème (et même à l’occasion, un «pas d’problème»), les administrations regardent du côté de ces logiciels mammouth qui, sur papier, sont censés faire tout, qui engloutissent des centaines de millions de dollars, mais qui, finalement, font les manchettes des médias parce qu’il faut y injecter encore plus d’argent. Et qui permettent aux TI d’asseoir encore plus leur contrôle sur une administration.

Bref, lorsque l’on parle de ville intelligente, plusieurs y voient le pactole. Ah! Reste que ce qui était «acceptable», hier, ne l’est plus aujourd’hui. Et que la réalisation d’une ville intelligente n’est surtout pas un défi technologique, loin de là.

LA QUESTION DU SANS-FIL
Il y a des années de cela, la simple logique eut voulu que la Ville cesse de penser «big telcos» afin de conclure rapidement une alliance avec l’organisme communautaire «Île sans fil» et ainsi favoriser le déploiement rapide sur l’île de la technologie sans fil.

Une telle alliance, un modèle dans le genre, existe.

Mais pas à Montréal. Plutôt à Québec, alors que la Ville et l’organisme communautaire «Zap Québec» travaillent main dans la main pour le plus grand bénéfice des citoyens de Québec et des touristes. Et à Montréal? On jase, on jase.

Donc, une ville intelligente. C’est une ville qui sait, à l’aide des technologies, comment harnacher ses infrastructures et les mettre au service de ses citoyens tout en réalisant des économies et en favorisant le développement durable.

C’est aussi une ville qui sait écouter et mobiliser ses citoyens, ses militants et ses entrepreneurs, tout en leur donnant des outils (comme des données utilisables) afin qu’ils puissent eux aussi créer des services destinés à leur organisation et à tous les citoyens de la ville. Sans compter que tous ces outils facilitent la prise de décisions chez les maires d’arrondissement et le comité exécutif.

Bref, une ville intelligente selon le professeur Rudolf Giffinger, c’est ça: «une économie intelligente, une mobilité intelligente, un environnement intelligent, des habitants intelligents, un mode de vie intelligent et, enfin, une administration intelligente».

J’invite le lecteur à regarder LifeApps, une extraordinaire série télé diffusée sur le site de la chaîne AlJazeera. Le sujet: des jeunes et de moins jeunes militants, bidouilleurs, qui s’impliquent et créent des services pour leur communauté.”

Are bots taking over Wikipedia?


Kurzweil News: “As crowdsourced Wikipedia has grown too large — with more than 30 million articles in 287 languages — to be entirely edited and managed by volunteers, 12 Wikipedia bots have emerged to pick up the slack.

The bots use Wikidata — a free knowledge base that can be read and edited by both humans and bots — to exchange information between entries and between the 287 languages.

Which raises an interesting question: what portion of Wikipedia edits are generated by humans versus bots?

To find out (and keep track of other bot activity), Thomas Steiner of Google Germany has created an open-source application (and API): Wikipedia and Wikidata Realtime Edit Stats, described in an arXiv paper.
The percentages of bot vs. human edits as shown in the application is constantly changing.  A KurzweilAI snapshot on Feb. 20 at 5:19 AM EST showed an astonishing 42% of Wikipedia being edited by bots. (The application lists the 12 bots.)


Anonymous vs. logged-In humans (credit: Thomas Steiner)
The percentages also vary by language. Only 5% of English edits were by bots; but for Serbian pages, in which few Wikipedians apparently participate, 96% of edits were by bots.

The application also tracks what percentage of edits are by anonymous users. Globally, it was 25 percent in our snapshot and a surprising 34 percent for English — raising interesting questions about corporate and other interests covertly manipulating Wikipedia information.