Smithsonian turns to crowdsourcing for massive digitization project


PandoDaily: “There are 5 million plant specimens in the US Herbarium at the Natural History Museum’s Botany Department, one of the most extensive collections of plant life in the world. They all have labels. But only 1.3 million of those labels can be read by computers. That’s where you come in.

Jason Shen and Sarah Allen, a pair of Presidential Innovation Fellows working with the Smithsonian Institute to improve its open data initiatives, have gone all Mechanical Turk on the esteemed knowledge network.

In a pilot project that is serving as a test run for other large Smithsonian scientific collections – accouning for a total of about 126 million specimens – the innovation fellows are crowdsourcing the transcription of scanned images of the labels.

To get involved, you don’t need to commit to a certain number of hours, or make yourself available at specific times. You just log into the Smithsonian’s recently established transcription site, select a project to work on, and start transcribing. Different volunteers can work on the same project at different times. When you’ve done your bit, you submit it for review, at which point a different volunteer comes in to check to see that you’ve done the transcription correctly.

 So, for instance, you might get to look at specimens collected by Martin W. Gorman on his 1902 expedition to Alaska’s Lake Iliamna Region, and read his thoughts on his curious findings. If you’re the type to get excited by a bit of vintage potentilla fruitcosa, then this is your Disneyland.

It’s the sort of crowdsourcing initiative that has been going on for years in other corners of the Internet, but the Smithsonian is only just getting going. It has long thought of itself as passer-on of knowledge – its mission is “the increase and diffusion of knowledge” – with the public as inherent recipients rather than contributors, so the “let’s get everyone to help us with this gargantuan task” mentality has not been its default position. It does rely on a lot of volunteers to lead tours and maintain back rooms, and the likes, but organizing knowledge is another thing…

Shen and Allen quietly launched the Smithsonian Transcription Center in August as part of a wider effort to digitize all of the Institute’s collections. The Herbarium effort is one of the most significant to date, but other projects have included field notes of bird observations to letters written between 20th-century American artists. More than 1,400 volunteers have contributed to the projects to date, accounting for more than 18,000 transcriptions.”

Twitter and Society


New book by Weller, Katrin / Bruns, Axel / Burgess, Jean / Mahrt, Merja / Puschmann, Cornelius (eds.): “Since its launch in 2006, Twitter has evolved from a niche service to a mass phenomenon; it has become instrumental for everyday communication as well as for political debates, crisis communication, marketing, and cultural participation. But the basic idea behind it has stayed the same: users may post short messages (tweets) of up to 140 characters and follow the updates posted by other users. Drawing on the experience of leading international Twitter researchers from a variety of disciplines and contexts, this is the first book to document the various notions and concepts of Twitter communication, providing a detailed and comprehensive overview of current research into the uses of Twitter. It also presents methods for analyzing Twitter data and outlines their practical application in different research contexts.”

You Are Your Data


in Slate: “We are becoming data. Every day, our smartphones, browsers, cars, and even refrigerators generate information about our habits. When we click “I agree” on terms of service, we opt in to systems in which we are known only by our data. So we need to be able to understand ourselves as data, too.
To understand what that might mean for the average person in the future, we should look to the Quantified Self community, which is at the frontier of understanding what our role as individuals in a data-driven society might look like. Quantified Self began as a Meetup community sharing personal stories of self-tracking techniques, and is now a catchall adjective to describe the emerging set of apps and sensors available to consumers to facilitate self-tracking, such as the Fitbit or Nike Fuelband. Some of the self-tracking practices of this group come across as extreme (experimenting with the correlation between butter consumption and brain function). But what is a niche interest today could be widely marketed tomorrow—and accordingly, their frustrations may soon be yours…

Instead, I propose that we should have a “right to use” our personal data: I should be able to access and make use of data that refers to me. At best, a right to use would reconcile both my personal interest in the small-scale insights and the firms’ large-scale interests in big data insights from the larger population. These interests are not in conflict with each other.
Of course, to translate this concept into practice, we need to work out matters of both technology and policy.
What data are we asking for? Are we asking for data that individuals have opted into creating, like self-tracking fitness applications? Should we broaden that definition to describe any data that refers to our person, such as behavioral data collected by cookies and gathered by third-party data brokers? These definitions will be hard to pin down.
Also, what kind of data? Just that which we’ve actively opted in to creating, or does it expand to the more hidden, passive, transactional data? Will firms exercise control over the line between where “raw” data becomes processed and therefore proprietary? If we can’t begin to define the data representation of a “step” in an activity tracker, how will we standardize access to that information?
Access to personal data also suffers from a chicken-and-egg problem right now. We don’t see greater consumer demand for this because we don’t yet have robust enough tools to make use of disparate sets of data as individuals, and yet such tools are not gaining traction without proven demand.”

Transparency 2.0: The Fundamentals of Online Open Government


White Paper by Granicus: “Open government is about building transparency, trust, and engagement with the public. Today, with 80% of the North American public on the Internet, it is becoming increasingly clear that building open government starts online. Transparency 2.0 not only provides public information, but also develops civic engagement, opens the decision-making process online, and takes advantage of today’s technology trends.
Citizen ideation & feedback. While open data comprised much of what online transparency used to be, today, government agencies have expanded openness to include public records, legislative data, decision-making workflow, and citizen ideation and feedback.
This paper outlines the principles of Transparency 2.0, the fundamentals and best practices for creating the most advanced and comprehensive online open government that over a thousand state, federal, and local government agencies are now using to reduce information requests, create engagement, and improve efficiency.”

Crisis response needs to be a science, not an art


Jimmy Whitworth in the Financial Times:”…It is an imperative to offer shelter, nutrition, sanitation and medical care to those suddenly bereft of it. Without aid, humanitarian crises would cause still greater suffering. Yet admiration for the agencies that deliver relief should not blind us to the need to ensure that it is well delivered. Humanitarian responses must be founded on good evidence.
The evidence base, unfortunately, is weak. We know that storms, earthquakes and conflicts have devastating consequences for health and wellbeing, and that not responding is not an option, but we know surprisingly little about how best to go about it. Not only is evidence-based practice rare in humanitarian relief operations, it is often impossible.
Questions about how best to deliver clean water or adequate shelter, and even about which health needs should be prioritised as the most pressing, have often been barely researched. Indeed, the evidence gap is so great that the Humanitarian Practice Network has highlighted a “dire lack of credible data to help us understand just how much populations in crisis suffer, and to what extent relief operations are able to relieve that suffering”. No wonder aid responses are often characterised as messy.
Good practice often rests on past practice rather than research. The Bible of humanitarian relief is a document called the Sphere handbook, an important initiative to set minimum standards for provision of health, nutrition, sanitation and shelter. Yet analysis of the 2004 handbook has revealed that just 13 per cent of its 346 standards were supported by good evidence of relevance to health. The handbook, for example, recommended that refugee camps should prioritise measles vaccination – a worthwhile goal, but not one that should clearly be favoured over control of other infectious diseases.

Also under-researched is the question of how best to provide types of relief that everybody agrees meet essential needs. Access to clean water is a clear priority for almost all populations in crisis but little is understood about how this is most efficiently delivered. Is it best to ship bottled water to stricken areas? Are tankers of clean water more effective? Or can water purification tablets do the job? The summer floods in northern India made it clear that there is little good evidence one way or another.

Adequate shelter, too, is a human essential in all but the most benign environments but, once again, the evidence base about how best to provide it is limited. There is a school of thought that building transitional shelter from locally available materials is better in the long run than housing people under tents, tarpaulins and plastic, which if accurate would have far-reaching consequences for standard practice. But too little research has been done…
Researchers also face significant challenges to building a better evidence base. They can struggle to secure access to disaster zones when getting relief in is the priority. The timescales involved in applying for funding and ethical approval, too, make it difficult for them to move quickly enough to set up a study in the critical post-disaster period.
It is to address this that Enhancing Learning and Research for Humanitarian Assistance, with the support of the Wellcome Trust and the UK Department for International Development, recently launched an £8m research programme that investigates these issues.”

MakerBot Launches Mission To Put 3-D Printers In Every U.S. Public School


FastCompany: “There was a time when learning-by-doing meant shop class or playing Oregon Trail. Now it means designing on a 3-D printer.
Brooklyn-based MakerBot Industries has announced a new crowdsourcing initiative with DonorsChoose.org, Autodesk, and America Makes to put 3-D printers in each of America’s public schools. MakerBot Academy could put as many as 5,000 printers in public schools by the end of this school year, says MakerBot CEO Bre Pettis.
The initiative is a response to President Obama’s call for more home-grown manufacturing in his recent State of the Union address. Each 3-D printing bundle comes with a MakerBot Replicator 2 printer, three spools of PLA filament (in red, white, and blue, of course), and a year of MakerBot Makercare for about $2,250, plus a $98 threshold raised by someone with ties to the school.
Individuals and corporations can visit DonorsChoose.org to donate to the pot for the project, and teachers register on the site to receive a bundle. Teachers have until Nov. 18 to enter the Thingiverse Math Manipulatives Challenge, where they can upload designs for teachers to use in the classroom. First-place winners get to send a 3-D printer bundle to the classroom of their choice.
“Hands-on learning and applied learning is the way to engage students, and there’s nothing more hands on and applied than 3-D printing,” says Charles Best, founder of CEO of DonorsChoose. “The impulse to construct is deeper than a teaching strategy. It’s a human need.”

Wicked Problems: Problems Worth Solving – A Handbook and a Call to Action


This book was started with the intent of changing design and social entrepreneurship education. As these disciplines converge, it becomes evident that existing pedagogy doesn’t support either students or practicioners attempting to design for impact. This text is a reaction to that convergence, and will ideally be used by various students, educators, and practicioners:
One audience is professors and educators of design, who are challenged with reinventing their educational curriculum in the face of a changing world. For them, this book should act as both a starting point for curriculum development and a justification for why this development is necessary—it should answer the question “what should design and social entrepreneurship education look like?”
Another audience is made up of fresh-out-of-school designers, who are bored and uninspired by their jobs. For them, this book should answer the question “how can I redirect my design efforts to something meaningful?”
Finally, a last audience is made up of practicing designers and entrepreneurs, who are looking to achieve social impact in their work. For them, the book should answer the question “what tools and techniques can I use in my work to drive impact through design?”
The entire text of the book is available online for free as HTML, and provided for reuse and adaptation under a creative commons license. We hope you find this a useful resource in your practice, education, and in your day to day life.
Read Wicked Problems: Problems Worth Solving

White House Unveils Big Data Projects, Round Two


Information Week: “The White House Office of Science and Technology Policy (OSTP) and Networking and Information Technology R&D program (NITRD) on Tuesday introduced a slew of new big-data collaboration projects aimed at stimulating private-sector interest in federal data. The initiatives, announced at the White House-sponsored “Data to Knowledge to Action” event, are targeted at fields as varied as medical research, geointelligence, economics, and linguistics.
The new projects are a continuation of the Obama Administration’s Big Data Initiative, announced in March 2012, when the first round of big-data projects was presented.
Thomas Kalil, OSTP’s deputy director for technology and innovation, said that “dozens of new partnerships — more than 90 organizations,” are pursuing these new collaborative projects, including many of the best-known American technology, pharmaceutical, and research companies.
Among the initiatives, Amazon Web Services (AWS) and NASA have set up the NASA Earth eXchange, or NEX, a collaborative network to provide space-based data about our planet to researchers in Earth science. AWS will host much of NASA’s Earth-observation data as an AWS Public Data Set, making it possible, for instance, to crowdsource research projects.
An estimated 4.4 million jobs are being created between now and 2015 to support big-data projects. Employers, educational institutions, and government agencies are working to build the educational infrastructure to provide students with the skills they need to fill those jobs.
To help train new workers, IBM, for instance, has created a new assessment tool that gives university students feedback on their readiness for number-crunching careers in both the public and private sector. Eight universities that have a big data and analytics curriculum — Fordham, George Washington, Illinois Institute of Technology, University of Massachusetts-Boston, Northwestern, Ohio State, Southern Methodist, and the University of Virginia — will receive the assessment tool.
OSTP is organizing an initiative to create a “weather service” for pandemics, Kalil said, a way to use big data to identify and predict pandemics as early as possible in order to plan and prepare for — and hopefully mitigate — their effects.
The National Institutes of Health (NIH), meanwhile, is undertaking its ” Big Data to Knowledge” (BD2K) initiative to develop a range of standards, tools, software, and other approaches to make use of massive amounts of data being generated by the health and medical research community….”
See also:
November 12, 2013 – Fact Sheet: Progress by Federal Agencies: Data to Knowledge to Action
November 12, 2013 – Fact Sheet: New Announcements: Data to Knowledge to Action
November 12, 2013 – Press Release: Data to Knowledge to Action Event

Transparency in Politics and the Media: Accountability and Open Government


New report from The Reuters Institute for the Study of Journalism by Nigel Bowles, James T. Hamilton, and David A. L. Levy: “Increasingly governments around the world are experimenting with initiatives in transparency or ‘open government’.
These involve a variety of measures including the announcement of more user-friendly government websites, greater access to government data, the extension of freedom of information legislation and broader attempts to involve the public in government decision making.
However, the role of the media in these initiatives has not hitherto been examined.  This new RISJ edited volume analyses the challenges and opportunities presented to journalists as they attempt to hold governments accountable in an era of professed transparency.
In examining how transparency and open government initiatives have affected the accountability role of the press in the US and the UK, it also explores how policies in these two countries could change in the future to help journalists hold governments more accountable.
This volume will be essential reading for all practising journalists, for students of journalism or politics, and for policymakers. This publication can be bought from I. B. Tauris
Download the Executive Summary and First Chapter”

Now there’s a bug bounty program for the whole Internet


Ars Technica: “Microsoft and Facebook are sponsoring a new program that pays big cash rewards to whitehat hackers who uncover security bugs threatening the stability of the Internet at large.
The Internet Bug Bounty program, which in some cases will pay $5,000 or more per vulnerability, is sponsored by Microsoft and Facebook. It will be jointly controlled by researchers from those companies along with their counterparts at Google, security firm iSec Partners, and e-commerce website Etsy. To qualify, the bugs must affect software implementations from a variety of companies, potentially result in severely negative consequences for the general public, and manifest themselves across a wide base of users. In addition to rewarding researchers for privately reporting the vulnerabilities, program managers will assist with coordinating disclosure and bug fixes involving large numbers of companies when necessary.
The program was unveiled Wednesday, and it builds off a growing number of similar initiatives. Last month, Google announced rewards as high as $3,133.70 for software updates that improve the security of OpenSSL, OpenSSH, BIND, and several other open-source packages. Additionally, Google, Facebook, Microsoft, eBay, Mozilla, and several other software or service providers pay cash in return for private reports of security vulnerabilities that threaten their users.”