How Crowdsourced Astrophotographs on the Web Are Revolutionizing Astronomy


Emerging Technology From the arXiv: “Astrophotography is currently undergoing a revolution thanks to the increased availability of high quality digital cameras and the software available to process the pictures after they have been taken.
Since photographs of the night sky are almost always better with long exposures that capture more light, this processing usually involves combining several images of the same part of the sky to produce one with a much longer effective exposure.
That’s all straightforward if you’ve taken the pictures yourself with the same gear under the same circumstances. But astronomers want to do better.
“The astrophotography group on Flickr alone has over 68,000 images,” say Dustin Lang at Carnegie Mellon University in Pittsburgh and a couple of pals. These and other images represent a vast source of untapped data for astronomers.
The problem is that it’s hard to combine images accurately when little is known about how they were taken. Astronomers take great care to use imaging equipment in which the pixels produce a signal that is proportional to the number of photons that hit.
But the same cannot be said of the digital cameras widely used by amateurs. All kinds of processes can end up influencing the final image.
So any algorithm that combines them has to cope with these variations. “We want to do this without having to infer the (possibly highly nonlinear) processing that has been applied to each individual image, each of which has been wrecked in its own loving way by its creator,” say Lang and co.
Now, these guys say they’ve cracked it. They’ve developed a system that automatically combines images from the same part of the sky to increase the effective exposure time of the resulting picture. And they say the combined images can rival those from much professional telescopes.
They’ve tested this approach by downloading images of two well-known astrophysical objects: the NGC 5907 Galaxy and the colliding pair of galaxies—Messier 51a and 51b.
For NGC 5907, they ended up with 4,000 images from Flickr, 1,000 from Bing and 100 from Google. They used an online system called astrometry.net that automatically aligns and registers images of the night sky and then combined the images using their new algorithm, which they call Enhance.
The results are impressive. They say that the combined images of NGC5907 (bottom three images) show some of the same faint features that revealed a single image taken over 11 hours of exposure using a 50 cm telescope (the top left image). All the images reveal the same kind of fine detail such as a faint stellar stream around the galaxy.
The combined image for the M51 galaxies is just as impressive, taking only 40 minutes to produce on a single processor. It reveals extended structures around both galaxies, which astronomers know to be debris from their gravitational interaction as they collide.
Lang and co say these faint features are hugely important because they allow astronomers to measure the age, mass ratios, and orbital configurations of the galaxies involved. Interestingly, many of these faint features are not visible in any of the input images taken from the Web. They emerge only once images have been combined.
One potential problem with algorithms like this is that they need to perform well as the number of images they combine increases. It’s no good if they grind to a halt as soon as a substantial amount of data becomes available.
On this score, Lang and co say astronomers can rest easy. The performance of their new Enhance algorithm scales linearly with the number of images it has to combine. That means it should perform well on large datasets.
The bottom line is that this kind of crowd-sourced astronomy has the potential to make a big impact, given that the resulting images rival those from large telescopes.
And it could also be used for historical images, say Lang and co. The Harvard Plate Archives, for example, contain half a million images dating back to the 1880s. These were all taken using different emulsions, with different exposures and developed using different processes. So the plates all have different responses to light, making them hard to compare.
That’s exactly the problem that Lang and co have solved for digital images on the Web. So it’s not hard to imagine how they could easily combine the data from the Harvard archives as well….”
Ref: arxiv.org/abs/1406.1528 : Towards building a Crowd-Sourced Sky Map

Opening Public Transportation Data in Germany


Thesis by Kaufmann, Stefan: “Open data has been recognized as a valuable resource, and public institutions have taken to publishing their data under open licenses, also in Germany. However, German public transit agencies are still reluctant to publish their schedules as open data. Also, two widely used data exchange formats used in German transit planning are proprietary, with no documentation publicly available. Through this work, one of the proprietary formats was reverse-engineered, and a transformation process into the open GTFS schedule format was developed. This process allowed a partnering transit operator to publish their schedule as open data. Also, through a survey taken with German transit authorities and operators, the prevalence of transit data exchange formats, and reservations concerning open transit data were evaluated. The survey brought a series of issues to light which serve as obstacles for opening up transit data. Addressing the issues found through this work, and partnering with open-minded transit authorities to further develop transit data publishing processes can serve as a foundation for wider adoption of publishing open transit data in Germany”

Big Data, Big Questions


Special Issue by the International Journal of Communication on Big Data, Big Questions:

Critiquing Big Data: Politics, Ethics, Epistemology | Special Section Introduction PDF
Kate Crawford, Mary L. Gray, Kate Miltner 10 pgs.
The Big Data Divide ABSTRACT PDF
Mark Andrejevic 17 pgs.
Metaphors of Big Data ABSTRACT PDF
Cornelius Puschmann, Jean Burgess 20 pgs.
Advertising, Big Data and the Clearance of the Public Realm: Marketers’ New Approaches to the Content Subsidy ABSTRACT PDF
Nick Couldry, Joseph Turow 17 pgs.
A Dozen Ways to Get Lost in Translation: Inherent Challenges in Large Scale Data Sets ABSTRACT PDF
Lawrence Busch 18 pgs.
Working Within a Black Box: Transparency in the Collection and Production of Big Twitter Data ABSTRACT PDF
Kevin Driscoll, Shawn Walker 20 pgs.
Living on Fumes: Digital Footprints, Data Fumes, and the Limitations of Spatial Big Data ABSTRACT PDF
Jim Thatcher 19 pgs.
This One Does Not Go Up To 11: The Quantified Self Movement as an Alternative Big Data Practice ABSTRACT PDF
Dawn Nafus, Jamie Sherman 11 pgs.
The Theory/Data Thing ABSTRACT PDF
Geoffrey C. Bowker 5 pgs.

Towards a comparative science of cities: using mobile traffic records in New York, London and Hong Kong


Book chapter by S. Grauwin, S. Sobolevsky, S. Moritz, I. Gódor, C. Ratti, to be published in “Computational Approaches for Urban Environments” (Springer Ed.), October 2014: “This chapter examines the possibility to analyze and compare human activities in an urban environment based on the detection of mobile phone usage patterns. Thanks to an unprecedented collection of counter data recording the number of calls, SMS, and data transfers resolved both in time and space, we confirm the connection between temporal activity profile and land usage in three global cities: New York, London and Hong Kong. By comparing whole cities typical patterns, we provide insights on how cultural, technological and economical factors shape human dynamics. At a more local scale, we use clustering analysis to identify locations with similar patterns within a city. Our research reveals a universal structure of cities, with core financial centers all sharing similar activity patterns and commercial or residential areas with more city-specific patterns. These findings hint that as the economy becomes more global, common patterns emerge in business areas of different cities across the globe, while the impact of local conditions still remains recognizable on the level of routine people activity.”

Every citizen a scientist? An EU project tries to change the face of research


Project News from the European Commission:  “SOCIENTIZE builds on the concept of ‘Citizen Science’, which sees thousands of volunteers, teachers, researchers and developers put together their skills, time and resources to advance scientific research. Thanks to open source tools developed under the project, participants can help scientists collect data – which will then be analysed by professional researchers – or even perform tasks that require human cognition or intelligence like image classification or analysis.

Every citizen can be a scientist
The project helps usher in new advances in everything from astronomy to social science.
‘One breakthrough is our increased capacity to reproduce, analyse and understand complex issues thanks to the engagement of large groups of volunteers,’ says Mr Fermin Serrano Sanz, researcher at the University of Zaragoza and Project Coordinator of SOCIENTIZE. ‘And everyone can be a neuron in our digitally-enabled brain.’
But how can ordinary citizens help with such extraordinary science? The key, says Mr Serrano Sanz, is in harnessing the efforts of thousands of volunteers to collect and classify data. ‘We are already gathering huge amounts of user-generated data from the participants using their mobile phones and surrounding knowledge,’ he says.
For example, the experiment ‘SavingEnergy@Home’ asks users to submit data about the temperatures in their homes and neighbourhoods in order to build up a clearer picture of temperatures in cities across the EU, while in Spain, GripeNet.es asks citizens to report when they catch the flu in order to monitor outbreaks and predict possible epidemics.
Many Hands Make Light Work
But citizens can also help analyse data. Even the most advanced computers are not very good at recognising things like sun spots or cells, whereas people can tell the difference between living and dying cells very easily, given only a short training.
The SOCIENTIZE projects ‘Sun4All’ and ‘Cell Spotting’ ask volunteers to label images of solar activity and cancer cells from an application on their phone or computer. With Cell Spotting, for instance, participants can observe cell cultures being studied with a microscope in order to determine their state and the effectiveness of medicines. Analysing this data would take years and cost hundreds of thousands of euros if left to a small team of scientists – but with thousands of volunteers helping the effort, researchers can make important breakthroughs quickly and more cheaply than ever before.
But in addition to bringing citizens closer to science, SOCIENTIZE also brings science closer to citizens. On 12-14 June, the project participated in the SONAR festival with ‘A Collective Music Experiment’ (CME). ‘Two hundred people joined professional DJs and created musical patterns using a web tool; participants shared their creations and re-used other parts in real time. The activity in the festival also included a live show of RdeRumba and Mercadal playing amateurs rhythms’ Mr. Serrano Sanz explains.
The experiment – which will be presented in a mini-documentary to raise awareness about citizen science – is expected to help understand other innovation processes observed in emergent social, technological, economic or political transformations. ‘This kind of event brings together a really diverse set of participants. The diversity does not only enrich the data; it improves the dialogue between professionals and volunteers. As a result, we see some new and innovative approaches to research.’
The EUR 0.7 million project brings together 6 partners from 4 countries: Spain (University of Zaragoza and TECNARA), Portugal (Museu da Ciência-Coimbra, MUSC ; Universidade de Coimbra),  Austria (Zentrum für Soziale Innovation) and Brazil (Universidade Federal de Campina Grande, UFCG).
SOCIENTIZE will end in October 2104 after bringing together 12000 citizens in different phases of research activities for 24 months.”

Giving is a question of time: Response times and contributions to a real world public good


Discussion Paper (University of Heidelberg) by Lohse, Johannes and Goeschl, Timo and Diederich , Johannes: “Recent experimental research has examined whether contributions to public goods can be traced back to intuitive or deliberative decision-making, using response times in public good games in order to identify the specific decision process at work. In light of conflicting results, this paper reports on an analysis of response time data from an online experiment in which over 3400 subjects from the general population decided whether to contribute to a real world public good. The between-subjects evidence confirms a strong positive link between contributing and deliberation and between free-riding and intuition. The average response time of contributors is 40 percent higher than that of free-riders. A within-subject analysis reveals that for a given individual, contributing significantly increases and free-riding significantly decreases the amount of deliberation required.”

Finding Mr. Smith or why anti-corruption needs open data


Martin Tisne: “Anti-corruption groups have been rightly advocating for the release of information on the beneficial or real owners of companies and trust. The idea is to crack down on tax evasion and corruption by identifying the actual individuals hiding behind several layers of shell companies.
But knowing that “Mr. Smith” is the owner of company X is of no interest, unless you know who Mr. Smith is.
The real interest lies in figuring out that Mr. Smith is linked to company Y, that has been illegally exporting timber from country Z, and that Mr. Smith is the son-in-law of the mining minister of yet another country, who has been accused of embezzling mining industry revenues.
For that, investigative journalists, prosecution authorities, civil society groups like Global Witness and Transparency International will need access not just to public registries of beneficial ownership but also contract data, political exposed persons databases (“PEPs” databases), project by project extractive industry data, and trade export/import data.
Unless those datasets are accessible, comparable, linked, it won’t be possible. We are talking about millions of datasets – no problem for computers to crunch, but impossible to go through manually.
This is what is different in the anti-corruption landscape today, compared to 10 years ago. Technology makes it possible. Don’t get me wrong – there are still huge, thorny political obstacles to getting the data even publicly available in the first place. But unless it is open data, I fear those battles will have been in vain.
That’s why we need open data as a topic on the G20 anti-corruption working group.”

A Big Day for Big Data: The Beginning of Our Data Transformation


Mark Doms, Under Secretary for Economic Affairs at the US Department of Commerce: “Wednesday, June 18, 2014, was a big day for big data.  The Commerce Department participated in the inaugural Open Data Roundtable at the White House, with GovLab at NYU and the White House Office of Science and Technology Policy. The event brought businesses and non-profit organizations that rely on Commerce data together with Commerce Department officials to discuss how to make the data we collect and release easier to find, understand and use.  This initiative has significant potential to fuel new businesses; create jobs; and help federal, state and local governments make better decisions.
OpenData 500

Under Secretary Mark Doms presented and participated in the first Open Data Roundtable at the White House, organized by Commerce, GovLab at NYU and the White House Office of Science and Technology Policy 
Data innovation is revolutionizing every aspect of our society and government data is playing a major role in the revolution. From the National Oceanic and Atmospheric Administration’s (NOAA’s) climate data to the U.S. Census Bureau’s American Community Survey, the U.S. Patent and Trademark Office (USPTO) patent and trademark records, and National Institute of Standards and Technology (NIST) research, companies, organizations and people are using this information to innovate, grow our economy and better plan for the future.
 At this week’s Open Data 500, some key insights I came away with include: 

  • There is a strong desire for data consistency across the Commerce Department, and indeed the federal government. 
  • Data should be catalogued in a common, machine-readable format. 
  • Data should be accessible in bulk, allowing the private sector greater flexibility to harness the information. 
  • The use of a single platform for access to government data would create efficiencies and help coordination across agencies.

Furthermore, business leaders stand ready to help us achieve these goals.
Secretary Pritzker is the first Secretary of Commerce to make data a departmental priority in the Commerce Department’s Strategic Plan, and has branded Commerce as “America’s Data Agency.” In keeping with that mantra, over the next several months, my team at the Economics and Statistics Administration (ESA), which includes the Bureau of Economic Analysis and the U.S. Census Bureau, will be involved in similar forums.  We will be engaging our users – businesses, academia, advocacy organizations, and state and local governments – to drive this open data conversation forward. 
Today was a big first step in that process. The insight gained will help inform our efforts ahead. Thanks again to the team at GovLab and the White House for their hard work in making it possible!”

How a Sensor-Filled World Will Change Human Consciousness


Scientific American: “Here’s a fun experiment: Try counting the electronic sensors surrounding you right now. There are cameras and microphones in your computer. GPS sensors and gyroscopes in your smartphone. Accelerometers in your fitness tracker. If you work in a modern office building or live in a newly renovated house, you are constantly in the presence of sensors that measure motion, temperature and humidity.
Sensors have become abundant because they have, for the most part, followed Moore’s law: they just keep getting smaller, cheaper and more powerful. A few decades ago the gyroscopes and accelerometers that are now in every smartphone were bulky and expensive, limited to applications such as spacecraft and missile guidance. Meanwhile, as you might have heard, network connectivity has exploded. Thanks to progress in microelectronics design as well as management of energy and the electromagnetic spectrum, a microchip that costs less than a dollar can now link an array of sensors to a low-power wireless communications network….”

Government, Foundations Turn to Cash Prizes to Generate Solutions


Megan O’Neil at the Chronicle of Philanthropy: “Government agencies and philanthropic organizations are increasingly staging competitions as a way generate interest in solving difficult technological, social, and environmental problems, according to a new report.
“The Craft of Prize Design: Lessons From the Public Sector” found that well-designed competitions backed by cash incentives can help organizations attract new ideas, mobilize action, and stimulate markets.
“Incentive prizes have transformed from an exotic open innovation to a proven innovation strategy for the public, private, and philanthropic sectors,” the report says.
Produced by Deloitte Consulting’s innovation practice, the report was financially supported by Bloomberg Philanthropies and the Case; Joyce; John S. and James L. Knight; Kresge; and Rockefeller foundations.
The federal government has staged more than 350 prize competitions during the past five years to stimulate innovation and crowdsource solutions, according to the report. And philanthropic organizations are also fronting prizes for competitions promoting innovative responses to questions such as how to strengthen communities and encourage sustainable energy consumption.
One example cited by the report is the Talent Dividend Prize, sponsored by CEOs for Cities and the Kresge Foundation, which awards $1-million to the city that most increases its college graduation rate during a four-year period. A second example is the MIT Clean Energy Prize, co-sponsored by the U.S. Department of Energy, which offered a total of $1 million in prize money. Submissions generated $85 million in capital and research grants, according to the report.
A prize-based project should not be adopted when an established approach to solve a problem already exists or if potential participants don’t have the interest or time to work on solving a problem, the report concludes. Instead, prize designers must gauge the capacity of potential participants before announcing a prize, and make sure that it will spur the discovery of new solutions.”