The Crowd & the Cloud


The Crowd & the Cloud (TV series): “Are you interested in birds, fish, the oceans or streams in your community? Are you concerned about fracking, air quality, extreme weather, asthma, Alzheimer’s disease, Zika or other epidemics? Now you can do more than read about these issues. You can be part of the solution.

Smartphones, computers and mobile technology are enabling regular citizens to become part of a 21st century way of doing science. By observing their environments, monitoring neighborhoods, collecting information about the world and the things they care about, so-called “citizen scientists” are helping professional scientists to advance knowledge while speeding up new discoveries and innovations.

The results are improving health and welfare, assisting in wildlife conservation, and giving communities the power to create needed change and help themselves.

Citizen science has amazing promise, but also raises questions about data quality and privacy. Its potential and challenges are explored in THE CROWD & THE CLOUD, a 4-part public television series premiering in April 2017. Hosted by former NASA Chief Scientist Waleed Abdalati, each episode takes viewers on a global tour of the projects and people on the front lines of this disruptive transformation in how science is done, and shows how anyone, anywhere can participate….(More)”

 

Does digital democracy improve democracy?


Thamy Pogrebinschi at Open Democracy: “The advancement of tools of information and communications technology (ICT) has the potential to impact democracy nearly as much as any other area, such as science or education. The effects of the digital world on politics and society are still difficult to measure, and the speed with which these new technological tools evolve is often faster than a scholar’s ability to assess them, or a policymaker’s capacity to make them fit into existing institutional designs.

Since their early inception, digital tools and widespread access to the internet have been changing the traditional means of participation in politics, making them more effective. Electoral processes have become more transparent and effective in several countries where the paper ballot has been substituted for electronic voting machines. Petition-signing became a widespread and powerful tool as individual citizens no longer needed to be bothered out in the streets to sign a sheet of paper, but could instead be simultaneously reached by the millions via e-mail and have their names added to virtual petition lists in seconds. Protests and demonstrations have also been immensely revitalized in the internet era. In the last few years, social networks like Facebook and WhatsApp have proved to be a driving-force behind democratic uprisings, by mobilizing the masses, invoking large gatherings, and raising awareness, as was the case of the Arab Spring.

While traditional means of political participation can become more effective by reducing the costs of participation with the use of ICT tools, one cannot yet assure that it would become less subject to distortion and manipulation. In the most recent United States’ elections, computer scientists claimed that electronic voting machines may have been hacked, altering the results in the counties that relied on them. E-petitions can also be easily manipulated, if safe identification procedures are not put in place. And in these times of post-facts and post-truths, protests and demonstrations can result from strategic partisan manipulation of social media, leading to democratic instability as has recently occurred in Brazil. Nevertheless, the distortion and manipulation of these traditional forms of participation were also present before the rise of ICT tools, and regardless, even if the latter do not solve these preceding problems, they may manage to make political processes more effective anyway.

The game-changer for democracy, however, is not the revitalization of the traditional means of political participation like elections, petition-signing and protests through digital tools. Rather, the real change on how democracy works, governments rule, and representation is delivered comes from entirely new means of e-participation, or the so-called digital democratic innovations. While the internet may boost traditional forms of political participation by increasing the quantity of citizens engaged, democratic innovations that rely on ICT tools may change the very quality of participation, thus in the long-run changing the nature of democracy and its institutions….(More)”

Bit By Bit: Social Research in the Digital Age


Open Review of Book by Matthew J. Salganik: “In the summer of 2009, mobile phones were ringing all across Rwanda. In addition to the millions of calls between family, friends, and business associates, about 1,000 Rwandans received a call from Joshua Blumenstock and his colleagues. The researchers were studying wealth and poverty by conducting a survey of people who had been randomly sampled from a database of 1.5 million customers from Rwanda’s largest mobile phone provider. Blumenstock and colleagues asked the participants if they wanted to participate in a survey, explained the nature of the research to them, and then asked a series of questions about their demographic, social, and economic characteristics.

Everything I have said up until now makes this sound like a traditional social science survey. But, what comes next is not traditional, at least not yet. They used the survey data to train a machine learning model to predict someone’s wealth from their call data, and then they used this model to estimate the wealth of all 1.5 million customers. Next, they estimated the place of residence of all 1.5 million customers by using the geographic information embedded in the call logs. Putting these two estimates together—the estimated wealth and the estimated place of residence—Blumenstock and colleagues were able to produce high-resolution estimates of the geographic distribution of wealth across Rwanda. In particular, they could produce an estimated wealth for each of Rwanda’s 2,148 cells, the smallest administrative unit in the country.

It was impossible to validate these estimates because no one had ever produced estimates for such small geographic areas in Rwanda. But, when Blumenstock and colleagues aggregated their estimates to Rwanda’s 30 districts, they found that their estimates were similar to estimates from the Demographic and Health Survey, the gold standard of surveys in developing countries. Although these two approaches produced similar estimates in this case, the approach of Blumenstock and colleagues was about 10 times faster and 50 times cheaper than the traditional Demographic and Health Surveys. These dramatically faster and lower cost estimates create new possibilities for researchers, governments, and companies (Blumenstock, Cadamuro, and On 2015).

In addition to developing a new methodology, this study is kind of like a Rorschach inkblot test; what people see depends on their background. Many social scientists see a new measurement tool that can be used to test theories about economic development. Many data scientists see a cool new machine learning problem. Many business people see a powerful approach for unlocking value in the digital trace data that they have already collected. Many privacy advocates see a scary reminder that we live in a time of mass surveillance. Many policy makers see a way that new technology can help create a better world. In fact, this study is all of those things, and that is why it is a window into the future of social research….(More)”

UK’s Digital Strategy


Executive Summary: “This government’s Plan for Britain is a plan to build a stronger, fairer country that works for everyone, not just the privileged few. …Our digital strategy now develops this further, applying the principles outlined in the Industrial Strategy green paper to the digital economy. The UK has a proud history of digital innovation: from the earliest days of computing to the development of the World Wide Web, the UK has been a cradle for inventions which have changed the world. And from Ada Lovelace – widely recognised as the first computer programmer – to the pioneers of today’s revolution in artificial intelligence, the UK has always been at the forefront of invention. …

Maintaining the UK government as a world leader in serving its citizens online

From personalised services in health, to safer care for the elderly at home, to tailored learning in education and access to culture – digital tools, techniques and technologies give us more opportunities than ever before to improve the vital public services on which we all rely.

The UK is already a world leader in digital government,7 but we want to go further and faster. The new Government Transformation Strategy published on 9 February 2017 sets out our intention to serve the citizens and businesses of the UK with a better, more coherent experience when using government services online – one that meets the raised expectations set by the many other digital services and tools they use every day. So, we will continue to develop single cross-government platform services, including by working towards 25 million GOV.UK Verify users by 2020 and adopting new services onto the government’s GOV.UK Pay and GOV.UK Notify platforms.

We will build on the ‘Government as a Platform’ concept, ensuring we make greater reuse of platforms and components across government. We will also continue to move towards common technology, ensuring that where it is right we are consuming commodity hardware or cloud-based software instead of building something that is needlessly government specific.

We will also continue to work, across government and the public sector, to harness the potential of digital to radically improve the efficiency of our public services – enabling us to provide a better service to citizens and service users at a lower cost. In education, for example, we will address the barriers faced by schools in regions not connected to appropriate digital infrastructure and we will invest in the Network of Teaching Excellence in Computer Science to help teachers and school leaders build their knowledge and understanding of technology. In transport, we will make our infrastructure smarter, more accessible and more convenient for passengers. At Autumn Statement 2016 we announced that the National Productivity Investment Fund would allocate £450 million from 2018-19 to 2020-21 to trial digital signalling technology on the rail network. And in policing, we will enable police officers to use biometric applications to match fingerprint and DNA from scenes of crime and return results including records and alerts to officers over mobile devices at the crime scene.

Read more about digital government.

Unlocking the power of data in the UK economy and improving public confidence in its use

As part of creating the conditions for sustainable growth, we will take the actions needed to make the UK a world-leading data-driven economy, where data fuels economic and social opportunities for everyone, and where people can trust that their data is being used appropriately.

Data is a global commodity and we need to ensure that our businesses can continue to compete and communicate effectively around the world. To maintain our position at the forefront of the data revolution, we will implement the General Data Protection Regulation by May 2018. This will ensure a shared and higher standard of protection for consumers and their data.

Read more about data….(More)”

AI, machine learning and personal data


Jo Pedder at the Information Commissioner’s Office Blog: “Today sees the publication of the ICO’s updated paper on big data and data protection.

But why now? What’s changed in the two and a half years since we first visited this topic? Well, quite a lot actually:

  • big data is becoming the norm for many organisations, using it to profile people and inform their decision-making processes, whether that’s to determine your car insurance premium or to accept/reject your job application;
  • artificial intelligence (AI) is stepping out of the world of science-fiction and into real life, providing the ‘thinking’ power behind virtual personal assistants and smart cars; and
  • machine learning algorithms are discovering patterns in data that traditional data analysis couldn’t hope to find, helping to detect fraud and diagnose diseases.

The complexity and opacity of these types of processing operations mean that it’s often hard to know what’s going on behind the scenes. This can be problematic when personal data is involved, especially when decisions are made that have significant effects on people’s lives. The combination of these factors has led some to call for new regulation of big data, AI and machine learning, to increase transparency and ensure accountability.

In our view though, whilst the means by which the processing of personal data are changing, the underlying issues remain the same. Are people being treated fairly? Are decisions accurate and free from bias? Is there a legal basis for the processing? These are issues that the ICO has been addressing for many years, through oversight of existing European data protection legislation….(More)”

When the Big Lie Meets Big Data


Peter Bruce in Scientific America: “…The science of predictive modeling has come a long way since 2004. Statisticians now build “personality” models and tie them into other predictor variables. … One such model bears the acronym “OCEAN,” standing for the personality characteristics (and their opposites) of openness, conscientiousness, extroversion, agreeableness, and neuroticism. Using Big Data at the individual level, machine learning methods might classify a person as, for example, “closed, introverted, neurotic, not agreeable, and conscientious.”

Alexander Nix, CEO of Cambridge Analytica (owned by Trump’s chief donor, Rebekah Mercer), says he has thousands of data points on you, and every other voter: what you buy or borrow, where you live, what you subscribe to, what you post on social media, etc. At a recent Concordia Summit, using the example of gun rights, Nix described how messages will be crafted to appeal specifically to you, based on your personality profile. Are you highly neurotic and conscientious? Nix suggests the image of a sinister gloved hand reaching through a broken window.

In his presentation, Nix noted that the goal is to induce behavior, not communicate ideas. So where does truth fit in? Johan Ugander, Assistant Professor of Management Science at Stanford, suggests that, for Nix and Cambridge Analytica, it doesn’t. In counseling the hypothetical owner of a private beach how to keep people off his property, Nix eschews the merely factual “Private Beach” sign, advocating instead a lie: “Sharks sighted.” Ugander, in his critique, cautions all data scientists against “building tools for unscrupulous targeting.”

The warning is needed, but may be too late. What Nix described in his presentation involved carefully crafted messages aimed at his target personalities. His messages pulled subtly on various psychological strings to manipulate us, and they obeyed no boundary of truth, but they required humans to create them.  The next phase will be the gradual replacement of human “craftsmanship” with machine learning algorithms that can supply targeted voters with a steady stream of content (from whatever source, true or false) designed to elicit desired behavior. Cognizant of the Pandora’s box that data scientists have opened, the scholarly journal Big Data has issued a call for papers for a future issue devoted to “Computational Propaganda.”…(More)”

Democracy at Work: Moving Beyond Elections to Improve Well-Being


Michael Touchton, Natasha Borges Sugiyama and Brian Wampler in the American Political Science Review: “How does democracy work to improve well-being? In this article, we disentangle the component parts of democratic practice—elections, civic participation, expansion of social provisioning, local administrative capacity—to identify their relationship with well-being. We draw from the citizenship debates to argue that democratic practices allow citizens to gain access to a wide range of rights, which then serve as the foundation for improving social well-being. Our analysis of an original dataset covering over 5,550 Brazilian municipalities from 2006 to 2013 demonstrates that competitive elections alone do not explain variation in infant mortality rates, one outcome associated with well-being. We move beyond elections to show how participatory institutions, social programs, and local state capacity can interact to buttress one another and reduce infant mortality rates. It is important to note that these relationships are independent of local economic growth, which also influences infant mortality. The result of our thorough analysis offers a new understanding of how different aspects of democracy work together to improve a key feature of human development….(More)”.

Handbook of Big Data Technologies


Handbook by Albert Y. Zomaya and Sherif Sakr: “…offers comprehensive coverage of recent advancements in Big Data technologies and related paradigms.  Chapters are authored by international leading experts in the field, and have been reviewed and revised for maximum reader value. The volume consists of twenty-five chapters organized into four main parts. Part one covers the fundamental concepts of Big Data technologies including data curation mechanisms, data models, storage models, programming models and programming platforms. It also dives into the details of implementing Big SQL query engines and big stream processing systems.  Part Two focuses on the semantic aspects of Big Data management including data integration and exploratory ad hoc analysis in addition to structured querying and pattern matching techniques.  Part Three presents a comprehensive overview of large scale graph processing. It covers the most recent research in large scale graph processing platforms, introducing several scalable graph querying and mining mechanisms in domains such as social networks.  Part Four details novel applications that have been made possible by the rapid emergence of Big Data technologies such as Internet-of-Things (IOT), Cognitive Computing and SCADA Systems.  All parts of the book discuss open research problems, including potential opportunities, that have arisen from the rapid progress of Big Data technologies and the associated increasing requirements of application domains.
Designed for researchers, IT professionals and graduate students, this book is a timely contribution to the growing Big Data field. Big Data has been recognized as one of leading emerging technologies that will have a major contribution and impact on the various fields of science and varies aspect of the human society over the coming decades. Therefore, the content in this book will be an essential tool to help readers understand the development and future of the field….(More)”

Crowdsourcing Expertise


Simons Foundation: “Ever wish there was a quick, easy way to connect your research to the public?

By hosting a Wikipedia ‘edit-a-thon’ at a science conference, you can instantly share your research knowledge with millions while improving the science content on the most heavily trafficked and broadly accessible resource in the world. In 2016, in partnership with the Wiki Education Foundation, we helped launched the Wikipedia Year of Science, an ambitious initiative designed to better connect the work of scientists and students to the public. Here, we share some of what we learned.

The Simons Foundation — through its Science Sandbox initiative, dedicated to public engagement — co-hosted a series of Wikipedia edit-a-thons throughout 2016 at almost every major science conference, in collaboration with the world’s leading scientific societies and associations.

At our edit-a-thons, we leveraged the collective brainpower of scientists, giving them basic training on Wikipedia guidelines and facilitating marathon editing sessions — powered by free pizza, coffee and sometimes beer — during which they made copious contributions within their respective areas of expertise.

These efforts, combined with the Wiki Education Foundation’s powerful classroom model, have had a clear impact. To date, we’ve reached over 150 universities including more than 6,000 students and scientists. As for output, 6,306 articles have been created or edited, garnering more than 304 million views; over 2,000 scientific images have been donated; and countless new scientist-editors have been minted, many of whom will likely continue to update Wikipedia content. The most common response we got from scientists and conference organizers about the edit-a-thons was: “Can we do that again next year?”

That’s where this guide comes in.

Through collaboration, input from Wikipedians and scientists, and more than a little trial and error, we arrived at a model that can help you organize your own edit-a-thons. This informal guide captures our main takeaways and lessons learned….Our hope is that edit-a-thons will become another integral part of science conferences, just like tweetups, communication workshops and other recent outreach initiatives. This would ensure that the content of the public’s most common gateway to science research will continually improve in quality and scope.

Download: “Crowdsourcing Expertise: A working guide for organizing Wikipedia edit-a-thons at science conferences

From big data to smart data: FDA’s INFORMED initiative


Sean KhozinGeoffrey Kim & Richard Pazdur in Nature: “….Recent advances in our understanding of disease mechanisms have led to the development of new drugs that are enabling precision medicine. For example, the co-development of kinase inhibitors that target ‘driver mutations’ in metastatic non-small-cell lung cancer (NSCLC) with companion diagnostics has led to substantial improvements in the treatment of some patients. However, growing evidence suggests that most patients with metastatic NSCLC and other advanced cancers may not have tumours with single driver mutations. Furthermore, the generation of clinical evidence in genomically diverse and geographically dispersed groups of patients using traditional trial designs and multiple competing therapies is becoming more costly and challenging.

Strategies aimed at creating new efficiencies in clinical evidence generation and extending the benefits of precision medicine to larger groups of patients are driving a transformation from a reductionist approach to drug development (for example, a single drug targeting a driver mutation and traditional clinical trials) to a holistic approach (for example, combination therapies targeting complex multiomic signatures and real-world evidence). This transition is largely fuelled by the rapid expansion in the four dimensions of biomedical big data, which has created a need for greater organizational and technical capabilities (Fig. 1). Appropriate management and analysis of such data requires specialized tools and expertise in health information technology, data science and high-performance computing. For example, efforts to generate clinical evidence using real-world data are being limited by challenges such as capturing clinically relevant variables from vast volumes of unstructured content (such as physician notes) in electronic health records and organizing various structured data elements that are primarily designed to support billing rather than clinical research. So, new standards and quality-control mechanisms are needed to ensure the validity of the design and analysis of studies based on electronic health records.

Figure 1: Conceptual map of technical and organizational capacity for biomedical big data.
Conceptual map of technical and organizational capacity for biomedical big data.

Big data can be defined as having four dimensions: volume (data size), variety (data type), veracity (data noise and uncertainty) and velocity (data flow and processing). Currently, FDA approval decisions are generally based on data of limited variety, mainly from clinical trials and preclinical studies (1) that are mostly structured (2), in data sets usually no more than a few gigabytes in size (3), that are processed intermittently as part of regulatory submissions (4). The expansion of big data in the four dimensions (grey lines) calls for increasing organizational and technical capacity. This could transform big data into smart data by enabling a holistic approach to personalization of therapies that takes patient, disease and environmental characteristics into account. (Full size image (309 KB);Download PowerPoint slide (492 KB)More)”