Human migration: the big data perspective


Alina Sîrbu et al at the International Journal of Data Science and Analytics: “How can big data help to understand the migration phenomenon? In this paper, we try to answer this question through an analysis of various phases of migration, comparing traditional and novel data sources and models at each phase. We concentrate on three phases of migration, at each phase describing the state of the art and recent developments and ideas. The first phase includes the journey, and we study migration flows and stocks, providing examples where big data can have an impact. The second phase discusses the stay, i.e. migrant integration in the destination country. We explore various data sets and models that can be used to quantify and understand migrant integration, with the final aim of providing the basis for the construction of a novel multi-level integration index. The last phase is related to the effects of migration on the source countries and the return of migrants….(More)”.

COVID-19 is creating a democratic deficit – here’s how to reduce it


Article by Matt Ryan: “As parliaments around the country move to scale down operations and defer sittings as part of containing COVID-19 people are beginning to ring the accountability alarm bells….

The good news is that we can learn from those parliaments and politicians around the world who have already been trialling new ways of working that go beyond traditional sittings. Leveraging simple and widely available technologies, they are involving more people with more diverse backgrounds in their processes with less reliance on those people being physically present.

Select Committees in the UK Parliament, for example, have used online “evidence checks” to scrutinise the basis for policy. These one-month exercises use targeted outreach and social media strategies to invite comments from knowledgeable stakeholders and members of the public about the rigour of evidence on which a government department’s policy is based. Evidence for departmental policy is summarised in a two-page document and comments publicly displayed in a web forum that resembles a readers’ comments section in an online news article.

In Taiwan, a participatory governance process pioneered by civic rights activists at the behest of a government minister combines large-scale online participation with smaller in-person gatherings to build a “rough consensus” on legislative proposals related to the digital economy before they are introduced. Known as vTaiwan, the process has led to 26 pieces of national legislation dealing with issues such as Uber, telemedicine and online alcohol sales, and has involved 200,000 people.

The government of Mexico City has raised the stakes even higher, involving more than 400,000 people in a process to draft a new constitution. It included a novel partnership between Change.org and the city mayor that enabled residents to create petition-backed proposals which, once they reached a certain threshold of support, bound the mayor to include them in the draft he submitted to a special constitutional assembly.

Processes like these can also offer relief for politicians and parliamentary officials managing the strain of examining an ever-increasing number of issues of greater complexity with limited personnel and budget. Evidence checks provide access to a wider pool of experts who can bolster existing research capacity. vTaiwan helps to find workable ways forward in industries being rapidly transformed by digital technologies. By “crowdsourcing” the city’s constitution, Mexico City’s mayor retained the trust of residents while undertaking reform at a grand scale….(More)”.

Why we need responsible data for children


Andrew Young and Stefaan Verhulst at The Conversation: “…Without question, the increased use of data poses unique risks for and responsibilities to children. While practitioners may have well-intended purposes to leverage data for and about children, the data systems used are often designed with (consenting) adults in mind without a focus on the unique needs and vulnerabilities of children. This can lead to the collection of inaccurate and unreliable data as well as the inappropriate and potentially harmful use of data for and about children….

Research undertaken in the context of the RD4C initiative uncovered the following trends and realities. These issues make clear why we need a dedicated data responsibility approach for children.

  • Today’s children are the first generation growing up at a time of rapid datafication where almost all aspects of their lives, both on and off-line, are turned into data points. An entire generation of young people is being datafied – often starting even before birth. Every year the average child will have more data collected about them in their lifetime than would a similar child born any year prior. The potential uses of such large volumes of data and the impact on children’s lives are unpredictable, and could potentially be used against them.
  • Children typically do not have full agency to make decisions about their participation in programs or services which may generate and record personal data. Children may also lack the understanding to assess a decision’s purported risks and benefits. Privacy terms and conditions are often barely understood by educated adults, let alone children. As a result, there is a higher duty of care for children’s data.
  • Disaggregating data according to socio-demographic characteristics can improve service delivery and assist with policy development. However, it also creates risks for group privacy. Children can be identified, exposing them to possible harms. Disaggregated data for groups such as child-headed households and children experiencing gender-based violence can put vulnerable communities and children at risk. Data about children’s location itself can be risky, especially if they have some additional vulnerability that could expose them to harm.
  • Mishandling data can cause children to lose trust in institutions that deliver essential services including vaccines, medicine, and nutrition supplies. For organizations dealing with child well-being, these retreats can have severe consequences. Distrust can cause families and children to refuse health, education, child protection and other public services. Such privacy protective behavior can impact children throughout the course of their lifetime, and potentially exacerbate existing inequities and vulnerabilities.
  • As volumes of collected and stored data increase, obligations and protections traditionally put in place for children may be difficult or impossible to uphold. The interests of children are not always prioritized when organizations define their legitimate interest to access or share personal information of children. The immediate benefit of a service provided does not always justify the risk or harm that might be caused by it in the future. Data analysis may be undertaken by people who do not have expertise in the area of child rights, as opposed to traditional research where practitioners are specifically educated in child subject research. Similarly, service providers collecting children’s data are not always specially trained to handle it, as international standards recommend.
  • Recent events around the world reveal the promise and pitfalls of algorithmic decision-making. While it can expedite certain processes, algorithms and their inferences can possess biases that can have adverse effects on people, for example those seeking medical care and attempting to secure jobs. The danger posed by algorithmic bias is especially pronounced for children and other vulnerable populations. These groups often lack the awareness or resources necessary to respond to instances of bias or to rectify any misconceptions or inaccuracies in their data.
  • Many of the children served by child welfare organizations have suffered trauma. Whether physical, social, emotional in nature, repeatedly making children register for services or provide confidential personal information can amount to revictimization – re-exposing them to traumas or instigating unwarranted feelings of shame and guilt.

These trends and realities make clear the need for new approaches for maximizing the value of data to improve children’s lives, while mitigating the risks posed by our increasingly datafied society….(More)”.

Data Collaboratives in Response to COVID19


Living Repository: “This document is part of a call for action to build a responsible infrastructure for data-driven pandemic response. 

It serves as a living repository for data collaboratives seeking to address the spread of COVID-19 and its secondary effects. 

> You can find ongoing data collaborative projects here

> Requests for data and expertise that might lead to data collaboratives can be found here.

> Data competitions, challenges, and calls for proposals, which can lead to useful tools to combat COVID-19, can be found here.

The repository aims to include projects that show a commitment to privacy protection, data responsibility, and overall user well-being. 

It will be updated regularly as we receive projects and proposals or otherwise become aware of them. 

HELP US MAKE THIS REPOSITORY BETTER:  Individuals are encouraged to edit the repo and/or suggest additions to this document if a project is not currently listed.

See full Living Repository here.

The world after coronavirus


Yuval Noah Harari at the Financial Times: “Humankind is now facing a global crisis. Perhaps the biggest crisis of our generation. The decisions people and governments take in the next few weeks will probably shape the world for years to come. They will shape not just our healthcare systems but also our economy, politics and culture. We must act quickly and decisively. We should also take into account the long-term consequences of our actions.

When choosing between alternatives, we should ask ourselves not only how to overcome the immediate threat, but also what kind of world we will inhabit once the storm passes. Yes, the storm will pass, humankind will survive, most of us will still be alive — but we will inhabit a different world.  Many short-term emergency measures will become a fixture of life. That is the nature of emergencies. They fast-forward historical processes.

Decisions that in normal times could take years of deliberation are passed in a matter of hours. Immature and even dangerous technologies are pressed into service, because the risks of doing nothing are bigger. Entire countries serve as guinea-pigs in large-scale social experiments. What happens when everybody works from home and communicates only at a distance? What happens when entire schools and universities go online? In normal times, governments, businesses and educational boards would never agree to conduct such experiments. But these aren’t normal times. 

In this time of crisis, we face two particularly important choices. The first is between totalitarian surveillance and citizen empowerment. The second is between nationalist isolation and global solidarity. 

Under-the-skin surveillance

In order to stop the epidemic, entire populations need to comply with certain guidelines. There are two main ways of achieving this. One method is for the government to monitor people, and punish those who break the rules. Today, for the first time in human history, technology makes it possible to monitor everyone all the time. Fifty years ago, the KGB couldn’t follow 240m Soviet citizens 24 hours a day, nor could the KGB hope to effectively process all the information gathered. The KGB relied on human agents and analysts, and it just couldn’t place a human agent to follow every citizen. But now governments can rely on ubiquitous sensors and powerful algorithms instead of flesh-and-blood spooks. 

In their battle against the coronavirus epidemic several governments have already deployed the new surveillance tools. The most notable case is China. By closely monitoring people’s smartphones, making use of hundreds of millions of face-recognising cameras, and obliging people to check and report their body temperature and medical condition, the Chinese authorities can not only quickly identify suspected coronavirus carriers, but also track their movements and identify anyone they came into contact with. A range of mobile apps warn citizens about their proximity to infected patients…

If I could track my own medical condition 24 hours a day, I would learn not only whether I have become a health hazard to other people, but also which habits contribute to my health. And if I could access and analyse reliable statistics on the spread of coronavirus, I would be able to judge whether the government is telling me the truth and whether it is adopting the right policies to combat the epidemic. Whenever people talk about surveillance, remember that the same surveillance technology can usually be used not only by governments to monitor individuals — but also by individuals to monitor governments. 

The coronavirus epidemic is thus a major test of citizenship….(More)”.

Toward Building The Data Infrastructure And Ecosystem We Need To Tackle Pandemics And Other Dynamic Societal And Environmental Threats


CALL FOR ACTION: “The spread of COVID-19 is a human tragedy and a worldwide crisis. The social and economic costs are huge, and they are contributing to a global slowdown. Despite the amount of data collected daily, we have not been able to leverage them to accelerate our understanding and action to counter COVID-19. As a result we have entered a global state of profound uncertainty and anxiety.

The current pandemic has not only shown vulnerabilities in our public health systems but has also made visible our failure to re-use data between the public and private sectors — what we call data collaboratives — to inform decision makers how to fight dynamic threats like the novel Coronavirus.

We have known for years that the re-use of aggregated and anonymized data — including from telecommunications, social media, and satellite feeds — can improve traditional models for tracking disease propagation. Telecommunications data has, for instance, been re-used to support the response to Ebola in Africa (Orange) and swine flu in Mexico (Telefónica). Social media data has been re-used to understand public perceptions around Zika in Brazil (Facebook). Satellite data has been used to track seasonal measles in Niger using nighttime lights. Geospatial data has similarly supported malaria surveillance and eradication efforts in Sub-Saharan Africa. In general, many infectious diseases have been monitored using mobile phones and mobility.

The potential and realized contributions of these and other data collaboratives reveal that the supply of and demand for data and data expertise are widely dispersed. They are spread across government, the private sector, and civil society and often poorly matched.

Much data needed by researchers is never made accessible to those who could productively put it to use while much data that is released is never used in a systematic and sustainable way during and post crisis.

This failure results in tremendous inefficiencies and costly delays in how we respond. It means lost opportunities to save lives and a persistent lack of preparation for future threats….(More)”. SIGN AND JOIN HERE.

See also Living Repository of Data4COVID19 Collaboratives.

The Coronavirus Tech Handbook


About: “The Coronavirus Tech Handbook provides a space for technologists, specialists, civic organisations and public & private institutions to collaborate on a rapid and sophisticated response to the coronavirus outbreak. It is an active and evolving resource with thousands of expert contributors.

In less than two weeks it has grown to cover areas including:

  • Detailed guidance for doctors and nurses,
  • Advice and tools for educators adjusting to remote teaching, 
  • Community of open-source ventilator designers
  • Comprehensive data and models for forecasting the spread of the virus.

Coronavirus Tech Handbook’s goal is to create a rapidly evolving open source technical knowledge base that will help all institutions across civil society and the public sector collaborate to fight the outbreak. 

Coronavirus Tech Handbook is not a place for the public to get advice, but a place for specialists to collaborate and make sure the best solutions are quickly shared and deployed….(More)”.

How scientists are crowdsourcing a coronavirus treatment


Article by Evan Nicole Brown: “… There’s currently no cure for COVID-19, but scientists are working on drugs that could help slow its spread. Fortunately, citizens can get involved in the process.

Foldit is an online video game that challenges players to fold various proteins into shapes where they are stable. Generally, folding proteins allows scientists (and citizens) to design new proteins from scratch, but in the case of coronavirus, Foldit players are trying to design the drugs to combat it. “Coronavirus has a ‘spike’ protein that it uses to recognize human cells,” says Brian Koepnick, a biochemist and researcher with the University of Washington’s Institute for Protein Design who has been using Foldit for protein research for six years. “Foldit players are designing new protein drugs that can bind to the COVID spike and block this recognition, [which could] potentially stop the virus from infecting more cells in an individual who has already been exposed to the virus.”

“In Foldit, you change the shape of a protein model to optimize your score. This score is actually a sophisticated calculation of the fold’s potential energy,” says Koepnick, adding that professional researchers use an identical score function in their work. “The coronavirus puzzles are set up such that high-scoring models have a better chance of actually binding to the target spike protein.” Ultimately, high-scoring solutions are analyzed by researchers and considered for real-world use….(More)”.

Like Zika, The Public Is Heading To Wikipedia During The COVID-19 Coronavirus Pandemic


Farah Qaiser at Forbes: “A new study out in the PLOS Computational Biology journal shows that public attention in the midst of the Zika virus epidemic was largely driven by media coverage, rather than the epidemic’s magnitude or extent, highlighting the importance of mass media coverage when it comes to public health. This is reflected in the ongoing COVID-19 situation, where to date, the main 2019–20 coronavirus pandemic Wikipedia page has over ten million page views.

The 2015-2016 Zika virus epidemic began in Northeastern Brazil, and spread across South and North America. The Zika virus was largely spread by infected Aedes mosquitoes, where symptoms included a fever, headache, itching, and muscle pain. It could also be transmitted between pregnant women and their fetuses, causing microcephaly, where a baby’s head was much smaller than expected.

Similar to the ongoing COVID-19 situation, the media coverage around the Zika virus epidemic shaped public opinion and awareness.

“We knew that it was relevant, and very important, for public health to understand how the media and news shapes the attention of [the] public during epidemic outbreaks,” says Michele Tizzoni, a principal investigator based at the Institute for Scientific Interchange (ISI) Foundation. …

Today, the 2019–20 coronavirus pandemic Wikipedia page has around ten million page views. As per Toby Negrin, the Wikimedia Foundation’s Chief Product Officer, this page has been edited over 12K times by nearly 1,900 different editors. The page is currently semi-protected – a common practice for Wikipedia pages that are relevant to current news stories.

In an email, Negrin shared that “the day after the World Health Organization classified COVID-19 as a pandemic on March 11th, the main English Wikipedia article about the pandemic had nearly 1.1 million views, an increase of nearly 30% from the day before the WHO’s announcement (on March 10th, it had just over 809,000 views).” This is similar to the peaks in Wikipedia attention observed when official announcements took place during the Zika virus epidemic.

In addition, initial data from Tizzoni’s research group shows that the lockdown in Italy has resulted in a 50% or more decrease in movement between provinces. Similarly, Negrin notes that since the national lockdown in Italy, “total pageviews from Italy to all Wikimedia projects increased by nearly 30% over where they were at the same time last year.”

With increased public awareness during epidemics, tackling misinformation is critical. This remains important at Wikipedia.

“When it comes to documenting current events on Wikipedia, volunteers take even greater care to get the facts right,” stated Negrin, and pointed out that there is a page dedicated to misinformation during this pandemic, which has received over half a million views….(More)”.

World Justice Project (WJP) Rule of Law Index®


Interactive Overview: “The World Justice Project (WJP) Rule of Law Index® is the world’s leading source for original, independent data on the rule of law. Now covering 128 countries and jurisdictions, the Index relies on national surveys of more than 130,000 households and 4,000 legal practitioners and experts to measure how the rule of law is experienced and perceived around the world.

Effective rule of law reduces corruption, combats poverty and disease, and protects people from injustices large and small. It is the foundation for communities of justice, opportunity, and peace—underpinning development, accountable government, and respect for fundamental rights.

Learn more about the rule of law and explore the full WJP Rule of Law Index 2020 report, including PDF report download, data insights, methodology, and more at the Index report resources page….(More)”