Book edited by Btihaj Ajana: “…provides an empirical and philosophical investigation of self-tracking practices. In recent years, there has been an explosion of apps and devices that enable the data capturing and monitoring of everyday activities, behaviours and habits. Encouraged by movements such as the Quantified Self, a growing number of people are embracing this culture of quantification and tracking in the spirit of improving their health and wellbeing.
The aim of this book is to enhance understanding of this fast-growing trend, bringing together scholars who are working at the forefront of the critical study of self-tracking practices. Each chapter provides a different conceptual lens through which one can examine these practices, while grounding the discussion in relevant empirical examples.
From phenomenology to discourse analysis, from questions of identity, privacy and agency to issues of surveillance and tracking at the workplace, this edited collection takes on a wide, and yet focused, approach to the timely topic of self-tracking. It constitutes a useful companion for scholars, students and everyday users interested in the Quantified Self phenomenon…(More)”.
A Really Bad Blockchain Idea: Digital Identity Cards for Rohingya Refugees
Wayan Vota at ICTworks: “The Rohingya Project claims to be a grassroots initiative that will empower Rohingya refugees with a blockchain-leveraged financial ecosystem tied to digital identity cards….
What Could Possibly Go Wrong?
Concerns about Rohingya data collection are not new, so Linda Raftree‘s Facebook post about blockchain for biometrics started a spirited discussion on this escalation of techno-utopia. Several people put forth great points about the Rohingya Project’s potential failings. For me, there were four key questions originating in the discussion that we should all be debating:
1. Who Determines Ethnicity?
Ethnicity isn’t a scientific way to categorize humans. Ethnic groups are based on human constructs such as common ancestry, language, society, culture, or nationality. Who are the Rohingya Project to be the ones determining who is Rohingya or not? And what is this rigorous assessment they have that will do what science cannot?
Might it be better not to perpetuate the very divisions that cause these issues? Or at the very least, let people self-determine their own ethnicity.
2. Why Digitally Identify Refugees?
Let’s say that we could group a people based on objective metrics. Should we? Especially if that group is persecuted where it currently lives and in many of its surrounding countries? Wouldn’t making a list of who is persecuted be a handy reference for those who seek to persecute more?
Instead, shouldn’t we focus on changing the mindset of the persecutors and stop the persecution?
3. Why Blockchain for Biometrics?
How could linking a highly persecuted people’s biometric information, such as fingerprints, iris scans, and photographs, to a public, universal, and immutable distributed ledger be a good thing?
Might it be highly irresponsible to digitize all that information? Couldn’t that data be used by nefarious actors to perpetuate new and worse exploitation of Rohingya? India has already lost Aadhaar data and the Equafax lost Americans’ data. How will the small, lightly funded Rohingya Project do better?
Could it be possible that old-fashioned paper forms are a better solution than digital identity cards? Maybe laminate them for greater durability, but paper identity cards can be hidden, even destroyed if needed, to conceal information that could be used against the owner.
4. Why Experiment on the Powerless?
Rohingya refugees already suffer from massive power imbalances, and now they’ll be asked to give up their digital privacy, and use experimental technology, as part of an NGO’s experiment, in order to get needed services.
Its not like they’ll have the agency to say no. They are homeless, often penniless refugees, who will probably have no realistic way to opt-out of digital identity cards, even if they don’t want to be experimented on while they flee persecution….(More)”
Artificial intelligence and privacy
Report by the The Norwegian Data Protection Authority (DPA): “…If people cannot trust that information about them is being handled properly, it may limit their willingness to share information – for example with their doctor, or on social media. If we find ourselves in a situation in which sections of the population refuse to share information because they feel that their personal integrity is being violated, we will be faced with major challenges to our freedom of speech and to people’s trust in the authorities.
A refusal to share personal information will also represent a considerable challenge with regard to the commercial use of such data in sectors such as the media, retail trade and finance services.
About the report
This report elaborates on the legal opinions and the technologies described in the 2014 report «Big Data – privacy principles under pressure». In this report we will provide greater technical detail in describing artificial intelligence (AI), while also taking a closer look at four relevant AI challenges associated with the data protection principles embodied in the GDPR:
- Fairness and discrimination
- Purpose limitation
- Data minimisation
- Transparency and the right to information
This represents a selection of data protection concerns that in our opinion are most relevance for the use of AI today.
The target group for this report consists of people who work with, or who for other reasons are interested in, artificial intelligence. We hope that engineers, social scientists, lawyers and other specialists will find this report useful….(More) (Download Report)”.
A Roadmap to a Nationwide Data Infrastructure for Evidence-Based Policymaking
Introduction by Julia Lane and Andrew Reamer of a Special Issue of the Annals of the American Academy of Political and Social Science: “Throughout the United States, there is broad interest in expanding the nation’s capacity to design and implement public policy based on solid evidence. That interest has been stimulated by the new types of data that are available that can transform the way in which policy is designed and implemented. Yet progress in making use of sensitive data has been hindered by the legal, technical, and operational obstacles to access for research and evaluation. Progress has also been hindered by an almost exclusive focus on the interest and needs of the data users, rather than the interest and needs of the data providers. In addition, data stewardship is largely artisanal in nature.
There are very real consequences that result from lack of action. State and local governments are often hampered in their capacity to effectively mount and learn from innovative efforts. Although jurisdictions often have treasure troves of data from existing programs, the data are stove-piped, underused, and poorly maintained. The experience reported by one large city public health commissioner is too common: “We commissioners meet periodically to discuss specific childhood deaths in the city. In most cases, we each have a thick file on the child or family. But the only time we compare notes is after the child is dead.”1 In reality, most localities lack the technical, analytical, staffing, and legal capacity to make effective use of existing and emerging resources.
It is our sense that fundamental changes are necessary and a new approach must be taken to building data infrastructures. In particular,
- Privacy and confidentiality issues must be addressed at the beginning—not added as an afterthought.
- Data providers must be involved as key stakeholders throughout the design process.
- Workforce capacity must be developed at all levels.
- The scholarly community must be engaged to identify the value to research and policy….
To develop a roadmap for the creation of such an infrastructure, the Bill and Melinda Gates Foundation, together with the Laura and John Arnold Foundation, hosted a day-long workshop of more than sixty experts to discuss the findings of twelve commissioned papers and their implications for action. This volume of The ANNALS showcases those twelve articles. The workshop papers were grouped into three thematic areas: privacy and confidentiality, the views of data producers, and comprehensive strategies that have been used to build data infrastructures in other contexts. The authors and the attendees included computer scientists, social scientists, practitioners, and data producers.
This introductory article places the research in both an historical and a current context. It also provides a framework for understanding the contribution of the twelve articles….(More)”.
How the Data That Internet Companies Collect Can Be Used for the Public Good
Stefaan G. Verhulst and Andrew Young at Harvard Business Review: “…In particular, the vast streams of data generated through social media platforms, when analyzed responsibly, can offer insights into societal patterns and behaviors. These types of behaviors are hard to generate with existing social science methods. All this information poses its own problems, of complexity and noise, of risks to privacy and security, but it also represents tremendous potential for mobilizing new forms of intelligence.
In a recent report, we examine ways to harness this potential while limiting and addressing the challenges. Developed in collaboration with Facebook, the report seeks to understand how public and private organizations can join forces to use social media data — through data collaboratives — to mitigate and perhaps solve some our most intractable policy dilemmas.
Data Collaboratives: Public-Private Partnerships for Our Data Age
For all of data’s potential to address public challenges, most data generated today is collected by the private sector. Typically ensconced in corporate databases, and tightly held in order to maintain competitive advantage, this data contains tremendous possible insights and avenues for policy innovation. But because the analytical expertise brought to bear on it is narrow, and limited by private ownership and access restrictions, its vast potential often goes untapped.
Data collaboratives offer a way around this limitation. They represent an emerging public-private partnership model, in which participants from different areas , including the private sector, government, and civil society , can come together to exchange data and pool analytical expertise in order to create new public value. While still an emerging practice, examples of such partnerships now exist around the world, across sectors and public policy domains….
Professionalizing the Responsible Use of Private Data for Public Good
For all its promise, the practice of data collaboratives remains ad hoc and limited. In part, this is a result of the lack of a well-defined, professionalized concept of data stewardship within corporations. Today, each attempt to establish a cross-sector partnership built on the analysis of social media data requires significant and time-consuming efforts, and businesses rarely have personnel tasked with undertaking such efforts and making relevant decisions.
As a consequence, the process of establishing data collaboratives and leveraging privately held data for evidence-based policy making and service delivery is onerous, generally one-off, not informed by best practices or any shared knowledge base, and prone to dissolution when the champions involved move on to other functions.
By establishing data stewardship as a corporate function, recognized within corporations as a valued responsibility, and by creating the methods and tools needed for responsible data-sharing, the practice of data collaboratives can become regularized, predictable, and de-risked.
If early efforts toward this end — from initiatives such as Facebook’s Data for Good efforts in the social media space and MasterCard’s Data Philanthropy approach around finance data — are meaningfully scaled and expanded, data stewards across the private sector can act as change agents responsible for determining what data to share and when, how to protect data, and how to act on insights gathered from the data.
Still, many companies (and others) continue to balk at the prospect of sharing “their” data, which is an understandable response given the reflex to guard corporate interests. But our research has indicated that many benefits can accrue not only to data recipients but also to those who share it. Data collaboration is not a zero-sum game.
With support from the Hewlett Foundation, we are embarking on a two-year project toward professionalizing data stewardship (and the use of data collaboratives) and establishing well-defined data responsibility approaches. We invite others to join us in working to transform this practice into a widespread, impactful means of leveraging private-sector assets, including social media data, to create positive public-sector outcomes around the world….(More)”.
Open Data Risk Assessment
Report by the Future of Privacy Forum: “The transparency goals of the open data movement serve important social, economic, and democratic functions in cities like Seattle. At the same time, some municipal datasets about the city and its citizens’ activities carry inherent risks to individual privacy when shared publicly. In 2016, the City of Seattle declared in its Open Data Policy that the city’s data would be “open by preference,” except when doing so may affect individual privacy. To ensure its Open Data Program effectively protects individuals, Seattle committed to performing an annual risk assessment and tasked the Future of Privacy Forum (FPF) with creating and deploying an initial privacy risk assessment methodology for open data.
This Report provides tools and guidance to the City of Seattle and other municipalities navigating the complex policy, operational, technical, organizational, and ethical standards that support privacyprotective open data programs. Although there is a growing body of research regarding open data privacy, open data managers and departmental data owners need to be able to employ a standardized methodology for assessing the privacy risks and benefits of particular datasets internally, without access to a bevy of expert statisticians, privacy lawyers, or philosophers. By optimizing its internal processes and procedures, developing and investing in advanced statistical disclosure control strategies, and following a flexible, risk-based assessment process, the City of Seattle – and other municipalities – can build mature open data programs that maximize the utility and openness of civic data while minimizing privacy risks to individuals and addressing community concerns about ethical challenges, fairness, and equity.
This Report first describes inherent privacy risks in an open data landscape, with an emphasis on potential harms related to re-identification, data quality, and fairness. To address these risks, the Report includes a Model Open Data Benefit-Risk Analysis (“Model Analysis”). The Model Analysis evaluates the types of data contained in a proposed open dataset, the potential benefits – and concomitant risks – of releasing the dataset publicly, and strategies for effective de-identification and risk mitigation. This holistic assessment guides city officials to determine whether to release the dataset openly, in a limited access environment, or to withhold it from publication (absent countervailing public policy considerations). …(More)”.
They Are Watching You—and Everything Else on the Planet
Cover article by Robert Draper for Special Issue of the National Geographic: “Technology and our increasing demand for security have put us all under surveillance. Is privacy becoming just a memory?…
In 1949, amid the specter of European authoritarianism, the British novelist George Orwell published his dystopian masterpiece 1984, with its grim admonition: “Big Brother is watching you.” As unsettling as this notion may have been, “watching” was a quaintly circumscribed undertaking back then. That very year, 1949, an American company released the first commercially available CCTV system. Two years later, in 1951, Kodak introduced its Brownie portable movie camera to an awestruck public.
Today more than 2.5 trillion images are shared or stored on the Internet annually—to say nothing of the billions more photographs and videos people keep to themselves. By 2020, one telecommunications company estimates, 6.1 billion people will have phones with picture-taking capabilities. Meanwhile, in a single year an estimated 106 million new surveillance cameras are sold. More than three million ATMs around the planet stare back at their customers. Tens of thousands of cameras known as automatic number plate recognition devices, or ANPRs, hover over roadways—to catch speeding motorists or parking violators but also, in the case of the United Kingdom, to track the comings and goings of suspected criminals. The untallied but growing number of people wearing body cameras now includes not just police but also hospital workers and others who aren’t law enforcement officers. Proliferating as well are personal monitoring devices—dash cams, cyclist helmet cameras to record collisions, doorbells equipped with lenses to catch package thieves—that are fast becoming a part of many a city dweller’s everyday arsenal. Even less quantifiable, but far more vexing, are the billions of images of unsuspecting citizens captured by facial-recognition technology and stored in law enforcement and private-sector databases over which our control is practically nonexistent.
Those are merely the “watching” devices that we’re capable of seeing. Presently the skies are cluttered with drones—2.5 million of which were purchased in 2016 by American hobbyists and businesses. That figure doesn’t include the fleet of unmanned aerial vehicles used by the U.S. government not only to bomb terrorists in Yemen but also to help stop illegal immigrants entering from Mexico, monitor hurricane flooding in Texas, and catch cattle thieves in North Dakota. Nor does it include the many thousands of airborne spying devices employed by other countries—among them Russia, China, Iran, and North Korea.
We’re being watched from the heavens as well. More than 1,700 satellites monitor our planet. From a distance of about 300 miles, some of them can discern a herd of buffalo or the stages of a forest fire. From outer space, a camera clicks and a detailed image of the block where we work can be acquired by a total stranger….
This is—to lift the title from another British futurist, Aldous Huxley—our brave new world. That we can see it coming is cold comfort since, as Carnegie Mellon University professor of information technology Alessandro Acquisti says, “in the cat-and-mouse game of privacy protection, the data subject is always the weaker side of the game.” Simply submitting to the game is a dispiriting proposition. But to actively seek to protect one’s privacy can be even more demoralizing. University of Texas American studies professor Randolph Lewis writes in his new book, Under Surveillance: Being Watched in Modern America, “Surveillance is often exhausting to those who really feel its undertow: it overwhelms with its constant badgering, its omnipresent mysteries, its endless tabulations of movements, purchases, potentialities.”
The desire for privacy, Acquisti says, “is a universal trait among humans, across cultures and across time. You find evidence of it in ancient Rome, ancient Greece, in the Bible, in the Quran. What’s worrisome is that if all of us at an individual level suffer from the loss of privacy, society as a whole may realize its value only after we’ve lost it for good.”…(More)”.
Extracting crowd intelligence from pervasive and social big data
Introduction by
With the prevalence of ubiquitous computing devices (smartphones, wearable devices, etc.) and social network services (Facebook, Twitter, etc.), humans are generating massive digital traces continuously in their daily life. Considering the invaluable crowd intelligence residing in these pervasive and social big data, a spectrum of opportunities is emerging to enable promising smart applications for easing individual life, increasing company profit, as well as facilitating city development. However, the nature of big data also poses fundamental challenges on the techniques and applications relying on the pervasive and social big data from multiple perspectives such as algorithm effectiveness, computation speed, energy efficiency, user privacy, server security, data heterogeneity and system scalability. This special issue presents the state-of-the-art research achievements in addressing these challenges. After the rigorous review process of reviewers and guest editors, eight papers were accepted as follows.The first paper “Automated recognition of hypertension through overnight continuous HRV monitoring” by Ni et al. proposes a non-invasive way to differentiate hypertension patients from healthy people with the pervasive sensors such as a waist belt. To this end, the authors train a machine learning model based on the heart rate data sensed from waists worn by a crowd of people, and the experiments show that the detection accuracy is around 93%.
The second paper “The workforce analyzer: group discovery among LinkedIn public profiles” by Dai et al. describes two users’ group discovery methods among LinkedIn public profiles. One is based on K-means and another is based on SVM. The authors contrast results of both methods and provide insights about the trending professional orientations of the workforce from an online perspective.
The third paper “Tweet and followee personalized recommendations based on knowledge graphs” by Pla Karidi et al. present an efficient semantic recommendation method that helps users filter the Twitter stream for interesting content. The foundation of this method is a knowledge graph that can represent all user topics of interest as a variety of concepts, objects, events, persons, entities, locations and the relations between them. An important advantage of the authors’ method is that it reduces the effects of problems such as over-recommendation and over-specialization.
The fourth paper “CrowdTravel: scenic spot profiling by using heterogeneous crowdsourced data” by Guo et al. proposes CrowdTravel, a multi-source social media data fusion approach for multi-aspect tourism information perception, which can provide travelling assistance for tourists by crowd intelligence mining. Experiments over a dataset of several popular scenic spots in Beijing and Xi’an, China, indicate that the authors’ approach attains fine-grained characterization for the scenic spots and delivers excellent performance.
The fifth paper “Internet of Things based activity surveillance of defence personnel” by Bhatia et al. presents a comprehensive IoT-based framework for analyzing national integrity of defence personnel with consideration to his/her daily activities. Specifically, Integrity Index Value is defined for every defence personnel based on different social engagements, and activities for detecting the vulnerability to national security. In addition to this, a probabilistic decision tree based automated decision making is presented to aid defence officials in analyzing various activities of a defence personnel for his/her integrity assessment.
The sixth paper “Recommending property with short days-on-market for estate agency” by Mou et al. proposes an estate with short days-on-market appraisal framework to automatically recommend those estates using transaction data and profile information crawled from websites. Both the spatial and temporal characteristics of an estate are integrated into the framework. The results show that the proposed framework can estimate accurately about 78% estates.
The seventh paper “An anonymous data reporting strategy with ensuring incentives for mobile crowd-sensing” by Li et al. proposes a system and a strategy to ensure anonymous data reporting while ensuring incentives simultaneously. The proposed protocol is arranged in five stages that mainly leverage three concepts: (1) slot reservation based on shuffle, (2) data submission based on bulk transfer and multi-player dc-nets, and (3) incentive mechanism based on blind signature.
The last paper “Semantic place prediction from crowd-sensed mobile phone data” by Celik et al. semantically classifes places visited by smart phone users utilizing the data collected from sensors and wireless interfaces available on the phones as well as phone usage patterns, such as battery level, and time-related information, with machine learning algorithms. For this study, the authors collect data from 15 participants at Galatasaray University for 1 month, and try different classification algorithms such as decision tree, random forest, k-nearest neighbour, naive Bayes, and multi-layer perceptron….(More)”.
The Future Computed: Artificial Intelligence and its role in society
Brad Smith at the Microsoft Blog: “Today Microsoft is releasing a new book, The Future Computed: Artificial Intelligence and its role in society. The two of us have written the foreword for the book, and our teams collaborated to write its contents. As the title suggests, the book provides our perspective on where AI technology is going and the new societal issues it has raised.
On a personal level, our work on the foreword provided an opportunity to step back and think about how much technology has changed our lives over the past two decades and to consider the changes that are likely to come over the next 20 years. In 1998, we both worked at Microsoft, but on opposite sides of the globe. While we lived on separate continents and in quite different cultures, we shared similar experiences and daily routines which were managed by manual planning and movement. Twenty years later, we take for granted the digital world that was once the stuff of science fiction.
Technology – including mobile devices and cloud computing – has fundamentally changed the way we consume news, plan our day, communicate, shop and interact with our family, friends and colleagues. Two decades from now, what will our world look like? At Microsoft, we imagine that artificial intelligence will help us do more with one of our most precious commodities: time. By 2038, personal digital assistants will be trained to anticipate our needs, help manage our schedule, prepare us for meetings, assist as we plan our social lives, reply to and route communications, and drive cars.
Beyond our personal lives, AI will enable breakthrough advances in areas like healthcare, agriculture, education and transportation. It’s already happening in impressive ways.
But as we’ve witnessed over the past 20 years, new technology also inevitably raises complex questions and broad societal concerns. As we look to a future powered by a partnership between computers and humans, it’s important that we address these challenges head on.
How do we ensure that AI is designed and used responsibly? How do we establish ethical principles to protect people? How should we govern its use? And how will AI impact employment and jobs?
To answer these tough questions, technologists will need to work closely with government, academia, business, civil society and other stakeholders. At Microsoft, we’ve identified six ethical principles – fairness, reliability and safety, privacy and security, inclusivity, transparency, and accountability – to guide the cross-disciplinary development and use of artificial intelligence. The better we understand these or similar issues — and the more technology developers and users can share best practices to address them — the better served the world will be as we contemplate societal rules to govern AI.
We must also pay attention to AI’s impact on workers. What jobs will AI eliminate? What jobs will it create? If there has been one constant over 250 years of technological change, it has been the ongoing impact of technology on jobs — the creation of new jobs, the elimination of existing jobs and the evolution of job tasks and content. This too is certain to continue.
Some key conclusions are emerging….
The Future Computed is available here and additional content related to the book can be found here.”
Big Data and medicine: a big deal?
V. Mayer-Schönberger and E. Ingelsson in the Journal of Internal Medicine: “Big Data promises huge benefits for medical research. Looking beyond superficial increases in the amount of data collected, we identify three key areas where Big Data differs from conventional analyses of data samples: (i) data are captured more comprehensively relative to the phenomenon under study; this reduces some bias but surfaces important trade-offs, such as between data quantity and data quality; (ii) data are often analysed using machine learning tools, such as neural networks rather than conventional statistical methods resulting in systems that over time capture insights implicit in data, but remain black boxes, rarely revealing causal connections; and (iii) the purpose of the analyses of data is no longer simply answering existing questions, but hinting at novel ones and generating promising new hypotheses. As a consequence, when performed right, Big Data analyses can accelerate research.
Because Big Data approaches differ so fundamentally from small data ones, research structures, processes and mindsets need to adjust. The latent value of data is being reaped through repeated reuse of data, which runs counter to existing practices not only regarding data privacy, but data management more generally. Consequently, we suggest a number of adjustments such as boards reviewing responsible data use, and incentives to facilitate comprehensive data sharing. As data’s role changes to a resource of insight, we also need to acknowledge the importance of collecting and making data available as a crucial part of our research endeavours, and reassess our formal processes from career advancement to treatment approval….(More)”.