Bilingual


baɪˈlɪŋgwəl

Practitioners across disciplines who possess both domain knowledge and data science expertise.

The Governance Lab (GovLab) at the NYU Tandon School of Engineering just launched the 100 Questions Initiative, “an effort to identify the most important societal questions whose answers can be found in data and data science if the power of data collaboratives is harnessed.”

The initiative will seek to identify questions that could help unlock the potential of data and data science in solving various global and domestic issues, including but not limited to, climate change, economic inequality, and migration. These questions will be sourced from individuals who have expertise in both a public issue and data science or what The GovLab calls “bilinguals.”

Tom Kalil, the Chief Innovation Officer at Schmidt Futures, argues that the emergent use of data science and machine learning in the public sector will increase the demand for individuals “who speak data science and social sector.”

Similarly within the business context, David Meer wrote that “being bilingual isn’t just a matter of native English speakers learning how to conjugate verbs in French or Spanish. Rather, it’s important that businesses cultivate talent that can simultaneously speak the language of advanced data analysis and nuts-and-bolts business operations. As data analysis becomes a more prevalent and powerful lever for strategy and growth, organizations increasingly need bilinguals to form the bridge between the work of advanced data scientists and business decision makers.”

For more info, visit www.the100questions.org

Digital Serfdom


ˈdɪʤətəl ˈsɜrfdəm

A condition where consumers give up their personal and private information in order to be able to use a particular product or service.

Serfdom is a system of forced labor that exists in a feudalistic society. It was very common in Europe during the medieval age. In this system, serfs or peasants do a variety of labor for their lords in exchange for protection from bandits and a small piece of land that they can cultivate for themselves. Serfs are also required to pay some form of tax often in the form of chickens or crops yielded from their piece of land.

Hassan Khan in The Next Web points out that the decline of property ownership is indicative that we are living in a digital serfdom. In an article he says:

“The percentage of households without a car is increasing. Ride-hailing services have multiplied. Netflix boasts over 188 million subscribers. Spotify gains ten million paid members every five to six months.

“The model of “impermanence” has become the new normal. But there’s still one place where permanence finds its home, with over two billion active monthly users, Facebook has become a platform of record for the connected world. If it’s not on social media, it may as well have never happened.”

Joshua A. T. Fairfield elaborates this phenomenon in his book “Owned: Property, Privacy, and the New Digital Serfdom.” Fairfield discusses his book in an article in The Conversation, stating that:

“The issue of who gets to control property has a long history. In the feudal system of medieval Europe, the king owned almost everything, and everyone else’s property rights depended on their relationship with the king. Peasants lived on land granted by the king to a local lord, and workers didn’t always even own the tools they used for farming or other trades like carpentry and blacksmithing.

[…]

“Yet the expansion of the internet of things seems to be bringing us back to something like that old feudal model, where people didn’t own the items they used every day. In this 21st-century version, companies are using intellectual property law – intended to protect ideas – to control physical objects consumers think they own.”

In other words, Fairfield is suggesting that the devices and services that we use — iPhones, Fitbits, Roomba, digital door locks, Spotify, Uber, and many more — are constantly capturing data about behaviors. By using these products, consumers have no choice but to trade their personal data in order to access the full functionalities of these devices or services. This data is used by private corporations for targeted advertisement, among others. This system of digital serfdom binds consumers to private corporations who dictate the terms of use for their products or services.

Janet Burns wrote about Alex Rosenblat’s “UBERLAND: How Algorithms Are Rewriting The Rules Of Work” and gave some examples of how algorithms use personal data to manipulate consumers’ behaviors:

“For example, algorithms in control of assigning and pricing rides have often surprised drivers and riders, quietly taking into account other traffic in the area, regionally adjusted rates, and data on riders and drivers themselves.

“In recent years, we’ve seen similar adjustments happen behind the scenes in online shopping, as UBERLAND points out: major retailers have tweaked what price different customers see for the same item based on where they live, and how feasibly they could visit a brick-and-mortar store for it.”

To conclude, an excerpt from Fairfield’s book cautions: 

“In the coming decade, if we do not take back our ownership rights, the same will be said of our self-driving cars and software-enabled homes. We risk becoming digital peasants, owned by software and advertising companies, not to mention overreaching governments.”

Sources and Further readings:

Fairfield, Joshua A. T. “Owned: Property, Privacy, and the New Digital Serfdom.” Cambridge Press. https://www.cambridge.org/gb/academic/subjects/law/property-law/owned-property-privacy-and-new-digital-serfdom#JiVMgvsMOg6Zer5x.97 

Fairfield, Joshua A.T. “The ‘internet of things’ is sending us back to the Middle Ages.” The Conversation. https://theconversation.com/the-internet-of-things-is-sending-us-back-to-the-middle-ages-81435 

Burns, Janet. “Algorithms And ‘Uberland’ Are Driving Us Into Digital Serfdom.” Forbes. https://www.forbes.com/sites/janetwburns/2018/10/28/algorithms-and-uberland-are-driving-us-into-technocratic-serfdom/#7887dccc6705

Khan, Hassan. “We’re living in digital serfdom — trading privacy for convenience.” The Next Web. https://thenextweb.com/contributors/2018/11/10/were-living-in-a-digital-serfdom-trading-privacy-for-convenience/

Self-Sovereign Identity


sɛlf-ˈsɑvrən aɪˈdɛntəti

A decentralized identification mechanism that gives individuals control over what, when, and to whom their personal information is shared.

Identification document (ID) is a crucial part of every individual’s life, in that it is often a prerequisite for accessing a variety of services — ranging from creating a bank account to enrolling children in school to buying alcoholic beverages to signing up for an email account to voting in an election — and also a proof of simply being. This system poses fundamental problems, which a field report by the GovLab on Blockchain and Identity frames as follows:

“One of the central challenges of modern identity is its fragmentation and variation across platform and individuals. There are also issues related to interoperability between different forms of identity, and the fact that different identities confer very different privileges, rights, services or forms of access. The universe of identities is vast and manifold. Every identity in effect poses its own set of challenges and difficulties—and, of course, opportunities.”

A report published in New America echoed this point, by arguing that:

“Societally, we lack a coherent approach to regulating the handling of personal data. Users share and generate far too much data—both personally identifiable information (PII) and metadata, or “data exhaust”—without a way to manage it. Private companies, by storing an increasing amount of PII, are taking on an increasing level of risk. Solution architects are recreating the wheel, instead of flying over the treacherous terrain we have just described.”

SSI is dubbed as the solution for those identity problems mentioned above. Identity Woman, a researcher and advocate for SSI, goes even further by arguing that generating “a digital identity that is not under the control of a corporation, an organization or a government” is essential “in pursuit of social justice, deep democracy, and the development of new economies that share wealth and protect the environment.”

To inform the analysis on blockchain-based Self-Sovereign Identity (SSI), the GovLab report argues that identity is “a process, not a thing” and breaks it into a 5-stage lifecycle, which are provisioning, administration, authentication, authorization, and auditing/monitoring. At each stage, identification serves a unique function and poses different challenges.

With SSI, individuals have full control over how their personal information is shared, who gets access to it, and when. The New America report, summarizes the potential of SSI in the following paragraphs:

“We believe that the great potential of SSI is that it can make identity in the digital world function more like identity in the physical world, in which every person has a unique and persistent identity which is represented to others by means of both their physical attributes and a collection of credentials attested to by various external sources of authority.”

[…]

“SSI, in contrast, gives the user a portable, digital credential (like a driver’s license or some other document that proves your age), the authenticity of which can be securely validated via cryptography without the recipient having to check with the authority that issued it. This means that while the credential can be used to access many different sites and services, there is no third-party broker to track the services to which the user is authenticating. Furthermore, cryptographic techniques called “zero-knowledge proofs” (ZKPs) can be used to prove possession of a credential without revealing the credential itself. This makes it possible, for example, for users to prove that they are over the age of 21 without having to share their actual birth dates, which are both sensitive information and irrelevant to a binary, yes-or-no ID transaction.”

Some case studies on the application of SSI in the real world presented in the GovLab Blockchange website include a government-issued self-sovereign ID using blockchain technology in the city of Zug in Switzerland; a mobile election voting platform, secured via smart biometrics, real time ID verification and the blockchain for irrefutability piloted in West Virginia; and a blockchain based land and property transaction/registration in Sweden.

Nevertheless, on the hype of this new and emerging technology, the authors write:

“At their core, blockchain technologies offer new capacity for increasing the immutability, integrity, and resilience of information capture and disclosure mechanisms, fostering the potential to address some of the information asymmetries described above. By leveraging a shared and verified database of ledgers stored in a distributed manner, blockchain seeks to redesign information ecosystems in a more transparent, immutable, and trusted manner. Solving information asymmetries may turn out to be the real contribution of blockchain, and this—much more than the current enthusiasm over virtual currencies—is the real reason to assess its potential.

“It is important to emphasize, of course, that blockchain’s potential remains just that for the moment—only potential. Considerable hype surrounds the emerging technology, and much remains to be done and many obstacles to overcome if blockchain is to achieve the enthusiasts’ vision of “radical transparency.”

Further readings:

Allen, Christopher (2016). The Path to Self-Sovereign Identity. Coindesk. https://www.coindesk.com/path-self-sovereign-identity

Apostle, Julia (2018). Lessons from Cambridge Analytica: one way to protect your data. Financial Times. https://www.ft.com/content/43bc6d18-2b6f-11e8-97ec-4bd3494d5f14

Graglia, Michael, Christopher Mellon, and Tim Robustelli (2018). The Nail Finds a Hammer: Self-Sovereign Identity, Design Principles, and Property Rights in the Developing World. New America. https://www.newamerica.org/future-property-rights/reports/nail-finds-hammer/

Identity Woman, Kaliya (2017). Humanizing Technology. Open Democracy. https://www.opendemocracy.net/en/transformation/humanizing-technology/

Verhulst, Stefaan G. and Andrew Young (2018). On the Emergent Use of Distributed Ledger Technologies for Identity Management. The GovLab. https://blockchan.ge/fieldreport.html

Grey Data


greɪ ˈdeɪtə

A term for data accumulated by an institution for operational purposes and does not fall under any traditional data protection policies.

Organizations across all sectors accumulate a massive amount of data just by virtue of operating alone, and universities are among such organizations. In a paper, Christine L. Borgman categorizes these as grey data and further suggested that universities should take a lead in demonstrating stewardship of these data, which include student applications, faculty dossier, registrar records, ID card data, security cameras, and many others.

“Some of these data are collected for mandatory reporting obligations such as enrollments, diversity, budgets, grants, and library collections. Many types of data about individuals are collected for operational and design purposes, whether for instruction, libraries, travel, health, or student services.” (Borgman, p. 380)

Grey data typically does not fall under traditional data protection policies such as Health Insurance Portability and Accountability Act (HIPAA), Family Educational Rights and Privacy Act (FERPA), or Institutional Review Boards. Consequently, there are a lot of debates about how to use (or misuse) them. Borgman points out that universities have been “exploiting these data for research, learning analytics, faculty evaluation, strategic decisions, and other sensitive matters.” On top of this, for-profit companies “are besieging universities with requests for access to data or for partnerships to mine them.”

Recognizing both the value of data and the risks arising from the accumulation of grey data, Borgman proposes a model of Data Stewardship by drawing on the practices of data protection in the University of California which concern information security, data governance, and cyber risk.

This model is an example of a good Data Stewardship practice that the GovLab is advocating amidst the rise of public-private collaboration in leveraging data for public good.

The GovLab’s Data Stewards website presents the need for such practice as follows:

“With these new practices of data collaborations come the need to reimagine roles and responsibilities to steer the process of using private data, and the insights it can generate, to address some of society’s biggest questions and challenges: Data Stewards.

“Today, establishing and sustaining these new collaborative and accountable approaches requires significant and time-consuming effort and investment of resources for both data holders on the supply side, and institutions that represent the demand. By establishing Data Stewardship as a function, recognized within the private sector as a valued responsibility, the practice of Data Collaboratives can become more predictable, scaleable, sustainable and de-risked.”

Resources:

Borgman, C. L. (2018). Open Data, Grey Data, and Stewardship: Universities at the Privacy Frontier. ArXiv. https://doi.org/10.15779/Z38B56D489

Young, A. (2018, November 26). About the Data Stewards Network. Retrieved March 6, 2019, from https://medium.com/data-stewards-network/about-the-data-stewards-network-1cb9db0c0792

Rawification


rɑwəfɪˈkeɪʃən

A process of making datasets raw in three steps: reformatting, cleaning, and ungrounding (Denis and Goeta).

Hundreds of thousands of datasets are now made available via numerous channels from both public and private domains. Based on the stage of processing, these datasets can be categorized as either raw data or processed data. According to an Open Government Data principle, raw data (or primary data) “are published as collected at the source, with the finest possible level of granularity, not in aggregate or modified forms.” While processed data is data that has been through some sort of adulteration, categorization, codification, aggregation, and other similar processes.

A large amount of data that is made publicly available come in processed form. For example, population, trade, and budget data are often presented in aggregated forms, preventing researchers from understanding the underlying stories behind these data, such as the differences in patterns or trends when gender, location, or other variables come into factor. Therefore, a rawification process is oftentimes needed in order for a dataset to be useful for a more detailed, secondary and valuable analysis.

Jérôme Denis and Samuel Goëta define ‘rawification’ as a process of reformatting, cleaning, and ungrounding data in order to obtain a truly ‘raw’ datasets.

According to Denis and Goëta, reformatting data means making sure that data that has been opened can also be easily readable by the users. This is usually achieved by reformatting the data so that it can be read and manipulated by most processing programs. One of the most commonly used formats is CSV (Comma Separated Values).

The next step in a rawification process is cleaning. In this stage, cleaning means correcting mistakes within the datasets, which include but not limited to, redundancies and incoherence. In many cases, datasets can have multiple entries for the same item, for example ‘New York University’ and ‘NYU’ might be interpreted as two different entities or ‘the GovLab’ and ‘the Governance Lab’ might experience a similar issue. Cleaning helps address issues like this.

The final step in a rawification process is ungrounding, which means taking out any ties or links from previous data use. Such ties include color coding, comments, and subcategories. This way the datasets can be purely raw and free of all associations and bias.

Opening up data is a clear step for increasing public access to information held within institutions. However, in order to ensure the utility of that data for those accessing it, a rawification process will likely be necessary.

Additional resources:

  • Denis, J., & Goëta, S. (2017). Rawification and the careful generation of open government data. Social Studies of Science, 47(5), 604–629. https://doi.org/10.1177/0306312717712473
  • Denis, J., & Goëta, S. (2014). Exploration, Extraction and ‘Rawification’. The Shaping of Transparency in the Back Rooms of Open Data (SSRN Scholarly Paper No. ID 2403069). Rochester, NY: Social Science Research Network. Retrieved from https://papers.ssrn.com/abstract=2403069

Data Fiduciary


ˈdeɪtə fəˈduʃiˌɛri

A person or a business that manages individual data in a trustworthy manner. Also ‘information fiduciary’, ‘data trust’, or ‘data steward’.

‘Fiduciary’ is an old concept in the legal world. Its latin origin is fidere, which means to trust. In the legal context, a fiduciary is usually a person that is trusted to make a decision on how to manage an asset or information, within constraints given by another person who owns such asset or information. Examples of a fiduciary relationship include homeowner and property manager, patient and doctor, or client and attorney. The latter having the ability to make decisions about the trusted asset that fall within the conditions agreed by the former.

Jack M. Balkin and Jonathan Zittrain wrote a case for “information fiduciary”, in which they pointed out the urgency of adopting the practice of fiduciary in the data space. In the Atlantic, they wrote:

“The information age has created new kinds of entities that have many of the trappings of fiduciaries—huge online businesses, like Facebook, Google, and Uber, that collect, analyze, and use our personal information—sometimes in our interests and sometimes not. Like older fiduciaries, these businesses have become virtually indispensable. Like older fiduciaries, these companies collect a lot of personal information that could be used to our detriment. And like older fiduciaries, these businesses enjoy a much greater ability to monitor our activities than we have to monitor theirs. As a result, many people who need these services often shrug their shoulders and decide to trust them. But the important question is whether these businesses, like older fiduciaries, have legal obligations to be trustworthy. The answer is that they should.”

Recent controversy involving Facebook data and Cambridge Analytica provides another reason for why companies collecting data from users need to act as a fiduciary. Within this framework, individuals would have a say over how and where their data can be used.

Another call for a form of data fiduciary comes from Google’s Sidewalk Labs project in Canada. After collecting data to inform urban planning in Quayside area in Toronto, Sidewalk Labs announced that they won’t be claiming ownership over the data that they collected and that the data should be “under the control of an independent Civic Data Trust.”

In a blog post, Sidewalk Labs wrote that:

“Sidewalk Labs believes an independent Civic Data Trust should become the steward of urban data collected in the physical environment. This Trust would approve and control the collection of, and manage access to, urban data originating in Quayside. The Civic Data Trust would be guided by a charter ensuring that urban data is collected and used in a way that is beneficial to the community, protects privacy, and spurs innovation and investment.”

Realizing the potential of creating new public value through an exchange of data, or data collaboratives, the GovLab “ is advancing the concept and practice of Data Stewardship to promote responsible data leadership that can address the challenges of the 21st century.” A Data Steward mirrors some of the responsibilities of a data fiduciary, in that she/he is “responsible for determining what, when, how and with whom to share private data for public good.”

Balkin and Zittrain suggest that there is an asymmetrical power between companies that collect user generated data and the users themselves, in that these companies are becoming indispensable and having more control over individuals data. However, these companies are currently not legally obligated to be trustworthy, meaning that there is no legal consequence for when they use this data in a way that breach privacy or in the least interest of the customers.

Under a data fiduciary framework, individuals who are trusted with data are attached with legal rights and responsibilities regarding the use of the data. In a case where a breach of trust happens, the trustee will have to face legal consequences.

More information:

Index: Trust in Institutions 2019


By Michelle Winowatan, Andrew J. Zahuranec, Andrew Young, Stefaan Verhulst

The Living Library Index – inspired by the Harper’s Index – provides important statistics and highlights global trends in governance innovation. This installment focuses on trust in institutions.

Please share any additional, illustrative statistics on open data, or other issues at the nexus of technology and governance, with us at info@thelivinglib.org

Global Trust in Public Institutions

Trust in Government

United States

  • Americans who say their democracy is working at least “somewhat well:” 58% – 2018
  • Number who believe sweeping changes to their government are needed: 61% – 2018
  • Percentage of Americans expressing faith in election system security: 45% – 2018
  • Percentage of Americans expressing an overarching trust in government: 40% – 2019
  • How Americans would rate the trustworthiness of Congress: 4.1 out of 10 – 2017
  • Number who have confidence elected officials act in the best interests of the public: 25% – 2018
  • Amount who trust the federal government to do what is right “just about always or most of the time”: 18% – 2017
  • Americans with trust and confidence in the federal government to handle domestic problems: 2 in 5 – 2018
    • International problems: 1 in 2 – 2018
  • US institution with highest amount of confidence to act in the best interests of the public: The Military (80%) – 2018
  • Most favorably viewed level of government: Local (67%) – 2018
  • Most favorably viewed federal agency: National Park Service (83% favorable) – 2018
  • Least favorable federal agency: Immigration and Customs Enforcement (47% unfavorable) – 2018

United Kingdom

  • Overall trust in government: 42% – 2019
    • Number who think the country is headed in the “wrong direction:” 7 in 10 – 2018
    • Those who have trust in politicians: 17% – 2018
    • Amount who feel unrepresented in politics: 61% – 2019
    • Amount who feel that their standard of living will get worse over the next year: Nearly 4 in 10 – 2019
  • Trust the national government handling of personal data:

European Union

Africa

Latin America

Other

Trust in Media

  • Percentage of people around the world who trust the media: 47% – 2019
    • In the United Kingdom: 37% – 2019
    • In the United States: 48% – 2019
    • In China: 76% – 2019
  • Rating of news trustworthiness in the United States: 4.5 out of 10 – 2017
  • Number of citizens who trust the press across the European Union: Almost 1 in 2 – 2019
  • France: 3.9 out of 10 – 2019
  • Germany: 4.8 out of 10 – 2019
  • Italy: 3.8 out of 10 – 2019
  • Slovenia: 3.9 out of 10 – 2019
  • Percentage of European Union citizens who trust the radio: 59% – 2017
    • Television: 51% – 2017
    • The internet: 34% – 2017
    • Online social networks: 20% – 2017
  • EU citizens who do not actively participate in political discussions on social networks because they don’t trust online social networks: 3 in 10 – 2018
  • Those who are confident that the average person in the United Kingdom can tell real news from ‘fake news’: 3 in 10 – 2018

Trust in Business

Sources