DATA – Page 301 – The Living Library

Google and the University of Chicago Are Sued Over Data Sharing

Curated on June 30, 2019June 30, 2019 by Stefaan Verhulst

Daisuke Wakabayashi in The New York Times: “When the University of Chicago Medical Center announced a partnership to share patient data with Google in 2017, the alliance was promoted as a way to unlock information trapped in electronic health records and improve predictive analysis in medicine.

On Wednesday, the University of Chicago, the medical center and Google were sued in a potential class-action lawsuit accusing the hospital of sharing hundreds of thousands of patients’ records with the technology giant without stripping identifiable date stamps or doctor’s notes.

The suit, filed in United States District Court for the Northern District of Illinois, demonstrates the difficulties technology companies face in handling health data as they forge ahead into one of the most promising — and potentially lucrative — areas of artificial intelligence: diagnosing medical problems.

Google is at the forefront of an effort to build technology that can read electronic health records and help physicians identify medical conditions. But the effort requires machines to learn this skill by analyzing a vast array of old health records collected by hospitals and other medical institutions.

That raises privacy concerns, especially when is used by a company like Google, which already knows what you search for, where you are and what interests you hold.

In 2016, DeepMind, a London-based A.I. lab owned by Google’s parent company, Alphabet, was accused of violating patient privacy after it struck a deal with Britain’s National Health Service to process medical data for research….(More)”.

Open Mobility Foundation

Curated on June 29, 2019June 29, 2019 by Stefaan Verhulst

Press Release: “The Open Mobility Foundation (OMF) – a global coalition led by cities committed to using well-designed, open-source technology to evolve how cities manage transportation in the modern era – launched today with the mission to promote safety, equity and quality of life. The announcement comes as a response to the growing number of vehicles and emerging mobility options on city streets. A new city-governed non-profit, the OMF brings together academic, commercial, advocacy and municipal stakeholders to help cities develop and deploy new digital mobility tools, and provide the governance needed to efficiently manage them.

“Cities are always working to harness the power of technology for the public good. The Open Mobility Foundation will help us manage emerging transportation infrastructures, and make mobility more accessible and affordable for people in all of our communities,” said Los Angeles Mayor Eric Garcetti, who also serves as Advisory Council Chair of Accelerator for America, which showcased the MDS platform early on.

The OMF convenes a new kind of public-private forum to seed innovative ideas and govern an evolving software platform. Serving as a forum for discussions about pedestrian safety, privacy, equity, open-source governance and other related topics, the OMF has engaged a broad range of city and municipal organizations, private companies and non-profit groups, and experts and advocates to ensure comprehensive engagement and expertise on vital issues….

The OMF governs a platform called “Mobility Data Specification” (MDS) that the Los Angeles Department of Transportation developed to help manage dockless micro-mobility programs (including shared dockless e-scooters). MDS is comprised of a set of Application Programming Interfaces (APIs) that create standard communications between cities and private companies to improve their operations. The APIs allow cities to collect data that can inform real-time traffic management and public policy decisions to enhance safety, equity and quality of life. More than 50 cities across the United States – and dozens across the globe – already use MDS to manage micro-mobility services.

Making this software open and free offers a safe and efficient environment for stakeholders, including municipalities, companies, experts and the public, to solve problems together. And because private companies scale best when cities can offer a consistent playbook for innovation, the OMF aims to nurture those services that provide the highest benefit to the largest number of people, from sustainability to safety outcomes….(More)”

Seize the Data: Using Evidence to Transform How Federal Agencies Do Business

Curated on June 29, 2019June 29, 2019 by Stefaan Verhulst

Report by the Partnership for Public Service: “The use of data analysis, rigorous evaluation and a range of other credible strategies to inform decision-making is becoming more common across government. Even so, the movement is nascent, with leading practices implemented at some agencies, but not yet widely adopted. Much more progress is necessary. In fact, the recently enacted Foundations for Evidence-Based Policymaking Act, as well as the recently released draft Federal Data Strategy Action Plan, both prioritize broader adoption of leading practices.

To support that effort, this report highlights practical steps that agencies can take to become more data-driven and evidence-based. The findings emerged from a series of workshops and interviews conducted between April 2018 and May 2019 by the Partnership for Public Service and Grant Thornton. From these sessions, we learned that the most forward-thinking agencies rely on multiple approaches, including:
• Using top-down and bottom-up approaches to build evidence-based organizations.
• Driving longer-term and shorter-term learning.
• Using existing data and new data.
• Strengthening internal capacity and creating external research practitioner partnerships.
This report describes what these strategies look like in practice, and shares real-world and replicable examples of how leading agencies have become more data-driven and evidence-based….(More)”.

Challenges in using data across government

Curated on June 29, 2019July 2, 2019 by Stefaan Verhulst

National Audit Office (UK): “Data is crucial to the way government delivers services for citizens, improves its own systems and processes, and makes decisions. Our work has repeatedly highlighted the importance of evidence-based decision-making at all levels of government activity, and the problems that arise when data is inadequate.

Government recognises the value of using data more effectively, and the importance of ensuring security and public trust in how it is used. It plans to produce a new national data strategy in 2020 to position “the UK as a global leader on data, working collaboratively and openly across government”.

To achieve its ambitions government will need to resolve fundamental challenges around how to use and share data safely and appropriately, and how to balance competing demands on public resources in a way that allows for sustained but proportionate investment in data. The future national data strategy provides the government with an opportunity to do this, building on the renewed interest and focus on the use of data within government and beyond.

Content and scope of the report

This report sets out the National Audit Office’s experience of data across government, including initial efforts to start to address the issues. From our past work we have identified three areas where government needs to establish the pre-conditions for success: clear strategy and leadership; a coherent infrastructure for managing data; and broader enablers to safeguard and support the better use of data. In this report we consider:

the current data landscape across government (Part One);
how government needs a clear plan and leadership to improve its use of data (Part Two);
the quality, standards and systems needed to use data effectively (Part Three); and
wider conditions and enablers for success (Part Four).

Concluding remarks

Past examples such as Windrush and Carer’s Allowance show how important good‑quality data is, and the consequences if not used well. Without accurate, timely and proportionate data, government will not be able get the best use out of public money or take the next step towards more sophisticated approaches to using data that can reap real rewards.

But despite years of effort and many well-documented failures, government has lacked clear and sustained strategic leadership on data. This has led to departments under-prioritising their own efforts to manage and improve data. There are some early signs that the situation is improving, but unless government uses the data strategy to push a sea change in strategy and leadership, it will not get the right processes, systems and conditions in place to succeed, and this strategy will be yet another missed opportunity….(More)”.

We Need a Data-Rich Picture of What’s Killing the Planet

Curated on June 27, 2019July 19, 2019 by Stefaan Verhulst

Clive Thompson at Wired: “…Marine litter isn’t the only hazard whose contours we can’t fully see. The United Nations has 93 indicators to measure the environmental dimensions of “sustainable development,” and amazingly, the UN found that we have little to no data on 68 percent of them—like how rapidly land is being degraded, the rate of ocean acidification, or the trade in poached wildlife. Sometimes this is because we haven’t collected it; in other cases some data exists but hasn’t been shared globally, or it’s in a myriad of incompatible formats. No matter what, we’re flying blind. “And you can’t manage something if you can’t measure it,” says David Jensen, the UN’s head of environmental peacebuilding.

In other words, if we’re going to help the planet heal and adapt, we need a data revolution. We need to build a “digital ecosystem for the environment,” as Jensen puts it.

The good news is that we’ve got the tools. If there’s one thing tech excels at (for good and ill), it’s surveillance, right? We live in a world filled with cameras and pocket computers, titanic cloud computing, and the eerily sharp insights of machine learning. And this stuff can be used for something truly worthwhile: studying the planet.

There are already some remarkable cases of tech helping to break through the fog. Consider Global Fishing Watch, a nonprofit that tracks the world’s fishing vessels, looking for overfishing. They use everything from GPS-like signals emitted by ships to satellite infrared imaging of ship lighting, plugged into neural networks. (It’s massive, cloud-scale data: over 60 million data points per day, making the AI more than 90 percent accurate at classifying what type of fishing activity a boat is engaged in.)

“If a vessel is spending its time in an area that has little tuna and a lot of sharks, that’s questionable,” says Brian Sullivan, cofounder of the project and a senior program manager at Google Earth Outreach. Crucially, Global Fishing Watch makes its data open to anyone—so now the National Geographic Society is using it to lobby for new marine preserves, and governments and nonprofits use it to target illicit fishing.

If we want better environmental data, we’ll need for-profit companies with the expertise and high-end sensors to pitch in too. Planet, a firm with an array of 140 satellites, takes daily snapshots of the entire Earth. Customers like insurance and financial firms love that sort of data. (It helps them understand weather and climate risk.) But Planet also offers it to services like Global Forest Watch, which maps deforestation and makes the information available to anyone (like activists who help bust illegal loggers). Meanwhile, Google’s skill in cloud-based data crunching helps illuminate the state of surface water: Google digitized 30 years of measurements from around the globe—extracting some from ancient magnetic tapes—then created an easy-to-use online tool that lets resource-poor countries figure out where their water needs protecting….(More)”.

Postsecondary Data Infrastructure: What is Possible Today

Curated on June 27, 2019June 27, 2019 by Stefaan Verhulst

Report by Amy O’Hara: “Data sharing across government agencies allows consumers, policymakers, practitioners, and researchers to answer pressing questions. Creating a data infrastructure to enable this data sharing for higher education data is challenging, however, due to legal, privacy, technical, and perception issues. To overcome these challenges, postsecondary education can learn from other domains to permit secure, responsible data access and use. Working models from both the public sector and academia show how sensitive data from multiple sources can be linked and accessed for authorized uses.

This brief describes best practices in use today and the emerging technology that could further protect future data systems and creates a new framework, the “Five Safes”, for controlling data access and use. To support decisions facing students, administrators, evaluators, and policymakers, a postsecondary infrastructure must support cycles of data discovery, request, access, analysis, review, and release. It must be cost-effective, secure, and efficient and, ideally, it will be highly automated, transparent, and adaptable. Other industries have successfully developed such infrastructures, and postsecondary education can learn from their experiences.

A functional data infrastructure relies on trust and control between the data providers, intermediaries, and users. The system should support equitable access for approved users and offer the ability to conduct independent analyses with scientific integrity for reasonable financial costs. Policymakers and developers should ensure the creation of expedient, convenient data access modes that allow for policy analyses. …

The “Five Safes” framework describes an approach for controlling data access and use. The five safes are: safe projects, safe people, safe settings, safe data, and afe outputs….(More)”.

Open Urban Data and the Sustainable Development Goals

Curated on June 25, 2019June 25, 2019 by Stefaan Verhulst

Conference Paper by Christine Meschede and Tobias Siebenlist: “Since the adoption of the United Nations’ Sustainable Development Goals (SDGs) in 2015 – an ambitious agenda to end poverty, combat environmental threats and ensure prosperity for everyone – some effort has been made regarding the adequate measuring of the progress on its targets. As the crucial point is the availability of sufficient, comparable information, open data can play a key role. The coverage of open data, i.e., data that is machine-readable, freely available and reusable for everyone, is assessed by several measurement tools. We propose the use of open governmental data to make the achievement of SDGs easy and transparent to measure. For this purpose, a mapping of the open data categories to the SDGs is presented. Further, we argue that the SDGs need to be tackled in particular at the city level. For analyzing the current applicability of open data for measuring progress on the SDGs, we provide a small-scale case study on German open data portals and the embedded data categories and datasets. The results suggest that further standardization is needed in order to be able to use open data for comparing cities and their progress towards the SDGs….(More)”.

Access to Data in Connected Cars and the Recent Reform of the Motor Vehicle Type Approval Regulation

Curated on June 25, 2019June 25, 2019 by Stefaan Verhulst

Paper by Wolfgang Kerber and Daniel Moeller: “The need for regulatory solutions for access to in-vehicle data and resources of connected cars is one of the big controversial and unsolved policy issues. Last year the EU revised the Motor Vehicle Type Approval Regulation which already entailed a FRAND-like solution for the access to repair and maintenance information (RMI) to protect competition on the automotive aftermarkets. However, the transition to connected cars changes the technological conditions for this regulatory solution significantly. This paper analyzes the reform of the type approval regulation and shows that the regulatory solutions for access to RMI are so far only very insufficiently capable of dealing with the challenges coming along with increased connectivity, e.g. with regard to the new remote diagnostic, repair and maintenance services. Therefore, an important result of the paper is that the transition to connected cars will require a further reform of the rules for the regulated access to RMI (esp. with regard to data access, interoperability, and safety/security issues). However, our analysis also suggests that the basic approach of the current regulated access regime for RMI in the type approval regulation can also be a model for developing general solutions for the currently unsolved problems of access to in-vehicle data and resources in the ecosystem of connected driving….(More)”.

Measuring and Protecting Privacy in the Always-On Era

Curated on June 25, 2019June 25, 2019 by Stefaan Verhulst

Paper by Dan Feldman and Eldar Haber: “Datamining practices have become greatly enhanced in the interconnected era. What began with the internetnow continues through the Internet of Things (IoT), whereby users can constantly be connected to the internet through various means like televisions, smartphones, wearables and computerized personal assistants, among other “things.” As many of these devices operate in a so-called “always-on” mode, constantly receiving and transmitting data, the increased use of IoT devices might lead society into an “always-on” era, where individuals are constantly datafied. As the current regulatory approach to privacy is sectoral in nature, i.e., protects privacy only within a specific context of information gathering or use, and directed only to specific pre-defined industries or a specific cohort, the individual’s privacy is at great risk. On the other hand, strict privacy regulation might negatively impact data utility which serves many purposes, and, perhaps mainly, is crucial for technological development and innovation. The tradeoff between data utility and privacy protection is most unlikely to be resolved under the sectoral approach to privacy, but a technological solution that relies mostly on a method called differential privacy might be of great help. It essentially suggests adding “noise” to data deemed sensitive ex-ante, depending on various parameters further suggested in this Article. In other words, using computational solutions combined with formulas that measure the probability of data sensitivity, privacy could be better protected in the always-on era.

This Article introduces legal and computational methods that could be used by IoT service providers and will optimally balance the tradeoff between data utility and privacy. It comprises several stages. The first Part discusses the protection of privacy under the sectoral approach, and estimates what values are embedded in it. The second Part discusses privacy protection in the “always-on” era. First it assesses how technological changes have shaped the sectoral regulation, then discusses why privacy is negatively impacted by IoT devices and the potential applicability of new regulatory mechanisms to meet the challenges of the “always-on” era. After concluding that the current regulatory framework is severely limited in protecting individuals’ privacy, the third Part discusses technology as a panacea, while offering a new computational model that relies on differential privacy and a modern technique called private coreset. The proposed model seeks to introduce “noise” to data on the user’s side to preserve individual’s privacy — depending on the probability of data sensitivity of the IoT device — while enabling service providers to utilize the data….(More)”.

The language we use to describe data can also help us fix its problems

Curated on June 22, 2019June 26, 2019 by Stefaan Verhulst

Luke Stark & Anna Lauren Hoffmann at Quartz: “Data is, apparently, everything.

It’s the “new oil” that fuels online business. It comes in floods or tsunamis. We access it via “streams” or “fire hoses.” We scrape it, mine it, bank it, and clean it. (Or, if you prefer your buzzphrases with a dash of ageism and implicit misogyny, big data is like “teenage sex,” while working with it is “the sexiest job” of the century.)

These data metaphors can seem like empty cliches, but at their core they’re efforts to come to grips with the continuing onslaught of connected devices and the huge amounts of data they generate.

In a recent article, we—an algorithmic-fairness researcher at Microsoft and a data-ethics scholar at the University of Washington—push this connection one step further. More than simply helping us wrap our collective heads around data-fueled technological change, we set out to learn what these metaphors can teach us about the real-life ethics of collecting and handling data today.

Instead of only drawing from the norms and commitments of computer science, information science, and statistics, what if we looked at the ethics of the professions evoked by our data metaphors instead?…(More)”.