Open Data, Grey Data, and Stewardship: Universities at the Privacy Frontier.


Paper by Christine L. Borgman: “As universities recognize the inherent value in the data they collect and hold, they encounter unforeseen challenges in stewarding those data in ways that balance accountability, transparency, and protection of privacy, academic freedom, and intellectual property. Two parallel developments in academic data collection are converging: (1) open access requirements, whereby researchers must provide access to their data as a condition of obtaining grant funding or publishing results in journals; and (2) the vast accumulation of “grey data” about individuals in their daily activities of research, teaching, learning, services, and administration.

The boundaries between research and grey data are blurring, making it more difficult to assess the risks and responsibilities associated with any data collection. Many sets of data, both research and grey, fall outside privacy regulations such as HIPAA, FERPA, and PII. Universities are exploiting these data for research, learning analytics, faculty evaluation, strategic decisions, and other sensitive matters. Commercial entities are besieging universities with requests for access to data or for partnerships to mine them. The privacy frontier facing research universities spans open access practices, uses and misuses of data, public records requests, cyber risk, and curating data for privacy protection. This Article explores the competing values inherent in data stewardship and makes recommendations for practice by drawing on the pioneering work of the University of California in privacy and information security, data governance, and cyber risk….(More)”.

The latest tools for sexual assault victims: Smartphone apps and software


 
Peter Holley at the Washington Post:  “…For much of the past decade, dozens of apps and websites have been created to help survivors of sexual assault electronically record and report such crimes. They are designed to assist an enormous pool of potential victims. The Rape Abuse & Incest National Network reports that more than 11 percent of all college students — both graduate and undergraduate — experience rape or sexual assault through physical force, violence or incapacitation. Despite the prevalence of such incidents, less than 10 percent of victims on college campuses report their assaults, according to the National Sexual Violence Resource Center.

The apps range from electronic reporting tools such as JDoe to legal guides that provide victims with access to law enforcement and crisis counseling. Others help victims save and share relevant medical information in case of an assault. The app Uask includes a “panic button” that connects users with 911 or allows them to send emergency messages to people with their location.

 

Since its debut in 2015, Callisto’s software has been adopted by 12 college campuses — including Stanford, the University of Oregon and St. John’s University — and made available to more than 160,000 students, according to the company. Sexual assault survivors who visit Callisto are six times as likely to report, and 15 percent of those survivors have matched with another victim of the same assailant, the company claims.

Peter Cappelli, a professor of management at the Wharton School and director of Wharton’s Center for Human Resources, told NPR that he sees potential problems with survivors “crowdsourcing” their decision to report assaults.

“I don’t think we want to have a standard where the decisions are crowdsourced,” he said. “I think what you want is to tell people [that] the criteria [for whether or not to report] are policy related, not personally related, and you should bring forward anything that fits the criteria, not [based on] whether you feel enough other people have made the complaint or not. We want to sometimes encourage people to do things they might feel uncomfortable about.”…(More)”.

The secret data collected by dockless bikes is helping cities map your movement


Lime is able to collect this information because its bikes, like all those in dockless bike-share programs, are built to operate without fixed stations or corrals. …In the 18 months or so since dockless bike-share arrived in the US, the service has spread to at least 88 American cities. (On the provider side, at least 10 companies have jumped into the business; Lime is one of the largest.) Some of those cities now have more than a year of data related to the programs, and they’ve started gleaning insights and catering to the increased number of cyclists on their streets.

South Bend is one of those leaders. It asked Lime to share data when operations kicked off in June 2017. At first, Lime provided the information in spreadsheets, but in early 2018 the startup launched a browser-based dashboard where cities could see aggregate statistics for their residents, such as how many of them rented bikes, how many trips they took, and how far and long they rode. Lime also added heat maps that reveal where most rides occur within a city and a tool for downloading data that shows individual trips without identifying the riders. Corcoran can glance at his dashboard and see, for example, that people in South Bend have taken 340,000 rides, traveled 158,000 miles, and spent more than 7 million minutes on Lime bikes since the company started service. He can also see there are 700 Lime bikes active in the city, down from an all-time high of 1,200 during the University of Notre Dame’s 2017 football season….(More)”.

Direct Democracy and Political Engagement of the Marginalized


Dissertation by Jeong Hyun Kim: “…examines direct democracy’s implications for political equality by focusing on how it influences and modifies political attitudes and behaviors of marginalized groups. Using cases and data from Sweden, Switzerland, and the United States, I provide a comprehensive, global examination of how direct democratic institutions affect political participation, especially of political minority or marginalized groups.

In the first paper, I examine whether the practice of direct democracy supports women’s political participation. I theorize that the use of direct democracy enhances women’s sense of political efficacy, thereby promoting their participation in the political process. I test this argument by leveraging a quasi-experiment in Sweden from 1921 to 1944, wherein the use of direct democratic institutions was determined by a population threshold. Findings from a regression discontinuity analysis lend strong support for the positive effect of direct democracy on women’s political participation. Using web documents of minutes from direct democratic meetings, I further show that women’s participation in direct democracy is positively associated with their subsequent participation in parliamentary elections.

The second paper expands on the first paper by examining an individual-level mechanism linking experience with direct democracy and feelings of political efficacy. Using panel survey data from Switzerland, I examine the relationship between individuals’ exposure to direct democracy and the gender gap in political efficacy. I find that direct democracy increases women’s sense of political efficacy, while it has no significant effect on men. This finding confirms that the opportunity for direct legislation leads women to feel more efficacious in politics, suggesting its further implications for the gender gap in political engagement.

In the third and final paper, I examine how direct democratic votes targeting ethnic minorities influence political mobilization of minority groups. I theorize that targeted popular votes intensify the general public’s hostility towards minority groups, thereby enhancing group members’ perceptions of being stigmatized. Consequently, this creates a greater incentive for minorities to actively engage in politics. Using survey data from the United States, combined with information about state-level direct democracy, I find that direct democratic votes targeting the rights of immigrants lead to greater political activism among ethnic minorities with immigrant background. These studies contribute to the extant study of women and minority politics by illuminating new mechanisms underlying mobilization of women and minorities and clarifying the causal effect of the type of government on political equality….(More)”.

What Can Satellite Imagery Tell Us About Obesity in Cities?


Emily Matchar at Smithsonian: “About 40 percent of American adults are obese, defined as having a body mass index (BMI) over 30. But obesity is not evenly distributed around the country. Some cities and states have far more obese residents than others. Why? Genetics, stress, income levels and access to healthy foods are play a role. But increasingly researchers are looking at the built environment—our cities—to understand why people are fatter in some places than in others.

New research from the University of Washington attempts to take this approach one step further by using satellite data to examine cityscapes. By using the satellite images in conjunction with obesity data, they hope to uncover which urban features might influence a city’s obesity rate.

The researchers used a deep learning network to analyze about 150,000 high-resolution satellite image of four cities: Los Angeles, Memphis, San Antonio and Seattle. The cities were selected for being from states with both high obesity rates (Texas and Tennessee) and low obesity rates (California and Washington). The network extracted features of the built environment: crosswalks, parks, gyms, bus stops, fast food restaurants—anything that might be relevant to health.

“If there’s no sidewalk you’re less likely to go out walking,” says Elaine Nsoesie, a professor of global health at the University of Washington who led the research.

The team’s algorithm could then see what features were more or less common in areas with greater and lesser rates of obesity. Some findings were predictable: more parks, gyms and green spaces were correlated with lower obesity rates. Others were surprising: more pet stores equaled thinner residents (“a high density of pet stores could indicate high pet ownership, which could influence how often people go to parks and take walks around the neighborhood,” the team hypothesized).

A paper on the results was recently published in the journal JAMA Network Open….(More)”.

The New York City Business Atlas: Leveling the Playing Field for Small Businesses with Open Data


Chapter by Stefaan Verhulst and Andrew Young in Smarter New York City:How City Agencies Innovate. Edited by André Corrêa d’Almeida: “While retail entrepreneurs, particularly those operating in the small-business space, are experts in their respective trades, they often lack access to high-quality information about social, environmental, and economic conditions in the neighborhoods where they operate or are considering operating.

The New York City Business Atlas, conceived by the Mayor’s Office of Data Analytics (MODA) and the Department of Small Business Services, is designed to alleviate that information gap by providing a public web-based tool that gives small businesses access to high-quality data to help them decide where to establish a new business or expand an existing one. e tool brings together a diversity of data, including business-fling data from the Department of Consumer Affairs, sales-tax data from the Department of Finance, demographic data from the census, and traffic data from Placemeter, a New York City startup focusing on real-time traffic information.

The initial iteration of the Business Atlas made useful and previously inaccessible data available to small-business owners and entrepreneurs in an innovative manner. After a few years, however, it became clear that the tool was not experiencing the level of use or creating the level of demonstrable impact anticipated. Rather than continuing down the same path or abandoning the effort entirely, MODA pivoted to a new approach, moving from the Business Atlas as a single information-providing tool to the Business Atlas as a suite of capabilities aimed at bolstering New York’s small-business community.

Through problem- and user-centered efforts, the Business Atlas is now making important insights available to stakeholders who can put it to meaningful use—from how long it takes to open a restaurant in the city to which areas are most in need of education and outreach to improve their code compliance. This chapter considers the open data environment from which the Business Atlas was launched, details the initial version of the Business Atlas and the lessons it generated and describes the pivot to this new approach….(More)”.

Ethics & Algorithms Toolkit


Toolkit: “Government leaders and staff who leverage algorithms are facing increasing pressure from the public, the media, and academic institutions to be more transparent and accountable about their use. Every day, stories come out describing the unintended or undesirable consequences of algorithms. Governments have not had the tools they need to understand and manage this new class of risk.

GovEx, the City and County of San Francisco, Harvard DataSmart, and Data Community DC have collaborated on a practical toolkit for cities to use to help them understand the implications of using an algorithm, clearly articulate the potential risks, and identify ways to mitigate them….We developed this because:

  • We saw a gap. There are many calls to arms and lots of policy papers, one of which was a DataSF research paper, but nothing practitioner-facing with a repeatable, manageable process.
  • We wanted an approach which governments are already familiar with: risk management. By identifing and quantifying levels of risk, we can recommend specific mitigations.. …(More)”.

United Nations accidentally exposed passwords and sensitive information to the whole internet


Micah Lee at The Intercept: “The United Nations accidentally published passwords, internal documents, and technical details about websites when it misconfigured popular project management service Trello, issue tracking app Jira, and office suite Google Docs.

The mistakes made sensitive material available online to anyone with the proper link, rather than only to specific users who should have access. Affected data included credentials for a U.N. file server, the video conferencing system at the U.N.’s language school, and a web development environment for the U.N.’s Office for the Coordination of Humanitarian Affairs. Security researcher Kushagra Pathak discovered the accidental leak and notified the U.N. about what he found a little over a month ago. As of today, much of the material appears to have been taken down.

In an online chat, Pathak said he found the sensitive information by running searches on Google. The searches, in turn, produced public Trello pages, some of which contained links to the public Google Docs and Jira pages.

Trello projects are organized into “boards” that contain lists of tasks called “cards.” Boards can be public or private. After finding one public Trello board run by the U.N., Pathak found additional public U.N. boards by using “tricks like by checking if the users of one Trello board are also active on some other boards and so on.” One U.N. Trello board contained links to an issue tracker hosted on Jira, which itself contained even more sensitive information. Pathak also discovered links to documents hosted on Google Docs and Google Drive that were configured to be accessible to anyone who knew their web addresses. Some of these documents contained passwords….Here is just some of the sensitive information that the U.N. accidentally made accessible to anyone who Googled for it:

  • A social media team promoting the U.N.’s “peace and security” efforts published credentials to access a U.N. remote file access, or FTP, server in a Trello card coordinating promotion of the International Day of United Nations Peacekeepers. It is not clear what information was on the server; Pathak said he did not connect to it.
  • The U.N.’s Language and Communication Programme, which offers language courses at U.N. Headquarters in New York City, published credentials for a Google account and a Vimeo account. The program also exposed, on a publicly visible Trello board, credentials for a test environment for a human resources web app. It also made public a Google Docs spreadsheet, linked from a public Trello board, that included a detailed meeting schedule for 2018, along with passwords to remotely access the program’s video conference system to join these meetings.
  • One public Trello board used by the developers of Humanitarian Response and ReliefWeb, both websites run by the U.N.’s Office for the Coordination of Humanitarian Affairs, included sensitive information like internal task lists and meeting notes. One public card from the board had a PDF, marked “for internal use only,” that contained a map of all U.N. buildings in New York City. …(More)”.

Computers Can Solve Your Problem. You May Not Like The Answer


David Scharfenberg at the Boston Globe: “Years of research have shown that teenagers need their sleep. Yet high schools often start very early in the morning. Starting them later in Boston would require tinkering with elementary and middle school schedules, too — a Gordian knot of logistics, pulled tight by the weight of inertia, that proved impossible to untangle.

Until the computers came along.

Last year, the Boston Public Schools asked MIT graduate students Sébastien Martin and Arthur Delarue to build an algorithm that could do the enormously complicated work of changing start times at dozens of schools — and rerouting the hundreds of buses that serve them….

The algorithm was poised to put Boston on the leading edge of a digital transformation of government. In New York, officials were using a regression analysis tool to focus fire inspections on the most vulnerable buildings. And in Allegheny County, Pa., computers were churning through thousands of health, welfare, and criminal justice records to help identify children at risk of abuse….

While elected officials tend to legislate by anecdote and oversimplify the choices that voters face, algorithms can chew through huge amounts of complicated information. The hope is that they’ll offer solutions we’ve never imagined ­— much as Google Maps, when you’re stuck in traffic, puts you on an alternate route, down streets you’ve never traveled.

Dataphiles say algorithms may even allow us to filter out the human biases that run through our criminal justice, social service, and education systems. And the MIT algorithm offered a small window into that possibility. The data showed that schools in whiter, better-off sections of Boston were more likely to have the school start times that parents prize most — between 8 and 9 a.m. The mere act of redistributing start times, if aimed at solving the sleep deprivation problem and saving money, could bring some racial equity to the system, too.

Or, the whole thing could turn into a political disaster.

District officials expected some pushback when they released the new school schedule on a Thursday night in December, with plans to implement in the fall of 2018. After all, they’d be messing with the schedules of families all over the city.

But no one anticipated the crush of opposition that followed. Angry parents signed an online petition and filled the school committee chamber, turning the plan into one of the biggest crises of Mayor Marty Walsh’s tenure. The city summarily dropped it. The failure would eventually play a role in the superintendent’s resignation.

It was a sobering moment for a public sector increasingly turning to computer scientists for help in solving nagging policy problems. What had gone wrong? Was it a problem with the machine? Or was it a problem with the people — both the bureaucrats charged with introducing the algorithm to the public, and the public itself?…(More)”

How Insurance Companies Used Bad Science to Discriminate


Jessie Wright-Mendoza at JStor: “After the Civil War, the United States searched for ways to redefine itself. But by the 1880’s, the hopes of Reconstruction had dimmed. Across the United States there was instead a push to formalize and legalize discrimination against African-Americans. The effort to marginalize the first generation of free black Americans infiltrated nearly every aspect of daily life, including the cost of insurance.

Initially, African-Americans could purchase life insurance policies on equal footing with whites. That all changed in 1881. In March of that year Prudential, one of the country’s largest insurers, announced that policies held by black adults would be worth one-third less than the same plans held by whites. Their weekly premiums would remain the same. Benefits for black children didn’t change, but weekly premiums for their policies would rise by five cents.

Prudential defended the decision by pointing out that the black mortality rate was higher than the white mortality rate. Therefore, they explained, claims paid out for black policyholders were a disproportionate amount of all payouts. Most of the major life insurance companies followed suit, making it nearly impossible for African-Americans to gain coverage. Across the industry, companies blocked agents from soliciting African-American customers and denied commission for any policies issued to blacks.

The public largely accepted the statistical explanation for unequal coverage. The insurer’s job was to calculate risk. Race was merely another variable like occupation or geographic location. As one trade publication put it in 1891: “Life insurance companies are not negro-maniacs, they are business institutions…there is no sentiment and there are no politics in it.”

Companies considered race-based risk the same for all African-Americans, whether they were strong or sickly, educated or uneducated, from the country or the city. The “science” behind the risk formula is credited to Prudential statistician Frederick L. Hoffman, whose efforts to prove the genetic inferiority of the black race were used to justify the company’s discriminatory policies….(More)”.