Access to Government Information in the United States: A Primer


Wendy Ginsberg and Michael Greene at Congressional Research Service: “No provision in the U.S. Constitution expressly establishes a procedure for public access to executive branch records or meetings. Congress, however, has legislated various public access laws. Among these laws are two records access statutes,

  • the Freedom of Information Act (FOIA; 5 U.S.C. §552), and
  • the Privacy Act (5 U.S.C. §552a),

and two meetings access statutes,

  •  the Federal Advisory Committee Act (FACA; 5 U.S.C. App.), and
  • the Government in the Sunshine Act (5 U.S.C. §552b).

These four laws provide the foundation for access to executive branch information in the American federal government. The records-access statutes provide the public with a variety of methods to examine how executive branch departments and agencies execute their missions. The meeting-access statutes provide the public the opportunity to participate in and inform the policy process. These four laws are also among the most used and most litigated federal access laws.

While the four statutes provide the public with access to executive branch federal records and meetings, they do not apply to the legislative or judicial branches of the U.S. government. The American separation of powers model of government provides a collection of formal and informal methods that the branches can use to provide information to one another. Moreover, the separation of powers anticipates conflicts over the accessibility of information. These conflicts are neither unexpected nor necessarily destructive. Although there is considerable interbranch cooperation in the sharing of information and records, such conflicts over access may continue on occasion.

This report offers an introduction to the four access laws and provides citations to additional resources related to these statutes. This report includes statistics on the use of FOIA and FACA and on litigation related to FOIA. The 114th Congress may have an interest in overseeing the implementation of these laws or may consider amending the laws. In addition, this report provides some examples of the methods Congress, the President, and the courts have employed to provide or require the provision of information to one another. This report is a primer on information access in the U.S. federal government and provides a list of resources related to transparency, secrecy, access, and nondisclosure….(More)”

It’s not big data that discriminates – it’s the people that use it


 in the Conversation: “Data can’t be racist or sexist, but the way it is used can help reinforce discrimination. The internet means more data is collected about us than ever before and it is used to make automatic decisions that can hugely affect our lives, from our credit scores to our employment opportunities.

If that data reflects unfair social biases against sensitive attributes, such as our race or gender, the conclusions drawn from that data might also be based on those biases.

But this era of “big data” doesn’t need to to entrench inequality in this way. If we build smarter algorithms to analyse our information and ensure we’re aware of how discrimination and injustice may be at work, we can actually use big data to counter our human prejudices.

This kind of problem can arise when computer models are used to make predictions in areas such as insurance, financial loans and policing. If members of a certain racial group have historically been more likely to default on their loans, or been more likely to be convicted of a crime, then the model can deem these people more risky. That doesn’t necessarily mean that these people actually engage in more criminal behaviour or are worse at managing their money. They may just be disproportionately targeted by police and sub-prime mortgage salesmen.

Excluding sensitive attributes

Data scientist Cathy O’Neil has written about her experience of developing models for homeless services in New York City. The models were used to predict how long homeless clients would be in the system and to match them with appropriate services. She argues that including race in the analysis would have been unethical.

If the data showed white clients were more likely to find a job than black ones, the argument goes, then staff might focus their limited resources on those white clients that would more likely have a positive outcome. While sociological research has unveiled the ways that racial disparities in homelessness and unemployment are the result of unjust discrimination, algorithms can’t tell the difference between just and unjust patterns. And so datasets should exclude characteristics that may be used to reinforce the bias, such as race.

But this simple response isn’t necessarily the answer. For one thing, machine learning algorithms can often infer sensitive attributes from a combination of other, non-sensitive facts. People of a particular race may be more likely to live in a certain area, for example. So excluding those attributes may not be enough to remove the bias….

An enlightened service provider might, upon seeing the results of the analysis, investigate whether and how racism is a barrier to their black clients getting hired. Equipped with this knowledge they could begin to do something about it. For instance, they could ensure that local employers’ hiring practices are fair and provide additional help to those applicants more likely to face discrimination. The moral responsibility lies with those responsible for interpreting and acting on the model, not the model itself.

So the argument that sensitive attributes should be stripped from the datasets we use to train predictive models is too simple. Of course, collecting sensitive data should be carefully regulated because it can easily be misused. But misuse is not inevitable, and in some cases, collecting sensitive attributes could prove absolutely essential in uncovering, predicting, and correcting unjust discrimination. For example, in the case of homeless services discussed above, the city would need to collect data on ethnicity in order to discover potential biases in employment practices….(More)

A new data viz tool shows what stories are being undercovered in countries around the world


Jospeh Lichterman at NiemanLab: “It’s a common lament: Though the Internet provides us access to a nearly unlimited number of sources for news, most of us rarely venture beyond the same few sources or topics. And as news consumption shifts to our phones, people are using even fewer sources: On average, consumers access 1.52 trusted news sources on their phones, according to the 2015 Reuters Digital News Report, which studied news consumption across several countries.

To try and diversify people’s perspectives on the news, Jigsaw — the techincubator, formerly known as Google Ideas, that’s run by Google’s parentcompany Alphabet — this week launched Unfiltered.News, an experimentalsite that uses Google News data to show users what topics are beingunderreported or are popular in regions around the world.

Screen Shot 2016-03-18 at 11.45.09 AM

Unfiltered.News’ main data visualization shows which topics are most reported in countries around the world. A column on the right side of the page highlights stories that are being reported widely elsewhere in the world, but aren’t in the top 100 stories on Google News in the selected country. In the United States yesterday, five of the top 10 underreported topics, unsurprisingly, dealt with soccer. In China, Barack Obama was the most undercovered topic….(More)”

Can Big Data Help Measure Inflation?


Bourree Lam in The Atlantic: “…As more and more people are shopping online, calculating this index has gotten more difficult, because there haven’t been any great ways of recording prices from the sites disparate retailers.Data shared by retailers and compiled by the technology firm Adobe might help close this gap. The company is perhaps known best for its visual software,including Photoshop, but the company has also become a provider of software and analytics for online retailers. Adobe is now aggregating the sales data that flows through their software for its Digital Price Index (DPI) project, an initiative that’s meant to answer some of the questions that have been dogging researcher snow that e-commerce is such a big part of the economy.

The project, which tracks billions of online transactions and the prices of over a million products, was developed with the help of the economists Austan Goolsbee, the former chairman of Obama’s Council of Economic Advisors and a professor at the University of Chicago’s Booth School of Business, and Peter Klenow, a professor at Stanford University. “We’ve been excited to help them setup various measures of the digital economy, and of prices, and also to see what the Adobe data can teach us about some of the questions that everybody’s had about the CPI,” says Goolsbee. “People are asking questions like ‘How price sensitive is online commerce?’ ‘How much is it growing?’ ‘How substitutable is itf or non-electronic commerce?’ A lot issues you can address with this in a way that we haven’t really been able to do before.” These are some questions that the DPI has the potential to answer.

…While this new trove of data will certainly be helpful to economists and analysts looking at inflation, it surely won’t replace the CPI. Currently, the government sends out hundreds of BLS employees to stores around the country to collect price data. Online pricing is a small part of the BLS calculation, which is incorporated into its methodology as people increasingly report shopping from retailers online, but there’s a significant time lag. While it’s unlikely that the BLS would incorporate private sources of data into its inflation calculations, as e-commerce grows they might look to improve the way they include online prices.Still, economists are optimistic about the potential of Adobe’s DPI. “I don’t think we know the digital economy as well as we should,” says Klenow, “and this data can help us eventually nail that better.”…(More)

Crowdlaw and open data policy: A perfect match?


 at Sunlight: “The open government community has long envisioned a future where all public policy is collaboratively drafted online and in the open — a future in which we (the people) don’t just have a say in who writes and votes on the rules that govern our society, but are empowered in a substantive way to participate, annotating or even crafting those rules ourselves. If that future seems far away, it’s because we’ve seen few successful instances of this approach in the United States. But an increasing amount of open and collaborative online approaches to drafting legislation — a set of practices the NYU GovLab and others have called “crowdlaw” — seem to have found their niche in open data policy.

This trend has taken hold at the local level, where multiple cities have employed crowdlaw techniques to draft or revise the regulations which establish and govern open data initiatives. But what explains this trend and the apparent connection between crowdlaw and the proactive release of government information online? Is it simply that both are “open government” practices? Or is there something more fundamental at play here?…

Since 2012, several high-profile U.S. cities have utilized collaborative tools such as Google Docs,GitHub, and Madison to open up the process of open data policymaking. The below chronology of notable instances of open data policy drafted using crowdlaw techniques gives the distinct impression of a good idea spreading in American cities:….

While many cities may not be ready to take their hands off of the wheel and trust the public to help engage in meaningful decisions about public policy, it’s encouraging to see some giving it a try when it comes to open data policy. Even for cities still feeling skeptical, this approach can be applied internally; it allows other departments impacted by changes that come about through an open data policy to weigh in, too. Cities can open up varying degrees of the process, retaining as much autonomy as they feel comfortable with. In the end, utilizing the crowdlaw process with open data legislation can increase its effectiveness and accountability by engaging the public directly — a win-win for governments and their citizens alike….(More)”

Social Media for Government: Theory and Practice


Book edited by Staci M. Zavattaro and Thomas A. Bryer: “Social media is playing a growing role within public administration, and with it, there is an increasing need to understand the connection between social media research and what actually takes place in government agencies. Most of the existing books on the topic are scholarly in nature, often leaving out the vital theory-practice connection. This book joins theory with practice within the public sector, and explains how the effectiveness of social media can be maximized. The chapters are written by leading practitioners and span topics like how to manage employee use of social media sites, how emergency managers reach the public during a crisis situation, applying public record management methods to social media efforts, how to create a social media brand, how social media can help meet government objectives such as transparency while juggling privacy laws, and much more. For each topic, a collection of practitioner insights regarding the best practices and tools they have discovered are included. Social Media for Government responds to calls within the overall public administration discipline to enhance the theory-practice connection, giving practitioners space to tell academics what is happening in the field in order to encourage further meaningful research into social media use within government….(More)

Cities, Data, and Digital Innovation


Paper by Mark Kleinman: “Developments in digital innovation and the availability of large-scale data sets create opportunities for new economic activities and new ways of delivering city services while raising concerns about privacy. This paper defines the terms Big Data, Open Data, Open Government, and Smart Cities and uses two case studies – London (U.K.) and Toronto – to examine questions about using data to drive economic growth, improve the accountability of government to citizens, and offer more digitally enabled services. The paper notes that London has been one of a handful of cities at the forefront of the Open Data movement and has been successful in developing its high-tech sector, although it has so far been less innovative in the use of “smart city” technology to improve services and lower costs. Toronto has also made efforts to harness data, although it is behind London in promoting Open Data. Moreover, although Toronto has many assets that could contribute to innovation and economic growth, including a growing high-technology sector, world-class universities and research base, and its role as a leading financial centre, it lacks a clear narrative about how these assets could be used to promote the city. The paper draws some general conclusions about the links between data innovation and economic growth, and between open data and open government, as well as ways to use big data and technological innovation to ensure greater efficiency in the provision of city services…(More)

App turns smartphones into seismic monitors


Springwise: “MyShake is an app that enables anyone to contribute to a worldwide seismic network and help people prepare for earthquakes.

The sheer number of smartphones on the planet make them excellent tools for collecting scientific data. We have already seen citizen scientists use their devices to help crowdsource big data about jellyfish and pollution.Now, MyShake is an Android app from Berkeley University, which enables anyone to contribute to a worldwide seismic network and help reduce the effects of earthquakes.

To begin, users download the app and enable it to run silently in the background of their smartphone. The app monitors for movement that fits the vibrational profile of an earthquake and sends anonymous information to a central system whenever relevant. The crowdsourced data enables the system to confirm an impending quake and estimate its origin time, location and magnitude. Then, the app can send warnings to those in the network who are likely to be affected by the earthquake. MyShake makes use of on the fact that the average smartphone can record earthquakes larger than magnitude five and within 10 km.

myshake-2-earthquake-crowdsource-citizen-scientist-app

MyShake is free to download and the team hopes to launch an iPhone version in the future….(More)”

Responsible Data reflection stories


Responsible Data Forum: “Through the various Responsible Data Forum events over the past couple of years, we’ve heard many anecdotes of responsible data challenges faced by people or organizations. These include potentially harmful data management practices, situations where people have experienced gut feelings that there is potential for harm, or workarounds that people have created to avoid those situations.

But we feel that trading in these “war stories” isn’t the most useful way for us to learn from these experiences as acommunity. Instead, we have worked with our communities to build a set of Reflection Stories: a structured, well-researched knowledge base on the unforeseen challenges and (sometimes) negative consequences of usingtechnology and data for social change.

We hope that this can offer opportunities for reflection and learning, as well as helping to develop innovativestrategies for engaging with technology and data in new and responsible ways….

What we learned from the stories

New spaces, new challenges

Moving into new digital spaces is bringing new challenges, and social media is one such space where these challengesare proving very difficult to navigate. This seems to stem from a number of key points:

  • organisations with low levels of technical literacy and experience in tech- or data-driven projects, deciding toengage suddenly with a certain tool or technology without realising what this entails. For some, this seems to stemfrom funders being more willing to support ‘innovative’ tech projects.
  • organisations wishing to engage more with social media while not being aware of more nuanced understandingsof public/private spaces online, and how different communities engage with social media. (see story #2)
    unpredictability and different levels of visibility: due to how privacy settings on Twitter are currently set, visibilityof users can be increased hugely by the actions of others – and once that happens, a user actually has very littleagency to change or reverse that. Sadly, being more visible on, for example, Twitter disproportionately affectswomen and minority groups in a negative way – so while ‘signal boosting’ to raise someone’s profile might be well-meant, the consequences are hard to predict, and almost impossible to reverse manually. (see story #4)
  • consent: related to the above point, “giving consent” can mean many different things when it comes to digitalspaces, especially if the person in question has little experience or understanding of using the technology inquestion (see stories #4 and #5).

Grey areas of responsible data

In almost all of the cases we looked at, very few decisions were concretely “right” or “wrong”. There are many, manygrey areas here, which need to be addressed on a case by case basis. In some cases, people involved really did thinkthrough their actions, and approached their problems thoughtfully and responsibly – but consequences they had notimagined, happened (see story #8).

Additionally, given the quickly moving nature of the space, challenges can arise that simply would not have beenpossible at the start.

….Despite the very varying settings of the stories collected, the shared mitigation strategies indicate that there areindeed a few key principles that can be kept in mind throughout the development of a new tech- or data-drivenproject.

The most stark of these – and one key aspect that is underlying many of these challenges – is a fundamental lack of technical literacy among advocacy organisations. This affects the way they interact with technical partners, the decisions they make around the project, the level to which they can have meaningful input, and more. Perhaps more crucially, it also affects the ability to know what to ask for help about – ie, to ‘know the unknowns’.

Building an organisation’s technical literacy might not mean being able to answer all technical questions in-house, but rather knowing what to ask and what to expect in an answer, from others. For advocacy organisations who don’t (yet)have this, it becomes all too easy to outsource not just the actual technical work but the contextual decisions too, which should be a collaborative process, benefiting from both sets of expertise.

There seems to be a lot of scope to expand this set of stories both in terms of collecting more from other advocacy organisations, and into other sectors, too. Ultimately, we hope that sharing our collective intelligence around lessonslearned from responsible data challenges faced in projects, will contribute to a greater understanding for all of us….Read all the stories here

Political Behavior and Big Data


Special issue of the International Journal of Sociology: “Interest in the use of “big data” in the social sciences is growing dramatically. Yet, adequate methodological research on what constitutes such data, and about their validity, is lacking. Scholars face both opportunities and challenges inherent in this new era of unprecedented quantification of information, including that related to political actions and attitudes. This special issue of the International Journal of Sociology addresses recent uses of “big data,” its multiple meanings, and the potential that this may have in building a stronger understanding of political behavior. We present a working definition of “big data” and summarize the major issues involved in their use. While the papers in this volume deal with various problems – how to integrate “big data” sources with cross-national survey research, the methodological challenges involved in building cross-national longitudinal network data of country memberships in international nongovernmental organizations, methods of detecting and correcting for source selection bias in event data derived from news and other online sources, the challenges and solutions to ex post harmonization of international social survey data – they share a common viewpoint. To make good on the substantive promise of “big data,” scholars need to engage with their inherent methodological problems. At this date, scholars are only beginning to identify and solve them….(More)”