11 ways to rethink open data and make it relevant to the public


Miguel Paz at IJNET: “It’s time to transform open data from a trendy concept among policy wonks and news nerds into something tangible to everyday life for citizens, businesses and grassroots organizations. Here are some ideas to help us get there:
1. Improve access to data
Craig Hammer from the World Bank has tackled this issue, stating that “Open Data could be the game changer when it comes to eradicating global poverty”, but only if governments make available online data that become actionable intelligence: a launch pad for investigation, analysis, triangulation, and improved decision making at all levels.
2. Create open data for the end user
As Hammer wrote in a blog post for the Harvard Business Review, while the “opening” has generated excitement from development experts, donors, several government champions, and the increasingly mighty geek community, the hard reality is that much of the public has been left behind, or tacked on as an afterthought. Let`s get out of the building and start working for the end user.
3. Show, don’t tell
Regular folks don’t know what “open data” means. Actually, they probably don’t care what we call it and don’t know if they need it. Apple’s Steve Jobs said that a lot of times, people don’t know what they want until you show it to them. We need to stop telling them they need it and start showing them why they need it, through actionable user experience.
4. Make it relevant to people’s daily lives, not just to NGOs and policymakers’ priorities
A study of the use of open data and transparency in Chile showed the top 10 uses were for things that affect their lives directly for better or for worse: data on government subsidies and support, legal certificates, information services, paperwork. If the data doesn’t speak to priorities at the household or individual level, we’ve lost the value of both the “opening” of data, and the data itself.
5. Invite the public into the sandbox
We need to give people “better tools to not only consume, but to create and manipulate data,” says my colleague Alvaro Graves, Poderopedia’s semantic web developer and researcher. This is what Code for America does, and it’s also what happened with the advent of Web 2.0, when the availability of better tools, such as blogging platforms, helped people create and share content.
6. Realize that open data are like QR codes
Everyone talks about open data the way they used to talk about QR codes–as something ground breaking. But as with QR Codes, open data only succeeds with the proper context to satisfy the needs of citizens. Context is the most important thing to funnel use and success of open data as a tool for global change.
7. Make open data sexy and pop, like Jess3.com
Geeks became popular because they made useful and cool things that could be embraced by end users. Open data geeks need to stick with that program.
8. Help journalists embrace open data
Jorge Lanata, a famous Argentinian journalist who is now being targeted by the Cristina Fernández administration due to his unfolding of government corruption scandals, once said that 50 percent of the success of a story or newspaper is assured if journalists like it.
That’s true of open data as well. If journalists understand its value for the public interest and learn how to use it, so will the public. And if they do, the winds of change will blow. Governments and the private sector will be forced to provide better, more up-to-date and standardized data. Open data will be understood not as a concept but as a public information source as relevant as any other. We need to teach Latin American journalists to be part of this.
9. News nerds can help you put your open data to good use
In order to boost the use of open data by journalists we need news nerds, teams of lightweight and tech-heavy armored journalist-programmers who can teach colleagues how open data through brings us high-impact storytelling that can change public policies and hold authorities accountable.
News nerds can also help us with “institutionalizing data literacy across societies” as Hammer puts it. ICFJ Knight International Journalism Fellow and digital strategist Justin Arenstein calls these folks “mass mobilizers” of information. Alex Howard “points to these groups because they can help demystify data, to make it understandable by populations and not just statisticians.”
I call them News Ninja Nerds, accelerator taskforces that can foster innovationsin news, data and transparency in a speedy way, saving governments and organizations time and a lot of money. Projects like ProPublica’s Dollars For Docs are great examples of what can be achieved if you mix FOIA, open data and the will to provide news in the public interest.
10. Rename open data
Part of the reasons people don’t embrace concepts such as open data is because it is part of a lingo that has nothing to do with them. No empathy involved. Let’s start talking about people’s right to know and use the data generated by governments. As Tim O’Reilly puts it: “Government as a Platform for Greatness,” with examples we can relate to, instead of dead .PDF’s and dirty databases.
11. Don’t expect open data to substitute for thinking or reporting
Investigative Reporting can benefit from it. But “but there is no substitute for the kind of street-level digging, personal interviews, and detective work” great journalism projects entailed, says David Kaplan in a great post entitled, Why Open Data is Not Enough.”

Three ways digital leaders can operate successfully in local government


in The Guardian: “The landscape of digital is constantly changing and being redefined with every new development, technology breakthrough, success and failure. We need digital public sector leaders who can properly navigate this environment, and follow these three guidelines.
1. Champion open data
We need leaders who can ensure that information and data is open by default, and secure when absolutely required. Too often councils commission digital programmes only to find the data generated does not easily integrate with other systems, or that data is not council-owned and can only be accessed at further cost.
2. Don’t get distracted by flashy products
Leaders must adopt an agnostic approach to technology, and not get seduced by the ever-increasing number of digital technologies and lose sight of real user and business needs.
3. Learn from research
Tales of misplaced IT investments plague the public sector, and senior leaders are understandably hesitant when considering future investments. To avoid causing even more disruption, we should learn from research findings such as those of the New Local Government Network’s recent digital roundtables on what works.
Making the decision to properly invest in digital leadership will not just improve decision making about digital solutions and strategies. It will also bring in the knowledge needed to navigate the complex security requirements that surround public-sector IT. And it will ensure that practices honed in the digital environment become embedded in the council more generally.
In Devon, for example, we are making sure all the services we offer online are based on the experience and behaviour of users. This has led service teams to refocus on the needs of citizens rather than those of the organisation. And our experiences of future proofing, agility and responsiveness are informing service design throughout the council.
What’s holding us back?
Across local government there is still a fragmented approach to collaboration. In central government, the Government Digital Service is charged with providing the right environment for change across all government departments. However, in local government, digital leaders often work alone without a unifying strategy across the sector. It is important to understand and recognise that the Government Digital Service is more than just a team pushing and promoting digital in central government: they are the future of central government, attempting to transform everything.
Initiatives such as LocalGov Digital, (O2’s Local Government Digital Fund), Forum (the DCLG’s local digital alliance) and the Guardian’s many public sector forums and networks are all helping to push forward debate, spread good practice and build a sense of urgent optimism around the local government digital agenda. But at present there is no equivalent to the unified force of the Government Digital Service.”

Canadian Organizations Join Forces to Launch Open Data Institute to Foster Open Government


Press Release: “The Canadian Digital Media Network, the University of Waterloo, Communitech, OpenText and Desire2Learn today announced the creation of the Open Data Institute.

The Open Data Institute, which received support from the Government of Canada in this week’s budget, will work with governments, academic institutions and the private sector to solve challenges facing “open government” efforts and realize the full potential of “open data.”
According to a statement, partners will work on development of common standards, the integration of data from different levels of government and the commercialization of data, “allowing Canadians to derive greater economic benefit from datasets that are made available by all levels of government.”
The Open Data Institute is a public-private partnership. Founding partners will contribute $3 million in cash and in-kind contributions over three years to establish the institute, a figure that has been matched by the Government of Canada.
“This is a strategic investment in Canada’s ability to lead the digital economy,” said Kevin Tuer, Managing Director of CDMN. “Similar to how a common system of telephone exchanges allowed world-wide communication, the Open Data Institute will help create a common platform to share and access datasets.”
“This will allow the development of new applications and products, creating new business opportunities and jobs across the country,” he added.
“The Institute will serve as a common forum for government, academia and the private sector to collaborate on Open Government initiatives with the goal of fueling Canadian tech innovation,” noted OpenText President and CEO Mark J. Barrenechea
“The Open Data Institute has the potential to strengthen the regional economy and increase our innovative capacity,” added Feridun Hamdullahpur, president and vice-chancellor of the University of Waterloo.

The newsonomics of measuring the real impact of news


Ken Doctor at Nieman Journalism Lab: “Hello there! It’s me, your friendly neighborhood Tweet Button. What if you could tap me and unlock a brand new source of funding for startup news sources of all kinds? What if, even better, you the reader could tap that money loose with a single click?
That’s the delightfully simple conceit behind a little widget, Impaq.me, you may have seen popping up as you traverse the news web. It’s social. It’s viral. It uses OPM (Other People’s Money) — and maybe a little bit of your own. It makes a new case to funders and maybe commercial sponsors. And it spits out metrics around the clock. It aims to be a convergence widget, acting on that now-aging idea that our attention is as important as our wallet. Consider it a new digital Swiss Army knife for the attention economy. TWEET
It’s impossible to tell how much of an impact Impaq.me may have. It’s still in its second round of testing at six of the U.S.’s most successful independent nonprofit startups — MinnPost, Center for Investigative Reporting, The Texas Tribune, Voice of San Diego, ProPublica, and the Center for Public Integrity — but as in all things digital, timing is everything. And that timing seems right.
First, let’s consider that spate of new news sites that have sprouted with the winter rains — Bill Keller’s and Neil Barsky’s Marshall Project being only the latest. It’s been quite a run — from Ezra Klein’s Project X to Pierre Omidyar’s First Look (and just launched The Intercept) to the reimagining of FiveThirtyEight. While they encompass a broad range of business models and goals (“The newsonomics of why everyone seems to be starting a news site”), they all need two things: money and engagement. Or, maybe better ordered, engagement and money. The dance between the two is still in the early stages of Internet choreography. Get the sequences right and you win.
Second, and related, is the big question of “social” and how our sharing of news is changing the old publishing dynamic of editors deciding what we’re going to read. Just this week, two pieces here at the Lab — one on Upworthy’s influence and one on the social/search tango — highlighted the still-being-understood role of social in our news-reading lives.
Third, funders of news sites, especially Knight and other lead foundations, are looking for harder evidence of the value generated by their early grants. Millions have been poured into creating new news sites. Now they’re asking: What has our funding really done? Within that big question, Impaq.me is only one of several new attempts to demonstrably measure real impact in new ways. We’ll take a brief look at those impact initiatives below….
If Impaq.me is all about impact and money, then it’s got good company. There are at least two other noteworthy impact-measuring projects going on.

  • The Center for Investigative Reporting’s Impact Tracker effort impact-tracking initiative launched last fall. The big idea: getting beyond the traditional metrics like unique visitors and pageviews to track the value of investigative and enterprise work. To that end, CIR has hired Lindsay Green-Barber, a CUNY-trained social scientist, and given her a perhaps first-ever title: media impact analyst.We can see the fruits of the work around CIR’s impressive Returning Home to Battle veterans series. On that series, CIR is tracking such impacts as change and rise in the public discourse around veterans’ issues and related allocation of government resources. The notion of good journalism intended to shine a light in dark places has been embedded in the CIR DNA for a long time; this new effort is intended to provide data — and words — to describe progress toward solutions. CIR is working with The Seattle Times on the impact of that paper’s education reporting, and CIR may soon look at more partnerships as well. Related: CIR is holding two “Dissection” events in New York and Washington in April, bringing together journalists, funders, and social scientists to widen the media impact movement.
  • Chalkbeat, a growing national education news site, too, is moving on impact analysis. It’s called MORI (Measures of our Reporting’s Influence), and it’s a WordPress plugin. Says Chalkbeat cofounder Elizabeth Green: “We built MORI to solve for a problem that I guess you could call ‘impact loss.’ We knew that our stories were having all kinds of impacts, but we had no way of keeping track of these impacts or making sense of them. That meant that we couldn’t easily compile what we had done in the last year to share with the outside world (board, donors, foundations, readers, our moms) but also — just as important — we couldn’t look back on what we’d done and learn from it.”Sound familiar?
    After much inquiry, Chalkbeat settled on technology. “Within each story’s back end,” Green said, “we can enter inputs — qualitative data about the type of story, topic, and target audience — as well as outcomes — impacts on policy and practice (what we call ‘informed action’) as well as impacts on what we call ‘civic deliberation.’”

The City as a Platform – Stripping out complexity and Making Things Happen


Emer Coleman: “The concept of data platforms has garnered a lot of coverage over the past few years and the City as a Platform is one that has wide traction in the “Smart City” space. It’s an idea that has been widely promulgated by service integrators and large consultancy firms. This idea has been adopted into the thinking of many cities in the UK, increasingly by local authorities who have both been forced by central government diktat to open their data and who are also engaging with many of the large private companies who sell infrastructure and capabilities and with whom they may have existing contractual arrangements.
Standard interpretations of city as platform usually involve the idea that the city authority will create the platform into which it will release its data. It then seeks the integration of API’s (both external and internal) into the platform so that theoretically the user can access that data via a unified City API on which developers can then create products and services.
Picture

Some local authorities seek to monetise access to this API while others see it as a mechanism for encouraging the development of new products and services that are of value to the state but which have been developed without direct additional investment by the state thereby generating public good from the public task of collecting and storing data.
This concept of city as platform integrated by local authorities appears at first glance to be a logical, linear and achievable goal but in my view completely misunderstands a number of key factors;
1. The evolution of the open data/big data market
2. Commercial and Technical realities
3. Governance and bureaucracy
I’ll explore these below…”

Algorithmic Accountability Reporting: On the Investigation of Black Boxes


New report by by Nicholas Diakopoulos: “The past three years have seen a small profusion of websites, perhaps as many as 80, spring up to capitalize on the high interest that mug shot photos generate online.1 Mug shots are public record, artifacts of an arrest, and these websites collect, organize, and optimize the photos so that they’re found more easily online. Proponents of such sites argue that the public has a right to know if their neighbor, romantic date, or colleague has an arrest record. Still, mug shots are not proof of conviction; they don’t signal guilt.
Having one online is likely to result in a reputational blemish; having that photo ranked as the first result when someone searches for your name on Google turns that blemish into a garish reputational wound, festering in facile accessibility. Some of these websites are exploiting this, charging peo- ple to remove their photo from the site so that it doesn’t appear in online searches. It’s reputational blackmail. And remember, these people aren’t necessarily guilty of anything.
To crack down on the practice, states like Oregon, Georgia, and Utah have passed laws requiring these sites to take down the photos if the person’s record has been cleared. Some credit card companies have stopped processing payments for the seediest of the sites. Clearly both legal and market forces can help curtail this activity, but there’s another way to deal with the issue too: algorithms. Indeed, Google recently launched updates to its ranking algorithm that down-weight results from mug shot websites, basically treating them more as spam than as legitimate information sources.2 With a single knock of the algorithmic gavel, Google declared such sites illegitimate.
At the turn of the millennium, 14 years ago, Lawrence Lessig taught us that “code is law”—that the architecture of systems, and the code and algorithms that run them, can be powerful influences on liberty.3 We’re living in a world now where algorithms adjudicate more and more consequential decisions in our lives. It’s not just search engines either; it’s everything from online review systems to educational evaluations, the operation of markets to how political campaigns are run, and even how social services like welfare and public safety are managed. Algorithms, driven by vast troves of data, are the new power brokers in society.
As the mug shots example suggests, algorithmic power isn’t necessarily detrimental to people; it can also act as a positive force. The intent here is not to demonize algorithms, but to recognize that they operate with biases like the rest of us.4 And they can make mistakes. What we generally lack as a public is clarity about how algorithms exercise their power over us. With that clarity comes an increased ability to publicly debate and dialogue the merits of any particular algorithmic power. While legal codes are available for us to read, algorithmic codes are more opaque, hidden behind layers of technical complexity. How can we characterize the power that various algorithms may exert on us? And how can we better understand when algo- rithms might be wronging us? What should be the role of journalists in holding that power to account?
In the next section I discuss what algorithms are and how they encode power. I then describe the idea of algorithmic accountability, first examining how algorithms problematize and sometimes stand in tension with transparency. Next, I describe how reverse engineering can provide an alternative way to characterize algorithmic power by delineating a conceptual model that captures different investigative scenarios based on reverse engineering algorithms’ input-output relationships. I then provide a number of illustrative cases and methodological details on how algorithmic accountability reporting might be realized in practice. I conclude with a discussion about broader issues of human resources, legality, ethics, and transparency.”

House Bill Raises Questions about Crowdsourcing


Anne Bowser for Commons Lab (Wilson Center):”A new bill in the House is raising some key questions about how crowdsourcing is understood by scientists, government agencies, policymakers and the public at large.
Robin Bravender’s recent article in Environment & Energy Daily, “House Republicans Push Crowdsourcing on Agency Science,” (subscription required) neatly summarizes the debate around H.R. 4012, a bill introduced to the House of Representatives earlier this month. The House Science, Space and Technology Committe earlier this week held a hearing on the bill, which could see a committee vote as early as next month.
Dubbed the “Secret Science Reform Act of 2014,” the bill prohibits the Environmental Protection Agency (EPA) from “proposing, finalizing, or disseminating regulations or assessments based upon science that is not transparent or reproducible.” If the bill is passed, EPA would be unable to base assessments or regulations on any information not “publicly available in a manner that is sufficient for independent analysis.” This would include all information published in scholarly journals based on data that is not available as open source.
The bill is based on the premise that forcing EPA to use public data will inspire greater transparency by allowing “the crowd” to conduct independent analysis and interpretation. While the premise of involving the public in scientific research is sound, this characterization of crowdsourcing as a process separate from traditional scientific research is deeply problematic.
This division contrasts the current practices of many researchers, who use crowdsourcing to directly involve the public in scientific processes. Galaxy Zoo, for example, enlists digital volunteers (called “citizen scientists”) help classify more than 40 million photographs of galaxies taken by the Hubble Telescope. These crowdsourced morphological classifications are a powerful form of data analysis, a key aspect of the scientific process. Galaxy Zoo then publishes a catalogue of these classifications as an open-source data set. And the data reduction techniques and measures of confidence and bias for the data catalogue are documented in MNRAS, a peer-reviewed journal. A recent Google Scholar search shows that the data set published in MNRAS has been cited a remarkable 121 times.
As this example illustrates, crowdsourcing is often embedded in the process of formal scientific research. But prior to being published in a scientific journal, the crowdsourced contributions of non-professional volunteers are subject to the scrutiny of professional scientists through the rigorous process of peer review. Because peer review was designed as an institution to ensure objective and unbiased research, peer-reviewed scientific work is widely accepted as the best source of information for any science-based decision.
Separating crowdsourcing from the peer review process, as this legislation intends, means that there will be no formal filters in place to ensure that open data will not be abused by special interests. Ellen Silbergeld, a professor at John Hopkins University who testified at the hearing this week, made exactly this point when she pointed to data manipulation commonly practiced by tobacco lobbyists in the United States.
Contributing to scientific research is one goal of crowdsourcing for science. Involving the public in scientific research also increases volunteer understanding of research topics and the scientific process and inspires heightened community engagement. These goals are supported by President Obama’s Second Open Government National Action Plan, which calls for “increased crowdsourcing and citizen science programs” to support “an informed and active citizenry.” But H.R. 4012 does not support these goals. Rather, this legislation could further degrade the public’s understanding of science by encouraging the public to distrust professional scientists rather than collaborate with them.
Crowdsourcing benefits organizations by bringing in the unique expertise held by external volunteers, which can augment and enhance the traditional scientific process. In return, these volunteers benefit from exposure to new and exciting processes, such as scientific research. This mutually beneficial relationship depends on collaboration, not opposition. Supporting an antagonistic relationship between science-based organizations like the EPA and members of “the crowd” will benefit neither institutions, nor volunteers, nor the country as a whole.
 

The GovLab Index: Designing for Behavior Change


Please find below the latest installment in The GovLab Index series, inspired by the Harper’s Index. “The GovLab Index: Designing for Behavior Change” explores the recent application of psychology and behavioral economics towards solving social issues and shaping public policy and programs. Previous installments include The Networked Public, Measuring Impact with Evidence, Open Data, The Data Universe, Participation and Civic Engagement and Trust in Institutions.

  • Year the Behavioural Insights or “Nudge” Team was established by David Cameron in the U.K.: 2010
  • Amount saved by the U.K. Courts Service a year by sending people owing fines personalized text messages to persuade them to pay promptly since the creation of the Nudge unit: £30m
    • Entire budget for the Behavioural Insights Team: less than £1 million
    • Estimated reduction in bailiff interventions through the use of personalized text reminders: 150,000 fewer interventions annually
  • Percentage increase among British residents who paid their taxes on time when they received a letter saying that most citizens in their neighborhood pay their taxes on time: 15%
  • Estimated increase in organ-donor registrations in the U.K. if people are asked “If you needed an organ transplant, would you take one?”: 96,000
  • Proportion of employees who now have a workplace pension since the U.K. government switched from opt-in to opt-out (illustrating the power of defaults): 83%, 63% before opt-out
  • Increase in 401(k) enrollment rates within the U.S. by changing the default from ‘opt in’ to ‘opt out’: from 13% to 80%
  • Behavioral studies have shown that consumers overestimate savings from credit cards with no annual fees. Reduction in overall borrowing costs to consumers by requiring card issuers to tell consumers how much it would cost them in fees and interest, under the 2009 CARD Act in the U.S.: 1.7% of average daily balances 
  • Many high school students and their families in the U.S. find financial aid forms for college complex and thus delay filling them out. Increase in college enrollment as a result of being helped to complete the FAFSA financial aid form by an H&R tax professional, who then provided immediate estimates of the amount of aid the student was eligible for, and the net tuition cost of four nearby public colleges: 26%
  • How much more likely people are to keep accounting records, calculate monthly revenues, and separate their home and business books if given “rules of thumb”-based training with regards to managing their finances, according to a randomized control trial conducted in a bank in the Dominican Republic: 10%
  • Elderly Americans are asked to choose from over 40 options when enrolling in Medicaid Part D private drug plans. How many switched plans to save money when they received a letter providing information about three plans that would be cheaper for them: almost double 
    • The amount saved on average per person by switching plans due to this intervention: $150 per year
  • Increase in prescriptions to manage cardiac disease when Medicaid enrollees are sent a suite of behavioral nudges such as more salient description of the consequences of remaining untreated and post-it note reminders during an experiment in the U.S.: 78%
  • Reduction in street-litter when a trail of green footprints leading to nearby garbage cans is stenciled on the ground during an experiment in Copenhagen, Denmark: 46%
  • Reduction in missed National Health Service appointments in the U.K. when patients are asked to fill out their own appointment cards: 18%
    • Reduction in missed appointments when patients are also made aware of the number of people who attend their appointments on time: 31%
    • The cost of non-attendance per year for the National Health Service: £700m 
  • How many people in a U.S. experiment chose to ‘downsize’ their meals when asked, regardless of whether they received a discount for the smaller portion: 14-33%
    • Average reduction in calories as a result of downsizing: 200
  • Number of households in the U.K. without properly insulated attics, leading to high energy consumption and bills: 40%
    • Result of offering group discounts to motivate households to insulate their attics: no effect
    • Increase in households that agreed to insulate their attics when offered loft-clearing services even though they had to pay for the service: 4.8 fold increase

Full list and sources at http://thegovlab.org/the-govlab-index-designing-for-behavior-change/
 

Big Data for Law


legislation.gov.uk: “The National Archives has received ‘big data’ funding from the Arts and Humanities Research Council (AHRC) to deliver the ‘Big Data for Law‘ project. Just over £550,000 will enable the project to transform how we understand and use current legislation, delivering a new service – legislation.gov.uk Research – by March 2015. There are an estimated 50 million words in the statute book, with 100,000 words added or changed every month. Search engines and services like legislation.gov.uk have transformed access to legislation. Law is accessed by a much wider group of people, the majority of whom are typically not legally trained or qualified. All users of legislation are confronted by the volume of legislation, its piecemeal structure, frequent amendments, and the interaction of the statute book with common law and European law. Not surprisingly, many find the law difficult to understand and comply with. There has never been a more relevant time for research into the architecture and content of law, the language used in legislation and how, through interpretation by the courts, it is given effect. Research that will underpin the drive to deliver good, clear and effective law. Researchers typically lack the raw data, the tools, and the methods to undertake research across the whole statute book. Meanwhile, the combination of low cost cloud computing, open source software and new methods of data analysis – the enablers of the big data revolution – are transforming research in other fields. Big data research is perfectly possible with legislation if only the basic ingredients – the data, the tools and some tried and trusted methods – were as readily available as the computing power and the storage. The vision for this project is to address that gap by providing a new Legislation Data Research Infrastructure at research.legislation.gov.uk. Specifically tailored to researchers’ needs, it will consist of downloadable data, online tools for end-users; and open source tools for researchers to download, adapt and use….
There are three main areas for research:

  • Understanding researchers’ needs: to ensure the service is based on evidenced need, capabilities and limitations, putting big data technologies in the hands of non-technical researchers for the first time.
  • Deriving new open data from closed data: no one has all the data that researchers might find useful. For example, the potentially personally identifiable data about users and usage of legislation.gov.uk cannot be made available as open data but is perfect for processing using existing big data tools; eg to identify clusters in legislation or “recommendations” datasets of “people who read Act A or B also looked at Act Y or Z”. The project will look whether it is possible to create new open data sets from this type of closed data. An N-Grams dataset and appropriate user interface for legislation or related case law, for example, would contain sequences of words/phrases/statistics about their frequency of occurrence per document. N-Grams are useful for research in linguistics or history, and could be used to provide a predictive text feature in a drafting tool for legislation.
  • Pattern language for legislation: We need new ways of codifying and modelling the architecture of the statute book to make it easier to research its entirety using big data technologies. The project will seek to learn from other disciplines, applying the concept of a ‘pattern language’ to legislation. Pattern languages have revolutionised software engineering over the last twenty years and have the potential to do the same for our understanding of the statute book. A pattern language is simply a structured method of describing good design practices, providing a common vocabulary between users and specialists, structured around problems or issues, with a solution. Patterns are not created or invented – they are identified as ‘good design’ based on evidence about how useful and effective they are. Applied to legislation, this might lead to a common vocabulary between the users of legislation and legislative drafters, to identifying useful and effective drafting practices and solutions that deliver good law. This could enable a radically different approach to structuring teaching materials or guidance for legislators.”

Open Data is an Essential Ingredient for Better Development Research


Aiddata blogpost: “UNICEF is making data a priority by re-launching the “UNICEF Child Info” department as “UNICEF Data” and actively promoting the use and collection of data to guide development. While their data is not subnational, it is comprehensive and expansive in its indicators. UNICEF’s mission calls for the use of the power of statistics and data to tell a story about the quality of life for children around the world. The connection between improving data and improving lives is a critical one that, while sometimes overshadowed by technical discussions on providing better data, is at the core of open data and the data transparency initiatives. By using evidence to anchor their decision-making, the UNICEF Data initiative hopes to craft and inspire better ways of caring for and empowering children across the globe.”