Estonian plan for 'data embassies' overseas to back up government databases


Graeme Burton in Computing: “Estonia is planning to open “data embassies” overseas to back up government databases and to operate government “in the cloud“.
The aim is partly to improve efficiency, but driven largely by fear of invasion and occupation, Jaan Priisalu, the director general of Estonian Information System Authority, told Sky News.
He said: “We are planning to actually operate our government in the cloud. It’s clear also how it helps to protect the country, the territory. Usually when you are the military planner and you are planning the occupation of the territory, then one of the rules is suppress the existing institutions.
“And if you are not able to do it, it means that this political price of occupying the country will simply rise for planners.”
Part of the rationale for the plan, he continued, was fear of attack from Russia in particular, which has been heightened following the occupation of Crimea, formerly in Ukraine.
“It’s quite clear that you can have problems with your neighbours. And our biggest neighbour is Russia, and nowadays it’s quite aggressive. This is clear.”
The plan is to back up critical government databases outside of Estonia so that affairs of state can be conducted in the cloud, even if the country is invaded. It would also have the benefit of keeping government information out of invaders’ hands – provided it can keep its government cloud secure.
According to Sky News, the UK is already in advanced talks about hosting the Estonian government databases and may make the UK the first of Estonia’s data embassies.
Having wrested independence from the Soviet Union in 1991, Estonia has experienced frequent tension with its much bigger neighbour. In 2007, for example, after the relocation of the “Bronze Soldier of Tallinn” and the exhumation of the soldiers buried in a square in the centre of the capital to a military cemetery in April 2007, the country was subject to a prolonged cyber-attack sourced to Russia.
Russian hacker “Sp0Raw” said that the most efficient of the online attacks on Estonia could not have been carried out without the approval of Russian authorities and added that the hackers seemed to act under “recommendations” from parties in government. However, claims by Estonia that the Russian government was directly involved in the attacks were “empty words, not supported by technical data”.
Mike Witt, deputy director of the US Computer Emergency Response Team (CERT), suggested that the distributed denial-of-service (DDOS) attacks, while crippling to the Estonian government at the time, were not significant in scale from a technical standpoint. However, the Estonian government was forced to shut down many of its online operations in response.
At the same time, the Estonian government has been accused of implementing anti-Russian laws and discriminating against its large ethnic Russian population.
Last week, the Estonian government unveiled a plan to allow anyone in the world to apply for “digital citizenship of the country, enabling them to use Estonian online services, open bank accounts, and start companies without having to physically reside in the country.”

How open data can help shape the way we analyse electoral behaviour


Harvey Lewis (Deloitte), Ulrich Atz, Gianfranco Cecconi, Tom Heath (ODI) in The Guardian: Even after the local council elections in England and Northern Ireland on 22 May, which coincided with polling for the European Parliament, the next 12 months remain a busy time for the democratic process in the UK.
In September, the people of Scotland make their choice in a referendum on the future of the Union. Finally, the first fixed-term parliament in Westminster comes to an end with a general election in all areas of Great Britain and Northern Ireland in May 2015.
To ensure that as many people as possible are eligible and able to vote, the government is launching an ambitious programme of Individual Electoral Registration (IER) this summer. This will mean that the traditional, paper-based approach to household registration will shift to a tailored and largely digital process more in-keeping with the data-driven demands of the twenty-first century.
Under IER, citizens will need to provide ‘identifying information’, such as date of birth or national insurance number, when applying to register.

Ballots: stuck in the past?

However, despite the government’s attempts through IER to improve the veracity of information captured prior to ballots being posted, little has changed in terms of the vision for capturing, distributing and analysing digital data from election day itself.

Advertisement

Indeed, paper is still the chosen medium for data collection.
Digitising elections is fraught with difficulty, though. In the US, for example, the introduction of new voting machines created much controversy even though they are capable of providing ‘near-perfect’ ballot data.
The UK’s democratic process is not completely blind, though. Numerous opinion surveys are conducted both before and after polling, including the long-running British Election Study, to understand the shifting attitudes of a representative cross-section of the electorate.
But if the government does not retain in sufficient geographic detail digital information on the number of people who vote, then how can it learn what is necessary to reverse the long-running decline in turnout?

The effects of lack of data

To add to the debate around democratic engagement, a joint research team, with data scientists from Deloitte and the Open Data Institute (ODI), have been attempting to understand what makes voters tick.
Our research has been hampered by a significant lack of relevant data describing voter behaviour at electoral ward level, as well as difficulties in matching what little data is available to other open data sources, such as demographic data from the 2011 Census.
Even though individual ballot papers are collected and verified for counting the number of votes per candidate – the primary aim of elections, after all – the only recent elections for which aggregate turnout statistics have been published at ward level are the 2012 local council elections in England and Wales. In these elections, approximately 3,000 wards from a total of over 8,000 voted.
Data published by the Electoral Commission for the 2013 local council elections in England and Wales purports to be at ward level but is, in fact, for ‘county electoral divisions’, as explained by the Office for National Statistics.
Moreover, important factors related to the accessibility of polling stations – such as the distance from main population centres – could not be assessed because the location of polling stations remains the responsibility of individual local authorities – and only eight of these have so far published their data as open data.
Given these fundamental limitations, drawing any robust conclusions is difficult. Nevertheless, our research shows the potential for forecasting electoral turnout with relatively few census variables, the most significant of which are age and the size of the electorate in each ward.

What role can open data play?

The limited results described above provide a tantalising glimpse into a possible future scenario: where open data provides a deeper and more granular understanding of electoral behaviour.
On the back of more sophisticated analyses, policies for improving democratic engagement – particularly among young people – have the potential to become focused and evidence-driven.
And, although the data captured on election day will always remain primarily for the use of electing the public’s preferred candidate, an important secondary consideration is aggregating and publishing data that can be used more widely.
This may have been prohibitively expensive or too complex in the past but as storage and processing costs continue to fall, and the appetite for such knowledge grows, there is a compelling business case.
The benefits of this future scenario potentially include:

  • tailoring awareness and marketing campaigns to wards and other segments of the electorate most likely to respond positively and subsequently turn out to vote
  • increasing the efficiency with which European, general and local elections are held in the UK
  • improving transparency around the electoral process and stimulating increased democratic engagement
  • enhancing links to the Government’s other significant data collection activities, including the Census.

Achieving these benefits requires commitment to electoral data being collected and published in a systematic fashion at least at ward level. This would link work currently undertaken by the Electoral Commission, the ONS, Plymouth University’s Election Centre, the British Election Study and the more than 400 local authorities across the UK.”

How to treat government like an open source project


Ben Balter in OpenSource.com: “Open government is great. At least, it was a few election cycles ago. FOIA requests, open data, seeing how your government works—it’s arguably brought light to a lot of not-so-great practices, and in many cases, has spurred citizen-centric innovation not otherwise imagined before the information’s release.
It used to be that sharing information was really, really hard. Open government wasn’t even a possibility a few hundred years ago. Throughout the history of communication tools—be it the printing press, fax machine, or floppy disks—new tools have generally done three things: lowered the cost to transmit information, increased who that information could be made available to, and increase how quickly that information could be distributed. But, printing presses and fax machines have two limitations: they are one way and asynchronous. They let you more easily request, and eventually see how the sausage was made but don’t let you actually take part in the sausage-making. You may be able to see what’s wrong, but you don’t have the chance to make it better. By the time you find out, it’s already too late.
As technology allows us to communicate with greater frequency and greater fidelity, we have the chance to make our government not only transparent, but truly collaborative.

So, how do we encourage policy makers and bureaucrats to move from open government to collaborative government, to learn open source’s lessons about openness and collaboration at scale?
For one, we geeks can help to create a culture of transparency and openness within government by driving up the demand side of the equation. Be vocal, demand data, expect to see process, and once released, help build lightweight apps. Show potential change agents in government that their efforts will be rewarded.
Second, it’s a matter of tooling. We’ve got great tools out there—things like Git that can track who made what change when and open standards like CSV or JSON that don’t require proprietary software—but by-and-large they’re a foreign concept in government, at least among those empowered to make change. Command line interfaces with black background and green text can be intimidating to government bureaucrats used to desktop publishing tools. Make it easier for government to do the right thing and choose open standards over proprietary tooling.”
Last, be a good open source ambassador. Help your home city or state get involved with open source. Encourage them to take their first step (be it consuming open source, publishing, or collaborating with the public), teach them what it means to do things in the open, And when they do push code outside the firewall, above all, be supportive. We’re in this together.
As technology makes it easier to work together, geeks can help make our government not just open, but in fact collaborative. Government is the world’s largest and longest running open source project (bugs, trolls, and all). It’s time we start treating it like one.

Open government: getting beyond impenetrable online data


Jed Miller in The Guardian: “Mathematician Blaise Pascal famously closed a long letter by apologising that he hadn’t had time to make it shorter. Unfortunately, his pithy point about “download time” is regularly attributed to Mark Twain and Henry David Thoreau, probably because the public loves writers more than it loves statisticians. Scientists may make things provable, but writers make them memorable.
The World Bank confronted a similar reality of data journalism earlier this month when it revealed that, of the 1,600 bank reports posted online on from 2008 to 2012, 32% had never been downloaded at all and another 40% were downloaded under 100 times each.
Taken together, these cobwebbed documents represent millions of dollars in World Bank funds and hundreds of thousands of person-hours, spent by professionals who themselves represent millions of dollars in university degrees. It’s difficult to see the return on investment in producing expert research and organising it into searchable web libraries when almost three quarters of the output goes largely unseen.
The World Bank works at a scale unheard of by most organisations, but expert groups everywhere face the same challenges. Too much knowledge gets trapped in multi-page pdf files that are slow to download (especially in low-bandwidth areas), costly to print, and unavailable for computer analysis until someone manually or automatically extracts the raw data.
Even those who brave the progress bar find too often that urgent, incisive findings about poverty, health, discrimination, conflict or social change are presented in prose written by and for high-level experts, rendering it impenetrable to almost everyone else. Information isn’t just trapped in pdfs; it’s trapped in PhDs.
Governments and NGOs are beginning to realise that digital strategy means more than posting a document online, but what will it take for these groups to change not just their tools, but their thinking? It won’t be enough to partner with WhatsApp or hire GrumpyCat.
I asked strategists from the development, communications and social media fields to offer simple, “Tweetable” suggestions for how the policy community can become better communicators.

For nonprofits and governments that still publish 100-page pdfs on their websites and do not optimise the content to share in other channels such as social: it is a huge waste of time and ineffective. Stop it now.

– Beth Kanter, author and speaker. Beth’s Blog: How Nonprofits Can Use Social Media

Treat text as #opendata so infomediaries can mash it up and make it more accessible (see, for example federalregister.gov) and don’t just post and blast: distribute information in a targeted way to those most likely to be interested.

– Beth Noveck, director at the Governance Lab and former director at White House Open Government Initiative

Don’t be boring. Sounds easy, actually quite hard, super-important.

– Eli Pariser, CEO of Upworthy

Surprise me. Uncover the key finding that inspired you, rather than trying to tell it all at once and show me how the world could change because of it.

– Jay Golden, co-founder of Wakingstar Storyworks

For the Bank or anyone who is generating policy information they actually want people to use, they must actually write it for the user, not for themselves. As Steve Jobs said, ‘Simple can be harder than complex’.

– Kristen Grimm, founder and president at Spitfire Strategies

The way to reach the widest audience is to think beyond content format and focus on content strategy.

– Laura Silber, director of public affairs at Open Society Foundations

Open the door to policy work with short, accessible pieces – a blog post, a video take, infographics – that deliver the ‘so what’ succinctly.

– Robert McMahon, editor at Council on Foreign Relations

Policy information is more usable if it’s linked to corresponding actions one can take, or if it helps stir debate.  Also, whichever way you slice it, there will always be a narrow market for raw policy reports … that’s why explainer sites, listicles and talking heads exist.

– Ory Okolloh, director of investments at Omidyar Network and former public policy and government relations manager at Google Africa
Ms Okolloh, who helped found the citizen reporting platform Ushahidi, also offered a simple reminder about policy reports: “‘Never gets downloaded’ doesn’t mean ‘never gets read’.” Just as we shouldn’t mistake posting for dissemination, we shouldn’t confuse popularity with influence….”

How The Right People Analyzing The Best Data Are Transforming Government


NextGov: “Analytics is often touted as a new weapon in the technology arsenal of bleeding-edge organizations willing to spend lots of money to combat problems.
In reality, that’s not the case at all. Certainly, there are complex big data analytics tools that will analyze massive data sets to look for the proverbial needle in a haystack, but analytics 101 also includes smarter ways to look at existing data sets.
In this arena, government is making serious strides, according to Kathryn Stack, advisor for evidence-based innovation at the Office of Management and Budget. Speaking in Washington on Thursday at an analytics conference hosted by IBM, Stack provided an outline for agencies to spur innovation and improve mission by making smarter use of the data they already produce.
Interestingly, the first step has nothing to do with technology and everything to do with people. Get “the right people in the room,” Stack said, and make sure they value learning.
“One thing I have learned in my career is that if you really want transformative change, it’s important to bring the right players together across organizations – from your own department and different parts of government,” Stack said. “Too often, we lose a lot of money when siloed organizations lose sight of what the problem really is and spend a bunch of money, and at the end of the day we have invested in the wrong thing that doesn’t address the problem.”
The Department of Labor provides a great example for how to change a static organizational culture into one that integrates performance management, evaluation- and innovation-based processes. The department, she said, created a chief evaluation office and set up evaluation offices for each of its bureaus. These offices were tasked with focusing on important questions to improve performance, going inside programs to learn what is and isn’t working and identifying barriers that impeded experimentation and learning. At the same time, they helped develop partnerships across the agency – a major importance for any organization looking to make drastic changes.
Don’t overlook experimentation either, Stack said. Citing innovation leaders in the private sector such as Google, which runs 12,000 randomized experiments per year, Stack said agencies should not be afraid to get out and run with ideas. Not all of them will be good – only about 10 percent of Google’s experiments usher in new business changes – but even failures can bring meaningful value to the mission.
Stack used an experiment conducted by the United Kingdom’s Behavioral Insights Team as evidence.
The team continually tweaked language to tax compliance letters sent to individuals delinquent on their taxes. Significant experimentation ushered in lots of data, and the team analyzed it to find that one phrase, “Nine out of ten Britains pay their taxes on time,” improved collected revenue by five percent. That case shows how failures can bring about important successes.
“If you want to succeed, you’ve got to be willing to fail and test things out,” Stack said.
Any successful analytics effort in government is going to employ the right people, the best data – Stack said it’s not a secret that the government collects both useful and not-so-useful, “crappy” data – as well as the right technology and processes, too. For instance, there are numerous ways to measure return on investment, including dollars per customer served or costs per successful outcome.
“What is the total investment you have to make in a certain strategy in order to get a successful outcome?” Stack said. “Think about cost per outcome and how you do those calculations.”…”

Citizen participation and technology


ICTlogy: “The recent, rapid rise in the use of digital technology is changing relationships between citizens, organizations and public institutions, and expanding political participation. But while technology has the potential to amplify citizens’ voices, it must be accompanied by clear political goals and other factors to increase their clout.
Those are among the conclusions of a new NDI study, “Citizen Participation and Technology,” that examines the role digital technologies – such as social media, interactive websites and SMS systems – play in increasing citizen participation and fostering accountability in government. The study was driven by the recognition that better insights are needed into the relationship between new technologies, citizen participation programs and the outcomes they aim to achieve.
Using case studies from countries such as Burma, Mexico and Uganda, the study explores whether the use of technology in citizen participation programs amplifies citizen voices and increases government responsiveness and accountability, and whether the use of digital technology increases the political clout of citizens.
The research shows that while more people are using technology—such as social media for mobile organizing, and interactive websites and text messaging systems that enable direct communication between constituents and elected officials or crowdsourcing election day experiences— the type and quality of their political participation, and therefore its impact on democratization, varies. It also suggests that, in order to leverage technology’s potential, there is a need to focus on non-technological areas such as political organizing, leadership skills and political analysis.
For example, the “2% and More Women in Politics” coalition led by Mexico’s National Institute for Women (INMUJERES) used a social media campaign and an online petition to call successfully for reforms that would allocate two percent of political party funding for women’s leadership training. Technology helped the activists reach a wider audience, but women from the different political parties who made up the coalition might not have come together without NDI’s role as a neutral convener.
The study, which was conducted with support from the National Endowment for Democracy, provides an overview of NDI’s approach to citizen participation, and examines how the integration of technologies affects its programs in order to inform the work of NDI, other democracy assistance practitioners, donors, and civic groups.

Observations:

Key findings:

  1. Technology can be used to readily create spaces and opportunities for citizens to express their voices, but making these voices politically stronger and the spaces more meaningful is a harder challenge that is political and not technological in nature.
  2. Technology that was used to purposefully connect citizens’ groups and amplify their voices had more political impact.
  3. There is a scarcity of data on specific demographic groups’ use of, and barriers to technology for political participation. Programs seeking to close the digital divide as an instrument of narrowing the political divide should be informed by more research into barriers to access to both politics and technology.
  4. There is a blurring of the meaning between the technologies of open government data and the politics of open government that clouds program strategies and implementation.
  5. Attempts to simply crowdsource public inputs will not result in users self-organizing into politically influential groups, since citizens lack the opportunities to develop leadership, unity, and commitment around a shared vision necessary for meaningful collective action.
  6. Political will and the technical capacity to engage citizens in policy making, or providing accurate data on government performance are lacking in many emerging democracies. Technology may have changed institutions’ ability to respond to citizen demands but its mere presence has not fundamentally changed actual government responsiveness.”

Crowdsourcing for public safety


Paper presented by A Goncalves, C Silva, P Morreale, J Bonafide  at Systems Conference (SysCon), 2014: “With advances in mobile technology, the ability to get real-time geographically accurate data, including photos and videos, becomes integrated into daily activities. Businesses use this technology edge to stay ahead of their competitors. Social media has made photo and video sharing a widely accepted and adopted behavior. This real-time data and information exchange, crowdsourcing, can be used to help first responders and personnel in emergency situations caused by extreme weather such as earthquakes, hurricanes, floods, and snow storms. Using smartphones, civilians can contribute data and images to the recovery process and make it more efficient, which can ultimately save lives and decrease the economic impact caused by extreme weather conditions.”

Linking Social, Open, and Enterprise Data


Paper by T Omitola, J Davies, A Duke, H Glaser, N Shadbolt for Proceeding WIMS ’14 (Proceedings of the 4th International Conference on Web Intelligence, Mining and Semantics): “The new world of big data, of the LOD cloud, of the app economy, and of social media means that organisations no longer own, much less control, all the data they need to make the best informed business decisions. In this paper, we describe how we built a system using Linked Data principles to bring in data from Web 2.0 sites (LinkedIn, Salesforce), and other external business sites such as OpenCorporates, linking these together with pertinent internal British Telecommunications enterprise data into that enterprise data space. We describe the challenges faced during the implementation, which include sourcing the datasets, finding the appropriate “join points” from the individual datasets, as well as developing the client application used for data publication. We describe our solutions to these challenges and discuss the design decisions made. We conclude by drawing some general principles from this work.”

Online and social media data as a flawed continuous panel survey


Fernando Diaz, Michael Gamon, Jake Hofman, Emre Kıcıman, and David Rothschild from Microsoft Research: “There is a large body of research on utilizing online activity to predict various real world outcomes, ranging from outbreaks of influenza to outcomes of elections. There is considerably less work, however, on using this data to understand topic-specific interest and opinion amongst the general population and specific demographic subgroups, as currently measured by relatively expensive surveys. Here we investigate this possibility by studying a full census of all Twitter activity during the 2012 election cycle along with comprehensive search history of a large panel of internet users during the same period, highlighting the challenges in interpreting online and social media activity as the results of a survey. As noted in existing work, the online population is a non-representative sample of the offline world (e.g., the U.S. voting population). We extend this work to show how demographic skew and user participation is non-stationary and unpredictable over time. In addition, the nature of user contributions varies wildly around important events. Finally, we note subtle problems in mapping what people are sharing or consuming online to specific sentiment or opinion measures around a particular topic. These issues must be addressed before meaningful insight about public interest and opinion can be reliably extracted from online and social media data…”

 

Data Mining Reddit Posts Reveals How to Ask For a Favor–And Get it


Emerging Technology From the arXiv: “There’s a secret to asking strangers for something and getting it. Now data scientists say they’ve discovered it by studying successful requests on the web

One of the more extraordinary phenomena on the internet is the rise of altruism and of websites designed to enable it. The Random Acts of Pizza section of the Reddit website is a good example.

People leave messages asking for pizza which others fulfil if they find the story compelling. As the site says: “because… who doesn’t like helping out a stranger? The purpose is to have fun, eat pizza and help each other out. Together, we aim to restore faith in humanity, one slice at a time.”

A request might go something like this: “It’s been a long time since my mother and I have had proper food. I’ve been struggling to find any kind of work so I can supplement my mom’s social security… A real pizza would certainly lift our spirits”. Anybody can then fulfil the order which is then marked on the site with a badge saying “got pizza’d”, often with notes of thanks.

That raises an interesting question. What kinds of requests are most successful in getting a response? Today, we get an answer thanks to the work of Tim Althoff at Stanford University and a couple of pals who lift the veil on the previously murky question of how to ask for a favour—and receive it.

They analysed how various features might be responsible for the success of a post, such as the politeness of the post; its sentiment, whether positive or negative for example; its length. The team also looked at the similarity of the requester to the benefactor; and also the status of the requester.

Finally, they examined whether the post contained evidence of need in the form of a narrative that described why the requester needed free pizza.

Althoff and co used a standard machine learning algorithm to comb through all the possible correlations in 70 per cent of the data, which they used for training. Having found various correlations, they tested to see whether this had predictive power in the remaining 30 per cent of the data. In other words, can their algorithm predict whether a previously unseen request will be successful or not?

It turns out that their algorithm makes a successful prediction about 70 per cent of the time. That’s far from perfect but much better than random guessing which is right only half the time.

So what kinds of factors are important? Narrative is a key part of many of the posts, so Althoff and co spent some time categorising the types of stories people use.

They divided the narratives into five types, those that mention: money; a job; being a student; family; and a final group that includes mentions of friends, being drunk, celebrating and so on, which Althoff and co call ‘craving’.

Of these, narratives about jobs, family and money increase the probability of success. Student narratives have no effect while craving narratives significantly reduce the chances of success. In other words, narratives that communicate a need are more successful than those that do not.

 “We find that clearly communicating need through the narrative is essential,” say Althoff and co. And evidence of reciprocation helps too.

(Given these narrative requirements, it is not surprising that longer requests tend to be more successful than short ones.)

So for example, the following request was successful because it clearly demonstrates both need and evidence of reciprocation.

“My gf and I have hit some hard times with her losing her job and then unemployment as well for being physically unable to perform her job due to various hand injuries as a server in a restaurant. She is currently petitioning to have unemployment reinstated due to medical reasons for being unable to perform her job, but until then things are really tight and ANYTHING would help us out right now.

I’ve been both a giver and receiver in RAOP before and would certainly return the favor again when I am able to reciprocate. It took everything we have to pay rent today and some food would go a long ways towards making our next couple of days go by much better with some food.”

By contrast, the ‘craving’ narrative below demonstrates neither and was not successful.

“My friend is coming in town for the weekend and my friends and i are so excited because we haven’t seen him since junior high. we are going to a high school football game then to the dollar theater after and it would be so nice if someone fed us before we embarked :)”

Althoff and co also say that the status of the requester is an important factor too. “We find that Reddit users with higher status overall (higher karma) or higher status within the subcommunity (previous posts) are significantly more likely to receive help,” they say.

But surprisingly, being polite does not help (except by offering thanks).

That’s interesting work. Until now, psychologists have never understood the factors that make requests successful, largely because it has always been difficult to separate the influence of the request from what is being requested.

The key here is that everybody making requests in this study wants the same thing—pizza. In one swoop, this makes the data significantly easier to tease apart.

An important line of future work will be in using his work to understand altruistic behaviour in other communities too…

Ref:  http://arxiv.org/abs/1405.3282 : How to Ask for a Favor: A Case Study on the Success of Altruistic Requests”