Open for Business: How Open Data Can Help Achieve the G20 Growth Target


New Report commissioned by Omydiar Network on the Business Case for Open Data: “Economic analysis has confirmed the significant contribution to economic growth and productivity achievable through an open data agenda. Governments, the private sector, individuals and communities all stand to benefit from the innovation and information that will inform investment, drive the creation of new industries, and inform decision making and research. To mark a step change in the way valuable information is created and reused, the G20 should release information as open data.
In May 2014, Omidyar Network commissioned Lateral Economics to undertake economic analysis on the potential of open data to support the G20’s 2% growth target and illustrate how an open data agenda can make a significant contribution to economic growth and productivity. Combining all G20 economies, output could increase by USD 13 trillion cumulatively over the next five years. Implementation of open data policies would thus boost cumulative G20 GDP by around 1.1 percentage points (almost 55%) of the G20’s 2% growth target over five years.
Recommendations
Importantly, open data cuts across a number of this year’s G20 priorities: attracting private infrastructure investment, creating jobs and lifting participation, strengthening tax systems and fighting corruption. This memo suggests an open data thread that runs across all G20 priorities. The more data is opened, the more it can be used, reused, repurposed and built on—in combination with other data—for everyone’s benefit.
We call on G20 economies to sign up to the Open Data Charter.
The G20 should ensure that data released by G20 working groups and themes is in line with agreed open data standards. This will lead to more accountable, efficient, effective governments who are going further to expose inadequacy, fight corruption and spur innovation.
Data is a national resource and open data is a ‘win-win’ policy. It is about making more of existing resources. We know that the cost of opening data is smaller than the economic returns, which could be significant. Methods to respect privacy concerns must be taken into account. If this is done, as the public and private sector share of information grows, there will be increasing positive returns.
The G20 opportunity
This November, leaders of the G20 Member States will meet in Australia to drive forward commitments made in the St Petersburg G20 Leaders Declaration last September and to make firm progress on stimulating growth. Actions across the G20 will include increasing investment, lifting employment and participation, enhancing trade and promoting competition.
The resulting ‘Brisbane Action Plan’ will encapsulate all of these commitments with the aim of raising the level of G20 output by at least 2% above the currently projected level over the next five years. There are major opportunities for cooperative and collective action by G20 governments.
Governments should intensify the release of existing public sector data – both government and publicly funded research data. But much more can be done to promote open data than simply releasing more government data. In appropriate circumstances, governments can mandate public disclosure of private sector data (e.g. in corporate financial reporting).
Recommendations for action

  • G20 governments should adopt the principles of the Open Data Charter to encourage the building of stronger, more interconnected societies that better meet the needs of our citizens and allow innovation and prosperity to flourish.
  • G20 governments should adopt specific open data targets under each G20 theme, as illustrated below, such as releasing open data related to beneficial owners of companies, as well revenues from extractive industries
  • G20 governments should consider harmonizing licensing regimes across the G20
  • G20 governments should adopt metrics for measuring the quantity and quality of open data publication, e.g. using the Open Data Institute’s Open Data Certificates as a bottom-up mechanism for driving the adoption of common standards.

Illustrative G20 examples
Fiscal and monetary policy
Governments possess rich real time data that is not open or accessed by government macro-economic managers. G20 governments should:

  • Open up models that lie behind economic forecasts and help assess alternative policy settings;
  • Publish spending and contractual data to enable comparative shopping by government between government suppliers.

Anti corruption
Open data may directly contribute to reduced corruption by increasing the likelihood corruption will be detected. G20 governments should:

  • Release open data related to beneficial owners of companies as well as revenues from extractive industries,
  • Collaborate on harmonised technical standards that permit the tracing of international money flows – including the tracing of beneficial owners of commercial entities, and the comparison and reconciliation of transactions across borders.

Trade
Obtaining and using trade data from multiple jurisdictions is difficult. Access fees, specific licenses, and non-machine readable formats all involve large transaction costs. G20 governments should:

  • Harmonise open data policies related to trade data.
  • Use standard trade schema and formats.

Employment
Higher quality information on employment conditions would facilitate better matching of employees to organizations, producing greater job-satisfaction and improved productivity. G20 governments should:

  • Open up centralised job vacancy registers to provide new mechanisms for people to find jobs.
  • Provide open statistical information about the demand for skills in particular areas to help those supporting training and education to hone their offerings.

Energy
Open data will help reduce the cost of energy supply and improve energy efficiency. G20 governments should:

  • Provide incentives for energy companies to publish open data from consumers and suppliers to enable cost savings through optimizing energy plans.
  • Release energy performance certifications for buildings
  • Publish real-time energy consumption for government buildings.

Infrastructure
Current infrastructure asset information is fragmented and inefficient. Exposing current asset data would be a significant first step in understanding gaps and providing new insights. G20 governments should:

  • Publish open data on governments’ infrastructure assets and plans to better understand infrastructure gaps, enable greater efficiency and insights in infrastructure development and use and analyse cost/benefits.
  • Publish open infrastructure data, including contracts via Open Contracting Partnership, in a consistent and harmonised way across G20 countries…”

Big Data, My Data


Jane Sarasohn-Kahn  at iHealthBeat: “The routine operation of modern health care systems produces an abundance of electronically stored data on an ongoing basis,” Sebastian Schneeweis writes in a recent New England Journal of Medicine Perspective.
Is this abundance of data a treasure trove for improving patient care and growing knowledge about effective treatments? Is that data trove a Pandora’s black box that can be mined by obscure third parties to benefit for-profit companies without rewarding those whose data are said to be the new currency of the economy? That is, patients themselves?
In this emerging world of data analytics in health care, there’s Big Data and there’s My Data (“small data”). Who most benefits from the use of My Data may not actually be the consumer.
Big focus on Big Data. Several reports published in the first half of 2014 talk about the promise and perils of Big Data in health care. The Federal Trade Commission’s study, titled “Data Brokers: A Call for Transparency and Accountability,” analyzed the business practices of nine “data brokers,” companies that buy and sell consumers’ personal information from a broad array of sources. Data brokers sell consumers’ information to buyers looking to use those data for marketing, managing financial risk or identifying people. There are health implications in all of these activities, and the use of such data generally is not covered by HIPAA. The report discusses the example of a data segment called “Smoker in Household,” which a company selling a new air filter for the home could use to target-market to an individual who might seek such a product. On the downside, without the consumers’ knowledge, the information could be used by a financial services company to identify the consumer as a bad health insurance risk.
Big Data and Privacy: A Technological Perspective,” a report from the President’s Office of Science and Technology Policy, considers the growth of Big Data’s role in helping inform new ways to treat diseases and presents two scenarios of the “near future” of health care. The first, on personalized medicine, recognizes that not all patients are alike or respond identically to treatments. Data collected from a large number of similar patients (such as digital images, genomic information and granular responses to clinical trials) can be mined to develop a treatment with an optimal outcome for the patients. In this case, patients may have provided their data based on the promise of anonymity but would like to be informed if a useful treatment has been found. In the second scenario, detecting symptoms via mobile devices, people wishing to detect early signs of Alzheimer’s Disease in themselves use a mobile device connecting to a personal couch in the Internet cloud that supports and records activities of daily living: say, gait when walking, notes on conversations and physical navigation instructions. For both of these scenarios, the authors ask, “Can the information about individuals’ health be sold, without additional consent, to third parties? What if this is a stated condition of use of the app? Should information go to the individual’s personal physicians with their initial consent but not a subsequent confirmation?”
The World Privacy Foundation’s report, titled “The Scoring of America: How Secret Consumer Scores Threaten Your Privacy and Your Future,” describes the growing market for developing indices on consumer behavior, identifying over a dozen health-related scores. Health scores include the Affordable Care Act Individual Health Risk Score, the FICO Medication Adherence Score, various frailty scores, personal health scores (from WebMD and OneHealth, whose default sharing setting is based on the user’s sharing setting with the RunKeeper mobile health app), Medicaid Resource Utilization Group Scores, the SF-36 survey on physical and mental health and complexity scores (such as the Aristotle score for congenital heart surgery). WPF presents a history of consumer scoring beginning with the FICO score for personal creditworthiness and recommends regulatory scrutiny on the new consumer scores for fairness, transparency and accessibility to consumers.
At the same time these three reports went to press, scores of news stories emerged discussing the Big Opportunities Big Data present. The June issue of CFO Magazine published a piece called “Big Data: Where the Money Is.” InformationWeek published “Health Care Dives Into Big Data,” Motley Fool wrote about “Big Data’s Big Future in Health Care” and WIRED called “Cloud Computing, Big Data and Health Care” the “trifecta.”
Well-timed on June 5, the Office of the National Coordinator for Health IT’s Roadmap for Interoperability was detailed in a white paper, titled “Connecting Health and Care for the Nation: A 10-Year Vision to Achieve an Interoperable Health IT Infrastructure.” The document envisions the long view for the U.S. health IT ecosystem enabling people to share and access health information, ensuring quality and safety in care delivery, managing population health, and leveraging Big Data and analytics. Notably, “Building Block #3” in this vision is ensuring privacy and security protections for health information. ONC will “support developers creating health tools for consumers to encourage responsible privacy and security practices and greater transparency about how they use personal health information.” Looking forward, ONC notes the need for “scaling trust across communities.”
Consumer trust: going, going, gone? In the stakeholder community of U.S. consumers, there is declining trust between people and the companies and government agencies with whom people deal. Only 47% of U.S. adults trust companies with whom they regularly do business to keep their personal information secure, according to a June 6 Gallup poll. Furthermore, 37% of people say this trust has decreased in the past year. Who’s most trusted to keep information secure? Banks and credit card companies come in first place, trusted by 39% of people, and health insurance companies come in second, trusted by 26% of people.
Trust is a basic requirement for health engagement. Health researchers need patients to share personal data to drive insights, knowledge and treatments back to the people who need them. PatientsLikeMe, the online social network, launched the Data for Good project to inspire people to share personal health information imploring people to “Donate your data for You. For Others. For Good.” For 10 years, patients have been sharing personal health information on the PatientsLikeMe site, which has developed trusted relationships with more than 250,000 community members…”

How to Make Government Data Sites Better


Flowing Data: “Accessing government data from the source is frustrating. If you’ve done it, or at least tried to, you know the pain that is oddly formatted files, search that doesn’t work, and annotation that tells you nothing about the data in front of you.
The most frustrating part of the process is knowing how useful the data could be if only it were shared more simply. Unfortunately, ease-of-use is rarely the case, and we spend more time formatting and inspecting the data than we do actually putting it to use. Shouldn’t it be the other way around?
It’s this painstaking process that draws so much ire. It’s hard not to complain.
Maybe the people in charged of these sites just don’t know what’s going on. Or maybe they’re so overwhelmed by suck that they don’t know where to start. Or they’re unknowingly infected by the that-is-how-we’ve-always-done-it bug.
Whatever it may be, I need to think out loud about how to improve these sites. Empty complaints don’t help.
I use the Centers for Disease Control and Prevention as the test subject, but most of the things covered should easily generalize to other government sites (and non-government ones too). And I choose CDC not because they’re the worst but because they publish a lot of data that is of immediate and direct use to the general public.
I approach this from the point of view of someone who uses government data, beyond pulling a single data point from a spreadsheet. I’m also going to put on my Captain Obvious hat, because what seems obvious to some is apparently a black box to others.
Provide a useable data format
Sometimes it feels like government data is available in every format except the one that data users want. The worst one was when I downloaded a 2gb file, and upon unzipping it, I discovered it was a EXE file.
Data in PDF format is a kick in the face for people looking for CSV files. There might be ways to get the data out from PDFs, but it’s still a pain when you have more than a handful of files….
Useable data format is the most important, and if there’s just one thing you change, make it this.
(Raw data is fine too)
It’s rare to find raw government data, so it’s like striking gold when it actually happens. I realize you run into issues with data privacy, quality, missing data, etc. For these data sources, I appreciate the estimates with standard errors. However, the less aggregated (the more raw) you can provide, the better.
CSV for that too, please.
Never mind the fancy sharing tools
Not all government data is wedged into PDF files, and some of it is accessible via export tools that let you subset and layout your data exactly how you want it. The problem is that in an effort to please everyone, you end up with a tool shown on the left….
Tell people where to get the data
Get the things above done, and your government data site is exponentially better than it was before, but let’s keep going.
The navigation process to get to a dataset is incredibly convoluted, which makes it hard to find data and difficult to return to it….
Show visual previews
I’m all for visualization integrated with the data search tools. It always sucks when I spend time formatting data only to find that it wasn’t worth my time. Census Reporter is a fine example of how this might work.
That said, visual tools plus an upgrade to the previously mentioned things is a big undertaking, especially if you’re going to do it right. So I’m perfectly fine if you skip this step to focus your resources on data that’s easier to use and download. Leave the visualizing and analysis to us.
Decide what’s important, archive the rest
So much cruft. So many old documents. Broken links. Create an archive and highlight what people come to your site for.
Wrapping up
There’s plenty more stuff to update, especially once you start to work with the details, but this should be a good place to start. It’s a lot easier to point out what you can do to improve government data sharing than it is to actually do it of course. There are so many people, policies, and oh yes, politics, that it can be hard to change.”

How Long Is Too Long? The 4th Amendment and the Mosaic Theory


Law and Liberty Blog: “Volume 8.2 of the NYU Journal of Law and Liberty has been sent to the printer and physical copies will be available soon, but the articles in the issue are already available online here. One article that has gotten a lot of attention so far is by Steven Bellovin, Renee Hutchins, Tony Jebara, and Sebastian Zimmeck titled “When Enough is Enough: Location Tracking, Mosaic Theory, and Machine Learning.” A direct link to the article is here.
The mosaic theory is a modern corollary accepted by some academics – and the D.C. Circuit Court of Appeals in Maynard v. U.S. – as a twenty-first century extension of the Fourth Amendment’s prohibition on unreasonable searches of seizures. Proponents of the mosaic theory argue that at some point enough individual data collections, compiled and analyzed together, become a Fourth Amendment search. Thirty years ago the Supreme Court upheld the use of a tracking device for three days without a warrant, however the proliferation of GPS tracking in cars and smartphones has made it significantly easier for the police to access a treasure trove of information about our location at any given time.
It is easy to see why this theory has attracted some support. Humans are creatures of habit – if our public locations are tracked for a few days, weeks, or a month, it is pretty easy for machines to learn our ways and assemble a fairly detailed report for the government about our lives. Machines could basically predict when you will leave your house for work, what route you will take, when and where you go grocery shopping, all before you even do it, once it knows your habits. A policeman could observe you moving about in public without a warrant of course, but limited manpower will always reduce the probability of continuous mass surveillance. With current technology, a handful of trained experts could easily monitor hundreds of people at a time from behind a computer screen, and gather even more information than most searches requiring a warrant. The Supreme Court indicated a willingness to consider the mosaic theory in U.S. v. Jones, but has yet to embrace it…”

The article in Law & Liberty details the need to determine at which point machine learning creates an intrusion into our reasonable expectations of privacy, and even discusses an experiment that could be run to determine how long data collection can proceed before it is an intrusion. If there is a line at which individual data collection becomes a search, we need to discover where that line is. One of the articles’ authors, Steven Bollovin, has argued that the line is probably at one week – at that point your weekday and weekend habits would be known. The nation’s leading legal expert on criminal law, Professor Orin Kerr, fired back on the Volokh Conspiracy that Bollovin’s one week argument is not in line with previous iterations of the mosaic theory.

Humanitarians in the sky


Patrick Meier in the Guardian: “Unmanned aerial vehicles (UAVs) capture images faster, cheaper, and at a far higher resolution than satellite imagery. And as John DeRiggi speculates in “Drones for Development?” these attributes will likely lead to a host of applications in development work. In the humanitarian field that future is already upon us — so we need to take a rights-based approach to advance the discussion, improve coordination of UAV flights, and to promote regulation that will ensure safety while supporting innovation.
It was the unprecedentedly widespread use of civilian UAVs following typhoon Haiyan in the Philippines that opened my eyes to UAV use in post-disaster settings. I was in Manila to support the United Nations’ digital humanitarian efforts and came across new UAV projects every other day.
One team was flying rotary-wing UAVs to search for survivors among vast fields of debris that were otherwise inaccessible. Another flew fixed-wing UAVs around Tacloban to assess damage and produce high-quality digital maps. Months later, UAVs are still being used to support recovery and preparedness efforts. One group is working with local mayors to identify which communities are being overlooked in the reconstruction.
Humanitarian UAVs are hardly new. As far back as 2007, the World Food Program teamed up with the University of Torino to build humanitarian UAVs. But today UAVs are much cheaper, safer, and easier to fly. This means more people own personal UAVs. The distinguishing feature between these small UAVs and traditional remote control airplanes or helicopters is that they are intelligent. Most can be programmed to fly and land autonomously at designated locations. Newer UAVs also have on-board, flight-stabilization features that automatically adapt to changing winds, automated collision avoidance systems, and standard fail-safe mechanisms.
While I was surprised by the surge in UAV projects in the Philippines, I was troubled that none of these teams were aware of each other and that most were apparently not sharing their imagery with local communities. What happens when even more UAV teams show up following future disasters? Will they be accompanied by droves of drone journalists and “disaster tourists” equipped with personal UAVs? Will we see thousands of aerial disaster pictures and videos uploaded to social media rather than in the hands of local communities? What are the privacy implications? And what about empowering local communities to deploy their own UAVs?
There were many questions but few answers. So I launched the humanitarian UAV network (UAViators) to bridge the worlds of humanitarian professionals and UAV experts to address these questions. Our first priority was to draft a code of conduct for the use of UAVs in humanitarian settings to hold ourselves accountable while educating new UAV pilots before serious mistakes are made…”

HHS releases new data and tools to increase transparency on hospital utilization and other trends


Pressrelease: “With more than 2,000 entrepreneurs, investors, data scientists, researchers, policy experts, government employees and more in attendance, the Department of Health and Human Services (HHS) is releasing new data and launching new initiatives at the annual Health Datapalooza conference in Washington, D.C.
Today, the Centers for Medicare & Medicaid Services (CMS) is releasing its first annual update to the Medicare hospital charge data, or information comparing the average amount a hospital bills for services that may be provided in connection with a similar inpatient stay or outpatient visit. CMS is also releasing a suite of other data products and tools aimed to increase transparency about Medicare payments. The data trove on CMS’s website now includes inpatient and outpatient hospital charge data for 2012, and new interactive dashboards for the CMS Chronic Conditions Data Warehouse and geographic variation data. Also today, the Food and Drug Administration (FDA) will launch a new open data initiative. And before the end of the conference, the Office of the National Coordinator for Health Information Technology (ONC) will announce the winners of two data challenges.
“The release of these data sets furthers the administration’s efforts to increase transparency and support data-driven decision making which is essential for health care transformation,” said HHS Secretary Kathleen Sebelius.
“These public data resources provide a better understanding of Medicare utilization, the burden of chronic conditions among beneficiaries and the implications for our health care system and how this varies by where beneficiaries are located,” said Bryan Sivak, HHS chief technology officer. “This information can be used to improve care coordination and health outcomes for Medicare beneficiaries nationwide, and we are looking forward to seeing what the community will do with these releases. Additionally, the openFDA initiative being launched today will for the first time enable a new generation of consumer facing and research applications to embed relevant and timely data in machine-readable, API-based formats.”
2012 Inpatient and Outpatient Hospital Charge Data
The data posted today on the CMS website provide the first annual update of the hospital inpatient and outpatient data released by the agency last spring. The data include information comparing the average charges for services that may be provided in connection with the 100 most common Medicare inpatient stays at over 3,000 hospitals in all 50 states and Washington, D.C. Hospitals determine what they will charge for items and services provided to patients and these “charges” are the amount the hospital generally bills for those items or services.
With two years of data now available, researchers can begin to look at trends in hospital charges. For example, average charges for medical back problems increased nine percent from $23,000 to $25,000, but the total number of discharges decreased by nearly 7,000 from 2011 to 2012.
In April, ONC launched a challenge – the Code-a-Palooza challenge – calling on developers to create tools that will help patients use the Medicare data to make health care choices. Fifty-six innovators submitted proposals and 10 finalists are presenting their applications during Datapalooza. The winning products will be announced before the end of the conference.
Chronic Conditions Warehouse and Dashboard
CMS recently released new and updated information on chronic conditions among Medicare fee-for-service beneficiaries, including:

  • Geographic data summarized to national, state, county, and hospital referral regions levels for the years 2008-2012;
  • Data for examining disparities among specific Medicare populations, such as beneficiaries with disabilities, dual-eligible beneficiaries, and race/ethnic groups;
  • Data on prevalence, utilization of select Medicare services, and Medicare spending;
  • Interactive dashboards that provide customizable information about Medicare beneficiaries with chronic conditions at state, county, and hospital referral regions levels for 2012; and
  • Chartbooks and maps.

These public data resources support the HHS Initiative on Multiple Chronic Conditions by providing researchers and policymakers a better understanding of the burden of chronic conditions among beneficiaries and the implications for our health care system.
Geographic Variation Dashboard
The Geographic Variation Dashboards present Medicare fee-for-service per-capita spending at the state and county levels in interactive formats. CMS calculated the spending figures in these dashboards using standardized dollars that remove the effects of the geographic adjustments that Medicare makes for many of its payment rates. The dashboards include total standardized per capita spending, as well as standardized per capita spending by type of service. Users can select the indicator and year they want to display. Users can also compare data for a given state or county to the national average. All of the information presented in the dashboards is also available for download from the Geographic Variation Public Use File.
Research Cohort Estimate Tool
CMS also released a new tool that will help researchers and other stakeholders estimate the number of Medicare beneficiaries with certain demographic profiles or health conditions. This tool can assist a variety of stakeholders interested in specific figures on Medicare enrollment. Researchers can also use this tool to estimate the size of their proposed research cohort and the cost of requesting CMS data to support their study.
Digital Privacy Notice Challenge
ONC, with the HHS Office of Civil Rights, will be awarding the winner of the Digital Privacy Notice Challenge during the conference. The winning products will help consumers get notices of privacy practices from their health care providers or health plans directly in their personal health records or from their providers’ patient portals.
OpenFDA
The FDA’s new initiative, openFDA, is designed to facilitate easier access to large, important public health datasets collected by the agency. OpenFDA will make FDA’s publicly available data accessible in a structured, computer readable format that will make it possible for technology specialists, such as mobile application creators, web developers, data visualization artists and researchers to quickly search, query, or pull massive amounts of information on an as needed basis. The initiative is the result of extensive research to identify FDA’s publicly available datasets that are often in demand, but traditionally difficult to use. Based on this research, openFDA is beginning with a pilot program involving millions of reports of drug adverse events and medication errors submitted to the FDA from 2004 to 2013. The pilot will later be expanded to include the FDA’s databases on product recalls and product labeling.
For more information about CMS data products, please visit http://www.cms.gov/Research-Statistics-Data-and-Systems/Research-Statistics-Data-and-Systems.html.
For more information about today’s FDA announcement visit: http://www.fda.gov/NewsEvents/Newsroom/PressAnnouncements/UCM399335 or http://open.fda.gov/

Estonian plan for 'data embassies' overseas to back up government databases


Graeme Burton in Computing: “Estonia is planning to open “data embassies” overseas to back up government databases and to operate government “in the cloud“.
The aim is partly to improve efficiency, but driven largely by fear of invasion and occupation, Jaan Priisalu, the director general of Estonian Information System Authority, told Sky News.
He said: “We are planning to actually operate our government in the cloud. It’s clear also how it helps to protect the country, the territory. Usually when you are the military planner and you are planning the occupation of the territory, then one of the rules is suppress the existing institutions.
“And if you are not able to do it, it means that this political price of occupying the country will simply rise for planners.”
Part of the rationale for the plan, he continued, was fear of attack from Russia in particular, which has been heightened following the occupation of Crimea, formerly in Ukraine.
“It’s quite clear that you can have problems with your neighbours. And our biggest neighbour is Russia, and nowadays it’s quite aggressive. This is clear.”
The plan is to back up critical government databases outside of Estonia so that affairs of state can be conducted in the cloud, even if the country is invaded. It would also have the benefit of keeping government information out of invaders’ hands – provided it can keep its government cloud secure.
According to Sky News, the UK is already in advanced talks about hosting the Estonian government databases and may make the UK the first of Estonia’s data embassies.
Having wrested independence from the Soviet Union in 1991, Estonia has experienced frequent tension with its much bigger neighbour. In 2007, for example, after the relocation of the “Bronze Soldier of Tallinn” and the exhumation of the soldiers buried in a square in the centre of the capital to a military cemetery in April 2007, the country was subject to a prolonged cyber-attack sourced to Russia.
Russian hacker “Sp0Raw” said that the most efficient of the online attacks on Estonia could not have been carried out without the approval of Russian authorities and added that the hackers seemed to act under “recommendations” from parties in government. However, claims by Estonia that the Russian government was directly involved in the attacks were “empty words, not supported by technical data”.
Mike Witt, deputy director of the US Computer Emergency Response Team (CERT), suggested that the distributed denial-of-service (DDOS) attacks, while crippling to the Estonian government at the time, were not significant in scale from a technical standpoint. However, the Estonian government was forced to shut down many of its online operations in response.
At the same time, the Estonian government has been accused of implementing anti-Russian laws and discriminating against its large ethnic Russian population.
Last week, the Estonian government unveiled a plan to allow anyone in the world to apply for “digital citizenship of the country, enabling them to use Estonian online services, open bank accounts, and start companies without having to physically reside in the country.”

How Big Data Could Undo Our Civil-Rights Laws


Virginia Eubanks in the American Prospect: “From “reverse redlining” to selling out a pregnant teenager to her parents, the advance of technology could render obsolete our landmark civil-rights and anti-discrimination laws.
Big Data will eradicate extreme world poverty by 2028, according to Bono, front man for the band U2. But it also allows unscrupulous marketers and financial institutions to prey on the poor. Big Data, collected from the neonatal monitors of premature babies, can detect subtle warning signs of infection, allowing doctors to intervene earlier and save lives. But it can also help a big-box store identify a pregnant teenager—and carelessly inform her parents by sending coupons for baby items to her home. News-mining algorithms might have been able to predict the Arab Spring. But Big Data was certainly used to spy on American Muslims when the New York City Police Department collected license plate numbers of cars parked near mosques, and aimed surveillance cameras at Arab-American community and religious institutions.
Until recently, debate about the role of metadata and algorithms in American politics focused narrowly on consumer privacy protections and Edward Snowden’s revelations about the National Security Agency (NSA). That Big Data might have disproportionate impacts on the poor, women, or racial and religious minorities was rarely raised. But, as Wade Henderson, president and CEO of the Leadership Conference on Civil and Human Rights, and Rashad Robinson, executive director of ColorOfChange, a civil rights organization that seeks to empower black Americans and their allies, point out in a commentary at TPM Cafe, while big data can change business and government for the better, “it is also supercharging the potential for discrimination.”
In his January 17 speech on signals intelligence, President Barack Obama acknowledged as much, seeking to strike a balance between defending “legitimate” intelligence gathering on American citizens and admitting that our country has a history of spying on dissidents and activists, including, famously, Dr. Martin Luther King, Jr. If this balance seems precarious, it’s because the links between historical surveillance of social movements and today’s uses of Big Data are not lost on the new generation of activists.
“Surveillance, big data and privacy have a historical legacy,” says Amalia Deloney, policy director at the Center for Media Justice, an Oakland-based organization dedicated to strengthening the communication effectiveness of grassroots racial justice groups. “In the early 1960s, in-depth, comprehensive, orchestrated, purposeful spying was used to disrupt political movements in communities of color—the Yellow Peril, the American Indian Movement, the Brown Berets, or the Black Panthers—to create fear and chaos, and to spread bias and stereotypes.”
In the era of Big Data, the danger of reviving that legacy is real, especially as metadata collection renders legal protection of civil rights and liberties less enforceable….
Big Data and surveillance are unevenly distributed. In response, a coalition of 14 progressive organizations, including the ACLU, ColorOfChange, the Leadership Conference on Civil and Human Rights, the NAACP, National Council of La Raza, and the NOW Foundation, recently released five “Civil Rights Principles for the Era of Big Data.” In their statement, they demand:

  • An end to high-tech profiling;
  • Fairness in automated decisions;
  • The preservation of constitutional principles;
  • Individual control of personal information; and
  • Protection of people from inaccurate data.

This historic coalition aims to start a national conversation about the role of big data in social and political inequality. “We’re beginning to ask the right questions,” says O’Neill. “It’s not just about what can we do with this data. How are communities of color impacted? How are women within those communities impacted? We need to fold these concerns into the national conversation.”

Rethinking Personal Data: A New Lens for Strengthening Trust


New report from the World Economic Forum: “As we look at the dynamic change shaping today’s data-driven world, one thing is becoming increasingly clear. We really do not know that much about it. Polarized along competing but fundamental principles, the global dialogue on personal data is inchoate and pulled in a variety of directions. It is complicated, conflated and often fueled by emotional reactions more than informed understandings.
The World Economic Forum’s global dialogue on personal data seeks to cut through this complexity. A multi-year initiative with global insights from the highest levels of leadership from industry, governments, civil society and academia, this work aims to articulate an ascendant vision of the value a balanced and human-centred personal data ecosystem can create.
Yet despite these aspirations, there is a crisis in trust. Concerns are voiced from a variety of viewpoints at a variety of scales. Industry, government and civil society are all uncertain on how to create a personal data ecosystem that is adaptive, reliable, trustworthy and fair.
The shared anxieties stem from the overwhelming challenge of transitioning into a hyperconnected world. The growth of data, the sophistication of ubiquitous computing and the borderless flow of data are all outstripping the ability to effectively govern on a global basis. We need the means to effectively uphold fundamental principles in ways fit for today’s world.
Yet despite the size and scope of the complexity, it cannot become a reason for inaction. The need for pragmatic and scalable approaches which strengthen transparency, accountability and the empowerment of individuals has become a global priority.
Tools are needed to answer fundamental questions: Who has the data? Where is the data? What is being done with it? All of these uncertainties need to be addressed for meaningful progress to occur.
Objectives need to be set. The benefits and harms for using personal data need be more precisely defined. The ambiguity surrounding privacy needs to be demystified and placed into a real-world context.
Individuals need to be meaningfully empowered. Better engagement over how data is used by third parties is one opportunity for strengthening trust. Supporting the ability for individuals to use personal data for their own purposes is another area for innovation and growth. But combined, the overall lack of engagement is undermining trust.
Collaboration is essential. The need for interdisciplinary collaboration between technologists, business leaders, social scientists, economists and policy-makers is vital. The complexities for delivering a sustainable and balanced personal data ecosystem require that these multifaceted perspectives are all taken into consideration.
With a new lens for using personal data, progress can occur.

Figure 1: A new lens for strengthening trust
 

Source: World Economic Forum

Continued Progress and Plans for Open Government Data


Steve VanRoekel, and Todd Park at the White House:  “One year ago today, President Obama signed an executive order that made open and machine-readable data the new default for government information. This historic step is helping to make government-held data more accessible to the public and to entrepreneurs while appropriately safeguarding sensitive information and rigorously protecting privacy.
Freely available data from the U.S. government is an important national resource, serving as fuel for entrepreneurship, innovation, scientific discovery, and economic growth. Making information about government operations more readily available and useful is also core to the promise of a more efficient and transparent government. This initiative is a key component of the President’s Management Agenda and our efforts to ensure the government is acting as an engine to expand economic growth and opportunity for all Americans. The Administration is committed to driving further progress in this area, including by designating Open Data as one of our key Cross-Agency Priority Goals.
Over the past few years, the Administration has launched a number of Open Data Initiatives aimed at scaling up open data efforts across the Health, Energy, Climate, Education, Finance, Public Safety, and Global Development sectors. The White House has also launched Project Open Data, designed to share best practices, examples, and software code to assist federal agencies with opening data. These efforts have helped unlock troves of valuable data—that taxpayers have already paid for—and are making these resources more open and accessible to innovators and the public.
Other countries are also opening up their data. In June 2013, President Obama and other G7 leaders endorsed the Open Data Charter, in which the United States committed to publish a roadmap for our nation’s approach to releasing and improving government data for the public.
Building upon the Administration’s Open Data progress, and in fulfillment of the Open Data Charter, today we are excited to release the U.S. Open Data Action Plan. The plan includes a number of exciting enhancements and new data releases planned in 2014 and 2015, including:

  • Small Business Data: The Small Business Administration’s (SBA) database of small business suppliers will be enhanced so that software developers can create tools to help manufacturers more easily find qualified U.S. suppliers, ultimately reducing the transaction costs to source products and manufacture domestically.
  • Smithsonian American Art Museum Collection: The Smithsonian American Art Museum’s entire digitized collection will be opened to software developers to make educational apps and tools. Today, even museum curators do not have easily accessible information about their art collections. This information will soon be available to everyone.
  • FDA Adverse Drug Event Data: Each year, healthcare professionals and consumers submit millions of individual reports on drug safety to the Food and Drug Administration (FDA). These anonymous reports are a critical tool to support drug safety surveillance. Today, this data is only available through limited quarterly reports. But the Administration will soon be making these reports available in their entirety so that software developers can build tools to help pull potentially dangerous drugs off shelves faster than ever before.

We look forward to implementing the U.S. Open Data Action Plan, and to continuing to work with our partner countries in the G7 to take the open data movement global”.