How Big Data Could Undo Our Civil-Rights Laws


Virginia Eubanks in the American Prospect: “From “reverse redlining” to selling out a pregnant teenager to her parents, the advance of technology could render obsolete our landmark civil-rights and anti-discrimination laws.
Big Data will eradicate extreme world poverty by 2028, according to Bono, front man for the band U2. But it also allows unscrupulous marketers and financial institutions to prey on the poor. Big Data, collected from the neonatal monitors of premature babies, can detect subtle warning signs of infection, allowing doctors to intervene earlier and save lives. But it can also help a big-box store identify a pregnant teenager—and carelessly inform her parents by sending coupons for baby items to her home. News-mining algorithms might have been able to predict the Arab Spring. But Big Data was certainly used to spy on American Muslims when the New York City Police Department collected license plate numbers of cars parked near mosques, and aimed surveillance cameras at Arab-American community and religious institutions.
Until recently, debate about the role of metadata and algorithms in American politics focused narrowly on consumer privacy protections and Edward Snowden’s revelations about the National Security Agency (NSA). That Big Data might have disproportionate impacts on the poor, women, or racial and religious minorities was rarely raised. But, as Wade Henderson, president and CEO of the Leadership Conference on Civil and Human Rights, and Rashad Robinson, executive director of ColorOfChange, a civil rights organization that seeks to empower black Americans and their allies, point out in a commentary at TPM Cafe, while big data can change business and government for the better, “it is also supercharging the potential for discrimination.”
In his January 17 speech on signals intelligence, President Barack Obama acknowledged as much, seeking to strike a balance between defending “legitimate” intelligence gathering on American citizens and admitting that our country has a history of spying on dissidents and activists, including, famously, Dr. Martin Luther King, Jr. If this balance seems precarious, it’s because the links between historical surveillance of social movements and today’s uses of Big Data are not lost on the new generation of activists.
“Surveillance, big data and privacy have a historical legacy,” says Amalia Deloney, policy director at the Center for Media Justice, an Oakland-based organization dedicated to strengthening the communication effectiveness of grassroots racial justice groups. “In the early 1960s, in-depth, comprehensive, orchestrated, purposeful spying was used to disrupt political movements in communities of color—the Yellow Peril, the American Indian Movement, the Brown Berets, or the Black Panthers—to create fear and chaos, and to spread bias and stereotypes.”
In the era of Big Data, the danger of reviving that legacy is real, especially as metadata collection renders legal protection of civil rights and liberties less enforceable….
Big Data and surveillance are unevenly distributed. In response, a coalition of 14 progressive organizations, including the ACLU, ColorOfChange, the Leadership Conference on Civil and Human Rights, the NAACP, National Council of La Raza, and the NOW Foundation, recently released five “Civil Rights Principles for the Era of Big Data.” In their statement, they demand:

  • An end to high-tech profiling;
  • Fairness in automated decisions;
  • The preservation of constitutional principles;
  • Individual control of personal information; and
  • Protection of people from inaccurate data.

This historic coalition aims to start a national conversation about the role of big data in social and political inequality. “We’re beginning to ask the right questions,” says O’Neill. “It’s not just about what can we do with this data. How are communities of color impacted? How are women within those communities impacted? We need to fold these concerns into the national conversation.”

Rethinking Personal Data: A New Lens for Strengthening Trust


New report from the World Economic Forum: “As we look at the dynamic change shaping today’s data-driven world, one thing is becoming increasingly clear. We really do not know that much about it. Polarized along competing but fundamental principles, the global dialogue on personal data is inchoate and pulled in a variety of directions. It is complicated, conflated and often fueled by emotional reactions more than informed understandings.
The World Economic Forum’s global dialogue on personal data seeks to cut through this complexity. A multi-year initiative with global insights from the highest levels of leadership from industry, governments, civil society and academia, this work aims to articulate an ascendant vision of the value a balanced and human-centred personal data ecosystem can create.
Yet despite these aspirations, there is a crisis in trust. Concerns are voiced from a variety of viewpoints at a variety of scales. Industry, government and civil society are all uncertain on how to create a personal data ecosystem that is adaptive, reliable, trustworthy and fair.
The shared anxieties stem from the overwhelming challenge of transitioning into a hyperconnected world. The growth of data, the sophistication of ubiquitous computing and the borderless flow of data are all outstripping the ability to effectively govern on a global basis. We need the means to effectively uphold fundamental principles in ways fit for today’s world.
Yet despite the size and scope of the complexity, it cannot become a reason for inaction. The need for pragmatic and scalable approaches which strengthen transparency, accountability and the empowerment of individuals has become a global priority.
Tools are needed to answer fundamental questions: Who has the data? Where is the data? What is being done with it? All of these uncertainties need to be addressed for meaningful progress to occur.
Objectives need to be set. The benefits and harms for using personal data need be more precisely defined. The ambiguity surrounding privacy needs to be demystified and placed into a real-world context.
Individuals need to be meaningfully empowered. Better engagement over how data is used by third parties is one opportunity for strengthening trust. Supporting the ability for individuals to use personal data for their own purposes is another area for innovation and growth. But combined, the overall lack of engagement is undermining trust.
Collaboration is essential. The need for interdisciplinary collaboration between technologists, business leaders, social scientists, economists and policy-makers is vital. The complexities for delivering a sustainable and balanced personal data ecosystem require that these multifaceted perspectives are all taken into consideration.
With a new lens for using personal data, progress can occur.

Figure 1: A new lens for strengthening trust
 

Source: World Economic Forum

Continued Progress and Plans for Open Government Data


Steve VanRoekel, and Todd Park at the White House:  “One year ago today, President Obama signed an executive order that made open and machine-readable data the new default for government information. This historic step is helping to make government-held data more accessible to the public and to entrepreneurs while appropriately safeguarding sensitive information and rigorously protecting privacy.
Freely available data from the U.S. government is an important national resource, serving as fuel for entrepreneurship, innovation, scientific discovery, and economic growth. Making information about government operations more readily available and useful is also core to the promise of a more efficient and transparent government. This initiative is a key component of the President’s Management Agenda and our efforts to ensure the government is acting as an engine to expand economic growth and opportunity for all Americans. The Administration is committed to driving further progress in this area, including by designating Open Data as one of our key Cross-Agency Priority Goals.
Over the past few years, the Administration has launched a number of Open Data Initiatives aimed at scaling up open data efforts across the Health, Energy, Climate, Education, Finance, Public Safety, and Global Development sectors. The White House has also launched Project Open Data, designed to share best practices, examples, and software code to assist federal agencies with opening data. These efforts have helped unlock troves of valuable data—that taxpayers have already paid for—and are making these resources more open and accessible to innovators and the public.
Other countries are also opening up their data. In June 2013, President Obama and other G7 leaders endorsed the Open Data Charter, in which the United States committed to publish a roadmap for our nation’s approach to releasing and improving government data for the public.
Building upon the Administration’s Open Data progress, and in fulfillment of the Open Data Charter, today we are excited to release the U.S. Open Data Action Plan. The plan includes a number of exciting enhancements and new data releases planned in 2014 and 2015, including:

  • Small Business Data: The Small Business Administration’s (SBA) database of small business suppliers will be enhanced so that software developers can create tools to help manufacturers more easily find qualified U.S. suppliers, ultimately reducing the transaction costs to source products and manufacture domestically.
  • Smithsonian American Art Museum Collection: The Smithsonian American Art Museum’s entire digitized collection will be opened to software developers to make educational apps and tools. Today, even museum curators do not have easily accessible information about their art collections. This information will soon be available to everyone.
  • FDA Adverse Drug Event Data: Each year, healthcare professionals and consumers submit millions of individual reports on drug safety to the Food and Drug Administration (FDA). These anonymous reports are a critical tool to support drug safety surveillance. Today, this data is only available through limited quarterly reports. But the Administration will soon be making these reports available in their entirety so that software developers can build tools to help pull potentially dangerous drugs off shelves faster than ever before.

We look forward to implementing the U.S. Open Data Action Plan, and to continuing to work with our partner countries in the G7 to take the open data movement global”.

To the Cloud: Big Data in a Turbulent World


Book by Vincent Mosco: “In the wake of revelations about National Security Agency activities—many of which occur “in the cloud”—this book offers both enlightenment and a critical view. Cloud computing and big data are arguably the most significant forces in information technology today. In clear prose, To the Cloud explores where the cloud originated, what it means, and how important it is for business, government, and citizens. It describes the intense competition among cloud companies like Amazon and Google, the spread of the cloud to government agencies like the controversial NSA, and the astounding growth of entire cloud cities in China. From advertising to trade shows, the cloud and big data are furiously marketed to the world, even as dark clouds loom over environmental, privacy, and employment issues that arise from the cloud. Is the cloud the long-promised information utility that will solve many of the world’s economic and social problems? Or is it just marketing hype? To the Cloud provides the first thorough analysis of the potential and the problems of a technology that may very well disrupt the world.”

Findings of the Big Data and Privacy Working Group Review


John Podesta at the White House Blog: “Over the past several days, severe storms have battered Arkansas, Oklahoma, Mississippi and other states. Dozens of people have been killed and entire neighborhoods turned to rubble and debris as tornadoes have touched down across the region. Natural disasters like these present a host of challenges for first responders. How many people are affected, injured, or dead? Where can they find food, shelter, and medical attention? What critical infrastructure might have been damaged?
Drawing on open government data sources, including Census demographics and NOAA weather data, along with their own demographic databases, Esri, a geospatial technology company, has created a real-time map showing where the twisters have been spotted and how the storm systems are moving. They have also used these data to show how many people live in the affected area, and summarize potential impacts from the storms. It’s a powerful tool for emergency services and communities. And it’s driven by big data technology.
In January, President Obama asked me to lead a wide-ranging review of “big data” and privacy—to explore how these technologies are changing our economy, our government, and our society, and to consider their implications for our personal privacy. Together with Secretary of Commerce Penny Pritzker, Secretary of Energy Ernest Moniz, the President’s Science Advisor John Holdren, the President’s Economic Advisor Jeff Zients, and other senior officials, our review sought to understand what is genuinely new and different about big data and to consider how best to encourage the potential of these technologies while minimizing risks to privacy and core American values.
Over the course of 90 days, we met with academic researchers and privacy advocates, with regulators and the technology industry, with advertisers and civil rights groups. The President’s Council of Advisors for Science and Technology conducted a parallel study of the technological trends underpinning big data. The White House Office of Science and Technology Policy jointly organized three university conferences at MIT, NYU, and U.C. Berkeley. We issued a formal Request for Information seeking public comment, and hosted a survey to generate even more public input.
Today, we presented our findings to the President. We knew better than to try to answer every question about big data in three months. But we are able to draw important conclusions and make concrete recommendations for Administration attention and policy development in a few key areas.
There are a few technological trends that bear drawing out. The declining cost of collection, storage, and processing of data, combined with new sources of data like sensors, cameras, and geospatial technologies, mean that we live in a world of near-ubiquitous data collection. All this data is being crunched at a speed that is increasingly approaching real-time, meaning that big data algorithms could soon have immediate effects on decisions being made about our lives.
The big data revolution presents incredible opportunities in virtually every sector of the economy and every corner of society.
Big data is saving lives. Infections are dangerous—even deadly—for many babies born prematurely. By collecting and analyzing millions of data points from a NICU, one study was able to identify factors, like slight increases in body temperature and heart rate, that serve as early warning signs an infection may be taking root—subtle changes that even the most experienced doctors wouldn’t have noticed on their own.
Big data is making the economy work better. Jet engines and delivery trucks now come outfitted with sensors that continuously monitor hundreds of data points and send automatic alerts when maintenance is needed. Utility companies are starting to use big data to predict periods of peak electric demand, adjusting the grid to be more efficient and potentially averting brown-outs.
Big data is making government work better and saving taxpayer dollars. The Centers for Medicare and Medicaid Services have begun using predictive analytics—a big data technique—to flag likely instances of reimbursement fraud before claims are paid. The Fraud Prevention System helps identify the highest-risk health care providers for waste, fraud, and abuse in real time and has already stopped, prevented, or identified $115 million in fraudulent payments.
But big data raises serious questions, too, about how we protect our privacy and other values in a world where data collection is increasingly ubiquitous and where analysis is conducted at speeds approaching real time. In particular, our review raised the question of whether the “notice and consent” framework, in which a user grants permission for a service to collect and use information about them, still allows us to meaningfully control our privacy as data about us is increasingly used and reused in ways that could not have been anticipated when it was collected.
Big data raises other concerns, as well. One significant finding of our review was the potential for big data analytics to lead to discriminatory outcomes and to circumvent longstanding civil rights protections in housing, employment, credit, and the consumer marketplace.
No matter how quickly technology advances, it remains within our power to ensure that we both encourage innovation and protect our values through law, policy, and the practices we encourage in the public and private sector. To that end, we make six actionable policy recommendations in our report to the President:
Advance the Consumer Privacy Bill of Rights. Consumers deserve clear, understandable, reasonable standards for how their personal information is used in the big data era. We recommend the Department of Commerce take appropriate consultative steps to seek stakeholder and public comment on what changes, if any, are needed to the Consumer Privacy Bill of Rights, first proposed by the President in 2012, and to prepare draft legislative text for consideration by stakeholders and submission by the President to Congress.
Pass National Data Breach Legislation. Big data technologies make it possible to store significantly more data, and further derive intimate insights into a person’s character, habits, preferences, and activities. That makes the potential impacts of data breaches at businesses or other organizations even more serious. A patchwork of state laws currently governs requirements for reporting data breaches. Congress should pass legislation that provides for a single national data breach standard, along the lines of the Administration’s 2011 Cybersecurity legislative proposal.
Extend Privacy Protections to non-U.S. Persons. Privacy is a worldwide value that should be reflected in how the federal government handles personally identifiable information about non-U.S. citizens. The Office of Management and Budget should work with departments and agencies to apply the Privacy Act of 1974 to non-U.S. persons where practicable, or to establish alternative privacy policies that apply appropriate and meaningful protections to personal information regardless of a person’s nationality.
Ensure Data Collected on Students in School is used for Educational Purposes. Big data and other technological innovations, including new online course platforms that provide students real time feedback, promise to transform education by personalizing learning. At the same time, the federal government must ensure educational data linked to individual students gathered in school is used for educational purposes, and protect students against their data being shared or used inappropriately.
Expand Technical Expertise to Stop Discrimination. The detailed personal profiles held about many consumers, combined with automated, algorithm-driven decision-making, could lead—intentionally or inadvertently—to discriminatory outcomes, or what some are already calling “digital redlining.” The federal government’s lead civil rights and consumer protection agencies should expand their technical expertise to be able to identify practices and outcomes facilitated by big data analytics that have a discriminatory impact on protected classes, and develop a plan for investigating and resolving violations of law.
Amend the Electronic Communications Privacy Act. The laws that govern protections afforded to our communications were written before email, the internet, and cloud computing came into wide use. Congress should amend ECPA to ensure the standard of protection for online, digital content is consistent with that afforded in the physical world—including by removing archaic distinctions between email left unread or over a certain age.
We also identify several broader areas ripe for further study, debate, and public engagement that, collectively, we hope will spark a national conversation about how to harness big data for the public good. We conclude that we must find a way to preserve our privacy values in both the domestic and international marketplace. We urgently need to build capacity in the federal government to identify and prevent new modes of discrimination that could be enabled by big data. We must ensure that law enforcement agencies using big data technologies do so responsibly, and that our fundamental privacy rights remain protected. Finally, we recognize that data is a valuable public resource, and call for continuing the Administration’s efforts to open more government data sources and make investments in research and technology.
While big data presents new challenges, it also presents immense opportunities to improve lives, the United States is perhaps better suited to lead this conversation than any other nation on earth. Our innovative spirit, technological know-how, and deep commitment to values of privacy, fairness, non-discrimination, and self-determination will help us harness the benefits of the big data revolution and encourage the free flow of information while working with our international partners to protect personal privacy. This review is but one piece of that effort, and we hope it spurs a conversation about big data across the country and around the world.
Read the Big Data Report.
See the fact sheet from today’s announcement.

This is what happens when you give social networking to doctors


in PandoDaily: “Dr. Gregory Kurio will never forget the time he was called to the ER because a epileptic girl was brought in suffering a cardiac arrest of sorts (HIPAA mandates he doesn’t give out the specific details of the situation). In the briefing, he learned the name of her cardiac physician who he happened to know through the industry. He subsequently called the other doctor and asked him to send over any available information on the patient — latest meds, EKGs, recent checkups, etc.

The scene in the ER was, to be expected, one of chaos, with trainees and respiratory nurses running around grabbing machinery and meds. Crucial seconds were ticking past, and Dr. Kurio quickly realized the fax machine was not the best approach for receiving the records he needed. ER fax machines are often on the opposite of the emergency room, take awhile to print lengthy of records, frequently run out of paper, and aren’t always reliable – not exactly the sort of technology you want when a patient’s life or death hangs in the midst.

Email wasn’t an option either, because HIPAA mandates that sensitive patient files are only sent through secure channels. With precious little time to waste, Dr. Kurio decided to take a chance on a new technology service he had just signed up for — Doximity.

Doximity is a LinkedIn for Doctors of sorts. It has, as one feature, a secure e-fax system that turns faxes into digital messages and sends them to a user’s mobile device. Dr. Kurio gave the other physician his e-fax number, and a little bit of techno-magic happened.

….

With a third of the nation’s doctors on the platform, today Doximity announced a $54 million Series C from DFJ,  T. Rowe Price Associates, Morgan Stanley, and existing investors. The funding news isn’t particularly important, in and of itself, aside from the fact that the company is attracting the attention of private market investors very early in its growth trajectory. But it’s a good opportunity to take a look at Doximity’s business model, how it mirrors the upwards growth of other vertical professional social networks (say that five times fast), and the way it’s transforming our healthcare providers’ jobs.

Doximity works, in many ways, just like LinkedIn. Doctors have profiles with pictures and their resume, and recruiters pay the company to message medical professionals. “If you think it’s hard to find a Ruby developer in San Francisco, try to find an emergency room physician in Indiana,” Doximity CEO Jeff Tangney says. One recruiter’s pain is a smart entrepreneur’s pleasure — a simple, straightforward monetization strategy.

But unlike LinkedIn, Doximity can dive much deeper on meeting doctors’ needs through specialized features like the e-fax system. It’s part of the reason Konstantin Guericke, one of LinkedIn’s “forgotten” co-founders, was attracted to the company and decided to join the board as an advisor. “In some ways, it’s a lot like LinkedIn,” Guericke says, when asked why he decided to help out. “But for me it’s the pleasure of focusing on a more narrow audience and making more of an impact on their life.”

In another such high-impact, specialized feature, doctors can access Doximity’s Google Alerts-like system for academic articles. They can sign up to receive notifications when stories are published about their obscure specialties. That means time-strapped physicians gain a more efficient way to stay up to date on all the latest research and information in their field. You can imagine that might impact the quality of the care they provide.

Lastly, Doximity offers a secure messaging system, allowing doctors to email one another regarding a fellow patient. Such communication is a thorny issue for doctors given HIPPA-related privacy requirements. There are limited ways to legally update say, a primary care physician when a specialist learns one of their patients has colon cancer. It turns into a big game of phone tag to relay what should be relatively straightforward information. Furthermore, leaving voicemails and sending faxes can result in details getting lost in what its an searchable system.

The platform is free for doctors, and it has attracted them quickly join in droves. Doximity co-founder and CEO Jeff Tangney estimates that last year the platform had added 15 to 16 percent of US doctors. But this year, the company claims it’s “on track to have half of US physicians as members by this summer.” Fairly impressive growth rate and market penetration.

With great market penetration comes great power. And dollars. Although the company is only monetizing through recruitment at the moment, the real money to be made with this service is through targeted advertising. Think about how much big pharma and medtech companies would be willing to cough up to to communicate at scale with the doctors who make purchasing decisions. Plus, this is an easy way for them to target industry thought leaders or professionals with certain specialties.

Doximity’s founders’ and investors’ eyes might be seeing dollar signs, but they haven’t rolled anything out yet on the advertising front. They’re wary and want to do so in a way that ads value to all parties while avoiding pissing off medical professionals. When they finally pul lthe trigger, however, it’s has the potential to be a Gold Rush.

Doximity isn’t the only company to have discovered there’s big money to be made in vertical professional social networks. As Pando has written, there’s a big trend in this regard. Spiceworks, the social network for IT professionals which claims to have a third of the world’s IT professionals on the site, just raised $57 million in a round led by none other than Goldman Sachs. Why does the firm have such faith in a free social network for IT pros — seemingly the most mundane and unprofitable of endeavors? Well, just like with doctor and pharma corps, IT companies are willing to shell out big to market their wares directly to such IT pros.

Although the monetization strategies differ from business to business, ResearchGate is building a similar community with a social network of scientists around the world, Edmodo is doing it with educators, GitHub with developers, GrabCAD for mechanical engineers. I’ve argued that such vertical professional social networks are a threat to LinkedIn, stealing business out from under it in large industry swaths. LinkedIn cofounder Konstantin Guericke disagrees.

“I don’t think it’s stealing revenue from them. Would it make sense for LinkedIn to add a profile subset about what insurance someone takes? That would just be clutter,” Guericke says. “It’s more going after an opportunity LinkedIn isn’t well positioned to capitalize on. They could do everything Doximity does, but they’d have to give up something else.”

All businesses come with their own challenges, and Doximity will certainly face its share of them as it scales. It has overcome the initial hurdle of achieving the network effects that come with penetrating the a large segment of the market. Next will come monetizing sensitively and continuing to protecting users — and patients’ — privacy.

There are plenty of data minefields to be had in a sector as closely regulated as healthcare, as fellow medical startup Practice Fusion recently found out. Doximity has to make sure its system for onboarding and verifying new doctors is airtight. The company has already encountered some instances of individuals trying to pose as medical professionals to get access to another’s records — specifically a former lover trying to chase down their ex-spouse’s STI tests. One blowup where the company approves someone they shouldn’t or hackers break into the system, and doctors could lose trust in the safety of the technology….”

Twitter Can Now Predict Crime, and This Raises Serious Questions


Motherboard: “Police departments in New York City may soon be using geo-tagged tweets to predict crime. It sounds like a far-fetched sci-fi scenario a la Minority Report, but when I contacted Dr. Matthew Greber, the University of Virginia researcher behind the technology, he explained that the system is far more mathematical than metaphysical.
The system Greber has devised is an amalgam of both old and new techniques. Currently, many police departments target hot spots for criminal activity based on actual occurrences of crime. This approach, called kernel density estimation (KDE), involves pairing a historical crime record with a geographic location and using a probability function to calculate the possibility of future crimes occurring in that area. While KDE is a serviceable approach to anticipating crime, it pales in comparison to the dynamism of Twitter’s real-time data stream, according to Dr. Gerber’s research paper “Predicting Crime Using Twitter and Kernel Density Estimation”.
Dr. Greber’s approach is similar to KDE, but deals in the ethereal realm of data and language, not paperwork. The system involves mapping the Twitter environment; much like how police currently map the physical environment with KDE. The big difference is that Greber is looking at what people are talking about in real time, as well as what they do after the fact, and seeing how well they match up. The algorithms look for certain language that is likely to indicate the imminent occurrence of a crime in the area, Greber says. “We might observe people talking about going out, getting drunk, going to bars, sporting events, and so on—we know that these sort of events correlate with crime, and that’s what the models are picking up on.”
Once this data is collected, the GPS tags in tweets allows Greber and his team to pin them to a virtual map and outline hot spots for potential crime. However, everyone who tweets about hitting the club later isn’t necessarily going to commit a crime. Greber tests the accuracy of his approach by comparing Twitter-based KDE predictions with traditional KDE predictions based on police data alone. The big question is, does it work? For Greber, the answer is a firm “sometimes.” “It helps for some, and it hurts for others,” he says.
According to the study’s results, Twitter-based KDE analysis yielded improvements in predictive accuracy over traditional KDE for stalking, criminal damage, and gambling. Arson, kidnapping, and intimidation, on the other hand, showed a decrease in accuracy from traditional KDE analysis. It’s not clear why these crimes are harder to predict using Twitter, but the study notes that the issue may lie with the kind of language used on Twitter, which is characterized by shorthand and informal language that can be difficult for algorithms to parse.
This kind of approach to high-tech crime prevention brings up the familiar debate over privacy and the use of users’ date for purposes they didn’t explicitly agree to. The case becomes especially sensitive when data will be used by police to track down criminals. On this point, though he acknowledges post-Snowden societal skepticism regarding data harvesting for state purposes, Greber is indifferent. “People sign up to have their tweets GPS tagged. It’s an opt-in thing, and if you don’t do it, your tweets won’t be collected in this way,” he says. “Twitter is a public service, and I think people are pretty aware of that.”…

How Can the Department of Education Increase Innovation, Transparency and Access to Data?


David Soo at the Department of Education: “Despite the growing amount of information about higher education, many students and families still need access to clear, helpful resources to make informed decisions about going to – and paying for – college.  President Obama has called for innovation in college access, including by making sure all students have easy-to-understand information.
Now, the U.S. Department of Education needs your input on specific ways that we can increase innovation, transparency, and access to data.  In particular, we are interested in how APIs (application programming interfaces) could make our data and processes more open and efficient.
APIs are set of software instructions and standards that allow machine-to-machine communication.  APIs could allow developers from inside and outside government to build apps, widgets, websites, and other tools based on government information and services to let consumers access government-owned data and participate in government-run processes from more places on the Web, even beyond .gov websites. Well-designed government APIs help make data and processes freely available for use within agencies, between agencies, in the private sector, or by citizens, including students and families.
So, today, we are asking you – student advocates, designers, developers, and others – to share your ideas on how APIs could spark innovation and enable processes that can serve students better. We need you to weigh in on a Request for Information (RFI) – a formal way the government asks for feedback – on how the Department could use APIs to increase access to higher education data or financial aid programs. There may be ways that Department forms – like the Free Application for Federal Student Aid (FAFSA) – or information-gathering processes could be made easier for students by incorporating the use of APIs. We invite the best and most creative thinking on specific ways that Department of Education APIs could be used to improve outcomes for students.
To weigh in, you can email APIRFI@ed.gov by June 2, or send your input via other addresses as detailed in the online notice.
The Department wants to make sure to do this right. It must ensure the security and privacy of the data it collects or maintains, especially when the information of students and families is involved.  Openness only works if privacy and security issues are fully considered and addressed.  We encourage the field to provide comments that identify concerns and offer suggestions on ways to ensure privacy, safeguard student information, and maintain access to federal resources at no cost to the student.
Through this request, we hope to gather ideas on how APIs could be used to fuel greater innovation and, ultimately, affordability in higher education.  For further information, see the Federal Register notice.”

Historic release of data delivers unprecedented transparency on the medical services physicians provide and how much they are paid


Jonathan Blum, Principal Deputy Administrator, Centers for Medicare & Medicaid Services : “Today the Centers for Medicare & Medicaid Services (CMS) took a major step forward in making Medicare data more transparent and accessible, while maintaining the privacy of beneficiaries, by announcing the release of new data on medical services and procedures furnished to Medicare fee-for-service beneficiaries by physicians and other healthcare professionals (http://www.cms.gov/newsroom/newsroom-center.html). For too long, the only information on physicians readily available to consumers was physician name, address and phone number. This data will, for the first time, provide a better picture of how physicians practice in the Medicare program.
This new data set includes over nine million rows of data on more than 880,000 physicians and other healthcare professionals in all 50 states, DC and Puerto Rico providing care to Medicare beneficiaries in 2012. The data set presents key information on the provision of services by physicians and how much they are paid for those services, and is organized by provider (National Provider Identifier or NPI), type of service (Healthcare Common Procedure Coding System, or HCPCS) code, and whether the service was performed in a facility or office setting. This public data set includes the number of services, average submitted charges, average allowed amount, average Medicare payment, and a count of unique beneficiaries treated. CMS takes beneficiary privacy very seriously and we will protect patient-identifiable information by redacting any data in cases where it includes fewer than 11 beneficiaries.
Previously, CMS could not release this information due to a permanent injunction issued by a court in 1979. However, in May 2013, the court vacated this injunction, causing a series of events that has led CMS to be able to make this information available for the first time.
Data to Fuel Research and Innovation
In addition to the public data release, CMS is making slight modifications to the process to request CMS data for research purposes. This will allow researchers to conduct important research at the physician level. As with the public release of information described above, CMS will continue to prohibit the release of patient-identifiable information. For more information about CMS’s disclosures to researchers, please contact the Research Data Assistance Center (ResDAC) at http://www.resdac.org/.
Unprecedented Data Access
This data release follows other CMS efforts to make more data available to the public. Since 2010, the agency has released an unprecedented amount of aggregated data in machine-readable form, with much of it available at http://www.healthdata.gov. These data range from previously unpublished statistics on Medicare spending, utilization, and quality at the state, hospital referral region, and county level, to detailed information on the quality performance of hospitals, nursing homes, and other providers.
In May 2013, CMS released information on the average charges for the 100 most common inpatient services at more than 3,000 hospitals nationwide http://www.cms.gov/Research-Statistics-Data-and-Systems/Statistics-Trends-and-Reports/Medicare-Provider-Charge-Data/Inpatient.html.
In June 2013, CMS released average charges for 30 selected outpatient procedures http://www.cms.gov/Research-Statistics-Data-and-Systems/Statistics-Trends-and-Reports/Medicare-Provider-Charge-Data/Outpatient.html.
We will continue to work toward harnessing the power of data to promote quality and value, and improve the health of our seniors and persons with disabilities.”

Medicare to Publish Trove of Data on Doctors


Louise Radnofsky in the Wall Street Journal: “The Obama administration said it would publish as early as next week data on what Medicare paid individual doctors in 2012, aiming to boost transparency and help root out fraud.
The move, which faced fierce resistance from doctors’ groups, would end a decadeslong block on making the information public.
Federal officials said they planned to release reimbursement information on April 9 or soon after that would show billing data for 880,000 health-care providers treating patients in the government-run insurance program for elderly and disabled people. It will include how many times the providers carried out a particular service or procedure, whether they carried it out in a medical facility or an office setting, the average amount they charged Medicare for it, the average amount they were paid for it, and the total number of people they treated.
The data set would show the names and addresses of the providers in connection with their reimbursement information, officials at the Centers for Medicare and Medicaid Services said. The agency hasn’t previously released such data.
Physicians’ organizations had sought to prevent the release of the data, citing concerns about physician privacy. But a federal judge last year lifted a long-standing injunction placed on the publication of the information by a federal court in Florida, in response to a challenge from Dow Jones & Co., The Wall Street Journal’s parent company.
Jonathan Blum, principal deputy administrator at CMS, informed the American Medical Association and Florida Medical Association in letters dated Wednesday that the agency would move to publish the data soon.
Ardis Dee Hoven, president of the American Medical Association, said the group remained concerned that CMS was taking a “broad approach” that could result in “unwarranted bias against physicians that can destroy careers.” Dr. Hoven said the AMA wanted doctors to be able to review and correct their information before the data set was published. The Florida Medical Association couldn’t immediately be reached.
Mr. Blum said that for privacy reasons, data related to subsets of fewer than 11 Medicare patients would be redacted.
In the letters, Mr. Blum said the agency believed that news organizations seeking the information—which include the Journal—would be able to use it to shed light on problems in the Medicare program. He also specifically cited earlier reporting by the Journal that had drawn on similar data.
“The Department concluded that the data to be released would assist the public’s understanding of Medicare fraud, waste, and abuse, as well as shed light on payments to physicians for services furnished to Medicare beneficiaries,” Mr. Blum wrote. “As an example, using similar payment information, The Wall Street Journal was able to identify and report on a number of instances of Medicare fraud, waste, and abuse, using Medicare payment data in its Secrets of the System series,” Mr. Blum wrote. That series was a finalist for a Pulitzer Prize in 2011.”