Identifying Healthcare Fraud with Open Data


Paper by Xuan Zhang et al: “Health care fraud is a serious problem that impacts every patient and consumer. This fraudulent behavior causes excessive financial losses every year and causes significant patient harm. Healthcare fraud includes health insurance fraud, fraudulent billing of insurers for services not provided, and exaggeration of medical services, etc. To identify healthcare fraud thus becomes an urgent task to avoid the abuse and waste of public funds. Existing methods in this research field usually use classified data from governments, which greatly compromises the generalizability and scope of application. This paper introduces a methodology to use publicly available data sources to identify potentially fraudulent behavior among physicians. The research involved data pairing of multiple datasets, selection of useful features, comparisons of classification models, and analysis of useful predictors. Our performance evaluation results clearly demonstrate the efficacy of the proposed method….(More)”.

Information Asymmetries, Blockchain Technologies, and Social Change


Reflections by Stefaan Verhulst on “the potential (and challenges) of Distributed Ledgers for “Market for Lemons” Conditions: We live in a data age, and it has become common to extol the transformative power of data and information. It is now conventional to assume that many of our most pressing public problems—everything from climate change to terrorism to mass migration—are amenable to a “data fix.”

The truth, though, is a little more complicated. While there is no doubt that data—when analyzed and used responsibly—holds tremendous potential, many factors affect whether, and to what extent, that potential will ultimately be fulfilled.

Our ability to address complex public problems using data depends vitally on how our respective data ecosystems is designed (as well as ongoing questions of representation in, power over, and stewardship of these ecosystems).

Flaws in our data ecosystem that prevent us from addressing problems; may also be responsible for many societal failures and inequalities result from the fact that:

  • some actors have better access to data than others;
  • data is of poor quality (or even “fake”); contains implicit bias; and/or is not validated and thus not trusted;
  • only easily accessible data are shared and integrated (“open washing”) while important data remain carefully hidden or without resources for relevant research and analysis; and more generally that
  • even in an era of big and open data, information too often remains stove-piped, siloed, and generally difficult to access.

Several observers have pointed to the relationship between these information asymmetries and, for example, corruption, financial exclusion, global pandemics, forced mass migration, human rights abuses, and electoral fraud.

Consider the transaction costs, power inequities and other obstacles that result from such information asymmetries, namely:

–     At the individual level: too often someone who is trying to open a bank account (or sign up for new cell phone service) is unable to provide all the requisite information, such as credit history, proof of address or other confirmatory and trusted attributes of identity. As such, information asymmetries are in effect limiting this individual’s access to financial and communications services.

–     At the corporate level, a vast body of literature in economics has shown how uncertainty over the quality and trustworthiness of data can impose transaction costs, limit the development of markets for goods and services, or shut them down altogether. This is the well-known “market for lemons” problem made famous in a 1970 paper of the same name by George Akerlof.

–     At the societal or governance level, information asymmetries don’t just affect the efficiency of markets or social inequality. They can also incentivize unwanted behaviors that cause substantial public harm. Tyrants and corrupt politicians thrive on limiting their citizens’ access to information (e.g., information related to bank accounts, investment patterns or disbursement of public funds). Likewise, criminals, operate and succeed in the information-scarce corners of the underground economy.

Blockchain technologies and Information Asymmetries

This is where blockchain comes in. At their core, blockchain technologies are a new type of disclosure mechanism that have the potential to address some of the information asymmetries listed above. There are many types of blockchain technologies, and while I use the blanket term ‘blockchain’ in the below for simplicity’s sake, the nuances between different types of blockchain technologies can greatly impact the character and likelihood of success of a given initiative.

By leveraging a shared and verified database of ledgers stored in a distributed manner, blockchain seeks to redesign information ecosystems in a more transparent, immutable, and trusted manner. Solving information asymmetries may be the real potential of blockchain, and this—much more than the current hype over virtual currencies—is the real reason to assess its potential….(More)”.

Evaluating Civic Open Data Standards


Renee Sieber and Rachel Bloom at SocArXiv Papers: In many ways, a precondition to realizing the promise of open government data is the standardization of that data. Open data standards ensure interoperability, establish benchmarks in assessing whether governments achieve their goals in publishing open data, can better ensure accuracy of the data. Interoperability enables the use of off-the shelf software and can ease third party development of products that serves multiple locales.

Our project aims to determine which standards for civic data are “best” to open up government data. We began by disambiguating the multiple meanings of what constitutes a data standard by creating a standards stack.

The empirical research started by identifying twelve “high value” open datasets for which we found 22 data standards. A qualitative systematic review of the gray literature and standards documentation generated 18 evaluation metrics, which we grouped into four categories. We evaluated the metrics with civic data standards. Our goal is to identify and characterize types of standards and provide a systematic way to assess their quality…(More)”.

Is Open Data Working for Women in Africa?


Web Foundation: “Open data has the potential to change politics, economies and societies for the better by giving people more opportunities to engage in the decisions that affect their lives. But to reach the full potential of open data, it must be available to and used by all. Yet, across the globe — and in Africa in particular — there is a significant data gap.

This report — Is open data working for women in Africa — maps the current state of open data for women across Africa, with insights from country-specific research in Nigeria, Cameroon, Uganda and South Africa with additional data from a survey of experts in 12 countries across the continent.

Our findings show that, despite the potential for open data to empower people, it has so far changed little for women living in Africa.

Key findings

  • There is a closed data culture in Africa — Most countries lack an open culture and have legislation and processes that are not gender-responsive. Institutional resistance to disclosing data means few countries have open data policies and initiatives at the national level. In addition, gender equality legislation and policies are incomplete and failing to reduce gender inequalities. And overall, Africa lacks the cross-organisational collaboration needed to strengthen the open data movement.
  • There are barriers preventing women from using the data that is available — Cultural and social realities create additional challenges for women to engage with data and participate in the technology sector. 1GB of mobile data in Africa costs, on average, 10% of average monthly income. This high cost keeps women, who generally earn less than men, offline. Moreover, time poverty, the gender pay gap and unpaid labour create economic obstacles for women to engage with digital technology.
  • Key datasets to support the advocacy objectives of women’s groups are missing — Data on budget, health and crime are largely absent as open data. Nearly all datasets in sub-Saharan Africa (373 out of 375) are closed, and sex-disaggregated data, when available online, is often not published as open data. There are few open data policies to support opening up of key datasets and even when they do exist, they largely remain in draft form. With little investment in open data initiatives, good data management practices or for implementing Right To Information (RTI) reforms, improvement is unlikely.
  • There is no strong base of research on women’s access and use of open data — There is lack of funding, little collaboration and few open data champions. Women’s groups, digital rights groups and gender experts rarely collaborate on open data and gender issues. To overcome this barrier, multi-stakeholder collaborations are essential to develop effective solutions….(More)”.

Data infrastructure literacy


Paper by Jonathan Gray, Carolin Gerlitz and Liliana Bounegru at Big Data & Society: “A recent report from the UN makes the case for “global data literacy” in order to realise the opportunities afforded by the “data revolution”. Here and in many other contexts, data literacy is characterised in terms of a combination of numerical, statistical and technical capacities. In this article, we argue for an expansion of the concept to include not just competencies in reading and working with datasets but also the ability to account for, intervene around and participate in the wider socio-technical infrastructures through which data is created, stored and analysed – which we call “data infrastructure literacy”. We illustrate this notion with examples of “inventive data practice” from previous and ongoing research on open data, online platforms, data journalism and data activism. Drawing on these perspectives, we argue that data literacy initiatives might cultivate sensibilities not only for data science but also for data sociology, data politics as well as wider public engagement with digital data infrastructures. The proposed notion of data infrastructure literacy is intended to make space for collective inquiry, experimentation, imagination and intervention around data in educational programmes and beyond, including how data infrastructures can be challenged, contested, reshaped and repurposed to align with interests and publics other than those originally intended….(More)”

Open Data in Tourism


European Data Portal: “New technologies are rapidly changing the tourism industry. Data are central assets in management and marketing of tourism destinations and businesses. Data driven services became a prominent tool for tourists to plan their trips. The study “Utilizing open data in tourism” predicts great potential for Open Data to increase innovations and destination management. Several actors already use Open Data to provide services in the tourism industry, e.g. the open service called Helsinki Region Infoshare from the city of Helsinki. Malta and Montenegro, for example, are providing data sets on tourist expenditure, hotels, accommodation, restaurants, events, bicycle stations, heritage sites, or beaches.

But not only government organisations and companies use Open Data in tourism. User-generated content, such as reviews and comments spread via social networking services, supports Tourists’ decision making. The study “You will like it!”  analyses user generated Open Data to predict tourists’ perception of sights or attractions.  Thereby they are contributing to the process of predicting tourists’ future preferences, what has potential implications and benefits for the tourism industry.

Engage in the discourse of Open data in tourism, for example on 18 July: the meeting “Linked Open Data im Tourismus“for destination marketing organizations (DMOs) takes place  in Innsbruck to discuss possibilities and prerequisites for using Open Data in tourism. If you rather try out using Open Data to plan your next weekend trip, visit the European Data Portal featured data article on  “Use Open Data to prepare your holiday trip”….(More)”.

Microsoft Research Open Data


Microsoft Research Open Data: “… is a data repository that makes available datasets that researchers at Microsoft have created and published in conjunction with their research. You can browse available datasets and either download them or directly copy them to an Azure-based Virtual Machine or Data Science Virtual Machine. To the extent possible, we follow FAIR (findable, accessible, interoperable and reusable) data principles and will continue to push towards the highest standards for data sharing. We recognize that there are dozens of data repositories already in use by researchers and expect that the capabilities of this repository will augment existing efforts. Datasets are categorized by their primary research area. You can find links to research projects or publications with the dataset.

What is our goal?

Our goal is to provide a simple platform to Microsoft’s researchers and collaborators to share datasets and related research technologies and tools. The site has been designed to simplify access to these data sets, facilitate collaboration between researchers using cloud-based resources, and enable the reproducibility of research. We will continue to evolve and grow this repository and add features to it based on feedback from the community.

How did this project come to be?

Over the past few years, our team, based at Microsoft Research, has worked extensively with the research community to create cloud-based research infrastructure. We started this project as a prototype about a year ago and are excited to finally share it with the research community to support data-intensive research in the cloud. Because almost all research projects have a data component, there is real need for curated and meaningful datasets in the research community, not only in computer science but in interdisciplinary and domain sciences. We have now made several such datasets available for download or use directly on cloud infrastructure….(More)”.

My City Forecast: Urban planning communication tool for citizen with national open data


Paper by Y. Hasegawa, Y. Sekimoto, T. Seto, Y. Fukushima et al in Computers, Environment and Urban Systems: “In urban management, the importance of citizen participation is being emphasized more than ever before. This is especially true in countries where depopulation has become a major concern for urban managers and many local authorities are working on revising city master plans, often incorporating the concept of the “compact city.” In Japan, for example, the implementation of compact city plans means that each local government decides on how to designate residential areas and promotes citizens moving to these areas in order to improve budget effectiveness and the vitality of the city. However, implementing a compact city is possible in various ways. Given that there can be some designated withdrawal areas for budget savings, compact city policies can include disadvantages for citizens. At this turning point for urban structures, citizen–government mutual understanding and cooperation is necessary for every step of urban management, including planning.

Concurrently, along with the recent rapid growth of big data utilization and computer technologies, a new conception of cooperation between citizens and government has emerged. With emerging technologies based on civic knowledge, citizens have started to obtain the power to engage directly in urban management by obtaining information, thinking about their city’s problems, and taking action to help shape the future of their city themselves (Knight Foundation, 2013). This development is also supported by the open government data movement, which promotes the availability of government information online (Kingston, Carver, Evans, & Turton, 2000). CityDashboard is one well-known example of real-time visualization and distribution of urban information. CityDashboard, a web tool launched in 2012 by University College London, aggregates spatial data for cities around the UK and displays the data on a dashboard and a map. These new technologies are expected to enable both citizens and government to see their urban situation in an interface presenting an overhead view based on statistical information.

However, usage of statistics and governmental data is as yet limited in the actual process of urban planning…

To help improve this situation and increase citizen participation in urban management, we have developed a web-based urban planning communication tool using open government data for enhanced citizen–government cooperation. The main aim of the present research is to evaluate the effect of our system on users’ awareness of and attitude toward the urban situation. We have designed and developed an urban simulation system, My City Forecast (http://mycityforecast.net,) that enables citizens to understand how their environment and region are likely to change by urban management in the future (up to 2040)….(More)”.

Balancing Act: Innovation vs. Privacy in the Age of Data Portability


Thursday, July 12, 2018 @ 2 MetroTech Center, Brooklyn, NY 11201

RSVP here.

The ability of people to move or copy data about themselves from one service to another — data portability — has been hailed as a way of increasing competition and driving innovation. In many areas, such as through the Open Banking initiative in the United Kingdom, the practice of data portability is fully underway and propagating. The launch of GDPR in Europe has also elevated the issue among companies and individuals alike. But recent online security breaches and other experiences of personal data being transferred surreptitiously from private companies, (e.g., Cambridge Analytica’s appropriation of Facebook data), highlight how data portability can also undermine people’s privacy.

The GovLab at the NYU Tandon School of Engineering is pleased to present Jeni Tennison, CEO of the Open Data Institute, for its next Ideas Lunch, where she will discuss how data portability has been regulated in the UK and Europe, and what governments, businesses and people need to do to strike the balance between its risks and benefits.

Jeni Tennison is the CEO of the Open Data Institute. She gained her PhD from the University of Nottingham then worked as an independent consultant, specialising in open data publishing and consumption, before joining the ODI in 2012. Jeni was awarded an OBE for services to technology and open data in the 2014 New Year Honours.

Before joining the ODI, Jeni was the technical architect and lead developer for legislation.gov.uk. She worked on the early linked data work on data.gov.uk, including helping to engineer new standards for publishing statistics as linked data. She continues her work within the UK’s public sector as a member of the Open Standards Board.

Jeni also works on international web standards. She was appointed to serve on the W3C’s Technical Architecture Group from 2011 to 2015 and in 2014 she started to co-chair the W3C’s CSV on the Web Working Group. She also sits on the Advisory Boards for Open Contracting Partnership and the Data Transparency Lab.

Twitter handle: @JeniT

Digital Government Review of Colombia


OECD Report: “This review analyses the shift from e-government to digital government in Colombia. It looks at the governance framework for digital government, the use of digital platforms and open data to engage and collaborate with citizens, conditions for a data-driven public sector, and policy coherence in a context of significant regional disparities. It provides concrete policy recommendations on how digital technologies and data can be harnessed for citizen-driven policy making and public service delivery…(More)”.