Wittgenstein, #TheDress and Google’s search for a bigger truth


Robert Shrimsley at the Financial Times: “As the world burnt with a BuzzFeed-prompted debate over whether a dress was black and blue or white and gold, the BBC published a short article posing the question everyone was surely asking: “What would Wittgenstein say about that dress?

Wittgenstein died in 1951, so we cannot know if the philosopher of language, truth and context would have been a devotee of BuzzFeed. (I guess it depends on whether we are talking of the early or the late Ludwig. The early Wittgenstein, it is well known, was something of an enthusiast for LOLs, whereas the later was more into WTFs and OMGs.)

The dress will now join the pantheon of web phenomena such as “Diet Coke and Mentos” and “Charlie bit my finger”. But this trivial debate on perceived truth captured in miniature a wider issue for the web: how to distil fact from noise when opinion drowns out information and value is determined by popularity.

At about the same time as the dress was turning the air blue — or was it white? — the New Scientist published a report on how one web giant might tackle this problem, a development in which Wittgenstein might have been very interested. The magazine reported on a Google research paper about how the company might reorder its search rankings to promote sites that could be trusted to tell the truth. (Google produces many such papers a year so this is a long way short of official policy.) It posits a formula for finding and promoting sites with a record of reliability.

This raises an interesting question over how troubled we should be by the notion that a private company with its own commercial interests and a huge concentration of power could be the arbiter of truth. There is no current reason to see sinister motives in Google’s search for a better web: it is both honourable and good business. But one might ask how, for example, Google Truth might determine established truths on net neutrality….

The paper suggests using fidelity to proved facts as a proxy for trust. This is easiest with single facts, such as a date or place of birth. For example, it suggests claiming Barack Obama was born in Kenya would push a site down the rankings. This would be good for politics but facts are not always neutral. Google would risk being depicted as part of “the mainstream media”. Fox Search here we come….(More)”

Models and Patterns of Trust


Paper presented by Bran Knowles et al at the 18th ACM Conference on Computer Supported Cooperative Work & Social Computing: “As in all collaborative work, trust is a vital ingredient of successful computer supported cooperative work, yet there is little in the way of design principles to help practitioners develop systems that foster trust. To address this gap, we present a set of design patterns, based on our experience designing systems with the explicit intention of increasing trust between stakeholders. We contextualize these patterns by describing our own learning process, from the development, testing and refinement of a trust model, to our realization that the insights we gained along the way were most usefully expressed through design patterns. In addition to a set of patterns for trust, this paper seeks to demonstrate of the value of patterns as a means of communicating the nuances revealed through ethnographic investigation….(More)

‘Data.gov-in-a-box’: Delimiting transparency


New paper by Clare Birchall in the European Journal of Social Theory: “Given that the Obama administration still relies on many strategies we would think of as sitting on the side of secrecy, it seems that the only lasting transparency legacy of the Obama administration will be data-driven or e-transparency as exemplified by the web interface ‘data.gov’. As the data-driven transparency model is exported and assumes an ascendant position around the globe, it is imperative that we ask what kind of publics, subjects, and indeed, politics it will produce. Open government data is not just a matter concerning accountability but is seen as a necessary component of the new ‘data economy’. To participate and benefit from this info-capitalist-democracy, the data subject is called upon to be both auditor and entrepreneur. This article explores the implications of responsibilization, outsourcing, and commodification on the contract of representational democracy and asks if there are other forms of transparency that might better resist neoliberal formations and re-politicize the public sphere….(More)”

“Data on the Web” Best Practices


W3C First Public Working Draft: “…The best practices described below have been developed to encourage and enable the continued expansion of the Web as a medium for the exchange of data. The growth of open data by governments across the world [OKFN-INDEX], the increasing publication of research data encouraged by organizations like the Research Data Alliance [RDA], the harvesting and analysis of social media, crowd-sourcing of information, the provision of important cultural heritage collections such as at the Bibliothèque nationale de France [BNF] and the sustained growth in the Linked Open Data Cloud [LODC], provide some examples of this phenomenon.

In broad terms, data publishers aim to share data either openly or with controlled access. Data consumers (who may also be producers themselves) want to be able to find and use data, especially if it is accurate, regularly updated and guaranteed to be available at all times. This creates a fundamental need for a common understanding between data publishers and data consumers. Without this agreement, data publishers’ efforts may be incompatible with data consumers’ desires.

Publishing data on the Web creates new challenges, such as how to represent, describe and make data available in a way that it will be easy to find and to understand. In this context, it becomes crucial to provide guidance to publishers that will improve consistency in the way data is managed, thus promoting the re-use of data and also to foster trust in the data among developers, whatever technology they choose to use, increasing the potential for genuine innovation.

This document sets out a series of best practices that will help publishers and consumers face the new challenges and opportunities posed by data on the Web.

Best practices cover different aspects related to data publishing and consumption, like data formats, data access, data identification and metadata. In order to delimit the scope and elicit the required features for Data on the Web Best Practices, the DWBP working group compiled a set of use cases [UCR] that represent scenarios of how data is commonly published on the Web and how it is used. The set of requirements derived from these use cases were used to guide the development of the best practice.

The Best Practices proposed in this document are intended to serve a more general purpose than the practices suggested in Best Practices for Publishing Linked Data [LD-BP] since it is domain-independent and whilst it recommends the use of Linked Data, it also promotes best practices for data on the web in formats such as CSV and JSON. The Best Practices related to the use of vocabularies incorporate practices that stem from Best Practices for Publishing Linked Data where appropriate….(More)

States Use Big Data to Nab Tax Fraudsters


at Governing: “It’s tax season again. For most of us, that means undergoing the laborious and thankless task of assembling financial records and calculating taxes for state and federal returns. But for a small group of us, tax season is profit season. It’s the time of year when fraudsters busy themselves with stealing identities and electronically submitting fraudulent tax returns for refunds.
Nobody knows for sure just how much tax return fraud is committed, but the amount is rising fast. According to the U.S. Treasury, the number of identified fraudulent federal returns has increased by 40 percent from 2011 to 2012, an increase of more than $4 billion. Ten years ago, New York state stopped refunds on 50,000 fraudulently filed tax returns. Last year, the number of stopped refunds was 250,000, according to Nonie Manion, executive deputy commissioner for the state’s Department of Taxation and Finance….
To combat the problem, state revenue and tax agencies are using software programs to sift through mounds of data and detect patterns that would indicate when a return is not valid. Just about every state with a tax fraud detection program already compares tax return data with information from other state agencies and private firms to spot incorrect mailing addresses and stolen identities. Because so many returns are filed electronically, fraud spotting systems look for suspicious Internet protocol (IP) addresses. For example, tax auditors in New York noticed that similar IP addresses in Fort Lauderdale, Fla., were submitting a series of returns for refunds. When the state couldn’t match the returns with any employer data, they were flagged for further scrutiny and  ultimately found to be fraudulent.
High-tech analytics is one way states keep up with the war on fraud. The other is accurate data. The third component is well trained staff. But it takes time and money to put together the technology and the expertise to combat the growing sophistication of fraudsters….(More)”

Evaluating Complex Social Initiatives


Srik Gopal at Stanford Social Innovation Review: “…the California Endowment (TCE) .. and ..The Grand Rapids Community Foundation (GRCF) …are just two funders who are actively “shifting the lens” in terms of how they are looking at evaluation in the light of complexity. They are building on the recognition that a traditional approach to evaluation—assessing specific effects of a defined program according to a set of pre-determined outcomes, often in a way that connects those outcomes back to the initiative—is increasingly falling short. There is a clear message emerging that evaluation needs to accommodate complexity, not “assume it away.”

My colleagues at FSG and I have, based on our work with TCE, GRCF, and numerous other clients, articulated a set of nine “propositions” in a recent practice brief that are helpful in guiding how we conceptualize, design, and implement evaluations of complex initiatives. We derived these propositions from what we now know (based on the emerging field of complexity science) as distinctive characteristics of complex systems. We know, for example, that complex systems are always changing, often in unpredictable ways; they are never static. Hence, we need to design evaluations so that they are adaptive, flexible, and iterative, not rigid and cookie-cutter.

Below are three of the propositions in more detail, along with tools and methods that can help apply the proposition in practice.

It is important to note that many of the traditional tools and methods that form the backbone of sound evaluations—such as interviews, focus groups, and surveys—are still relevant. We would, however, suggest that organizations adapt those methods to reflect a complexity orientation. For example, interviews should explore the role of context; we should not confine them to the initiative’s boundaries. Focus groups should seek to understand local adaptation, not just adherence. And surveys should probe for relationships and interdependencies, not just perceived outcomes. In addition to traditional methods, we suggest incorporating newer, innovative techniques that provide a richer narrative, including:

  • Systems mapping—an iterative, often participatory process of graphically representing a system, including its components and connections
  • Appreciative inquiry—a group process that inquires into, identifies, and further develops the best of “what is” in organizations
  • Design thinking—a user-centered approach to developing new solutions to abstract, ill-defined, or complex problems… (More)”

New million dollar fund for participatory budgeting in South Australia


Medha Basu at Future Gov: “A new programme in South Australia is allowing citizens to determine which community projects should get funding.

The Fund My Community programme has a pool of AU$1 million (US$782,130) to fund projects by non-profit organisations aimed at supporting disadvantaged South Australians.

Organisations can nominate their projects for funding from this pool and anyone in the state can vote for the projects on the YourSAy web site.

All information about the projects submitted by the organisations will be available online to make the process transparent. “We hope that by providing the community with the right information about grant applications, people will support projects that will have the biggest impact in addressing disadvantage across South Australia,” the Fund My Community web site says.

The window to nominate community projects for funding is open until 2 April. Eligible applications will be opened for community assessment from 23 April to 4 May. The outcome will be announced and grants will be given out in June. See the full timeline here:

Fund my Community South Australia

There is a catch here though. The projects that receive the most support from the community are suggested for funding, but due to “a legal requirement” the final decision and grant approval comes from the Board of the Charitable and Social Welfare Fund, according to the YourSAy web site….(More)”

New research project to map the impact of open budget data


Jonathan Gray at Open Knowledge: “…a new research project to examine the impact of open budget data, undertaken as a collaboration between Open Knowledge and the Digital Methods Initiative at the University of Amsterdam, supported by the Global Initiative for Financial Transparency (GIFT).

The project will include an empirical mapping of who is active around open budget data around the world, and what the main issues, opportunities and challenges are according to different actors. On the basis of this mapping it will provide a review of the various definitions and conceptions of open budget data, arguments for why it matters, best practises for publication and engagement, as well as applications and outcomes in different countries around the world.

As well as drawing on Open Knowledge’s extensive experience and expertise around open budget data (through projects such as Open Spending), it will utilise innovative tools and methods developed at the University of Amsterdam to harness evidence from the web, social media and collections of documents to inform and enrich our analysis.

As part of this project we’re launching a collaborative bibliography of existing research and literature on open budget data and associated topics which we hope will become a useful resource for other organisations, advocates, policy-makers, and researchers working in this area. If you have suggestions for items to add, please do get in touch.

This project follows on from other research projects we’ve conducted around this area – including on data standards for fiscal transparency, on technology for transparent and accountable public finance, and on mapping the open spending community….(More)”

CrowdFlower Launches Open Data Project


Anthony Ha at Techcrunch: “Crowdsourcing company CrowdFlower allows businesses to tap into a distributed workforce of 5 million contributors for basic tasks like sentiment analysis. Today it’s releasing some of that data to the public through its new Data for Everyone initiative…. hope is to turn CrowdFlower into a central repository where open data can be found by researchers and entrepreneurs. (Factual was another startup trying to become a hub for open data, though in recent years, it’s become more focused on gathering location data to power mobile ads.)…

As for the data that’s available now, …There’s a lot of Twitter sentiment analysis covering things like from attitudes towards brands and products, yogurt (?), and climate change. Among the more recent data sets, I was particularly taken in the gender breakdown of who’s been on the cover of Time magazine and, yes, the analysis of who thought the dress (you know the one) was gold and white versus blue and black…. (More)”

Crowdsourcing America’s cybersecurity is an idea so crazy it might just work


at the Washington Post: “One idea that’s starting to bubble up from Silicon Valley is the concept of crowdsourcing cybersecurity. As Silicon Valley venture capitalist Robert R. Ackerman, Jr. has pointed out, due to “the interconnectedness of our society in cyberspace,” cyber networks are best viewed as an asset that we all have a shared responsibility to protect. Push on that concept hard enough and you can see how many of the core ideas from Silicon Valley – crowdsourcing, open source software, social networking, and the creative commons – can all be applied to cybersecurity.

Silicon Valley venture capitalists are already starting to fund companies that describe themselves as crowdsourcing cybersecurity. For example, take Synack, a “crowd security intelligence” company that received $7.5 million in funding from Kleiner Perkins (one of Silicon Valley’s heavyweight venture capital firms), Allegis Ventures, and Google Ventures in 2014. Synack’s two founders are ex-NSA employees, and they are using that experience to inform an entirely new type of business model. Synack recruits and vets a global network of “white hat hackers,” and then offers their services to companies worried about their cyber networks. For a fee, these hackers are able to find and repair any security risks.

So how would crowdsourced national cybersecurity work in practice?

For one, there would be free and transparent sharing of computer code used to detect cyber threats between the government and private sector. In December, the U.S. Army Research Lab added a bit of free source code, a “network forensic analysis network” known as Dshell, to the mega-popular code sharing site GitHub. Already, there have been 100 downloads and more than 2,000 unique visitors. The goal, says William Glodek of the U.S. Army Research Laboratory, is for this shared code to “help facilitate the transition of knowledge and understanding to our partners in academia and industry who face the same problems.”

This open sourcing of cyber defense would be enhanced with a scaled-up program of recruiting “white hat hackers” to become officially part of the government’s cybersecurity efforts. Popular annual events such as the DEF CON hacking conference could be used to recruit talented cyber sleuths to work alongside the government.

There have already been examples of communities where people facing a common cyber threat gather together to share intelligence. Perhaps the best-known example is the Conficker Working Group, a security coalition that was formed in late 2008 to share intelligence about malicious Conficker malware. Another example is the Financial Services Information Sharing and Analysis Center, which was created by presidential mandate in 1998 to share intelligence about cyber threats to the nation’s financial system.

Of course, there are some drawbacks to this crowdsourcing idea. For one, such a collaborative approach to cybersecurity might open the door to government cyber defenses being infiltrated by the enemy. Ackerman makes the point that you never really know who’s contributing to any community. Even on a site such as Github, it’s theoretically possible that an ISIS hacker or someone like Edward Snowden could download the code, reverse engineer it, and then use it to insert “Trojan Horses” intended for military targets into the code….  (More)