Paper by Hector J. Levesque: “The science of AI is concerned with the study of intelligent forms of behaviour in computational terms. But what does it tell us when a good semblance of a behaviour can be achieved using cheap tricks that seem to have little to do with what we intuitively imagine intelligence to be? Are these intuitions wrong, and is intelligence really just a bag of tricks? Or are the philosophers right, and is a behavioural understanding of intelligence simply too weak? I think both of these are wrong. I suggest in the context of question-answering that what matters when it comes to the science of AI is not a good semblance of intelligent behaviour at all, but the behaviour itself, what it depends on, and how it can be achieved. I go on to discuss two major hurdles that I believe will need to be cleared.”
Manipulation Among the Arbiters of Collective Intelligence: How Wikipedia Administrators Mold Public Opinion
New paper by Sanmay Das, Allen Lavoie, and Malik Magdon-Ismail: “Our reliance on networked, collectively built information is a vulnerability when the quality or reliability of this information is poor. Wikipedia, one such collectively built information source, is often our first stop for information on all kinds of topics; its quality has stood up to many tests, and it prides itself on having a “Neutral Point of View”. Enforcement of neutrality is in the hands of comparatively few, powerful administrators. We find a surprisingly large number of editors who change their behavior and begin focusing more on a particular controversial topic once they are promoted to administrator status. The conscious and unconscious biases of these few, but powerful, administrators may be shaping the information on many of the most sensitive topics on Wikipedia; some may even be explicitly infiltrating the ranks of administrators in order to promote their own points of view. Neither prior history nor vote counts during an administrator’s election can identify those editors most likely to change their behavior in this suspicious manner. We find that an alternative measure, which gives more weight to influential voters, can successfully reject these suspicious candidates. This has important implications for how we harness collective intelligence: even if wisdom exists in a collective opinion (like a vote), that signal can be lost unless we carefully distinguish the true expert voter from the noisy or manipulative voter.”
The Participatory Turn: Participatory Budgeting Comes to America
Thesis by Hollie Russon Gilman: “Participatory Budgeting (PB) has expanded to over 1,500 municipalities worldwide since its inception in Porto Alege, Brazil in 1989 by the leftist Partido dos Trabalhadores (Workers’ Party). While PB has been adopted throughout the world, it has yet to take hold in the United States. This dissertation examines the introduction of PB to the United States with the first project in Chicago in 2009, and proceeds with an in-depth case study of the largest implementation of PB in the United States: Participatory Budgeting in New York City. I assess the outputs of PB in the United States including deliberations, governance, and participation. I argue that PB produces better outcomes than the status quo budget process in New York City, while also transforming how those who participate understand themselves as citizens, constituents, Council members, civil society leaders and community stakeholders. However, there are serious challenges to participation, including high costs of engagement, process exhaustion, and perils of scalability. I devise a framework for assessment called “citizenly politics,” focusing on: 1) designing participation 2) deliberation 3) participation and 4) potential for institutionalization. I argue that while the material results PB produces are relatively modest, including more innovative projects, PB delivers more substantial non-material or existential results. Existential citizenly rewards include: greater civic knowledge, strengthened relationships with elected officials, and greater community inclusion. Overall, PB provides a viable and informative democratic innovation for strengthening civic engagement within the United States that can be streamlined and adopted to scale.”
Crowd-Sourcing the Nation: Now a National Effort
Release from the U.S. Department of the Interior, U.S. Geological Survey: “The mapping crowd-sourcing program, known as The National Map Corps (TNMCorps), encourages citizens to collect structures data by adding new features, removing obsolete points, and correcting existing data for The National Map database. Structures being mapped in the project include schools, hospitals, post offices, police stations and other important public buildings.
Since the start of the project in 2012, more than 780 volunteers have made in excess of 13,000 contributions. In addition to basic editing, a second volunteer peer review process greatly enhances the quality of data provided back to The National Map. A few months ago, volunteers in 35 states were actively involved. This final release of states opens up the entire country for volunteer structures enhancement.
To show appreciation of our volunteer’s efforts, The National Map Corps has instituted a recognition program that awards “virtual” badges to volunteers. The badges consist of a series of antique surveying instruments ranging from the Order of the Surveyor’s Chain (25 – 50 points) to the Theodolite Assemblage (2000+ points). Additionally, volunteers are publically acclaimed (with permission) via Twitter, Facebook and Google+….
Tools on TNMCorps website explain how a volunteer can edit any area, regardless of their familiarity with the selected structures, and becoming a volunteer for TNMCorps is easy; go to The National Map Corps website to learn more and to sign up as a volunteer. If you have access to the Internet and are willing to dedicate some time to editing map data, we hope you will consider participating!”
Five myths about big data
Samuel Arbesman, senior scholar at the Ewing Marion Kauffman Foundation and the author of “The Half-Life of Facts” in the Washington Post: “Big data holds the promise of harnessing huge amounts of information to help us better understand the world. But when talking about big data, there’s a tendency to fall into hyperbole. It is what compels contrarians to write such tweets as “Big Data, n.: the belief that any sufficiently large pile of s— contains a pony.” Let’s deflate the hype.
1. “Big data” has a clear definition.
The term “big data” has been in circulation since at least the 1990s, when it is believed to have originated in Silicon Valley. IBM offers a seemingly simple definition: Big data is characterized by the four V’s of volume, variety, velocity and veracity. But the term is thrown around so often, in so many contexts — science, marketing, politics, sports — that its meaning has become vague and ambiguous….
2. Big data is new.
By many accounts, big data exploded onto the scene quite recently. “If wonks were fashionistas, big data would be this season’s hot new color,” a Reuters report quipped last year. In a May 2011 report, the McKinsey Global Institute declared big data “the next frontier for innovation, competition, and productivity.”
It’s true that today we can mine massive amounts of data — textual, social, scientific and otherwise — using complex algorithms and computer power. But big data has been around for a long time. It’s just that exhaustive datasets were more exhausting to compile and study in the days when “computer” meant a person who performed calculations….
3. Big data is revolutionary.
In their new book, “Big Data: A Revolution That Will Transform How We Live, Work, and Think,”Viktor Mayer-Schonberger and Kenneth Cukier compare “the current data deluge” to the transformation brought about by the Gutenberg printing press.
If you want more precise advertising directed toward you, then yes, big data is revolutionary. Generally, though, it’s likely to have a modest and gradual impact on our lives….
4. Bigger data is better.
In science, some admittedly mind-blowing big-data analyses are being done. In business, companies are being told to “embrace big data before your competitors do.” But big data is not automatically better.
Really big datasets can be a mess. Unless researchers and analysts can reduce the number of variables and make the data more manageable, they get quantity without a whole lot of quality. Give me some quality medium data over bad big data any day…
5. Big data means the end of scientific theories.
Chris Anderson argued in a 2008 Wired essay that big data renders the scientific method obsolete: Throw enough data at an advanced machine-learning technique, and all the correlations and relationships will simply jump out. We’ll understand everything.
But you can’t just go fishing for correlations and hope they will explain the world. If you’re not careful, you’ll end up with spurious correlations. Even more important, to contend with the “why” of things, we still need ideas, hypotheses and theories. If you don’t have good questions, your results can be silly and meaningless.
Having more data won’t substitute for thinking hard, recognizing anomalies and exploring deep truths.”
Announcing Project Open Data from Cloudant Labs
Yuriy Dybskiy from Cloudant: “There has been an emerging pattern over the last few years of more and more government datasets becoming available for public access. Earlier this year, the White House announced official policy on such data – Project Open Data.
Available resources
Here are four resources on the topic:
- Tim Berners-Lee: Open, Linked Data for a Global Community – [10 min video]
- Rufus Pollock: Open Data – How We Got Here and Where We’re Going – [24 min video]
- Open Knowledge Foundation Datasets – http://data.okfn.org/data
- Max Ogden: Project
dat
– collaborative data – [github repo]
One of the main challenges is access to the datasets. If only there were a database that had easy access to its data baked right in it.
Luckily, there is CouchDB and Cloudant, which share the same APIs to access data via HTTP. This makes for a really great option to store interesting datasets.
Cloudant Open Data
Today we are happy to announce a Cloudant Labs project – Cloudant Open Data!
Several datasets are available at the moment, for example, businesses_sf – data regarding businesses registered in San Francisco and sf_pd_incidents – a collection of incident reports (criminal and non-criminal) made by the San Francisco Police Department.
We’ll add more, but if you have one you’d like us to add faster – drop us a line at [email protected]
Create an account and play with these datasets yourself”
Improved Governance? Exploring the Results of Peru's Participatory Budgeting Process
Paper by Stephanie McNulty for the 2013 Annual Meeting of the American Political Science Association (Aug. 29-Sept. 1, 2013): “Can a nationally mandated participatory budget process change the nature of local governance? Passed in 2003 to mandate participatory budgeting in all districts and regions of Peru, Peru’s National PB Law has garnered international attention from proponents of participatory governance. However, to date, the results of the process have not been widely documented. Presenting data that have been gathered through fieldwork, online databases, and primary documents, this paper explores the results of Peru’s PB after ten years of implementation. The paper finds that results are limited. While there are a significant number of actors engaged in the process, the PB is still dominated by elite actors that do not represent the diversity of the civil society sector in Peru. Participants approve important “pro-poor” projects, but they are not always executed. Finally, two important indicators of governance, sub-national conflict and trust in local institutions, have not improved over time. Until Peruvian politicians make a concerted effort to move beyond politics as usual, results will continue to be limited”
OpenCounter
Code for America: “OpenCounter’s mission is to empower entrepreneurs and foster local economic development by simplifying the process of registering a business.
Economic development happens in many forms, from projects like the revitalization of the Brooklyn Navy Yard or Hudson Rail Yards in New York City, to campaigns to encourage residents to shop at local merchants. While the majority of headlines will focus on a City’s effort to secure a major new employer (think Apple’s 1,000,000 square foot expansion in Austin, Texas), most economic development and job creation happens on a much smaller scale, as individuals stake their financial futures on creating a new product, store, service or firm.
But these new businesses aren’t in a position to accept tax breaks on capital equipment or enter into complex development and disposition agreements to build new offices or stores. Many new businesses can’t even meet the underwriting criteria of SBA backed revolving-loan programs. Competition for local grants for facade improvements or signage assistance can be fierce….
Despite many cities’ genuine efforts to be “business-friendly,” their default user interface consists of florescent-lit formica, waiting lines, and stacks of forms. Online resources often remind one of a phone book, with little interactivity or specialization based on either the businesses’ function or location within a jurisdiction.
That’s why we built OpenCounter….See what we’re up to at opencounter.us or visit a live version of our software at http://opencounter.cityofsantacruz.com.”
From Machinery to Mobility: Government and Democracy in a Participative Age
New book by Jeffrey Roy: “The Westminster-stylized model of Parliamentary democratic politics and public service accountability is increasingly out of step with the realities of today’s digitally and socially networked era. This book explores the reconfiguration of democratic and managerial governance within democratic societies due to the advent of technological mobility. More specifically, the traditional public sector prism of organizational and accountability – denoted as ‘machinery of government’, is increasingly strained in an era characterized by smart devices, social media, and cloud computing. This book examines the roots and implications of the tensions between machinery and mobility and the sorts of investments and initiatives that have been undertaken by governments around the world as well as their appropriateness and relative impacts. This book also examines the prospects for holistic adaptation of democratic and managerial systems going forward, identifying the most crucial directions and determinants for improving public sector performance in terms of outcomes, accountability, and agility. Accordingly, the ultimate aim of this initiative is to contribute to the formation of intellectual foundations for more systemic reforms of public sector governance in Canada and elsewhere, and to offer forward-looking trajectories for government adaptation in shifting from a traditional prism of ‘machinery’ to new organizational and institutional arrangements better suited for an era of ‘mobility’.”
Defense Against National Vulnerabilities in Public Data
DOD/DARPA Notice (See also Foreign Policy article): “OBJECTIVE: Investigate the national security threat posed by public data available either for purchase or through open sources. Based on principles of data science, develop tools to characterize and assess the nature, persistence, and quality of the data. Develop tools for the rapid anonymization and de-anonymization of data sources. Develop framework and tools to measure the national security impact of public data and to defend against the malicious use of public data against national interests.
DESCRIPTION: The vulnerabilities to individuals from a data compromise are well known and documented now as “identity theft.” These include regular stories published in the news and research journals documenting the loss of personally identifiable information by corporations and governments around the world. Current trends in social media and commerce, with voluntary disclosure of personal information, create other potential vulnerabilities for individuals participating heavily in the digital world. The Netflix Challenge in 2009 was launched with the goal of creating better customer pick prediction algorithms for the movie service [1]. An unintended consequence of the Netflix Challenge was the discovery that it was possible to de-anonymize the entire contest data set with very little additional data. This de-anonymization led to a federal lawsuit and the cancellation of the sequel challenge [2]. The purpose of this topic is to understand the national level vulnerabilities that may be exploited through the use of public data available in the open or for purchase.
Could a modestly funded group deliver nation-state type effects using only public data?…”
The official link for this solicitation is: www.acq.osd.mil/osbp/sbir/solicitations/sbir20133.