Opinion Mining in Social Big Data


New Paper by Wlodarczak, Peter and Ally, Mustafa and Soar, Jeffrey: “Opinion mining has rapidly gained importance due to the unprecedented amount of opinionated data on the Internet. People share their opinions on products, services, they rate movies, restaurants or vacation destinations. Social Media such as Facebook or Twitter has made it easier than ever for users to share their views and make it accessible for anybody on the Web. The economic potential has been recognized by companies who want to improve their products and services, detect new trends and business opportunities or find out how effective their online marketing efforts are. However, opinion mining using social media faces many challenges due to the amount and the heterogeneity of the available data. Also, spam or fake opinions have become a serious issue. There are also language related challenges like the usage of slang and jargon on social media or special characters like smileys that are widely adopted on social media sites.
These challenges create many interesting research problems such as determining the influence of social media on people’s actions, understanding opinion dissemination or determining the online reputation of a company. Not surprisingly opinion mining using social media has become a very active area of research, and a lot of progress has been made over the last years. This article describes the current state of research and the technologies that have been used in recent studies….(More)”
 

The Internet’s hidden science factory


Jenny Marder at PBS Newshour: “….Marshall is a worker for Amazon’s Mechanical Turk, an online job forum where “requesters” post jobs, and an army of crowdsourced workers complete them, earning fantastically small fees for each task. The work has been called microlabor, and the jobs, known as Human Intelligence Tasks, or HITs, range wildly. Some are tedious: transcribing interviews or cropping photos. Some are funny: prank calling someone’s buddy (that’s worth $1) or writing the title to a pornographic movie based on a collection of dirty screen grabs (6 cents). And others are downright bizarre. One task, for example, asked workers to strap live fish to their chests and upload the photos. That paid $5 — a lot by Mechanical Turk standards….
These aren’t obscure studies that Turkers are feeding. They span dozens of fields of research, including social, cognitive and clinical psychology, economics, political science and medicine. They teach us about human behavior. They deal in subjects like energy conservation, adolescent alcohol use, managing money and developing effective teaching methods.


….In 2010, the researcher Joseph Henrich and his team published a paper showing that an American undergraduate was about 4,000 times more likely than an average American to be the subject of a research study.
But that output pales in comparison to Mechanical Turk workers. The typical “Turker” completes more studies in a week than the typical undergraduate completes in a lifetime. That’s according to research by Rand, who surveyed both groups. Among those he surveyed, he found that the median traditional lab subject had completed 15 total academic studies — an average of one per week. The median Turker, on the other hand, had completed 300 total academic studies — an average of 20 per week….(More)”

Scenario Planning Case Studies Using Open Government Data


New Paper by Robert Power, Bella Robinson, Lachlan Rudd, and Andrew Reeson: “The opportunity for improved decision making has been enhanced in recent years through the public availability of a wide variety of information. In Australia, government data is routinely made available and maintained in the http://data.gov.au repository. This is a single point of reference for data that can be reused for purposes beyond that originally considered by the data custodians. Similarly a wealth of citizen information is available from the Australian Bureau of Statistics. Combining this data allows informed decisions to be made through planning scenarios.”

We present two case studies that demonstrate the utility of data integration and web mapping. As a simple proof of concept the user can explore different scenarios in each case study by indicating the relative weightings to be used for the decision making process. Both case studies are demonstrated as a publicly available interactive map-based website….(More)”

Small Pieces Loosely Joined: How smarter use of technology and data can deliver real reform of local government


Policy Exchange (UK): “Local authorities could save up to £10billion by 2020 through smarter and more collaborative use of technology and data.
Small Pieces Loosely Joined highlights how every year councils lose more than £1 billion by failing to identify where fraud has taken place. The paper also sheds light on how a lack of data sharing and collaboration between many local authorities, as well as the use of bespoke IT systems, keeps the cost of providing public services unsustainably high.
The report sets out three ways in which local authorities could not only save billions of pounds, but also provide better, more coordinated public services:

  1. Using data to predict and prevent fraud. Each year councils lose in excess of £1.3 billion through Council Tax fraud, benefit fraud and housing tenancy fraud (such as illegal subletting). By collecting and analysing data from numerous different sources, it is possible to predict where future violations are most likely to occur and direct investigative teams to respond to them first.
  2. Sharing data between neighbouring councils. Sharing data would reveal where it might be beneficial for two or more neighbouring LAs to merge one or more services. For example, if one council spends £5m each year on combating a particular issue, such as investigating food safety violations, fly-tipping or pest control, it may be more cost-effective to hire the services of a neighbouring council that has a far greater incidence of that same issue.
  3. Phasing out costly bespoke IT systems. Rather than each LA independently designing or commissioning its own apps and online services (such as paying for council tax or reporting noisy neighbours), an ‘app store’ should be created where individuals, businesses or other organisations can bid to provide them. The services created could then be used by dozens – or even hundreds – of LAs, creating economies of scale that bring down prices for all.

Since 2008, councils have shouldered the largest spending cuts of any part of the public sector – despite providing 80% of local public services – and face a funding shortfall of £12.4 billion by 2020. Some are doing admirably well under this extreme financial pressure, developing innovative schemes using data to ensure that they scale back spending but continue to provide vital public services. For example, Leeds, Yorkshire and Humber are developing a shared platform for digital services needed by all three councils. Similarly, a collaboration of public sector organisations in and around Hampshire and the Isle of Wight is developing ways of sharing data and helping neighbouring councils to share content and data through the Hampshire Hub.
FULL Report

The story of the sixth myth of open data and open government


Paper by Ann-Sofie Hellberg and Karin Hedström: “The aim of this paper is to describe a local government effort to realise an open government agenda. This is done using a storytelling approach….The empirical data is based on a case study. We participated in, as well as followed, the process of realising an open government agenda on a local level, where citizens were invited to use open public data as the basis for developing apps and external web solutions. Based on an interpretative tradition, we chose storytelling as a way to scrutinize the competition process. In this paper, we present a story about the competition process using the story elements put forward by Kendall and Kendall (2012).

….Our research builds on existing research by proposing the myth that the “public” wants to make use of open data. We provide empirical insights into the challenge of gaining benefits from open public data. In particular, we illustrate the difficulties in getting citizens interested in using open public data. Our case shows that people seem to like the idea of open public data, but do not necessarily participate actively in the data re-use process…..This study illustrates the difficulties of promoting the re-use of open public data. Public organisations that want to pursue an open government agenda can use our findings as empirical insights… (More)”

 

Innovation Labs: Leveraging Openness for Radical Innovation?


Paper by Gryszkiewicz, Lidia and Lykourentzou, Ioanna and Toivonen, Tuukka: “A growing range of public, private and civic organisations, from Unicef through Nesta to Tesco, now run units known as ‘innovation labs’. The hopeful assumption they share is that labs, by building on openness among other features, can generate promising solutions to grand challenges of the future. Despite their seeming proliferation and popularisation, the underlying innovation paradigm embodied by labs has so far received scant academic attention. This is a missed opportunity, because innovation labs are potentially fruitful vehicles for leveraging openness for radical innovation. Indeed, they not only strive to span organisational, sectoral and geographical boundaries by bringing a variety of uncommon actors together to embrace radical ideas and out-of-the box thinking, but they also aim to apply the concept of openness throughout the innovation process, including the experimentation and development phases. While the phenomenon of labs clearly forms part of a broader trend towards openness, it seems to transcend traditional conceptualisations of open innovation (Chesbrough, 2006), open strategy (Whittington et al., 2011), open science (David, 1998) or open government (Janssen et al., 2012). What are innovation labs about, how do they differ from other innovation efforts and how do they embrace openness to create breakthrough innovations? This short exploratory paper is an introduction to a larger empirical study aiming to answer these questions….(More).”

Surveying the citizen science landscape


Paper by Andrea Wiggins and Kevin Crowston in First Monday: “Citizen science has seen enormous growth in recent years, in part due to the influence of the Internet, and a corresponding growth in interest. However, the few stand-out examples that have received attention from media and researchers are not representative of the diversity of the field as a whole, and therefore may not be the best models for those seeking to study or start a citizen science project. In this work, we present the results of a survey of citizen science project leaders, identifying sub-groups of project types according to a variety of features related to project design and management, including funding sources, goals, participant activities, data quality processes, and social interaction. These combined features highlight the diversity of citizen science, providing an overview of the breadth of the phenomenon and laying a foundation for comparison between citizen science projects and to other online communities….(More).”

Access to Scientific Data in the 21st Century: Rationale and Illustrative Usage Rights Review


Paper by James Campbell  in Data Science Journal: “Making scientific data openly accessible and available for re-use is desirable to encourage validation of research results and/or economic development. Understanding what users may, or may not, do with data in online data repositories is key to maximizing the benefits of scientific data re-use. Many online repositories that allow access to scientific data indicate that data is “open,” yet specific usage conditions reviewed on 40 “open” sites suggest that there is no agreed upon understanding of what “open” means with respect to data. This inconsistency can be an impediment to data re-use by researchers and the public. (More)”

Open Government: Origin, Development, and Conceptual Perspectives


Paper by Bernd W. Wirtz & Steven Birkmeyer in the International Journal of Public Administration: “The term “open government” is frequently used in practice and science. Since President Obama’s Memorandum for the Heads of Executive Departments and Agencies in March 2009, open government has attracted an enormous amount of public attention. It is applied by authors from diverse areas, leading to a very heterogeneous comprehension of the concept. Against this background, this article screens the current open government literature to deduce an integrative definition of open government. Furthermore, this article analyzes the empirical and conceptual literature of open government to deduce an open government framework. In general, this article provides a clear understanding of the open government concept. (More)”

The new scientific revolution: Reproducibility at last


in the Washington Post:”…Reproducibility is a core scientific principle. A result that can’t be reproduced is not necessarily erroneous: Perhaps there were simply variables in the experiment that no one detected or accounted for. Still, science sets high standards for itself, and if experimental results can’t be reproduced, it’s hard to know what to make of them.
“The whole point of science, the way we know something, is not that I trust Isaac Newton because I think he was a great guy. The whole point is that I can do it myself,” said Brian Nosek, the founder of a start-up in Charlottesville, Va., called the Center for Open Science. “Show me the data, show me the process, show me the method, and then if I want to, I can reproduce it.”
The reproducibility issue is closely associated with a Greek researcher, John Ioannidis, who published a paper in 2005 with the startling title “Why Most Published Research Findings Are False.”
Ioannidis, now at Stanford, has started a program to help researchers improve the reliability of their experiments. He said the surge of interest in reproducibility was in part a reflection of the explosive growth of science around the world. The Internet is a factor, too: It’s easier for researchers to see what everyone else is doing….
Errors can potentially emerge from a practice called “data dredging”: When an initial hypothesis doesn’t pan out, the researcher will scan the data for something that looks like a story. The researcher will see a bump in the data and think it’s significant, but the next researcher to come along won’t see it — because the bump was a statistical fluke….
So far about 7,000 people are using that service, and the center has received commitments for $14 million in grants, with partners that include the National Science Foundation and the National Institutes of Health, Nosek said.
Another COS initiative will help researchers register their experiments in advance, telling the world exactly what they plan to do, what questions they will ask. This would avoid the data-dredging maneuver in which researchers who are disappointed go on a deep dive for something publishable.
Nosek and other reformers talk about “publication bias.” Positive results get reported, negative results ignored. Someone reading a journal article may never know about all the similar experiments that came to naught….(More).”