Who Retweets Whom: How Digital And Legacy Journalists Interact on Twitter


Paper by Michael L. Barthel, Ruth Moon, and William Mari published by the Tow Center: “When bloggers and citizen journalists became fixtures of the U.S. media environment, traditional print journalists responded with a critique, as this latest Tow Center brief says. According to mainstream reporters, the interlopers were “unprofessional, unethical, and overly dependent on the very mainstream media they criticized. In a 2013 poll of journalists, 51 percent agreed that citizen journalism is not real journalism”.

However, the digital media environment, a space for easy interaction has provided opportunities for journalists of all stripes to vault the barriers between legacy and digital sectors; if not collaborating, then perhaps communicating at least.

This brief by three PhD candidates at The University of Washington, Michael L. Barthel, Ruth Moon and William Mari, takes a snapshot of how fifteen political journalists from BuzzFeed, Politico and The New York Times, interact (representing digital, hybrid and legacy outlets respectively). The researchers place those interactions in the context of reporters’ longstanding traditions of gossip, goading, collaboration and competition.

They found tribalism, pronounced most strongly in the legacy outlet, but present across each grouping. They found hierarchy and status-boosting. But those phenomena were not absolute; there were also instances of co-operation, sharing and mutual benefit. None-the-less, by these indicators at least; there was a clear pecking order: Digital and hybrid organizations’ journalists paid “more attention to traditional than digital publications”.

You can download your copy here (pdf).”

Study to examine Australian businesses’ use of government data


ComputerWorld: “The New York University’s GovLab and the federal Department of Communications have embarked on a study of how Australian organisations are employing government data sets.

The ‘Open Data 500’ study was launched today at the Locate15 conference. It aims to provide a basis for assessing the value of open data and encourage the development of new businesses based on open data, as well as encourage discussion about how to make government data more useful to businesses and not-for-profit organisations.

The study is part of a series of studies taking place under the auspices of the OD500 Global Network.

“This study will help ensure the focus of Government is on the publication of high value datasets, with an emphasis on quality rather than quantity,” a statement issued by the Department of Communications said.

“Open Data 500 advances the government’s policy of increasing the number of high value public datasets in Australia in an effort to drive productivity and innovation, as well as its commitment to greater consultation with private sector stakeholders on open data,” Communications Minister Malcolm Turnbull said in remarks prepared for the Locate 15 conference….(More)”

The Algorithmic Self


Frank Pasquale in The Hedgehog Review:“…For many technology enthusiasts, the answer to the obesity epidemic—and many other problems—lies in computational countermeasures to the wiles of the food scientists. App developers are pioneering behavioristic interventions to make calorie counting and exercise prompts automatic. For example, users of a new gadget, the Pavlok wristband, can program it to give them an electronic shock if they miss exercise targets. But can such stimuli break through the blooming, buzzing distractions of instant gratification on offer in so many rival games and apps? Moreover, is there another way of conceptualizing our relationship to our surroundings than as a suboptimal system of stimulus and response?
Some of our subtlest, most incisive cultural critics have offered alternatives. Rather than acquiesce to our manipulability, they urge us to become more conscious of its sources—be they intrusive advertisements or computers that we (think we) control. For example, Sherry Turkle, founder and director of the MIT Initiative on Technology and Self, sees excessive engagement with gadgets as a substitution of the “machinic” for the human—the “cheap date” of robotized interaction standing in for the more unpredictable but ultimately challenging and rewarding negotiation of friendship, love, and collegiality. In The Glass Cage, Nicholas Carr critiques the replacement of human skill with computer mediation that, while initially liberating, threatens to sap the reserves of ingenuity and creativity that enabled the computation in the first place.
Beyond the psychological, there is a political dimension, too. Legal theorist and Georgetown University law professor Julie Cohen warns of the dangers of “modulation,” which enables advertisers, media executives, political consultants, and intelligence operatives to deploy opaque algorithms to monitor and manipulate behavior. Cultural critic Rob Horning ups the ante on the concerns of Cohen and Turkle with a series of essays dissecting feedback loops among surveillance entities, the capture of important information, and self-readjusting computational interventions designed to channel behavior and thought into ever-narrower channels. Horning also criticizes Carr for failing to emphasize the almost irresistible economic logic behind algorithmic self-making—at first for competitive advantage, then, ultimately, for survival.
To negotiate contemporary algorithms of reputation and search—ranging from resumé optimization on LinkedIn to strategic Facebook status updates to OkCupid profile grooming—we are increasingly called on to adopt an algorithmic self, one well practiced in strategic self-promotion. This algorithmic selfhood may be critical to finding job opportunities (or even maintaining a reliable circle of friends and family) in an era of accelerating social change. But it can also become self-defeating. Consider, for instance, the self-promoter whose status updates on Facebook or LinkedIn gradually tip from informative to annoying. Or the search engine−optimizing website whose tactics become a bit too aggressive, thereby causing it to run afoul of Google’s web spam team and consequently sink into obscurity. The algorithms remain stubbornly opaque amid rapidly changing social norms. A cyber-vertigo results, as we are pressed to promote our algorithmic selves but puzzled over the best way to do so….(More)
 

The Data Disclosure Decision


“The CIO Council Innovation Committee has released its first Open Data case study, The Data Disclosure Decision, showcasing the Department of Education (Education) Disclosure Review Board.
The Department of Education is a national warehouse for open data across a decentralized educational system, managing and exchanging education related data from across the country. Education collects large amounts of aggregate data at the state, district, and school level, disaggregated by a number of demographic variables. A majority of the data Education collects is considered personally identifiable information (PII), making data disclosure avoidance plans a mandatory component of Education’s data releases. With their expansive data sets and a need to protect sensitive information, Education quickly realized the need to organize and standardize their data disclosure protocol.
Education formally established the Data Disclosure Board with Secretary of Education Arne Duncan signing their Charter in August 2013. Since its inception, the Disclosure Review Board has recognized substantial successes and has greatly increased the volume and quality of data being released. Education’s Disclosure Review Board is continually learning through its open data journey and improving their approach through cultural change and leadership buy-in.
Learn more about Education’s Data Review Board’s story by reading The Data Disclosure Decision where you will find the full account of their experience and what they learned along the way. Read The Data Disclosure Decision

New portal to crowdsource captions, transcripts of old photos, national archives


Irene Tham at The Straits Times: “Wanted: history enthusiasts to caption old photographs and transcribe handwritten manuscripts that contain a piece of Singapore’s history.

They are invited to contribute to an upcoming portal that will carry some 3,000 unidentified photographs dating back to the late 1800s, and 3,000 pages of Straits Settlement records including letters written during Sir Stamford Raffles’ administration of Singapore.

These are collections from the Government and individuals waiting to be “tagged” on the new portal – The Citizen Archivist Project at www.nas.gov.sg/citizenarchivist….

Without tagging – such as by photo captioning and digital transcription – these records cannot be searched. There are over 140,000 photos and about one million pages of Straits Settlements Records in total that cannot be searched today.

These records date back to the 1800s, and include letters written during Sir Stamford Raffles’ administration in Singapore.

“The key challenge is that they were written in elaborate cursive penmanship which is not machine-readable,” said Dr Yaacob, adding that the knowledge and wisdom of the public can be tapped on to make these documents more accessible.

Mr Arthur Fong (West Coast GRC) had asked how the Government could get young people interested in history, and Dr Yaacob said this initiative was something they would enjoy.

Portal users must first log in using their existing Facebook, Google or National Library Board accounts. Contributions will be saved in users’ profiles, automatically created upon signing in.

Transcript contributions on the portal work in similar ways to Wikipedia; contributed text will be uploaded immediately on the portal.

However, the National Archives will take up to three days to review photo caption contributions. Approved captions will be uploaded on its website at www.nas.gov.sg/archivesonline….(More)”

How Open Is University Data?


Daniel Castro  at GovTech: “Many states now support open data, or data that’s made freely available without restriction in a nonproprietary, machine-readable format, to increase government transparency, improve public accountability and participation, and unlock opportunities for civic innovation. To date, 10 states have adopted open data policies, via executive order or legislation, and 24 states have built open data portals. But while many agencies have joined the open data movement, state colleges and universities have largely ignored this opportunity. To remedy this, policymakers should consider how to extend open data policies to state colleges and universities.

There are many potential benefits of open data for higher education. First, it can help prospective students and their parents better understand the value of different degree programs. One way to control rising higher ed costs is to create more informed consumers. The feds are already pushing for such changes. President Obama and Education Secretary Arne Duncan called for schools to make more information publicly available about the costs of obtaining a college degree, and the White House launched the College Scorecard, an online tool to compare data about the average tuition cost, size of loan payments and loan default rate for different schools.

But students deserve more detailed information. Prospective students should be able to decide where to attend and what to study based on historical data like program costs, percentage of students completing the program and how long they take to do so, and what kind of earning power they have after graduating.

Second, open data can aid better fiscal oversight and accountability of university operations. In 2014, states provided about $76 billion in support for higher ed, yet few colleges and universities have adopted open data policies to increase the transparency of their budgets. Contrast this with California cities like Oakland, Palo Alto and Los Angeles, which created online tools to let others explore and visualize their budgets. Additional oversight, including from the public, could help reduce fraud, waste and abuse in higher education, save taxpayers money and create more opportunities for public participation in state budgeting.

Third, open data can be a valuable resource for producing innovations that make universities a better place to work and study. Large campuses are basically small cities, and many cities have found open data useful for improving public safety and optimizing transportation services. Universities hold much untapped data: course catalogs, syllabi, bus schedules, campus menus, campus directories, faculty evaluations, etc. Creating portals to release these data sets and building application programming interfaces to access this information would give developers direct access to data that students, faculty, alumni and other stakeholders could use to build apps and services to improve the college experience….(More)”

Tweets Can Predict Health Insurance Exchange Enrollment


PennMedicine: “An increase in Twitter sentiment (the positivity or negativity of tweets) is associated with an increase in state-level enrollment in the Affordable Care Act’s (ACA) health insurance marketplaces — a phenomenon that points to use of the social media platform as a real-time gauge of public opinion and provides a way for marketplaces to quickly identify enrollment changes and emerging issues. Although Twitter has been previously used to measure public perception on a range of health topics, this study, led by researchers at the Perelman School of Medicine at the University of Pennsylvania and published online in the Journal of Medical Internet Research, is the first to look at its relationship with the new national health insurance marketplace enrollment.

The study examined 977,303 ACA and “Obamacare”-related tweets — along with those directed toward the Twitter handle for HealthCare.gov and the 17 state-based marketplace Twitter accounts — in March 2014, then tested a correlation of Twitter sentiment with marketplace enrollment by state. Tweet sentiment was determined using the National Research Council (NRC) sentiment lexicon, which contains more than 54,000 words with corresponding sentiment weights ranging from positive to negative. For example, the word “excellent” has a positive sentiment weight, and is more positive than the word “good,” but the word “awful” is negative. Using this lexicon, researchers found that a .10 increase in the sentiment of tweets was associated with a nine percent increase in health insurance marketplace enrollment at the state level. While a .10 increase may seem small, these numbers indicate a significant correlation between Twitter sentiment and enrollment based on a continuum of sentiment scores that were examined over a million tweets.

“The correlation between Twitter sentiment and the number of eligible individuals who enrolled in a marketplace plan highlights the potential for Twitter to be a real-time monitoring strategy for future enrollment periods,” said first author Charlene A. Wong, MD, a Robert Wood Johnson Foundation Clinical Scholar and Fellow in Penn’s Leonard Davis Institute of Health Economics. “This would be especially valuable for quickly identifying emerging issues and making adjustments, instead of having to wait weeks or months for that information to be released in enrollment reports, for example.”…(More)”

Encyclopedia of Social Network Analysis and Mining


“The Encyclopedia of Social Network Analysis and Mining (ESNAM) is the first major reference work to integrate fundamental concepts and research directions in the areas of social networks and  applications to data mining. While ESNAM  reflects the state-of-the-art in  social network research, the field  had its start in the 1930s when fundamental issues in social network research were broadly defined. These communities were limited to relatively small numbers of nodes (actors) and links. More recently the advent of electronic communication, and in particular on-line communities, have created social networks of hitherto unimaginable sizes. People around the world are directly or indirectly connected by popular social networks established using web-based platforms rather than by physical proximity.

Reflecting the interdisciplinary nature of this unique field, the essential contributions of diverse disciplines, from computer science, mathematics, and statistics to sociology and behavioral science, are described among the 300 authoritative yet highly readable entries. Students will find a world of information and insight behind the familiar façade of the social networks in which they participate. Researchers and practitioners will benefit from a comprehensive perspective on the methodologies for analysis of constructed networks, and the data mining and machine learning techniques that have proved attractive for sophisticated knowledge discovery in complex applications. Also addressed is the application of social network methodologies to other domains, such as web networks and biological networks….(More)”

Philadelphia’s Newly Upgraded Open Data Portal


Michael Grass at Government Executive: “If you’re looking for streets where vending is prohibited in the city of Philadelphia, the city’s newly upgraded open data portal has that information. If you’re looking for information on reported bicycle thefts, the city’s open data portal has that information, too. Same goes for the city’s budget.

Philadelphia’s recently relaunched open data portal, Open Data Philly, has 264 data sets, applications and APIs available for the public to access and use. Much of that information comes from municipal sources.

“The redesign of OpenDataPhilly will increase access to available data, thereby enabling our citizens to become more engaged and knowledgeable and our government more accountable,” Mayor Michael Nutter said in a statement last month.

But Philadelphia’s open data portal isn’t just designed to unlock datasets at City Hall.

The city’s universities, cultural and non-profit organizations and commercial entities are part of the portal as well. Portal users interested in historic maps of the city can access the Philadelphia GeoHistory Network, a project of Philadelphia’s Athenaeum Museum, which maintains a tool where layers of historic maps can overlaid on an interactive Google map.

You can even find a list of current happy hour specials, courtesy of DrinkPhilly….(More)”

“Data on the Web” Best Practices


W3C First Public Working Draft: “…The best practices described below have been developed to encourage and enable the continued expansion of the Web as a medium for the exchange of data. The growth of open data by governments across the world [OKFN-INDEX], the increasing publication of research data encouraged by organizations like the Research Data Alliance [RDA], the harvesting and analysis of social media, crowd-sourcing of information, the provision of important cultural heritage collections such as at the Bibliothèque nationale de France [BNF] and the sustained growth in the Linked Open Data Cloud [LODC], provide some examples of this phenomenon.

In broad terms, data publishers aim to share data either openly or with controlled access. Data consumers (who may also be producers themselves) want to be able to find and use data, especially if it is accurate, regularly updated and guaranteed to be available at all times. This creates a fundamental need for a common understanding between data publishers and data consumers. Without this agreement, data publishers’ efforts may be incompatible with data consumers’ desires.

Publishing data on the Web creates new challenges, such as how to represent, describe and make data available in a way that it will be easy to find and to understand. In this context, it becomes crucial to provide guidance to publishers that will improve consistency in the way data is managed, thus promoting the re-use of data and also to foster trust in the data among developers, whatever technology they choose to use, increasing the potential for genuine innovation.

This document sets out a series of best practices that will help publishers and consumers face the new challenges and opportunities posed by data on the Web.

Best practices cover different aspects related to data publishing and consumption, like data formats, data access, data identification and metadata. In order to delimit the scope and elicit the required features for Data on the Web Best Practices, the DWBP working group compiled a set of use cases [UCR] that represent scenarios of how data is commonly published on the Web and how it is used. The set of requirements derived from these use cases were used to guide the development of the best practice.

The Best Practices proposed in this document are intended to serve a more general purpose than the practices suggested in Best Practices for Publishing Linked Data [LD-BP] since it is domain-independent and whilst it recommends the use of Linked Data, it also promotes best practices for data on the web in formats such as CSV and JSON. The Best Practices related to the use of vocabularies incorporate practices that stem from Best Practices for Publishing Linked Data where appropriate….(More)