The Modern Beauty of 19th-Century Data Visualizations


Laura Bliss at CityLab: “The Library of Congress‘ online presence is a temple of American history, an unmatched, searchable collection of digitized photographs, maps, recordings, sheet music, and documents in the millions, dating back to the 15th century.
 
Sifting through these treasures isn’t so easy, though. When you do manage the clunky search interface and stumble across a gorgeous 1870s statistical atlas, it’s hard to zoom in closely on its pages and properly marvel at the antique gem.
Problem solved, thanks to the info-nerds at Vintage Visualizations, a project of the Brooklyn Brainery. They’ve reproduced a number of the LOC’s Civil War-era data visualizations in high-quality poster prints, and they are mouthwateringly cool. For example, I really wish we still ranked city populations like this chart does, which traces a century of census data in colorful Jenga towers (NYC, forever the biggest apple!):

Behold, the ratio of “church accommodation” by state, circa 1870, displayed like wallpaper swatches….(More):

Study: Complaining on Twitter correlates with heart disease risks


at ArsTechnica: “Tweets prove better regional heart disease predictor than many classic factors. This week, a study was released by researchers at the University of Pennsylvania that found a surprising correlation when studying two kinds of maps: those that mapped the county-level frequency of cardiac disease, and those that mapped the emotional state of an area’s Twitter posts.
In all, researchers sifted through over 826 million tweets, made available by Twitter’s research-friendly “garden hose” server access, then narrowed those down to roughly 146 million tweets that had been posted with geolocation data from over 1,300 counties (each county needed to have at least 50,000 tweets to sift through to qualify). The team then measured an individual county’s expected “health” level based on frequency of certain phrases, using dictionaries that had been put through scrutiny over their application to emotional states. Negative statements about health, jobs, and attractiveness—along with a bump in curse words—would put a county in the “risk” camp, while words like “opportunities,” “overcome,” and “weekend” added more points to a county’s “protective” rating.
Not only did this measure correlate strongly with age-adjusted heart disease rate data, it turned out to be a more efficient predictor of higher or lower disease likelihood than “ten classical predictors” combined, including education, obesity, and smoking. Twitter beat that data by a rate of 42 percent to 36 percent….Psychological Science, 2014. DOI: 10.1177/0956797614557867  (About DOIs)….(More)”

At Universities, a Push for Data-Driven Career Services


at The New York Times: “Officials at the University of California, San Diego, had sparse information on the career success of their graduates until they set up a branded page for the university on LinkedIn a couple of years ago.

“Back then, we had records on 125,000 alumni, but we had good employment information on less than 10,000 of them,” recalled Armin Afsahi, who oversees alumni relations as the university’s associate vice chancellor for advancement. “Aside from Qualcomm, which is in our back yard, we didn’t know who employed our alumni.”

Within three months of setting up the university page, LinkedIn connections surfaced information on 92,000 alumni, Mr. Afsahi said.

The LinkedIn page of University of California, San Diego.
The LinkedIn page of University of California, San Diego.Credit

….

“The old models of alumni relations don’t work,” Mr. Afsahi said. “We have to be a data-driven, intelligence-oriented organization to create the engagement and value” that students and alumni expect.

In an article on Sunday, I profiled two analytics start-ups, EverTrue and Graduway, which aim to help colleges and universities identify their best prospective donors or student mentors by scanning their graduates’ social networking activities. Each start-up taps into LinkedIn profiles of alumni — albeit in different ways — to help institutions of higher education stay up-to-date with their graduates’ contact information and careers.

Since 2013, however, LinkedIn has offered its own proprietary service, called University Pages, where schools can create hubs for alumni outreach and networking. About 25,000 institutions of higher learning around the world now have official university pages on the site…(More).”

Big Data Now


at Radar – O’Reilly: “In the four years we’ve been producing Big Data Now, our wrap-up of important developments in the big data field, we’ve seen tools and applications mature, multiply, and coalesce into new categories. This year’s free wrap-up of Radar coverage is organized around seven themes:

  • Cognitive augmentation: As data processing and data analytics become more accessible, jobs that can be automated will go away. But to be clear, there are still many tasks where the combination of humans and machines produce superior results.
  • Intelligence matters: Artificial intelligence is now playing a bigger and bigger role in everyone’s lives, from sorting our email to rerouting our morning commutes, from detecting fraud in financial markets to predicting dangerous chemical spills. The computing power and algorithmic building blocks to put AI to work have never been more accessible.
  • The convergence of cheap sensors, fast networks, and distributed computation: The amount of quantified data available is increasing exponentially — and aside from tools for centrally handling huge volumes of time-series data as it arrives, devices and software are getting smarter about placing their own data accurately in context, extrapolating without needing to ‘check in’ constantly.
  • Reproducing, managing, and maintaining data pipelines: The coordination of processes and personnel within organizations to gather, store, analyze, and make use of data.
  • The evolving, maturing marketplace of big data components: Open-source components like Spark, Kafka, Cassandra, and ElasticSearch are reducing the need for companies to build in-house proprietary systems. On the other hand, vendors are developing industry-specific suites and applications optimized for the unique needs and data sources in a field.
  • The value of applying techniques from design and social science: While data science knows human behavior in the aggregate, design works in the particular, where A/B testing won’t apply — you only get one shot to communicate your proposal to a CEO, for example. Similarly, social science enables extrapolation from sparse data. Both sets of tools enable you to ask the right questions, and scope your problems and solutions realistically.
  • The importance of building a data culture: An organization that is comfortable with gathering data, curious about its significance, and willing to act on its results will perform demonstrably better than one that doesn’t. These priorities must be shared throughout the business.
  • The perils of big data: From poor analysis (driven by false correlation or lack of domain expertise) to intrusiveness (privacy invasion, price profiling, self-fulfilling predictions), big data has negative potential.

Download our free snapshot of big data in 2014, and follow the story this year on Radar.”

Charitable techies can now donate their skills to nonprofits in need


Springwise: “Technology is a mixed blessing for many of nonprofits. While apps such as SnapDonate and Charitweet make it easier than ever for people to make donations, tech development and upkeep can cost NGOs a lot of money which is desperately needed elsewhere.
With this in mind, a new platform called #charity is encouraging IT professionals to donate their time and specialist tech skills to nonprofits in need, helping those organizations to reduce costs and put the money back into their missions.
Charities of all sizes can list their IT needs, one project at a time, on #charity for free. Volunteers can sign up to the network and #charity will smartmatch them with the project that best suits their skills and interests — using an algorithm which scan’s participants LinkedIn profiles. #charity assign project managers to ensure the right people are in place and they also backup all the volunteers’ work on their platform in case they are not able to complete the task. Support is provided throughout for both parties — ensuring everyone’s experience is as smooth and efficient as possible.
#charity is set to launch in March. It currently has 350 participants on board and is accepting registration for early access. There are pilot programs underway involving Action Against Hunger, Moneythink and Syria Deeply….(More).”

Survive and Thrive: How Big Data Is Transforming Health Care


at Pacific Standard: “When you step on a scale, take your temperature, or check your blood pressure, you’re using data from your body to measure your health. Advances in fitness trackers have made health quantification more accessible to casual users. But for researchers, health care providers, and people with chronic conditions, advances in tracking technology, data analysis, and automation offer significant improvements in medical treatment and quality of life.

This three-part series explores health quantification through the eyes of Rutgers University Ph.D student Maria Qadri, who has both professional and personal experience in the matter. Qadri’s research aims to help people with traumatic brain injury and Parkinson’s Disease better manage their illness, and, as a Type 1 diabetic, glucose monitoring is a major part of her own life. Below, we take a look at how number crunching and personal data factors into Qadri’s research and life….(More).”

symbolia 1

 

Open Data Is Finally Making A Dent In Cities


Brooks Rainwater at Co-Exist: “As with a range of leading issues, cities are at the vanguard of this shifting environment. Through increased measurement, analysis, and engagement, open data will further solidify the centrality of cities.
In the Chicago, the voice of the mayor counts for a lot. And Mayor Emmanuel has been at the forefront in supporting and encouraging open data in the city, resulting in a strong open government community. The city has more than 600 datasets online, and has seen millions of page views on its data portal. The public benefits have accrued widely with civic initiatives like Chicagolobbyists.org, as well as with a myriad of other open data led endeavors.
Transparency is one of the great promises of open data. Petitioning the government is a fundamental tenet of democracy and many government relations’ professionals perform this task brilliantly. At the same time that transparency is good for the city, it’s good for citizens and democracy. Through the advent of Chicagolobbyists.org, anyone can now see how many lobbyists are in the city, how much they are spending, who they are talking to, and when it is happening.
Throughout the country, we are seeing data driven sites and apps like this that engage citizens, enhance services, and provide a rich understanding of government operations In Austin, a grassroots movement has formed with advocacy organization Open Austin. Through hackathons and other opportunities, citizens are getting involved, services are improving, and businesses are being built.
Data can even find your dog, reducing the number of stray animals being sheltered, with StrayMapper.com. The site has a simple map-based web portal where you can type in whether you are missing a dog or cat, when you lost them, and where. That information is then plugged into the data being collected by the city on stray animals. This project, developed by a Code for America brigade team, helps the city improve its rate of returning pets to owners.
It’s not only animals that get lost or at least can’t find the best way home. I’ve found myself in that situation too. Thanks to Ridescout, incubated in Washington, D.C., at 1776, I have been able to easily find the best way home. Through the use of open data available from both cities and the Department of Transportation, Ridescout created an app that is an intuitive mobility tool. By showing me all of the available options from transit to ridesharing to my own two feet, it frequently helps me get from place to place in the city. It looks like it wasn’t just me that found this app to be handy; Daimler recently acquired Ridescout as the auto giant continues its own expansion into the data driven mobility space.”

Doing Social Network Research: Network-based Research Design for Social Scientists


New book by Garry Robins: “Are you struggling to design your social network research? Are you looking for a book that covers more than social network analysis? If so, this is the book for you! With straight-forward guidance on research design and data collection, as well as social network analysis, this book takes you start to finish through the whole process of doing network research. Open the book and you’ll find practical, ‘how to’ advice and worked examples relevant to PhD students and researchers from across the social and behavioural sciences. The book covers:

  • Fundamental network concepts and theories
  • Research questions and study design
  • Social systems and data structures
  • Network observation and measurement
  • Methods for data collection
  • Ethical issues for social network research
  • Network visualization
  • Methods for social network analysis
  • Drawing conclusions from social network results

This is a perfect guide for all students and researchers looking to do empirical social network research…(More)”

The downside of Open Data


Joshua Chambers at FutureGov: “…Inaccurate public datasets can cause big problems, because apps that feed off of them could be giving out false information. I was struck by this when we reported on an app in Australia that was issuing alerts for forest fires that didn’t exist. The data was coming from public emergency calls, but wasn’t verified before being displayed. This meant that app users would be alerted of all possible fires, but also could be caused unnecessarily panic. The government takes the view that more alerts are better than slower verified ones, but there is the potential for people to become less likely to trust all alerts on the app.
No-one wants to publish inaccurate data, but accuracy takes time and costs money. So we come to a central tension in discussions about open data: is it better to publish more data, with the risk of inaccuracy, or limit publication to datasets which are accurate?
The United Kingdom takes the view that more data is best. I interviewed the UK’s lead official on open data, Paul Maltby, a couple of years ago, and he told me that: “There’s a misnomer here that everything has to be perfect before you can put it out,” adding that “what we’re finding is that, actually, some of the datasets are a bit messy. We try to keep them as high-quality as we can; but other organisations then clean up the data and sell it on”.
Indeed, he noted that some officials use data accuracy as an excuse to not publish information that could hold their departments to account. “There’s sometimes a reluctance to get data out from the civil service; and whilst we see many examples of people understanding the reasons why data has been put to use, I’d say the general default is still not pro-release”.
Other countries take a different view, however. Singapore, for example, publishes much less data than Britain, but has more of a push on making its data accurate to assist startups and app builders….(More)”

Social Sensing and Crowdsourcing: the future of connected sensors


Conference Paper by C. Geijer, M. Larsson, M. Stigelid: “Social sensing is becoming an alternative to static sensors. It is a way to crowdsource data collection where sensors can be placed on frequently used objects, such as mobile phones or cars, to gather important information. Increasing availability in technology, such as cheap sensors being added in cell phones, creates an opportunity to build bigger sensor networks that are capable of collecting a larger quantity and more complex data. The purpose of this paper is to highlight problems in the field, as well as their solutions. The focus lies on the use of physical sensors and not on the use of social media to collect data. Research papers were reviewed based on implemented or suggested implementations of social sensing. The discovered problems are contrasted with possible solutions, and used to reflect upon the future of the field. We found issues such as privacy, noise and trustworthiness to be problems when using a distributed network of sensors. Furthermore, we discovered models for determining the accuracy as well as truthfulness of gathered data that can effectively combat these problems. The topic of privacy remains an open-ended problem, since it is based upon ethical considerations that may differ from person to person, but there exists methods for addressing this as well. The reviewed research suggests that social sensing will become more and more useful in the future….(More).”