Researcher uncovers inherent biases of big data collected from social media sites


Phys.org: “With every click, Facebook, Twitter and other social media users leave behind digital traces of themselves, information that can be used by businesses, government agencies and other groups that rely on “big data.”

But while the information derived from social network sites can shed light on social behavioral traits, some analyses based on this type of data collection are prone to bias from the get-go, according to new research by Northwestern University professor Eszter Hargittai, who heads the Web Use Project.

Since people don’t randomly join Facebook, Twitter or LinkedIn—they deliberately choose to engage —the data are potentially biased in terms of demographics, socioeconomic background or Internet skills, according to the research. This has implications for businesses, municipalities and other groups who use because it excludes certain segments of the population and could lead to unwarranted or faulty conclusions, Hargittai said.

The study, “Is Bigger Always Better? Potential Biases of Big Data Derived from Social Network Sites” was published last month in the journal The Annals of the American Academy of Political and Social Science and is part of a larger, ongoing study.

The buzzword “big data” refers to automatically generated information about people’s behavior. It’s called “big” because it can easily include millions of observations if not more. In contrast to surveys, which require explicit responses to questions, big data is created when people do things using a service or system.

“The problem is that the only people whose behaviors and opinions are represented are those who decided to join the site in the first place,” said Hargittai, the April McClain-Delaney and John Delaney Professor in the School of Communication. “If people are analyzing big data to answer certain questions, they may be leaving out entire groups of people and their voices.”

For example, a city could use Twitter to collect local opinion regarding how to make the community more “age-friendly” or whether more bike lanes are needed. In those cases, “it’s really important to know that people aren’t on Twitter randomly, and you would only get a certain type of person’s response to the question,” said Hargittai.

“You could be missing half the population, if not more. The same holds true for companies who only use Twitter and Facebook and are looking for feedback about their products,” she said. “It really has implications for every kind of group.”…

More information: “Is Bigger Always Better? Potential Biases of Big Data Derived from Social Network Sites” The Annals of the American Academy of Political and Social Science May 2015 659: 63-76, DOI: 10.1177/0002716215570866

Rethinking Smart Cities From The Ground Up


New report byTom Saunders and Peter Baeck (NESTA): “This report tells the stories of cities around the world – from Beijing to Amsterdam, and from London to Jakarta – that are addressing urban challenges by using digital technologies to engage and enable citizens.

Key findings

  • Many ‘top down’ smart city ideas have failed to deliver on their promise, combining high costs and low returns.
  • ‘Collaborative technologies’ offer cities another way to make smarter use of resources, smarter ways of collecting data and smarter ways to make decisions.
  • Collaborative technologies can also help citizens themselves shape the future of their cities.
  • We have created five recommendations for city government who want to make their cities smarter.

As cities bring people together to live, work and play, they amplify their ability to create wealth and ideas. But scale and density also bring acute challenges: how to move around people and things; how to provide energy; how to keep people safe.

‘Smart cities’ offer sensors, ‘big data’ and advanced computing as answers to these challenges, but they have often faced criticism for being too concerned with hardware rather than with people.

In this report we argue that successful smart cities of the future will combine the best aspects of technology infrastructure while making the most of the growing potential of ‘collaborative technologies’, technologies that enable greater collaboration between urban communities and between citizens and city governments.

How will this work in practice? Drawing on examples from all around the world we investigate four emerging methods which are helping city governments engage and enable citizens: the collaborative economy, crowdsourcing data, collective intelligence and crowdfunding.

Policy recommendations

  1. Set up a civic innovation lab to drive innovation in collaborative technologies.
  2. Use open data and open platforms to mobilise collective knowledge.
  3. Take human behaviour as seriously as technology.
  4. Invest in smart people, not just smart technology.
  5. Spread the potential of collaborative technologies to all parts of society….(More)”

This App Lets You See The Tough Choices Needed To Balance Your City’s Budget


Jay Cassano at FastCoExist: “Ask the average person on the street how much money their city spends on education or health care or police. Even the most well-informed probably won’t be able to come up with a dollar amount. That’s because even if you are interested, municipal budgets aren’t presented in a way that makes sense to ordinary people.

Balancing Act is a web app that displays a straightforward pie chart of a city’s budget, broken down into categories like pensions, parks & recreations, police, and education. But it doesn’t just display the current budget breakdown. It invites users to tweak it, expressing their own priorities, all while keeping the city in the black. Do you want your libraries to be better funded? Fine—but you’re going to have to raise property taxes to do it.

“Balancing Act provides a way for people to both understand what public entities are doing and then to weight that against the other possible things that government can do,” says Chris Adams, president of Engaged Public, a Colorado-based consulting firm that develops technology for government and non-profits. “Especially in this era of information, all of us have a responsibility to spend a bit of time understanding how our government is spending money on our behalf.”

Hartford, Connecticut is the first city in the country that is using Balancing Act. The city was facing a $49 million budget deficit this spring, and Mayor Pedro Segarra says he took input from citizens using Balancing Act. Meanwhile, in Engaged Public’s home state, residents can input their income to generate an itemized tax receipt and then tweak the Colorado state budget as they see fit.

Engaged Public hopes that by making budgets more interactive and accessible, more people will take an interest in them.

“Budget information almost universally exists, but it’s not in accessible formats—mostly they’re in PDF files,” says Adams. “So citizens are invited to pour through tens of thousands of pages of PDFs. But that really doesn’t give you a high-level understanding of what’s at stake in a reasonable amount of time.”

If widely used, Balancing Act could be a useful tool for politicians to check the pulse of their constituents. For example, decreasing funding to parks draws a negative public reaction. But if enough people on Balancing Act experimented with the budget, saw the necessity of it, and submitted their recommendations, then an elected might be willing to make a decision that would otherwise seem politically risky….(More)”

Big Data’s Impact on Public Transportation


InnovationEnterprise: “Getting around any big city can be a real pain. Traffic jams seem to be a constant complaint, and simply getting to work can turn into a chore, even on the best of days. With more people than ever before flocking to the world’s major metropolitan areas, the issues of crowding and inefficient transportation only stand to get much worse. Luckily, the traditional methods of managing public transportation could be on the verge of changing thanks to advances in big data. While big data use cases have been a part of the business world for years now, city planners and transportation experts are quickly realizing how valuable it can be when making improvements to city transportation. That hour long commute may no longer be something travelers will have to worry about in the future.

In much the same way that big data has transformed businesses around the world by offering greater insight in the behavior of their customers, it can also provide a deeper look at travellers. Like retail customers, commuters have certain patterns they like to keep to when on the road or riding the rails. Travellers also have their own motivations and desires, and getting to the heart of their actions is all part of what big data analytics is about. By analyzing these actions and the factors that go into them, transportation experts can gain a better understanding of why people choose certain routes or why they prefer one method of transportation over another. Based on these findings, planners can then figure out where to focus their efforts and respond to the needs of millions of commuters.

Gathering the accurate data needed to make knowledgeable decisions regarding city transportation can be a challenge in itself, especially considering how many people commute to work in a major city. New methods of data collection have made that effort easier and a lot less costly. One way that’s been implemented is through the gathering of call data records (CDR). From regular transactions made from mobile devices, information about location, time, and duration of an action (like a phone call) can give data scientists the necessary details on where people are traveling to, how long it takes them to get to their destination, and other useful statistics. The valuable part of this data is the sample size, which provides a much bigger picture of the transportation patterns of travellers.

That’s not the only way cities are using big data to improve public transportation though. Melbourne in Australia has long been considered one of the world’s best cities for public transit, and much of that is thanks to big data. With big data and ad hoc analysis, Melbourne’s acclaimed tram system can automatically reconfigure routes in response to sudden problems or challenges, such as a major city event or natural disaster. Data is also used in this system to fix problems before they turn serious.Sensors located in equipment like tram cars and tracks can detect when maintenance is needed on a specific part. Crews are quickly dispatched to repair what needs fixing, and the tram system continues to run smoothly. This is similar to the idea of the Internet of Things, wherein embedded sensors collect data that is then analyzed to identify problems and improve efficiency.

Sao Paulo, Brazil is another city that sees the value of using big data for its public transportation. The city’s efforts concentrate on improving the management of its bus fleet. With big data collected in real time, the city can get a more accurate picture of just how many people are riding the buses, which routes are on time, how drivers respond to changing conditions, and many other factors. Based off of this information, Sao Paulo can optimize its operations, providing added vehicles where demand is genuine whilst finding which routes are the most efficient. Without big data analytics, this process would have taken a very long time and would likely be hit-or-miss in terms of accuracy, but now, big data provides more certainty in a shorter amount of time….(More)”

Exploring Open Energy Data in Urban Areas


The Worldbank: “…Energy efficiency – using less energy input to deliver the same level of service – has been described by many as the ‘first fuel’ of our societies. However, lack of adequate data to accurately predict and measure energy efficiency savings, particularly at the city level, has limited the realization of its promise over the past two decades.
Why Open Energy Data?
Open Data can be a powerful tool to reduce information asymmetry in markets, increase transparency and help achieve local economic development goals. Several sectors like transport, public sector management and agriculture have started to benefit from Open Data practices. Energy markets are often characterized by less-than-optimal conditions with high system inefficiencies, misaligned incentives and low levels of transparency. As such, the sector has a lot to potentially gain from embracing Open Data principles.
The United States is a leader in this field with its ‘Energy Data’ initiative. This initiative makes data easy to find, understand and apply, helping to fuel a clean energy economy. For example, the Energy Information Administration’s (EIA) open application programming interface (API) has more than 1.2 million time series of data and is frequently visited by users from the private sector, civil society and media. In addition, the Green Button  initiative is empowering American citizens to have access to their own energy usage data, and OpenEI.org is an Open Energy Information platform to help people find energy information, share their knowledge and connect to other energy stakeholders.
Introducing the Open Energy Data Assessment
To address this data gap in emerging and developing countries, the World Bank is conducting a series of Open Energy Data Assessments in urban areas. The objective is to identify important energy-related data, raise awareness of the benefits of Open Data principles and improve the flow of data between traditional energy stakeholders and others interested in the sector.
The first cities we assessed were Accra, Ghana and Nairobi, Kenya. Both are among the fastest-growing cities in the world, with dynamic entrepreneurial and technology sectors, and both are capitals of countries with an ongoing National Open Data Initiative., The two cities have also been selected to be part of the Negawatt Challenge, a World Bank international competition supporting technology innovation to solve local energy challenges.
The ecosystem approach
The starting point for the exercise was to consider the urban energy sector as an ecosystem, comprised of data suppliers, data users, key datasets, a legal framework, funding mechanisms, and ICT infrastructure. The methodology that we used adapted the established World Bank Open Data Readiness Assessment (ODRA), which highlights valuable connections between data suppliers and data demand.  The assessment showcases how to match pressing urban challenges with the opportunity to release and use data to address them, creating a longer-term commitment to the process. Mobilizing key stakeholders to provide quick, tangible results is also key to this approach….(More) …See also World Bank Open Government Data Toolkit.”

Waze and the Traffic Panopticon


 in the New Yorker: “In April, during his second annual State of the City address, Los Angeles Mayor Eric Garcetti announced a data-sharing agreement with Waze, the Google-owned, Israel-based navigation service. Waze is different from most navigation apps, including Google Maps, in that it relies heavily on real-time, user-generated data. Some of this data is produced actively—a driver or passenger sees a stalled vehicle, then uses a voice command or taps a stalled-vehicle icon on the app to alert others—while other data, such as the user’s location and average speed, is gathered passively, via smartphones. The agreement will see the city provide Waze with some of the active data it collects, alerting drivers to road closures, construction, and parades, among other things. From Waze, the city will get real-time data on traffic and road conditions. Garcetti said that the partnership would mean “less congestion, better routing, and a more livable L.A.” Di-Ann Eisnor, Waze’s head of growth, acknowledged to me that these kinds of deals can cause discomfort to the people working inside city government. “It’s exciting, but people inside are also fearful because it seems like too much work, or it seems so unknown,” she said.

Indeed, the deal promises to help the city improve some of its traffic and infrastructure systems (L.A. still uses paper to manage pothole patching, for example), but it also acknowledges Waze’s role in the complex new reality of urban traffic planning. Traditionally, traffic management has been a largely top-down process. In Los Angeles, it is coördinated in a bunker downtown, several stories below the sidewalk, where engineers stare at blinking lights representing traffic and live camera feeds of street intersections. L.A.’s sensor-and-algorithm-driven Automated Traffic Surveillance and Control System is already one of the world’s most sophisticated traffic-mitigation tools, but it can only do so much to manage the city’s eternally unsophisticated gridlock. Los Angeles appears to see its partnership with Waze as an important step toward improving the bridge between its subterranean panopticon and the rest of the city still further, much like other metropolises that have struck deals with Waze under the company’s Connected Cities program.
Among the early adopters is Rio de Janeiro, whose urban command center tracks everything from accidents to hyperlocal weather conditions, pulling data from thirty departments and private companies, including Waze. “In Rio,” Eisnor said, traffic managers “were able to change the garbage routes, figure out where to install cameras, and deploy traffic personnel” because of the program. She also pointed out that Connected Cities has helped municipal workers in Washington, D.C., patch potholes within forty-eight hours of their being identified on Waze. “We’re helping reframe city planning through not just space but space and time,” she said…..(More)

Citizen-Driven Innovation : A Guidebook for City Mayors and Public Administrators


Guidebook by Eskelinen, Jarmo; Garcia Robles, Ana; Lindy, Ilari; Marsh, Jesse; Muente-Kunigami, Arturo:  “… aims to bring citizen-driven innovation to policy makers and change agents around the globe, by spreading good practice on open and participatory approaches as applied to digital service development in different nations, climates, cultures, and urban settings. The report explores the concept of smart cities through a lens that promotes citizens as the driving force of urban innovation. Different models of smart cities are presented, showing how citizen-centric methods have been used to mobilize resources to respond to urban innovation challenges in a variety of situations, objectives, and governance structures. The living lab approach strengthens these processes as one of the leading methods for agile development or the rapid prototyping of ideas, concepts, products, services, and processes in a highly decentralized and user-centric manner. By adopting these approaches and promoting citizen-driven innovation, cities around the world are aiming to alleviate the demand for services, increase the quality of delivery, and promote local entrepreneurship. This guidebook is structured into seven main sections: an introductory section describes the vision of a humanly smart city, in order to give an idea of the kind of result that can be attained from opening up and applying citizen-driven innovation methods. Chapter one getting started helps mayors launch co-design initiatives, exploring innovation processes founded on trust and verifying the benefits of opening up. Chapter two, building a strategy identifies the key steps for building an innovation partnership and together defining a sustainable city vision and scenarios for getting there. Chapter three, co-designing solutions looks at the process of unpacking concrete problems, working creatively to address them, and following up on implementation. Chapter four, ensuring sustainability describes key elements for long-term viability: evaluation and impact assessment, appropriate institutional structuring, and funding and policy support. Chapter five, joining forces suggests ways to identify a unique role for participation in international networks and how to best learn from cooperation. Finally, the report provides a starter pack with some of the more commonly used tools and methods to support the kinds of activities described in this guidebook….(More)”

The Tragedy of the Digital Commons


J. Nathan Matias in the Atlantic “….Milland and other regular Turkers navigate this precariously free market withTurkopticon, a DIY technology for rating employers created in 2008. To use it, workers install a browser plugin that extends Amazon’s website with special rating features. Before accepting a new task, workers check how others have rated the employer. After finishing, they can also leave their own rating of how well they were treated. Collective rating on Turkopticon is an act of citizenship in the digital world. This digital citizenship acknowledges that online experiences are as much a part of our common life as our schools, sidewalks, and rivers—requiring as much stewardship, vigilance, and improvement as anything else we share.

“How do you fix a broken system that isn’t yours to repair?” That’s the question that motivated the researchers Lilly Irani and Six Silberman to create Turkopticon, and it’s one that comes up frequently in digital environments dominated by large platforms with hands-off policies. (On social networks like Twitter, for example, harassment is a problem for many users.) Irani and Silberman describe Turkopticon as a “mutual aid for accountability” technology, a system that coordinates peer support to hold others accountable when platforms choose not to step in.

Mutual aid accountability is a growing response to the complex social problems people face online. On Twitter, systems like The Block Bot and BlockTogether coordinate collective judgments about alleged online harassers. The systems then collectively block tweets from accounts that a group prefers not to hear from. Last month, the advocacy organization Hollaback raised over $20,000 on Kickstarter to create support networks for people experiencing harassment. In November, I worked with the advocacy organization Women, Action, and the Media, which took a role as “authorized reporter” with Twitter. For three weeks WAM! accepted reports, sorted evidence, and forwarded serious cases to Twitter. In response, the company warned, suspended, and deleted the accounts of many alleged harassers.
These mutual aid technologies operate in the shadow of larger systems with gaps in how people are supported—even when platforms do step in, says Stuart Geiger, a Berkeley Ph.D. student. In other words, sometimes a platform’s system-wide solutions to a problem can create their own problems. For several years, Geiger and his colleague Aaron Halfaker, now a researcher at Wikimedia, were concerned that Wikipedia’s semi-automated anti-vandalism systems might be making the site unfriendly. As a graduate student unable to change Wikipedia’s code, Halfaker created Snuggle, a mutual-aid mentorship technology that tracks the site’s spam responders. When Snuggle users think a newcomer’s edits were mistakenly flagged as spam, the software coordinates Wikipedians to help those users recover from the negative experience of getting revoked.

By organizing peer support at scale, the designers of Turkopticon and its cousins draw attention to common problems, hoping to influence longer-term change on a complex issue. In time, the idea goes, requesters on Mechanical Turk might change their treatment of workers, Amazon might change its policies and software, or regulators might set new rules for digital labor. This is an approach with a long history in an area that might seem unlikely: the conservation movement. (Silberman and Irani cite the movement as inspiration for Turkopticon.)

To better understand how this approach might influence digital citizenship, I followed the history of mutual-aid accountability in a precious common network that the city of Boston enjoys every day: the Charles River. Planned, re-routed, exploited and contested, it has inspired and supported human life since before written history….(More)”

Data Reinvents Libraries for the 21st Century


 in GovTech: “Libraries can evoke tired assumptions. It could be a stack of battered books and yesteryear movies; that odd odor of wilted pages and circa-1970s decor; or it could be a bout of stereotypes, like obsolete encyclopedias and ruler-snapping librarians.

Whatever the case, the truth is that today libraries are proving they’re more than mausoleums of old knowledge. They’re in a state of progressive reform, rethinking services and restructuring with data. It’s a national trend as libraries modernize, strategize and recast themselves as digital platforms. They’ve taken on the role of data curator for information coming in and citizen-generated data going out….

Nate Hill is among this band of progressives. As a data zealot who believes in data’s inclination for innovation, the former deputy director for Tennessee’s Chattanooga Public Library, led a charge to transform the library into a data centric community hub. The library boasts an open data portal that it manages for the city, a civic hacker lab, a makerspace for community projects, and expanded access to in-person and online tutorials for coding and other digital skill sets….

The draw in data sharing and creating, Hill said, comes from the realization that today’s data channels are no longer one-way systems.

“I push people to the idea that now it’s about being a producer rather than just a consumer,” Hill said, “because really that whole idea of a read-write Web comes from the notion that you and I, for example, are just as capable at editing Wikipedia articles on the fly and changing information as anybody else.”

For libraries, Hill sees this as an opportunity and asks what institution can better pioneer the new frontier of information exchange. He posits the idea that, as the original public content curator, adding open data to libraries is only natural. In fact, he says it’s a logical next step when considering that traditional media like books, research journals and other sources infuse data points with rich context — something most city and state open data portals typically don’t do.

“The dream here is to treat the library as a different kind of community infrastructure,” Hill said. “You can conceivably be feeding live data about a city into an open data portal, and at the same time, turning the library into a real live information source — rather than something just static.”

In Chattanooga, an ongoing effort is in the works to do just that. The library seeks to integrate open data into its library catalog searches. Visitors researching Chattanooga’s waterfront could do a quick search and pull up local books, articles and mapping documents, but also a collection of latest data sets on water pollution and land use, for example.

Eyeing the library data movement at scale, Hill said he could easily envision a network of public libraries that act as local data hubs, retrieving and funneling data into larger state and national data portals….(More).

5 cool ways connected data is being used


 at Wareable: “The real news behind the rise of wearable tech isn’t so much the gadgetry as the gigantic amount of personal data that it harnesses.

Concerns have already been raised over what companies may choose to do with such valuable information, with one US life insurance company already using Fitbits to track customers’ exercise and offer them discounts when they hit their activity goals.

Despite a mildly worrying potential dystopia in which our own data could be used against us, there are plenty of positive ways in which companies are using vast amounts of connected data to make the world a better place…

Parkinson’s disease research

Apple Health ResearchKit was recently unveiled as a platform for collecting collaborative data for medical studies, but Apple isn’t the first company to rely on crowdsourced data for medical research.

The Michael J. Fox Foundation for Parkinson’s Research recently unveiled a partnership with Intel to improve research and treatment for the neurodegenerative brain disease. Wearables are being used to unobtrusively gather real-time data from sufferers, which is then analysed by medical experts….

Saving the rhino

Connected data and wearable tech isn’t just limited to humans. In South Africa, the Madikwe Conservation Project is using wearable-based data to protect endangered rhinos from callous poachers.

A combination of ultra-strong Kevlar ankle collars powered by an Intel Galileo chip, along with an RFID chip implanted in each rhino’s horn allows the animals to be monitored. Any break in proximity between the anklet and horn results in anti-poaching teams being deployed to catch the bad guys….

Making public transport smart

A company called Snips is collecting huge amounts of urban data in order to improve infrastructure. In partnership with French national rail operator SNCF, Snips produced an app called Tranquilien to utilise location data from commuters’ phones and smartwatches to track which parts of the rail network were busy at which times.

Combining big data with crowdsourcing, the information helps passengers to pick a train where they can find a seat during peak times, while the data can also be useful to local businesses when serving the needs of commuters who are passing through.

Improving the sports fan experience

We’ve already written about how wearable tech is changing the NFL, but the collection of personal data is also set to benefit the fans.

Levi’s Stadium – the new home of the San Francisco 49ers – opened in 2014 and is one of the most technically advanced sports venues in the world. As well as a strong Wi-Fi signal throughout the stadium, fans also benefit from a dedicated app. This not only offers instant replays and real-time game information, but it also helps them find a parking space, order food and drinks directly to their seat and even check the lines at the toilets. As fans use the app, all of the data is collated to enhance the fan experience in future….

Creating interactive art

Don’t be put off by the words ‘interactive installation’. On Broadway is a cool work of art that “represents life in the 21st Century city through a compilation of images and data collected along the 13 miles of Broadway that span Manhattan”….(More)”