Paper by Wolff, Annika; Kortuem, Gerd and Cavero, Jose: “A bottom-up approach to smart cities places citizens in an active role of contributing, analysing and interpreting data in pursuit of tackling local urban challenges and building a more sustainable future city. This vision can only be realised if citizens have sufficient data literacy skills and experience of large, complex, messy, ever expanding data sets. Schools typically focus on teaching data handling skills using small, personally collected data sets obtained through scientific experimentation, leading to a gap between what is being taught and what will be needed as big data and analytics become more prevalent. This paper proposes an approach to teaching data literacy in the context of urban innovation tasks, using an idea of Urban Data Games. These are supported by a set of training data and resources that will be used in school trials for exploring the problems people have when dealing with large data and trialling novel approaches for teaching data literacy….(More)”
A sentiment analysis of U.S. local government tweets: The connection between tone and citizen involvement
Paper by Staci M. Zavattaro, P. Edward French, and Somya D. Mohanty: “As social media tools become more popular at all levels of government, more research is needed to determine how the platforms can be used to create meaningful citizen–government collaboration. Many entities use the tools in one-way, push manners. The aim of this research is to determine if sentiment (tone) can positively influence citizen participation with government via social media. Using a systematic random sample of 125 U.S. cities, we found that positive sentiment is more likely to engender digital participation but this was not a perfect one-to-one relationship. Some cities that had an overall positive sentiment score and displayed a participatory style of social media use did not have positive citizen sentiment scores. We argue that positive tone is only one part of a successful social media interaction plan, and encourage social media managers to actively manage platforms to use activities that spur participation….(More)”
A map for Big Data research in Digital Humanities
Article by Frederic Kaplan in Frontiers: “This article is an attempt to represent Big Data research in Digital Humanities as a structured research field. A division in three concentric areas of study is presented. Challenges in the first circle – focusing on the processing and interpretations of large cultural datasets – can be organized linearly following the data processing pipeline. Challenges in the second circle – concerning digital culture at large – can be structured around the different relations linking massive datasets, large communities, collective discourses, global actors and the software medium. Challenges in the third circle – dealing with the experience of big data – can be described within a continuous space of possible interfaces organized around three poles: immersion, abstraction and language. By identifying research challenges in all these domains, the article illustrates how this initial cartography could be helpful to organize the exploration of the various dimensions of Big Data Digital Humanities research….(More)”
How Data Mining could have prevented Tunisia’s Terror attack in Bardo Museum
Wassim Zoghlami at Medium: “…Data mining is the process of posing queries and extracting useful patterns or trends often previously unknown from large amounts of data using various techniques such as those from pattern recognition and machine learning. Latelely there has been a big interest on leveraging the use of data mining for counter-terrorism applications
Using the data on more than 50.000+ ISIS connected twitter accounts , I was able to establish an understanding of some factors determined how often ISIS attacks occur , what different types of terror strikes are used in which geopolitical situations, and many other criteria through graphs about the frequency of hashtags usages and the frequency of a particular group of the words used in the tweets.
A simple data mining project of some of the repetitive hashtags and sequences of words used typically by ISIS militants in their tweets yielded surprising results. The results show a rise of some keywords on the tweets that started from Marsh 15, three days before Bardo museum attacks.
Some of the common frequent keywords and hashtags that had a unusual peak since marsh 15 , three days before the attack :
#طواغيت تونس : Tyrants of Tunisia = a reference to the military
بشرى تونس : Good news for Tunisia.
قريبا تونس : Soon in Tunisia.
#إفريقية_للإعلام : The head of social media of Afriqiyah
#غزوة_تونس : The foray of Tunis…
Big Data and Data Mining should be used for national security intelligence
The Tunisian national security has to leverage big data to predict such attacks and to achieve objectives as the volume of digital data. Some of the challenges facing the Data mining techniques are that to carry out effective data mining and extract useful information for counterterrorism and national security, we need to gather all kinds of information about individuals. However, this information could be a threat to the individuals’ privacy and civil liberties…(More)”
Nowcasting Disaster Damage
Paper by Yury Kryvasheyeu et al: “Could social media data aid in disaster response and damage assessment? Countries face both an increasing frequency and intensity of natural disasters due to climate change. And during such events, citizens are turning to social media platforms for disaster-related communication and information. Social media improves situational awareness, facilitates dissemination of emergency information, enables early warning systems, and helps coordinate relief efforts. Additionally, spatiotemporal distribution of disaster-related messages helps with real-time monitoring and assessment of the disaster itself. Here we present a multiscale analysis of Twitter activity before, during, and after Hurricane Sandy. We examine the online response of 50 metropolitan areas of the United States and find a strong relationship between proximity to Sandy’s path and hurricane-related social media activity. We show that real and perceived threats — together with the physical disaster effects — are directly observable through the intensity and composition of Twitter’s message stream. We demonstrate that per-capita Twitter activity strongly correlates with the per-capita economic damage inflicted by the hurricane. Our findings suggest that massive online social networks can be used for rapid assessment (“nowcasting”) of damage caused by a large-scale disaster….(More)”
How Google and Facebook are finding victims of the Nepal earthquake
Caitlin Dewey in the Washington Post: “As the death toll from Saturday’s 7.8-magnitude Nepalese earthquake inches higher, help in finding and identifying missing persons has come from an unusual source: Silicon Valley tech giants.
Both Google and Facebook deployed collaborative, cellphone-based tools over the weekend to help track victims of the earthquake. In the midst of both company’s big push to bring Internet to the developing world, it’s an important illustration of exactly how powerful that connectivity could be. And yet, in a country like Nepal — where there are only 77 cellphone subscriptions per 100 people versus 96 in the U.S. and 125 in the U.K. — it’s also a reminder of how very far that effort still has to go.
Facebook Safety Check
Facebook’s Safety Check essentially lets users do two things, depending on where they are. Users in an area impacted by a natural disaster can log onto the site and mark themselves as “safe.” Meanwhile, users around the world can log into the site and check if any of their friends are in the impacted area. The tool was built by Japanese engineers in response to the 2011 earthquake and tsunami that devastated coastal Japan.
…
Facebook hasn’t publicized how many people have used the tool, though the network only has 4.4 million users in the country based on estimates by its ad platform. Notably, you must also a smartphone running the Facebook app to use this feature — and smartphone penetration in Nepal is quite low.
Google Person Finder
Like Safety Check, Google Person Finder is intended to connect people in a disaster area with friends and family around the world. Google’s five-year-old project also operates on a larger scale, however: It basically provides a massive, open platform to collaboratively track missing persons’ reports. Previously, Google’s deployed the tool to help victims in the wake of Typhoon Haiyan and the Boston bombing.
Domestic Drones and Privacy: A Primer
Richard M. Thompson for the Congressional Research Service: “There are two overarching privacy issues implicated by domestic drone use. The first is defining what “privacy” means in the context of aerial surveillance. Privacy is an ambiguous term that can mean different things in different contexts. This becomes readily apparent when attempting to apply traditional privacy concepts such as personal control and secrecy to drone surveillance. Other, more nuanced privacy theories such as personal autonomy and anonymity must be explored to get a fuller understanding of the privacy risks posed by drone surveillance. Moreover, with ever-increasing advances in data storage and manipulation, the subsequent aggregation, use, and retention of drone-obtained data may warrant an additional privacy impact analysis.
The second predominant issue is which entity should be responsible for regulating drones and privacy. As the final arbiter of the Constitution, the courts are naturally looked upon to provide at least the floor of privacy protection from UAS surveillance, but as will be discussed in this report, under current law, this protection may be minimal….(More)”
Health Big Data in the Commercial Context
CDT Press Release: “This paper is the third in a series of three, each of which explores health big data in a different context. The first — on health big data in the government context — is available here, and the second — on health big data in the clinical context — is available here.
Consumers are increasingly using mobile phone apps and wearable devices to generate and share data on health and wellness. They are using personal health record tools to access and copy health records and move them to third party platforms. They are sharing health information on social networking sites. They leave digital health footprints when they conduct online searches for health information. The health data created, accessed, and shared by consumers using these and many other tools can range from detailed clinical information, such as downloads from an implantable device and details about medication regimens, to data about weight, caloric intake, and exercise logged with a smart phone app.
These developments offer a wealth of opportunities for health care and personal wellness. However, privacy questions arise due to the volume and sensitivity of health data generated by consumer-focused apps, devices, and platforms, including the potential analytics uses that can be made of such data.
Many of the privacy issues that face traditional health care entities in the big data era also apply to app developers, wearable device manufacturers, and other entities not part of the traditional health care ecosystem. These include questions of data minimization, retention, and secondary use. Notice and consent pose challenges, especially given the limits of presenting notices on mobile device screens, and the fact that consumer devices may be bought and used without consultation with a health care professional. Security is a critical issue as well.
However, the privacy and security provisions of the Heath Insurance Portability and Accountability Act (HIPAA) do not apply to most app developers, device manufacturers or others in the consumer health space. This has benefits to innovation, as innovators would otherwise have to struggle with the complicated HIPAA rules. However, the current vacuum also leaves innovators without clear guidance on how to appropriately and effectively protect consumers’ health data. Given the promise of health apps, consumer devices, and consumer-facing services, and given the sensitivity of the data that they collect and share, it is important to provide such guidance….
As the source of privacy guidelines, we look to the framework provided by the Fair Information Practice Principles (FIPPs) and explore how it could be applied in an age of big data to patient-generated data. The FIPPs have influenced to varying degrees most modern data privacy regimes. While some have questioned the continued validity of the FIPPs in the current era of mass data collection and analysis, we consider here how the flexibility and rigor of the FIPPs provide an organizing framework for responsible data governance, promoting innovation, efficiency, and knowledge production while also protecting privacy. Rather than proposing an entirely new framework for big data, which could be years in the making at best, using the FIPPs would seem the best approach in promoting responsible big data practices. Applying the FIPPs could also help synchronize practices between the traditional health sector and emerging consumer products….(More)”
Does Twitter Increase Perceived Police Legitimacy?
Paper by Stephan G. Grimmelikhuijsen and Albert J. Meijer in Public Administration Review: “Social media use has become increasingly popular among police forces. The literature suggests that social media use can increase perceived police legitimacy by enabling transparency and participation. Employing data from a large and representative survey of Dutch citizens (N = 4,492), this article tests whether and how social media use affects perceived legitimacy for a major social media platform, Twitter. A negligible number of citizens engage online with the police, and thus the findings reveal no positive relationship between participation and perceived legitimacy. The article shows that by enhancing transparency, Twitter does increase perceived police legitimacy, albeit to a limited extent. Subsequent analysis of the mechanism shows both an affective and a cognitive path from social media use to legitimacy. Overall, the findings suggest that establishing a direct channel with citizens and using it to communicate successes does help the police strengthen their legitimacy, but only slightly and for a small group of interested citizens….(More)”
How Crowdsourcing And Machine Learning Will Change The Way We Design Cities
Now, that data is being used to predict what parts of cities feel the safest. StreetScore, a collaboration between the MIT Media Lab’s Macro Connections and Camera Culture groups, uses an algorithm to create a super high-resolution map of urban perceptions. The algorithmically generated data could one day be used to research the connection between urban perception and crime, as well as informing urban design decisions.

The algorithm, created by Nikhil Naik, a Ph.D. student in the Camera Culture lab, breaks an image down into its composite features—such as building texture, colors, and shapes. Based on how Place Pulse volunteers rated similar features, the algorithm assigns the streetscape a perceived safety score between 1 and 10. These scores are visualized as geographic points on a map, designed by MIT rising sophomore Jade Philipoom. Each image available from Google Maps in the two cities are represented by a colored dot: red for the locations that the algorithm tags as unsafe, and dark green for those that appear safest. The site, now limited to New York and Boston, will be expanded to feature Chicago and Detroit later this month, and eventually, with data collected from a new version of Place Pulse, will feature dozens of cities around the world….(More)”