Extracting crowd intelligence from pervasive and social big data


Introduction by Leye Wang, Vincent Gauthier, Guanling Chen and Luis Moreira-Matias of Special Issue of the Journal of Ambient Intelligence and Humanized Computing: “With the prevalence of ubiquitous computing devices (smartphones, wearable devices, etc.) and social network services (Facebook, Twitter, etc.), humans are generating massive digital traces continuously in their daily life. Considering the invaluable crowd intelligence residing in these pervasive and social big data, a spectrum of opportunities is emerging to enable promising smart applications for easing individual life, increasing company profit, as well as facilitating city development. However, the nature of big data also poses fundamental challenges on the techniques and applications relying on the pervasive and social big data from multiple perspectives such as algorithm effectiveness, computation speed, energy efficiency, user privacy, server security, data heterogeneity and system scalability. This special issue presents the state-of-the-art research achievements in addressing these challenges. After the rigorous review process of reviewers and guest editors, eight papers were accepted as follows.

The first paper “Automated recognition of hypertension through overnight continuous HRV monitoring” by Ni et al. proposes a non-invasive way to differentiate hypertension patients from healthy people with the pervasive sensors such as a waist belt. To this end, the authors train a machine learning model based on the heart rate data sensed from waists worn by a crowd of people, and the experiments show that the detection accuracy is around 93%.

The second paper “The workforce analyzer: group discovery among LinkedIn public profiles” by Dai et al. describes two users’ group discovery methods among LinkedIn public profiles. One is based on K-means and another is based on SVM. The authors contrast results of both methods and provide insights about the trending professional orientations of the workforce from an online perspective.

The third paper “Tweet and followee personalized recommendations based on knowledge graphs” by Pla Karidi et al. present an efficient semantic recommendation method that helps users filter the Twitter stream for interesting content. The foundation of this method is a knowledge graph that can represent all user topics of interest as a variety of concepts, objects, events, persons, entities, locations and the relations between them. An important advantage of the authors’ method is that it reduces the effects of problems such as over-recommendation and over-specialization.

The fourth paper “CrowdTravel: scenic spot profiling by using heterogeneous crowdsourced data” by Guo et al. proposes CrowdTravel, a multi-source social media data fusion approach for multi-aspect tourism information perception, which can provide travelling assistance for tourists by crowd intelligence mining. Experiments over a dataset of several popular scenic spots in Beijing and Xi’an, China, indicate that the authors’ approach attains fine-grained characterization for the scenic spots and delivers excellent performance.

The fifth paper “Internet of Things based activity surveillance of defence personnel” by Bhatia et al. presents a comprehensive IoT-based framework for analyzing national integrity of defence personnel with consideration to his/her daily activities. Specifically, Integrity Index Value is defined for every defence personnel based on different social engagements, and activities for detecting the vulnerability to national security. In addition to this, a probabilistic decision tree based automated decision making is presented to aid defence officials in analyzing various activities of a defence personnel for his/her integrity assessment.

The sixth paper “Recommending property with short days-on-market for estate agency” by Mou et al. proposes an estate with short days-on-market appraisal framework to automatically recommend those estates using transaction data and profile information crawled from websites. Both the spatial and temporal characteristics of an estate are integrated into the framework. The results show that the proposed framework can estimate accurately about 78% estates.

The seventh paper “An anonymous data reporting strategy with ensuring incentives for mobile crowd-sensing” by Li et al. proposes a system and a strategy to ensure anonymous data reporting while ensuring incentives simultaneously. The proposed protocol is arranged in five stages that mainly leverage three concepts: (1) slot reservation based on shuffle, (2) data submission based on bulk transfer and multi-player dc-nets, and (3) incentive mechanism based on blind signature.

The last paper “Semantic place prediction from crowd-sensed mobile phone data” by Celik et al. semantically classifes places visited by smart phone users utilizing the data collected from sensors and wireless interfaces available on the phones as well as phone usage patterns, such as battery level, and time-related information, with machine learning algorithms. For this study, the authors collect data from 15 participants at Galatasaray University for 1 month, and try different classification algorithms such as decision tree, random forest, k-nearest neighbour, naive Bayes, and multi-layer perceptron….(More)”.

Artificial intelligence and smart cities


Essay by Michael Batty at Urban Analytics and City Sciences: “…The notion of the smart city of course conjures up these images of such an automated future. Much of our thinking about this future, certainly in the more popular press, is about everything ranging from the latest App on our smart phones to driverless cars while somewhat deeper concerns are about efficiency gains due to the automation of services ranging from transit to the delivery of energy. There is no doubt that routine and repetitive processes – algorithms if you like – are improving at an exponential rate in terms of the data they can process and the speed of execution, faithfully following Moore’s Law.

Pattern recognition techniques that lie at the basis of machine learning are highly routinized iterative schemes where the pattern in question – be it a signature, a face, the environment around a driverless car and so on – is computed as an elaborate averaging procedure which takes a series of elements of the pattern and weights them in such a way that the pattern can be reproduced perfectly by the combinations of elements of the original pattern and the weights. This is in essence the way neural networks work. When one says that they ‘learn’ and that the current focus is on ‘deep learning’, all that is meant is that with complex patterns and environments, many layers of neurons (elements of the pattern) are defined and the iterative procedures are run until there is a convergence with the pattern that is to be explained. Such processes are iterative, additive and not much more than sophisticated averaging but using machines that can operate virtually at the speed of light and thus process vast volumes of big data. When these kinds of algorithm can be run in real time and many already can be, then there is the prospect of many kinds of routine behaviour being displaced. It is in this sense that AI might herald in an era of truly disruptive processes. This according to Brynjolfsson and McAfee is beginning to happen as we reach the second half of the chess board.

The real issue in terms of AI involves problems that are peculiarly human. Much of our work is highly routinized and many of our daily actions and decisions are based on relatively straightforward patterns of stimulus and response. The big questions involve the extent to which those of our behaviours which are not straightforward can be automated. In fact, although machines are able to beat human players in many board games and there is now the prospect of machines beating the very machines that were originally designed to play against humans, the real power of AI may well come from collaboratives of man and machine, working together, rather than ever more powerful machines working by themselves. In the last 10 years, some of my editorials have tracked what is happening in the real-time city – the smart city as it is popularly called – which has become key to many new initiatives in cities. In fact, cities – particularly big cities, world cities – have become the flavour of the month but the focus has not been on their long-term evolution but on how we use them on a minute by minute to week by week basis.

Many of the patterns that define the smart city on these short-term cycles can be predicted using AI largely because they are highly routinized but even for highly routine patterns, there are limits on the extent to which we can explain them and reproduce them. Much advancement in AI within the smart city will come from automation of the routine, such as the use of energy, the delivery of location-based services, transit using information being fed to operators and travellers in real time and so on. I think we will see some quite impressive advances in these areas in the next decade and beyond. But the key issue in urban planning is not just this short term but the long term and it is here that the prospects for AI are more problematic….(More)”.

Data-Intensive Approaches To Creating Innovation For Sustainable Smart Cities


Science Trends: “Located at the complex intersection of economic development and environmental change, cities play a central role in our efforts to move towards sustainability. Reducing air and water pollution, improving energy efficiency while securing energy supply, and minimizing vulnerabilities to disruptions and disturbances are interconnected and pose a formidable challenge, with their dynamic interactions changing in highly complex and unpredictable manners….

The Beijing City Lab demonstrates the usefulness of open urban data in mapping urbanization with a fine spatiotemporal scale and reflecting social and environmental dimensions of urbanization through visualization at multiple scales.

The basic principle of open data will generate significant opportunities for promoting inter-disciplinary and inter-organizational research, producing new data sets through the integration of different sources, avoiding duplication of research, facilitating the verification of previous results, and encouraging citizen scientists and crowdsourcing approaches. Open data also is expected to help governments promote transparency, citizen participation, and access to information in policy-making processes.

Despite a significant potential, however, there still remain numerous challenges in facilitating innovation for urban sustainability through open data. The scope and amount of data collected and shared are still limited, and the quality control, error monitoring, and cleaning of open data is also indispensable in securing the reliability of the analysis. Also, the organizational and legal frameworks of data sharing platforms are often not well-defined or established, and it is critical to address the interoperability between various data standards, balance between open and proprietary data, and normative and legal issues such as the data ownership, personal privacy, confidentiality, law enforcement, and the maintenance of public safety and national security….

These findings are described in the article entitled Facilitating data-intensive approaches to innovation for sustainability: opportunities and challenges in building smart cities, published in the journal Sustainability Science. This work was led by Masaru Yarime from the City University of Hong Kong….(More)”.

Democratising the future: How do we build inclusive visions of the future?


Chun-Yin San at Nesta: “In 2011, Lord Martin Rees, the British Astronomer-Royal, launched a scathing critique on the UK Government’s long-term thinking capabilities. “It is depressing,” he argued, “that long-term global issues of energy, food, health and climate get trumped on the political agenda by the short term”. We are facing more and more complex, intergenerational issues like climate change, or the impact of AI, which require long-term, joined-up thinking to solve.

But even when governments do invest in foresight and strategic planning, there is a bigger question around whose vision of the future it is. These strategic plans tend to be written in opaque and complex ways by ‘experts’, with little room for scrutiny, let alone input, by members of the public….

There have been some great examples of more democratic futures exercises in the past. Key amongst them was the Hawai’i 2000 project in the 1970s, which bought together Hawaiians from different walks of life to debate the sort of place that Hawai’i should become over the next 30 years. It generated some incredibly inspiring and creative collective visions of the future of the tropical American state, and also helped embed long-term strategic thinking into policy-making instruments – at least for a time.

A more recent example took place over 2008 in the Dutch Caribbean nation of Aruba, which engaged some 50,000 people from all parts of Aruban society. The Nos Aruba 2025 project allowed the island nation to develop a more sustainable national strategic plan than ever before – one based on what Aruba and its people had to offer, responding to the potential and needs of a diverse community. Like Hawai’i 2000, what followed Nos Aruba 2025 was a fundamental change in the nature of participation in the country’s governance, with community engagement becoming a regular feature in the Aruban government’s work….

These examples demonstrate how futures work is at its best when it is participatory. …However, aside from some of the projects above, examples of genuine engagement in futures remain few and far between. Even when activities examining a community’s future take place in the public domain – such as the Museum of London’s ongoing City Now City Future series – the conversation can often seem one-sided. Expert-generated futures are presented to people with little room for them to challenge these ideas or contribute their own visions in a meaningful way. This has led some, like academics Denis Loveridge and Ozcan Saritas, to remark that futures and foresight can suffer from a serious case of ‘democratic deficit‘.

There are three main reasons for this:

  1. Meaningful participation can be difficult to do, as it is expensive and time-consuming, especially when it comes to large-scale exercises meant to facilitate deep and meaningful dialogue about a community’s future.

  2. Participation is not always valued in the way it should be, and can be met with false sincerity from government sponsors. This is despite the wide-reaching social and economic benefits to building collective future visions, which we are currently exploring further in our work.

  3. Practitioners may not necessarily have the know-how or tools to do citizen engagement effectively. While there are plenty of guides to public engagement and a number of different futures toolkits, there are few openly available resources for participatory futures activities….(More)”

How Software is Eating the World and Reprogramming Democracy


Jaime Gómez Ramírez at Open Mind: “Democracy, the government of the majority typically through elected representatives, is undergoing a major crisis. Human societies have experimented with democracy since at least the fifth century BC in the polis of Athens. Whether democracy is scalable is an open question that could help understand the current mistrust in democratic institutions and the rise of populism. The majority rule is a powerful narrative that is fed every few years with elections. In Against elections, the cultural historian Van Reybrouck claims that elections were never meant to make democracy possible, rather the opposite, it was a tool designed for those in power to prevent “the rule of the mob”. Elections created a new elite and power remained in the hands of a minority, but this time endowed with democratic legitimacy….

The 2008 financial crisis have changed the perception of, the once taken for granted, complementary nature of democracy and capitalism. The belief that capitalism and democracy go hand by hand is not credible anymore. The concept of nation is a fiction in need of a continuous stock of intergenerational believers. The nation state successfully assimilated heterogeneous groups of people under a common language and shared cultural values. But this seems today a rather fragile foundation to resist the centrifugal forces that financial capitalism impinges upon the social fabric.

Nation states will not collapse over night, but they are an industrial era device in a digital world. To do not fall into obsolescence they will need to change their operative system. Since the venture capitalist Marc Andreessen coined the phrase “software is eating the world” the logic of financial capitalism has accelerated this trend. Five software companies: Facebook, Apple, Amazon, Netflix and Google parent Alphabet (FANG) equal more than 10 per cent percent of the S&P 500 cap. Todays dominant industries in entertainment, retail, telecom, marketing companies and others are software companies. Software is also taking a bigger share in industries that traditionally exist in the physical space like automakers and energy. Education and health care have shown more resistance to software-based entrepreneurial change but a very profound transformation is underway. This is already visible with the growing popularity of MOOCs and personalized health monitoring systems.

Software-based business not only have up trending market share but more importantly, software can reprogram the world. The internet of things will allow to have full connectivity of smart devices in an economy with massive deflationary costs in computing. Computing might even become free. This has profound consequences for business, industry and most importantly, for how citizens want to organize society and governance.

The most promising technological innovation in years is the blockchain technology, an encrypted and distributed ledger system. Blockchain is an universal and freely accessible repository of documents including property and insurance contracts, publicly auditable, and resistant to special group interests manipulation and corruption. New kinds of governance models and services could be tested and implemented using the blockchain. The time is ripe for fundamental software-based transformation in governance. Democracy and free society will ignore this at its own peril…(More)”.

“Nudge units” – where they came from and what they can do


Zeina Afif at the Worldbank: “You could say that the first one began in 2009, when the US government recruited Cass Sunstein to head The Office of Information and Regulatory Affairs (OIRA) to streamline regulations. In 2010, the UK established the first Behavioural Insights Unit (BIT) on a trial basis, under the Cabinet Office. Other countries followed suit, including the US, Australia, Canada, Netherlands, and Germany. Shortly after, countries such as India, Indonesia, Peru, Singapore, and many others started exploring the application of behavioral insights to their policies and programs. International institutions such as the World Bank, UN agencies, OECD, and EU have also established behavioral insights units to support their programs. And just this month, the Sustainable Energy Authority of Ireland launched its own Behavioral Economics Unit.

The Future
As eMBeD, the behavioral science unit at the World Bank, continues to support governments across the globe in the implementation of their units, here are some common questions we often get asked.

What are the models for a Behavioral Insights Unit in Government?
As of today, over a dozen countries have integrated behavioral insights with their operations. While there is not one model to prescribe, the setup varies from centralized or decentralized to networked….

In some countries, the units were first established at the ministerial level. One example is MineduLab in Peru, which was set up with eMBeD’s help. The unit works as an innovation lab, testing rigorous and leading research in education and behavioral science to address issues such as teacher absenteeism and motivation, parents’ engagement, and student performance….

What should be the structure of the team?
Most units start with two to four full-time staff. Profiles include policy advisors, social psychologists, experimental economists, and behavioral scientists. Experience in the public sector is essential to navigate the government and build support. It is also important to have staff familiar with designing and running experiments. Other important skills include psychology, social psychology, anthropology, design thinking, and marketing. While these skills are not always readily available in the public sector, it is important to note that all behavioral insights units partnered with academics and experts in the field.

The U.S. team, originally called the Social and Behavioral Sciences Team, is staffed mostly by seconded academic faculty, researchers, and other departmental staff. MineduLab in Peru partnered with leading experts, including the Abdul Latif Jameel Poverty Action Lab (J-PAL), Fortalecimiento de la Gestión de la Educación (FORGE), Innovations for Poverty Action (IPA), and the World Bank….(More)”

Linux Foundation Debuts Community Data License Agreement


Press Release: “The Linux Foundation, the nonprofit advancing professional open source management for mass collaboration, today announced the Community Data License Agreement(CDLA) family of open data agreements. In an era of expansive and often underused data, the CDLA licenses are an effort to define a licensing framework to support collaborative communities built around curating and sharing “open” data.

Inspired by the collaborative software development models of open source software, the CDLA licenses are designed to enable individuals and organizations of all types to share data as easily as they currently share open source software code. Soundly drafted licensing models can help people form communities to assemble, curate and maintain vast amounts of data, measured in petabytes and exabytes, to bring new value to communities of all types, to build new business opportunities and to power new applications that promise to enhance safety and services.

The growth of big data analytics, machine learning and artificial intelligence (AI) technologies has allowed people to extract unprecedented levels of insight from data. Now the challenge is to assemble the critical mass of data for those tools to analyze. The CDLA licenses are designed to help governments, academic institutions, businesses and other organizations open up and share data, with the goal of creating communities that curate and share data openly.

For instance, if automakers, suppliers and civil infrastructure services can share data, they may be able to improve safety, decrease energy consumption and improve predictive maintenance. Self-driving cars are heavily dependent on AI systems for navigation, and need massive volumes of data to function properly. Once on the road, they can generate nearly a gigabyte of data every second. For the average car, that means two petabytes of sensor, audio, video and other data each year.

Similarly, climate modeling can integrate measurements captured by government agencies with simulation data from other organizations and then use machine learning systems to look for patterns in the information. It’s estimated that a single model can yield a petabyte of data, a volume that challenges standard computer algorithms, but is useful for machine learning systems. This knowledge may help improve agriculture or aid in studying extreme weather patterns.

And if government agencies share aggregated data on building permits, school enrollment figures, sewer and water usage, their citizens benefit from the ability of commercial entities to anticipate their future needs and respond with infrastructure and facilities that arrive in anticipation of citizens’ demands.

“An open data license is essential for the frictionless sharing of the data that powers both critical technologies and societal benefits,” said Jim Zemlin, Executive Director of The Linux Foundation. “The success of open source software provides a powerful example of what can be accomplished when people come together around a resource and advance it for the common good. The CDLA licenses are a key step in that direction and will encourage the continued growth of applications and infrastructure.”…(More)”.

Humanizing technology


Kaliya Young at Open Democracy: “Can we use the internet to enhance deep human connection and support the emergence of thriving communities in which everyone’s needs are met and people’s lives are filled with joy and meaning?….

Our work on ‘technical’ technologies won’t generate broad human gains unless we invest an equal amount of time, energy and resources in the development of social and emotional technologies that drive how our whole society is organized and how we work together. I think we are actually on the cusp of having the tools, understanding and infrastructure to make that happen, without all our ideas and organizing being intermediated by giant corporations. But what does that mean in practice?

I think two things are absolutely vital.

First of all, how do we connect all the people and all the groups that want to align their goals in pursuit of social justice, deep democracy, and the development of new economies that share wealth and protect the environment? How are people supported to protect their own autonomy while also working with multiple other groups in processes of joint work and collective action?

One key element of the answer to that question is to generate a digital identity that is not under the control of a corporation, an organization or a government.

I have been co-leading the community surrounding the Internet Identity Workshop for the last 12 years. After many explorations of the techno-possibility landscape we have finally made some breakthroughs that will lay the foundations of a real internet-scale infrastructure to support what are called ‘user-centric’ or ‘self-sovereign’ identities.

This infrastructure consists of a network with two different types of nodes—people and organizations—with each individual being able to join lots of different groups. But regardless of how many groups they join, people will need a digital identity that is not owned by Twitter, Amazon, Apple, Google or Facebook. That’s the only way they will be able to control their own autonomous interactions on the internet. If open standards are not created for this critical piece of infrastructure then we will end up in a future where giant corporations control all of our identities. In many ways we are in this future now.

This is where something called ‘Shared Ledger Technology’ or SLT comes in—more commonly known as ‘blockchain’ or ‘distributed ledger technology.’  SLT represents a huge innovation in terms of databases that can be read by anyone and which are highly resistant to tampering—meaning that data cannot be erased or changed once entered. At the moment there’s a lot of work going on to design the encryption key management that’s necessary to support the creation and operation of these unique private channels of connection and communication between individuals and organizations. The Sovrin Foundation has built an SLT specifically for digital identity key management, and has donated the code required to the HyperLeger Foundation under ‘project Indy.’…

To put it simply, technical technologies are easier to turn in the direction of democracy and social justice if they are developed and applied with social and emotional intelligence. Combining all three together is the key to using technology for liberating ends….(More)”.

These 16 companies want to make technology work for everyone


MIT Sloan School Press Release: “One company helps undocumented people create a digital identity. Another uses artificial intelligence to help students transition to college. Yet another provides free training to budding tech pros.

These organizations are just a few of the many that are using technology to solve problems and help people all over the world — and they are all finalists in the MIT Initiative on the Digital Economy’s second annual Inclusive Innovation Challenge. During a time of great technological innovation, many people are not benefiting from this progress. The challenge is recognizing companies that are using technology to improve opportunities for working people…..

Here are the finalists:

AdmitHub
Did you know that of the students who have been admitted to college each spring, 14 percent don’t actually attend come fall? Or that of those who do attend, 48 percent haven’t graduated six years later. Boston-based AdmitHub created a virtual assistant powered by artificial intelligence to help students navigate the financial, academic, and social situations that accompany going to college, and they do it all through text messaging, communicating with students on their terms and easing the transition to college.

African Renewable Energy Distributor Ltd.
This company has developed solar-powered, portable kiosks where people can charge their phones, access Wi-Fi, or access an intranet while offline. Using a micro franchise business model, the Rwanda-based company hopes to empower women and people with disabilities who can run the kiosks.

AID:Tech
More than two billion people worldwide have no legal identity, something that is necessary for accessing public and financial services. Aid:Tech aims to end that, by providing a platform for undocumented people to create a digital ID using blockchain so that every transaction is secure and traceable. Aid:Tech is based out of Dublin, with offices in New York and London….(More)”

Inside the Lab That’s Quantifying Happiness


Rowan Jacobsen at Outside: “In Mississippi, people tweet about cake and cookies an awful lot; in Colorado, it’s noodles. In Mississippi, the most-tweeted activity is eating; in Colorado, it’s running, skiing, hiking, snowboarding, and biking, in that order. In other words, the two states fall on opposite ends of the behavior spectrum. If you were to assign a caloric value to every food mentioned in every tweet by the citizens of the United States and a calories-burned value to every activity, and then totaled them up, you would find that Colorado tweets the best caloric ratio in the country and Mississippi the worst.

Sure, you’d be forgiven for doubting people’s honesty on Twitter. On those rare occasions when I destroy an entire pint of Ben and Jerry’s, I most assuredly do not tweet about it. Likewise, I don’t reach for my phone every time I strap on a pair of skis.

And yet there’s this: Mississippi has the worst rate of diabetes and heart disease in the country and Colorado has the best. Mississippi has the second-highest percentage of obesity; Colorado has the lowest. Mississippi has the worst life expectancy in the country; Colorado is near the top. Perhaps we are being more honest on social media than we think. And perhaps social media has more to tell us about the state of the country than we realize.

That’s the proposition of Peter Dodds and Chris Danforth, who co-direct the University of Vermont’s Computational Story Lab, a warren of whiteboards and grad students in a handsome brick building near the shores of Lake Champlain. Dodds and Danforth are applied mathematicians, but they would make a pretty good comedy duo. When I stopped by the lab recently, both were in running clothes and cracking jokes. They have an abundance of curls between them and the wiry energy of chronic thinkers. They came to UVM in 2006 to start the Vermont Complex Systems Center, which crunches big numbers from big systems and looks for patterns. Out of that, they hatched the Computational Story Lab, which sifts through some of that public data to discern the stories we’re telling ourselves. “It took us a while to come up with the name,” Dodds told me as we shotgunned espresso and gazed into his MacBook. “We were going to be the Department of Recreational Truth.”

This year, they teamed up with their PhD student Andy Reagan to launch the Lexicocalorimeter, an online tool that uses tweets to compute the calories in and calories out for every state. It’s no mere party trick; the Story Labbers believe the Lexicocalorimeter has important advantages over slower, more traditional methods of gathering health data….(More)”.