Stefaan Verhulst

Can You Really Spot Cancer Through a Search Engine?

Curated on June 9, 2016August 3, 2018 by Stefaan Verhulst

Michael Reilly at MIT Technology Review: “In the world of cancer treatment, early diagnosis can mean the difference between being cured and being handed a death sentence. At the very least, catching a tumor early increases a patient’s chances of living longer.

Researchers at Microsoft think they may know of a tool that could help detect cancers before you even think to go to a doctor: your search engine.

In a study published Tuesday in the Journal of Oncology Practice, the Microsoft team showed that it was able to mine the anonymized search queries of 6.4 million Bing users to find searches that indicated someone had been diagnosed with pancreatic cancer (such as “why did I get cancer in pancreas,” and “I was told I have pancreatic cancer what to expect”). Then, looking at people’s search patterns before their diagnosis, they identified patterns of search that indicated they had been experiencing symptoms before they ever sought medical treatment.

Pancreatic cancer is a particularly deadly form of the disease. It’s the fourth-leading cause of cancer death in the U.S., and three-quarters of people diagnosed with it die within a year. But catching it early still improves the odds of living longer.

By looking for searches for symptoms—which include yellowing, itchy skin, and abdominal pain—and checking the user’s search history for signs of other risk factors like alcoholism and obesity, the team was often able to identify searches for symptoms up to five months before they were diagnosed.

In their paper, the team acknowledged the limitations of the work, saying that it is not meant to provide people with a diagnosis. Instead they suggested that it might one day be turned into a tool that warns users whose searches indicate they may have symptoms of cancer.

“The goal is not to perform the diagnosis,” said Ryen White, one of the researchers, on a post on Microsoft’s blog. “The goal is to help those at highest risk to engage with medical professionals who can actually make the true diagnosis.”…(More)”

The Spanish Town That Runs on Twitter

Curated on June 9, 2016August 3, 2018 by Stefaan Verhulst

Mark Scott at the New York Times: “…For the town’s residents, more than half of whom have Twitter accounts, their main way to communicate with local government officials is now the social network. Need to see the local doctor? Send a quick Twitter message to book an appointment. See something suspicious? Let Jun’s policeman know with a tweet.

People in Jun can still use traditional methods, like completing forms at the town hall, to obtain public services. But Mr. Rodríguez Salas said that by running most of Jun’s communications through Twitter, he not only has shaved on average 13 percent, or around $380,000, from the local budget each year since 2011, but he also has created a digital democracy where residents interact online almost daily with town officials.

“Everyone can speak to everyone else, whenever they want,” said Mr.Rodríguez Salas in his office surrounded by Twitter paraphernalia,while sporting a wristband emblazoned with #LoveTwitter. “We are onTwitter because that’s where the people are.”…

By incorporating Twitter into every aspect of daily life — even the localschool’s lunch menu is sent out through social media — this Spanishtown has become a test bed for how cities may eventually use socialnetworks to offer public services….

Using Twitter has also reduced the need for some jobs. Jun cut its police force by three-quarters, to just one officer, soon after turning to Twitter as its main form of communication when residents began tweeting potential problems directly to the mayor.

“We don’t have one police officer,” Mr. Rodríguez Salas said. “We have 3,500.”

For Justo Ontiveros, Jun’s remaining police officer, those benefits are up close and personal. He now receives up to 20, mostly private, messages from locals daily with concerns ranging from advice on filling out forms to reporting crimes like domestic abuse and speeding.

Mr. Ontiveros said his daily Twitter interactions have given him both greater visibility within the community and a higher level of personal satisfaction, as neighbors now regularly stop him in the street to discuss things that he has posted on Twitter.

“It gives people more power to come and talk to me about their problems,” said Mr. Ontiveros, whose department Twitter account has more than 3,500 followers.

Still, Jun’s reliance on Twitter has not been universally embraced….(More)”

White House Challenges Artificial Intelligence Experts to Reduce Incarceration Rates

Curated on June 9, 2016May 29, 2019 by Stefaan Verhulst

Jason Shueh at GovTech: “The U.S. spends $270 billion on incarceration each year, has a prison population of about 2.2 million and an incarceration rate that’s spiked 220 percent since the 1980s. But with the advent of data science, White House officials are asking experts for help.

On Tuesday, June 7, the White House Office of Science and Technology Policy’s Lynn Overmann, who also leads the White House Police Data Initiative, stressed the severity of the nation’s incarceration crisis while asking a crowd of data scientists and artificial intelligence specialists for aid.

“We have built a system that is too large, and too unfair and too costly — in every sense of the word — and we need to start to change it,” Overmann said, speaking at a Computing Community Consortium public workshop.

She argued that the U.S., a country that has the highest amount incarcerated citizens in the world, is in need of systematic reforms with both data tools to process alleged offenders and at the policy level to ensure fair and measured sentences. As a longtime counselor, advisor and analyst for the Justice Department and at the city and state levels, Overman said she has studied and witnessed an alarming number of issues in terms of bias and unwarranted punishments.

For instance, she said that statistically, while drug use is about equal between African-Americans and Caucasians, African-Americans are more likely to be arrested and convicted. They also receive longer prison sentences compared to Caucasian inmates convicted of the same crimes….

Data and digital tools can help curb such pitfalls by increasing efficiency, transparency and accountability, she said.

“We think these types of data exchanges [between officials and technologists] can actually be hugely impactful if we can figure out how to take this information and operationalize it for the folks who run these systems,” Obermann noted.

The opportunities to apply artificial intelligence and data analytics, she said, might include using it to improve questions on parole screenings, using it to analyze police body camera footage, and applying it to criminal justice data for legislators and policy workers….

If the private sector is any indication, artificial intelligence and machine learning techniques could be used to interpret this new and vast supply of law enforcement data. In an earlier presentation by Eric Horvitz, the managing director at Microsoft Research, Horvitz showcased how the company has applied artificial intelligence to vision and language to interpret live video content for the blind. The app, titled SeeingAI, can translate live video footage, captured from an iPhone or a pair of smart glasses, into instant audio messages for the seeing impaired. Twitter’s live-streaming app Periscope has employed similar technology to guide users to the right content….(More)”

Open Data For Social Good: The Case For Better Transport Services

Curated on June 8, 2016August 3, 2018 by Stefaan Verhulst

Martin Howell at TechWeek Europe: “The growing focus on data protection, driven partly by stronger legislation and partly by consumer pressure, has put the debate on the benefits of open data somewhat on the back burner.

The continuing spate of high-profile data breaches and the abuse of public trust in the form of constant bombardment of automated calls, spam emails and clumsily ‘personalised’ advertising has done little to further the open data agenda. In fact it left many consumers feeling lukewarm about the prospects of organisations opening up their data feeds, even at a promise of a better service in return.

That’s a worrying trend. In many industries effective use of open data can lead to development of solutions that address some of the major challenges populations are faced with today, allowing for faster innovation and adaptability to change. There are significant ways in which individuals, and society as a whole could benefit from open data, if organisations and governments get data sharing right.

Open data for transport

A good example is city transportation. Many metropolises face a major challenge – growing populations are placing pressure on current infrastructure systems, leading to congestion and inefficiency.

An open data system, where commuters use a single travel account for all travel transactions and information – whether that’s public transport, walking, using the bike, using Uber, and so on, would give the city unprecedented insight into how people commute and what’s behind their travel choices.

The key to engaging the public with this is the condition that data is used responsibly and for the greater good. Currently, Transport for London (TfL) operates a meet-in-the-middle model. Consumers can travel anonymously on the TfL network, with only the point of entry and point of exit being recorded, and the company provides that anonymised data to third-party app developers who can then use it to release useful travel applications.

TfL doesn’t profit from sharing consumer data but it does enjoy the benefits that come with it. Third-party travel applications make it easier for commuters to use TfL’s network and make the service itself appear more efficient – in short, everyone benefits.

Mutual benefit

Let’s now imagine a scenario that takes this mutually beneficial relationship a step forward, with consumers willingly giving up some information about themselves to the responsible parties (in this case, the city) and receiving personalised service in return. In this scenario, the more information commuters can provide to the system, the more useful the system can be to them.

Apart from providing personalised travel information and recommendations, such a system would have one more important benefit – it would enable cities to encourage greater social responsibility, extending the benefits from the individual to the community as a whole….(More)”

Citizen Lobbying: How Your Skills Can Fix Democracy

Curated on June 8, 2016August 3, 2018 by Stefaan Verhulst

TEDxBrussels Presentation: “The more society professionalises, the less is taking advantage if its own skills. Indeed, each of us has much more to give to society than what our job descriptions allow us to. How to then mobilize our skills for the greater good? Alberto Alemanno, an engaged academic and civic advocate, argues that besides voting and running for office there is also a third, less known – yet more promising -, way to make society progress: lobbying. Lobbying is no longer a prerogative of well-funded groups with huge memberships and countless political connections. This talk offers you a guide on how to become an effective citizen lobbyist in your daily life by tapping into your own talents, skills and experience….(More)” See also http://www.thegoodlobby.eu/

Digital Keywords: A Vocabulary of Information Society and Culture

Curated on June 8, 2016August 3, 2018 by Stefaan Verhulst

Book edited by Benjamin Peters: “In the age of search, keywords increasingly organize research, teaching, and even thought itself. Inspired by Raymond Williams’s 1976 classic Keywords, the timely collection Digital Keywords gathers pointed, provocative short essays on more than two dozen keywords by leading and rising digital media scholars from the areas of anthropology, digital humanities, history, political science, philosophy, religious studies, rhetoric, science and technology studies, and sociology. Digital Keywords examines and critiques the rich lexicon animating the emerging field of digital studies.

This collection broadens our understanding of how we talk about the modern world, particularly of the vocabulary at work in information technologies. Contributors scrutinize each keyword independently: for example, the recent pairing of digital and analog is separated, while classic terms such as community, culture, event, memory, and democracy are treated in light of their historical and intellectual importance. Metaphors of the cloud in cloud computing and the mirror in data mirroring combine with recent and radical uses of terms such as information, sharing, gaming, algorithm, and internet to reveal previously hidden insights into contemporary life. Bookended by a critical introduction and a list of over two hundred other digital keywords, these essays provide concise, compelling arguments about our current mediated condition.

Digital Keywords delves into what language does in today’s information revolution and why it matters…(More)”.

Searching for Someone: From the “Small World Experiment” to the “Red Balloon Challenge,” and beyond

Curated on June 8, 2016July 19, 2019 by Stefaan Verhulst

Essay by Manuel Cebrian, Iyad Rahwan, Victoriano Izquierdo, Alex Rutherford, Esteban Moro and Alex (Sandy) Pentland: “Our ability to search social networks for people and information is fundamental to our success. We use our personal connections to look for new job opportunities, to seek advice about what products to buy, to match with romantic partners, to find a good physician, to identify business partners, and so on.

Despite living in a world populated by seven billion people, we are able to navigate our contacts efficiently, only needing a handful of personal introductions before finding the answer to our question, or the person we are seeking. How does this come to be? In folk culture, the answer to this question is that we live in a “small world.” The catch-phrase was coined in 1929 by the visionary author Frigyes Karinthy in his Chain-Links essay, where these ideas are put forward for the first time.

Let me put it this way: Planet Earth has never been as tiny as it is now. It shrunk — relatively speaking of course — due to the quickening pulse of both physical and verbal communication. We never talked about the fact that anyone on Earth, at my or anyone’s will, can now learn in just a few minutes what I think or do, and what I want or what I would like to do. Now we live in fairyland. The only slightly disappointing thing about this land is that it is smaller than the real world has ever been. — Frigyes Karinthy, Chain-Links, 1929

Then, it was just a dystopian idea reflecting the anxiety of living in an increasingly more connected world. But there was no empirical evidence that this was actually the case, and it took almost 30 years to find any.

Six Degrees of Separation

In 1967, legendary psychologist Stanley Milgram conducted a ground-breaking experiment to test this “small world” hypothesis. He started with random individuals in the U.S. midwest, and asked them to send packages to people in Boston, Massachusetts, whose address was not given. They must contribute to this “search” only by sending the package to individuals known on a first-name basis. Milgram expected that successful searches (if any!) would require hundreds of individuals along the chain from the initial sender to the final recipient.

Surprisingly, however, Milgram found that the average path length was somewhere between five point five and six individuals, which made social search look astonishingly efficient. Although the experiment raised some methodological criticisms, its findings were profound. However, what it did not answer is why social networks have such short paths in the first place. The answer was not obvious. In fact, there were reasons to suspect that short paths were just a myth: social networks are very cliquish. Your friends’ friends are likely to also be your friends, and thus most social paths are short and circular. This “cliquishness” suggests that our search through the social network can easily get “trapped” within our close social community, making social search highly inefficient.

Architectures for Social Search

Again, it took a long time — more than 40 years — before this riddle was solved. In a 1998 seminal paper in Nature, Duncan Watts & Steven Strogatzcame up with an elegant mathematical model to explain the existence of these short paths. They started from a social network that is very cliquish, i.e., most of your friends are also friends of one another. In this model, the world is “large” since the social distance among individuals is very long. However, if we take only a tiny fraction of these connections (say one out of every hundred links), and rewire them to random individuals in the network, that same world suddenly becomes “small.” These random connections allow individuals to jump to faraway communities very quickly — using them as social network highways — thus reducing average path length in a dramatic fashion.

While this theoretical insight suggests that social networks are searchable due to the existence of short paths, it does not yet say much about the “procedure” that people use to find these paths. There is no reason, a priori, that we should know how to find these short chains, especially since there are many chains, and no individuals have knowledge of the network structure beyond their immediate communities. People do not know how the friends of their friends are connected among themselves, and therefore it is not obvious that they would have a good way of navigating their social network while searching.

Soon after Watts and Strogatz came up with this model at Cornell University, a computer scientist across campus, Jon Kleinberg, set out to investigate whether such “small world” networks are searchable. In a landmark Nature article, “Navigation in a Small World,” published in 200o, he showed that social search is easy without global knowledge of the network, but only for a very specific value of the probability of long-range connectivity (i.e., the probability that we know somebody far removed from us, socially, in the social network). With the advent of a publicly available social media dataset such as LiveJournal, David Liben-Nowell and colleagues showed that real-world social networks do indeed have these particular long-range ties. It appears the social architecture of the world we inhabit is remarkably fine-tuned for searchability….

The Tragedy of the Crowdsourcers

Some recent efforts have been made to try and disincentivize sabotage. If verification is also rewarded along the recruitment tree, then the individuals who recruited the saboteurs would have a clear incentive to verify, halt, and punish the saboteurs. This theoretical solution is yet to be tested in practice, and it is conjectured that a coalition of saboteurs, where saboteurs recruit other saboteurs pretending to “vet” them, would make recursive verification futile.

If we are to believe in theory, theory does not shed a promising light on reducing sabotage in social search. We recently proposed the “Crowdsourcing Dilemma.” In it, we perform a game-theoretic analysis of the fundamental tradeoff between the potential for increased productivity of social search and the possibility of being set back by malicious behavior, including misinformation. Our results show that, in competitive scenarios, such as those with multiple social searches competing for the same information, malicious behavior is the norm, not an anomaly — a result contrary to conventional wisdom. Even worse: counterintuitively, making sabotage more costly does not deter saboteurs, but leads all the competing teams to a less desirable outcome, with more aggression, and less efficient collective search for talent.

These empirical and theoretical findings have cautionary implications for the future of social search, and crowdsourcing in general. Social search is surprisingly efficient, cheap, easy to implement, and functional across multiple applications. But there are also surprises in the amount of evildoing that the social searchers will stumble upon while recruiting. As we get deeper and deeper into the recruitment tree, we stumble upon that evil force lurking in the dark side of the network.

Evil mutates and regenerates in the crowd in new forms impossible to anticipate by the designers or participants themselves. Crowdsourcing and its enemies will always be engaged in an co-evolutionary arms race.

Talent is there to be searched and recruited. But so are evil and malice. Ultimately, crowdsourcing experts need to figure out how to recruit more of the former, while deterring more of the later. We might be living on a small world, but the cost and fragility of navigating it could harm any potential strategy to leverage the power of social networks….

Being searchable is a way of being closely connected to everyone else, which is conducive to contagion, group-think, and, most crucially, makes it hard for individuals to differentiate from each other. Evolutionarily, for better or worse, our brain makes us mimic others, and whether this copying of others ends up being part of the Wisdom of the Crowds, or the “stupidity of many,” it is highly sensitive to the scenario at hand.

Katabasis, or the myth of the hero that descends to the underworld and comes back stronger, is as old as time and pervasive across ancient cultures. Creative people seem to need to “get lost.” Grigori Perelman, Shinichi Mochizuki, and Bob Dylan all disappeared for a few years to reemerge later as more creative versions of themselves. Others like J. D. Salinger and Bobby Fisher also vanished, and never came back to the public sphere. If others cannot search and find us, we gain some slack, some room to escape from what we are known for by others. Searching for our true creative selves may rest on the difficulty of others finding us….(More)”

Fan Favorites

Curated on June 8, 2016August 3, 2018 by Stefaan Verhulst

Erin Reilly at Strategy + Business: “…In theory, new technological advances such as big data and machine learning, combined with more direct access to audience sentiment, behaviors, and preferences via social media and over-the-top delivery channels, give the entertainment and media industry unprecedented insight into what the audience actually wants. But as a professional in the television industry put it, “We’re drowning in data and starving for insights.” Just as my data trail didn’t trace an accurate picture of my true interest in soccer, no data set can quantify all that consumers are as humans. At USC’s Annenberg Innovation Lab, our research has led us to an approach that blends data collection with a deep understanding of the social and cultural context in which the data is created. This can be a powerful practice for helping researchers understand the behavior of fans — fans of sports, brands, celebrities, and shows.

A Model for Understanding Fans

Marketers and creatives often see audiences and customers as passive assemblies of listeners or spectators. But we believe it’s more useful to view them as active participants. The best analogy may be fans. Broadly characterized, fans have a continued connection with the property they are passionate about. Some are willing to declare their affinity through engagement, some have an eagerness to learn more about their passion, and some want to connect with others who share their interests. Fans are emotionally linked to the object of their passion, and experience their passion through their own subjective lenses. We all start out as audience members. But sometimes, when the combination of factors aligns in just the right way, we become engaged as fans.

For businesses, the key to building this engagement and solidifying the relationship is understanding the different types of fan motivations in different contexts, and learning how to turn the data gathered about them into actionable insights. Even if Jane Smith and her best friend are fans of the same show, the same team, or the same brand, they’re likely passionate for different reasons. For example, some viewers may watch the ABC melodrama Scandal because they’re fashionistas and can’t wait to see the newest wardrobe of star Kerry Washington; others may do so because they’re obsessed with politics and want to see how the newly introduced Donald Trump–like character will behave. And those differences mean fans will respond in varied ways to different situations and content.
Though traditional demographics may give us basic information about who fans are and where they’re located, current methods of understanding and measuring engagement are missing the answers to two essential questions: (1) Why is a fan motivated? and (2) What triggers the fan’s behavior? Our Innovation Lab research group is developing a new model called Leveraging Engagement, which can be used as a framework when designing media strategy….(More)”

Big Data Quality: a Roadmap for Open Data

Curated on June 8, 2016August 3, 2018 by Stefaan Verhulst

Paper by Paolo Ciancarini, Francesco Poggi and Daniel Russo: “Open Data (OD) is one of the most discussed issue of Big Data which raised the joint interest of public institutions, citizens and private companies since 2009. In addition to transparency in public administrations, another key objective of these initiatives is to allow the development of innovative services for solving real world problems, creating value in some positive and constructive way. However, the massive amount of freely available data has not yet brought the expected effects: as of today, there is no application that has exploited the potential provided by large and distributed information sources in a non-trivial way, nor any service has substantially changed for the better the lives of people. The era of a new generation applications based on open data is far to come. In this context, we observe that OD quality is one of the major threats to achieving the goals of the OD movement. The starting point of this study is the quality of the OD released by the five Constitutional offices of Italy. W3C standards about OD are widely known accepted in Italy by the Italian Digital Agency (AgID). According to the most recent Italian Laws the Public Administration may release OD according to the AgID standards. Our exploratory study aims to assess the quality of such releases and the real implementations of OD. The outcome suggests the need of a drastic improvement in OD quality. Finally we highlight some key quality principles for OD, and propose a roadmap for further research….(more)”

The Perils of Experimentation

Curated on June 8, 2016August 3, 2018 by Stefaan Verhulst

Paper by Michael A. Livermore: “More than eighty years after Justice Brandeis coined the phrase “laboratories of democracy,” the concept of policy experimentation retains its currency as a leading justification for decentralized governance. This Article examines the downsides of experimentation, and in particular the potential for decentralization to lead to the production of information that exacerbates public choice failures. Standard accounts of experimentation and policy learning focus on information concerning the social welfare effects of alternative policies. But learning can also occur along a political dimension as information about ideological preferences, campaign techniques, and electoral incentives is revealed. Both types of information can be put to use in the policy arena by a host of individual and institutional actors that have a wide range of motives, from public-spirited concern for the general welfare to a desire to maximize personal financial returns. In this complex environment, there is no guarantee that the information that is generated by experimentation will lead to social benefits. This Article applies this insight to prior models of federalism developed in the legal and political science literature to show that decentralization can lead to the over-production of socially harmful information. As a consequence, policy makers undertaking a decentralization calculation should seek a level of decentralization that best balances the costs and benefits of information production. To illustrate the legal and policy implications of the arguments developed here, this Article examines two contemporary environmental rulemakings of substantial political, legal, and economic significance: a rule to define the jurisdictional reach of the Clean Water Act; and a rule to limit greenhouse gas emissions from the electricity generating sector….(More)”.