Kalev Leetaru at Forbes: “One of the most talked-about stories in the world of polling and survey research in recent years has been the gradual death of survey response rates and the reliability of those insights….
The online world’s perceived anonymity has offered some degree of reprieve in which online polls and surveys have often bested traditional approaches in assessing views towards society’s most controversial issues. Yet, here as well increasing public understanding of phishing and online safety are ever more problematic.
The answer has been the rise of “big data” analysis of society’s digital exhaust to fill in the gaps….
Is it truly the same answer though?
Constructing and conducting a well-designed survey means being able to ask the public exactly the questions of interest. Most importantly, it entails being able to ensure representative demographics of respondents.
An online-only poll is unlikely to accurately capture the perspectives of the three quarters of the earth’s population that the digital revolution has left behind. Even within the US, social media platforms are extraordinarily skewed.
The far greater problem is that society’s data exhaust is rarely a perfect match for the questions of greatest interest to policymakers and public.
Cellphone mobility records can offer an exquisitely detailed look at how the people of a city go about their daily lives, but beneath all that blinding light are the invisible members of society not deemed valuable to advertisers and thus not counted. Even for the urban society members whose phones are their ever-present companions, mobility data only goes so far. It can tell us that occupants of a particular part of the city during the workday spend their evenings in a particular part of the city, allowing us to understand their work/life balance, but it offers few insights into their political leanings.
One of the greatest challenges of today’s “big data” surveying is that it requires us to narrow our gaze to only those questions which can be easily answered from the data at hand.
Much as AI’s crisis of bias comes from the field’s steadfast refusal to pay for quality data, settling for highly biased free data, so too has “big data” surveying limited itself largely to datasets it can freely and easily acquire.
The result is that with traditional survey research, we are free to ask the precise questions we are most interested in. With data exhaust research, we must imperfectly shoehorn our questions into the few available metrics. With sufficient creativity it is typically possible to find some way of proxying the given question, but the resulting proxies may be highly unstable, with little understanding of when and where they may fail.
Much like how the early rise of the cluster computing era caused “big data” researchers to limit the questions they asked of their data to just those they could fit into a set of tiny machines, so too has the era of data exhaust surveying forced us to greatly restrict our understanding of society.
Most dangerously, however, big data surveying implicitly means we are measuring only the portion of society our vast commercial surveillance state cares about.
In short, we are only able to measure those deemed of greatest interest to advertisers and thus the most monetizable.
Putting this all together, the decline of traditional survey research has led to the rise of “big data” analysis of society’s data exhaust. Instead of giving us an unprecedented new view into the heartbeat of daily life, this reliance on the unintended output of our digital lives has forced researchers to greatly narrow the questions they can explore and severely skews them to the most “monetizable” portions of society.
In the end, the shift of societal understanding from precision surveys to the big data revolution has led not to an incredible new understanding of what makes us tick, but rather a far smaller, less precise and less accurate view than ever before, just our need to understand ourselves has never been greater….(More)”.