Book edited by Collmann, Jeff, and Matei, Sorin Adam: “This book springs from a multidisciplinary, multi-organizational, and multi-sector conversation about the privacy and ethical implications of research in human affairs using big data. The need to cultivate and enlist the public’s trust in the abilities of particular scientists and scientific institutions constitutes one of this book’s major themes. The advent of the Internet, the mass digitization of research information, and social media brought about, among many other things, the ability to harvest – sometimes implicitly – a wealth of human genomic, biological, behavioral, economic, political, and social data for the purposes of scientific research as well as commerce, government affairs, and social interaction. What type of ethical dilemmas did such changes generate? How should scientists collect, manipulate, and disseminate this information? The effects of this revolution and its ethical implications are wide-ranging.
This book includes the opinions of myriad investigators, practitioners, and stakeholders in big data on human beings who also routinely reflect on the privacy and ethical issues of this phenomenon. Dedicated to the practice of ethical reasoning and reflection in action, the book offers a range of observations, lessons learned, reasoning tools, and suggestions for institutional practice to promote responsible big data research on human affairs. It caters to a broad audience of educators, researchers, and practitioners. Educators can use the volume in courses related to big data handling and processing. Researchers can use it for designing new methods of collecting, processing, and disseminating big data, whether in raw form or as analysis results. Lastly, practitioners can use it to steer future tools or procedures for handling big data. As this topic represents an area of great interest that still remains largely undeveloped, this book is sure to attract significant interest by filling an obvious gap in currently available literature. …(More)”
Jesse Dunietz at Nautilus: “…A feverish push for “big data” analysis has swept through biology, linguistics, finance, and every field in between. Although no one can quite agree how to define it, the general idea is to find datasets so enormous that they can reveal patterns invisible to conventional inquiry. The data are often generated by millions of real-world user actions, such as tweets or credit-card purchases, and they can take thousands of computers to collect, store, and analyze. To many companies and researchers, though, the investment is worth it because the patterns can unlock information about anything from genetic disorders to tomorrow’s stock prices.
But there’s a problem: It’s tempting to think that with such an incredible volume of data behind them, studies relying on big data couldn’t be wrong. But the bigness of the data can imbue the results with a false sense of certainty. Many of them are probably bogus—and the reasons why should give us pause about any research that blindly trusts big data.
In the case of language and culture, big data showed up in a big way in 2011, when Google released itsNgrams tool. Announced with fanfare in the journal Science, Google Ngrams allowed users to search for short phrases in Google’s database of scanned books—about 4 percent of all books ever published!—and see how the frequency of those phrases has shifted over time. The paper’s authors heralded the advent of “culturomics,” the study of culture based on reams of data and, since then, Google Ngrams has been, well, largely an endless source of entertainment—but also a goldmine for linguists, psychologists, and sociologists. They’ve scoured its millions of books to show that, for instance, yes, Americans are becoming more individualistic; that we’re “forgetting our past faster with each passing year”; and that moral ideals are disappearing from our cultural consciousness.
WE’RE LOSING HOPE: An Ngrams chart for the word “hope,” one of many intriguing plots found by xkcd author Randall Munroe. If Ngrams really does reflect our culture, we may be headed for a dark place.
The problems start with the way the Ngrams corpus was constructed. In a study published last October, three University of Vermont researchers pointed out that, in general, Google Books includes one copy of every book. This makes perfect sense for its original purpose: to expose the contents of those books to Google’s powerful search technology. From the angle of sociological research, though, it makes the corpus dangerously skewed….
Even once you get past the data sources, there’s still the thorny issue of interpretation. Sure, words like “character” and “dignity” might decline over the decades. But does that mean that people care about morality less? Not so fast, cautions Ted Underwood, an English professor at the University of Illinois, Urbana-Champaign. Conceptions of morality at the turn of the last century likely differed sharply from ours, he argues, and “dignity” might have been popular for non-moral reasons. So any conclusions we draw by projecting current associations backward are suspect.
Of course, none of this is news to statisticians and linguists. Data and interpretation are their bread and butter. What’s different about Google Ngrams, though, is the temptation to let the sheer volume of data blind us to the ways we can be misled.
This temptation isn’t unique to Ngrams studies; similar errors undermine all sorts of big data projects. Consider, for instance, the case of Google Flu Trends (GFT). Released in 2008, GFT would count words like “fever” and “cough” in millions of Google search queries, using them to “nowcast” how many people had the flu. With those estimates, public health officials could act two weeks before the Centers for Disease Control could calculate the true numbers from doctors’ reports.
When big data isn’t seen as a panacea, it can be transformative.
Initially, GFT was claimed to be 97 percent accurate. But as a study out of Northeastern University documents, that accuracy was a fluke. First, GFT completely missed the “swine flu” pandemic in the spring and summer of 2009. (It turned out that GFT was largely predicting winter.) Then, the system began to overestimate flu cases. In fact, it overshot the peak 2013 numbers by a whopping 140 percent. Eventually, Google just retired the program altogether.
So what went wrong? As with Ngrams, people didn’t carefully consider the sources and interpretation of their data. The data source, Google searches, was not a static beast. When Google started auto-completing queries, users started just accepting the suggested keywords, distorting the searches GFT saw. On the interpretation side, GFT’s engineers initially let GFT take the data at face value; almost any search term was treated as a potential flu indicator. With millions of search terms, GFT was practically guaranteed to over-interpret seasonal words like “snow” as evidence of flu.
But when big data isn’t seen as a panacea, it can be transformative. Several groups, like Columbia University researcher Jeffrey Shaman’s, for example, have outperformed the flu predictions of both the CDC and GFT by using the former to compensate for the skew of the latter. “Shaman’s team tested their model against actual flu activity that had already occurred during the season,” according to the CDC. By taking the immediate past into consideration, Shaman and his team fine-tuned their mathematical model to better predict the future. All it takes is for teams to critically assess their assumptions about their data….(More)
Elizabeth Radziszewski at The Wilson Quaterly: “Although the landscape of threats has changed in recent years, U.S. strategies bear striking resemblance to the ways policymakers dealt with crises in the past. Whether it involves diplomatic overtures, sanctions, bombing campaigns, or the use of special ops and covert operations, the range of responses suffers from innovation deficit. Even the use of drones, while a new tool of warfare, is still part of the limited categories of responses that focus mainly on whether or not to kill, cooperate, or do nothing. To meet the evolving nature of threats posed by nonstate actors such as ISIS, the United States needs a strategy makeover — a creative lift, so to speak.
Sanctions, diplomacy, bombing campaigns, special ops, covert operations — the range of our foreign policy responses suffers from an innovation deficit.
Enter the business world. Today’s top companies face an increasingly competitive marketplace where innovative approaches to product and service development are a necessity. Just as the market has changed for companies since the forces of globalization and the digital economy took over, so has the security landscape evolved for the world’s leading hegemon. Yet the responses of top businesses to these changes stand in stark contrast to the United States’ stagnant approaches to current national security threats. Many of today’s thriving businesses have embraced design thinking (DT), an innovative process that identifies consumer needs through immersive ethnographic experiences that are melded with creative brainstorming and quick prototyping.
What would happen if U.S. policymakers took cues from the business world and applied DT in policy development? Could the United States prevent the threats from metastasizing with more proactive rather than reactive strategies — by discovering, for example, how ideas from biology, engineering, and other fields could help analysts inject fresh perspective into tired solutions? Put simply, if U.S. policymakers want to succeed in managing future threats, then they need to start thinking more like business innovators who integrate human needs with technology and economic feasibility.
In his 1969 book The Sciences of the Artificial, Herbert Simon made the first connection between design and a way of thinking. But it was not until the 1980s and 1990s that Stanford scientists began to see the benefits of design practices used by industrial designers as a method for creative thinking. At the core of DT is the idea that solving a challenge requires a deeper understanding of the problem’s true nature and the processes and people involved. This approach contrasts greatly with more standard innovation styles, where a policy solution is developed and then resources are used to fit the solution to the problem. DT reverses the order.
DT encourages divergent thinking, the process of generating many ideas before converging to select the most feasible ones, including making connections between different-yet-related worlds. Finally, the top ideas are quickly prototyped and tested so that early solutions can be modified without investing many resources and risking the biggest obstacle to real innovation: the impulse to try fitting an idea, product, policy to the people, rather of the other way around…
If DT has reenergized the innovative process in the business and nonprofit sector, a systematic application of its methodology could just as well revitalize U.S. national security policies. Innovation in security and foreign policy is often framed around the idea of technological breakthroughs. Thanks toDefense Advanced Research Projects Agency (DARPA), the Department of Defense has been credited with such groundbreaking inventions as GPS, the Internet, and stealth fighters — all of which have created rich opportunities to explore new military strategies. Reflecting this infatuation with technology, but with a new edge, is Defense Secretary Ashton Carter’s unveiling of the Defense Innovation Unit Experimental, an initiative to scout for new technologies, improve outreach to startups, and form deeper relationships between the Pentagon and Silicon Valley. The new DIUE effort signals what businesses have already noticed: the need to be more flexible in establishing linkages with people outside of the government in search for new ideas.
Yet because the primary objective of DIUE remains technological prowess, the effort alone is unlikely to drastically improve the management of national security. Technology is not a substitute for an innovative process. When new invention is prized as the sole focus of innovation, it can, paradoxically, paralyze innovation. Once an invention is adopted, it is all too tempting to mold subsequent policy development around emergent technology, even if other solutions could be more appropriate….(More)”
Chapter by Thakuriah, P., Dirks, L., and Keita, Y. in Seeing Cities Through Big Data: Research Methods and Applications in Urban Informatics (forthcoming): “This paper assesses non-traditional urban digital infomediaries who are pushing the agenda of urban Big Data and Open Data. Our analysis identified a mix of private, public, non-profit and informal infomediaries, ranging from very large organizations to independent developers. Using a mixed-methods approach, we identified four major groups of organizations within this dynamic and diverse sector: general-purpose ICT providers, urban information service providers, open and civic data infomediaries, and independent and open source developers. A total of nine organizational types are identified within these four groups. We align these nine organizational types along five dimensions accounts for their mission and major interests, products and services, as well activities they undertake: techno-managerial, scientific, business and commercial, urban engagement, and openness and transparency. We discuss urban ICT entrepreneurs, and the role of informal networks involving independent developers, data scientists and civic hackers in a domain that historically involved professionals in the urban planning and public management domains. Additionally, we examine convergence in the sector by analyzing overlaps in their activities, as determined by a text mining exercise of organizational webpages. We also consider increasing similarities in products and services offered by the infomediaries, while highlighting ideological tensions that might arise given the overall complexity of the sector, and differences in the backgrounds and end-goals of the participants involved. There is much room for creation of knowledge and value networks in the urban data sector and for improved cross-fertilization among bodies of knowledge….(More)”
Responsible Data Forum: “The engine room is excited to release new adaptations of the responsible development data book that we now fondly refer to as, “The Hand-Book of the Modern Development Specialist: Being a Complete Illustrated Guide to Responsible Data Usage, Manners & General Deportment.”
You can now view this resource on its new webpage, where you can read chapter summaries for quickresources, utilize slide decks complete with presenter notes, and read the original resource with a newdesign make-over….
Freshly Released Adaptations
The following adaptations can be found on our Hand-book webpage.
Chapter summaries: Chapter summaries enable readers to get a taste of section content, allow them to know if the particular section is of relative use, provides a simple overview if they aren’t comfortable diving right into the book, or gives a memory jog for those who are already familiar withthe content.
Slide deck templates: The slide decks enable in-depth presentation based on the structure of the book by using its diagrams. This will help responsible data advocates customize slides for their own organization’s needs. These decks are complete with thorough notes to aide a presenter that may not be an expert on the contents.
New & improved book format: Who doesn’t love a makeover? The original resource is still available to download as a printable file for those that prefer book formatting, and now the document sports improved visuals and graphics….(More)”
Brief by Dennis Anderson, Robert Wu, Dr. June-Suh Cho, and Katja Schroeder: “This book discusses three levels of e-government and national strategies to reach a citizen-centric participatory e-government, and examines how disruptive technologies help shape the future of e-government. The authors examine how e-government can facilitate a symbiotic relationship between the government and its citizens. ICTs aid this relationship and promote transparencies so that citizens can place greater trust in the activities of their government. If a government can manage resources more effectively by better understanding the needs of its citizens, it can create a sustainable environment for citizens. Having a national strategy on ICT in government and e-government can significantly reduce government waste, corruption, and inefficiency. Businesses, CIOs and CTOs in the public sector interested in meeting sustainability requirements will find this book useful. …(More)”
]Book by Calestous Juma: “The rise of artificial intelligence has rekindled a long-standing debate regarding the impact of technology on employment. This is just one of many areas where exponential advances in technology signal both hope and fear, leading to public controversy. This book shows that many debates over new technologies are framed in the context of risks to moral values, human health, and environmental safety. But it argues that behind these legitimate concerns often lie deeper, but unacknowledged, socioeconomic considerations. Technological tensions are often heightened by perceptions that the benefits of new technologies will accrue only to small sections of society while the risks will be more widely distributed. Similarly, innovations that threaten to alter cultural identities tend to generate intense social concern. As such, societies that exhibit great economic and political inequities are likely to experience heightened technological controversies.
Drawing from nearly 600 years of technology history, Innovation and Its Enemies identifies the tension between the need for innovation and the pressure to maintain continuity, social order, and stability as one of today’s biggest policy challenges. It reveals the extent to which modern technological controversies grow out of distrust in public and private institutions. Using detailed case studies of coffee, the printing press, margarine, farm mechanization, electricity, mechanical refrigeration, recorded music, transgenic crops, and transgenic animals, it shows how new technologies emerge, take root, and create new institutional ecologies that favor their establishment in the marketplace. The book uses these lessons from history to contextualize contemporary debates surrounding technologies such as artificial intelligence, online learning, 3D printing, gene editing, robotics, drones, and renewable energy. It ultimately makes the case for shifting greater responsibility to public leaders to work with scientists, engineers, and entrepreneurs to manage technological change, make associated institutional adjustments, and expand public engagement on scientific and technological matters….(More)”
Chapter by Ricard Munné in New Horizons for a Data-Driven Economy: “The public sector is becoming increasingly aware of the potential value to be gained from big data, as governments generate and collect vast quantities of data through their everyday activities.
The benefits of big data in the public sector can be grouped into three major areas, based on a classification of the types of benefits: advanced analytics, through automated algorithms; improvements in effectiveness, providing greater internal transparency; improvements in efficiency, where better services can be provided based on the personalization of services; and learning from the performance of such services.
The chapter examined several drivers and constraints that have been identified, which can boost or stop the development of big data in the sector depending on how they are addressed. The findings, after analysing the requirements and the technologies currently available, show that there are open research questions to be addressed in order to develop such technologies so competitive and effective solutions can be built. The main developments are required in the fields of scalability of data analysis, pattern discovery, and real-time applications. Also required are improvements in provenance for the sharing and integration of data from the public sector. It is also extremely important to provide integrated security and privacy mechanisms in big data applications, as public sector collects vast amounts of sensitive data. Finally, respecting the privacy of citizens is a mandatory obligation in the European Union….(More)”
Book edited by Jennifer Howard-Grenville, Claus Rerup, Ann Langley, and Haridimos Tsoukas: “Over the past 15 years, organizational routines have been increasingly investigated from a process perspective to challenge the idea that routines are stable entities that are mindlessly enacted.
A process perspective explores how routines are performed by specific people in specific settings. It shows how action, improvisation, and novelty are part of routine performances. It also departs from a view of routines as “black boxes” that transform inputs into organizational outputs and places attention on the actual actions and patterns that comprise routines. Routines are both effortful accomplishments, in that it takes effort to perform, sustain, or change them, and emergent accomplishments, because sometimes the effort to perform routines leads to unforeseen change.
While a process perspective has enabled scholars to open up the “black box” of routines and explore their actions and patterns in fine-grained, dynamic ways, there is much more work to be done. Chapters in this volume make considerable progress, through the three main themes expressed across these chapters. These are: Zooming out to understand routines in larger contexts; Zooming in to reveal actor dispositions and skill; and Innovation, creativity and routines in ambiguous contexts….(More)”
Book by Geoffrey Rockwell and Stéfan Sinclair: “The image of the scholar as a solitary thinker dates back at least to Descartes’ Discourse on Method. But scholarly practices in the humanities are changing as older forms of communal inquiry are combined with modern research methods enabled by the Internet, accessible computing, data availability, and new media. Hermeneutica introduces text analysis using computer-assisted interpretive practices. It offers theoretical chapters about text analysis, presents a set of analytical tools (called Voyant) that instantiate the theory, and provides example essays that illustrate the use of these tools. Voyant allows users to integrate interpretation into texts by creating hermeneutica—small embeddable “toys” that can be woven into essays published online or into such online writing environments as blogs or wikis. The book’s companion website, Hermeneutic.ca, offers the example essays with both text and embedded interactive panels. The panels show results and allow readers to experiment with the toys themselves.
The use of these analytical tools results in a hybrid essay: an interpretive work embedded with hermeneutical toys that can be explored for technique. The hermeneutica draw on and develop such common interactive analytics as word clouds and complex data journalism interactives. Embedded in scholarly texts, they create a more engaging argument. Moving between tool and text becomes another thread in a dynamic dialogue….(More)”