Stefaan Verhulst
…to perform top-quality and cost-effective research, scientists need these technologies and the technical knowledge of experts to run them. When money is tight, where can scientists turn for the tools they need to complete their projects?
Sharing resources
An early solution to this problem was to create what the academic world calls “resource labs” that specialize in one or more specific type of science experiments (e.g., genomics, cell culture, proteomics). Researchers can then order and pay for that type of experiment from the resource lab instead of doing it on their own.
By focusing on one area of science, resource labs become the experts in that area and do the experiments better, faster and cheaper than most scientists could do in their own labs. Scientists no longer stumble through failed experiments trying to learn a new technique when a resource lab can do it correctly from the start.
The pooled funds from many research projects allow resource labs to buy better and faster equipment than any individual scientist could afford. This provides more researchers access to better technology at lower costs – which also saves taxpayers money, since many grants are government-backed….
Connecting people on a scientific Craigslist
This is a common paradox, with several efforts under way to address it. For example, MIT has created several “remote online laboratories” running experiments that can be controlled via the internet, to help enrich teaching in places that can’t afford advanced equipment. Harvard’s eagle-i system is a directory where researchers can list information, data and equipment they are willing to share with others – including cell lines, research mice, and equipment. Different services work for different institutions.
In 2011, Dr. Elizabeth Iorns, a breast cancer researcher, developed a mouse model to study how breast cancer spreads, but her institution didn’t have the equipment to finish one part of her study. My resource lab could complete the project, but despite significant searching, Dr. Iorns did not have an effective way to find labs like mine.
Actively connecting scientists with resource labs, and helping resource labs keep their equipment optimally busy, is a model Iorns and cofounder Dan Knox have developed into a business, called Science Exchange. (I am on its Lab Advisory Board, but have no financial interest in the company.) A little bit Craigslist and Travelocity for science rolled into one, Science Exchange provides scientists and expert resource labs a way to find each other to keep research progressing.
Unlike Starbucks, resource labs are not found on every corner and can be difficult for scientists to find. Now a simple search provides scientists a list of multiple resource labs that could do the experiments, including estimated costs and speed – and even previous users’ reviews of the choices.
I signed onto Science Exchange soon after it went live and Iorns immediately sent her project to my lab. We completed the project quickly, resulting in the first peer-reviewed publication made possible through Science Exchange….(More).
Paper Ben Goldacre and Jonathan Gray at BioMed Central: “OpenTrials is a collaborative and open database for all available structured data and documents on all clinical trials, threaded together by individual trial. With a versatile and expandable data schema, it is initially designed to host and match the following documents and data for each trial: registry entries; links, abstracts, or texts of academic journal papers; portions of regulatory documents describing individual trials; structured data on methods and results extracted by systematic reviewers or other researchers; clinical study reports; and additional documents such as blank consent forms, blank case report forms, and protocols. The intention is to create an open, freely re-usable index of all such information and to increase discoverability, facilitate research, identify inconsistent data, enable audits on the availability and completeness of this information, support advocacy for better data and drive up standards around open data in evidence-based medicine. The project has phase I funding. This will allow us to create a practical data schema and populate the database initially through web-scraping, basic record linkage techniques, crowd-sourced curation around selected drug areas, and import of existing sources of structured and documents. It will also allow us to create user-friendly web interfaces onto the data and conduct user engagement workshops to optimise the database and interface designs. Where other projects have set out to manually and perfectly curate a narrow range of information on a smaller number of trials, we aim to use a broader range of techniques and attempt to match a very large quantity of information on all trials. We are currently seeking feedback and additional sources of structured data….(More)”
Kaveh Waddell in the Atlantic: “Big data can help solve problems that are too big for one person to wrap their head around. It’s helped businesses cut costs, cities plan new developments, intelligence agencies discover connections between terrorists, health officials predict outbreaks, and police forces get ahead of crime. Decision-makers are increasingly told to “listen to the data,” and make choices informed by the outputs of complex algorithms.
But when the data is about humans—especially those who lack a strong voice—those algorithms can become oppressive rather than liberating. For many poor people in the U.S., the data that’s gathered about them at every turn can obstruct attempts to escape poverty.
Low-income communities are among the most surveilled communities in America. And it’s not just the police that are watching, says Michele Gilman, a law professor at the University of Baltimore and a former civil-rights attorney at the Department of Justice. Public-benefits programs, child-welfare systems, and monitoring programs for domestic-abuse offenders all gather large amounts of data on their users, who are disproportionately poor.
In certain places, in order to qualify for public benefits like food stamps, applicants have to undergo fingerprinting and drug testing. Once people start receiving the benefits, officials regularly monitor them to see how they spend the money, and sometimes check in on them in their homes.
Data gathered from those sources can end up feeding back into police systems, leading to a cycle of surveillance. “It becomes part of these big-data information flows that most people aren’t aware they’re captured in, but that can have really concrete impacts on opportunities,” Gilman says.
Once an arrest crops up on a person’s record, for example, it becomes much more difficult for that person to find a job, secure a loan, or rent a home. And that’s not necessarily because loan officers or hiring managers pass over applicants with arrest records—computer systems that whittle down tall stacks of resumes or loan applications will often weed some out based on run-ins with the police.
When big-data systems make predictions that cut people off from meaningful opportunities like these, they can violate the legal principle of presumed innocence, according to Ian Kerr, a professor and researcher of ethics, law, and technology at the University of Ottawa.
Outside the court system, “innocent until proven guilty” is upheld by people’s due-process rights, Kerr says: “A right to be heard, a right to participate in one’s hearing, a right to know what information is collected about me, and a right to challenge that information.” But when opaque data-driven decision-making takes over—what Kerr calls “algorithmic justice”—some of those rights begin to erode….(More)”
]
Book by Calestous Juma: “The rise of artificial intelligence has rekindled a long-standing debate regarding the impact of technology on employment. This is just one of many areas where exponential advances in technology signal both hope and fear, leading to public controversy. This book shows that many debates over new technologies are framed in the context of risks to moral values, human health, and environmental safety. But it argues that behind these legitimate concerns often lie deeper, but unacknowledged, socioeconomic considerations. Technological tensions are often heightened by perceptions that the benefits of new technologies will accrue only to small sections of society while the risks will be more widely distributed. Similarly, innovations that threaten to alter cultural identities tend to generate intense social concern. As such, societies that exhibit great economic and political inequities are likely to experience heightened technological controversies.
Drawing from nearly 600 years of technology history, Innovation and Its Enemies identifies the tension between the need for innovation and the pressure to maintain continuity, social order, and stability as one of today’s biggest policy challenges. It reveals the extent to which modern technological controversies grow out of distrust in public and private institutions. Using detailed case studies of coffee, the printing press, margarine, farm mechanization, electricity, mechanical refrigeration, recorded music, transgenic crops, and transgenic animals, it shows how new technologies emerge, take root, and create new institutional ecologies that favor their establishment in the marketplace. The book uses these lessons from history to contextualize contemporary debates surrounding technologies such as artificial intelligence, online learning, 3D printing, gene editing, robotics, drones, and renewable energy. It ultimately makes the case for shifting greater responsibility to public leaders to work with scientists, engineers, and entrepreneurs to manage technological change, make associated institutional adjustments, and expand public engagement on scientific and technological matters….(More)”
Chapter by Ricard Munné in New Horizons for a Data-Driven Economy: “The public sector is becoming increasingly aware of the potential value to be gained from big data, as governments generate and collect vast quantities of data through their everyday activities.
The benefits of big data in the public sector can be grouped into three major areas, based on a classification of the types of benefits: advanced analytics, through automated algorithms; improvements in effectiveness, providing greater internal transparency; improvements in efficiency, where better services can be provided based on the personalization of services; and learning from the performance of such services.
The chapter examined several drivers and constraints that have been identified, which can boost or stop the development of big data in the sector depending on how they are addressed. The findings, after analysing the requirements and the technologies currently available, show that there are open research questions to be addressed in order to develop such technologies so competitive and effective solutions can be built. The main developments are required in the fields of scalability of data analysis, pattern discovery, and real-time applications. Also required are improvements in provenance for the sharing and integration of data from the public sector. It is also extremely important to provide integrated security and privacy mechanisms in big data applications, as public sector collects vast amounts of sensitive data. Finally, respecting the privacy of citizens is a mandatory obligation in the European Union….(More)”
John Wilbanks & Stephen H Friend in Nature Biotechnology: “To upend current barriers to sharing clinical data and insights, we need a framework that not only accounts for choices made by trial participants but also qualifies researchers wishing to access and analyze the data.
This March, Sage Bionetworks (Seattle) began sharing curated data collected from >9,000 participants of mPower, a smartphone-enabled health research study for Parkinson’s disease. The mPower study is notable as one of the first observational assessments of human health to rapidly achieve scale as a result of its design and execution purely through a smartphone interface. To support this unique study design, we developed a novel electronic informed consent process that includes participant-determined data-sharing preferences. It is through these preferences that the new data—including self-reported outcomes and quantitative sensor data—are shared broadly for secondary analysis. Our hope is that by sharing these data immediately, prior even to our own complete analysis, we will shorten the time to harnessing any utility that this study’s data may hold to improve the condition of patients who suffer from this disease.
Turbulent times for data sharing
Our release of mPower comes at a turbulent time in data sharing. The power of data for secondary research is top of mind for many these days. Vice President Joe Biden, in heading President Barack Obama’s ambitious cancer ‘moonshot’, describes data sharing as second only to funding to the success of the effort. However, this powerful support for data sharing stands in opposition to the opinions of many within the research establishment. To wit, the august New England Journal of Medicine (NEJM)’s recent editorial suggesting that those who wish to reuse clinical trial data without the direct participation and approval of the original study team are “research parasites”4. In the wake of colliding perspectives on data sharing, we must not lose sight of the scientific and societal ends served by such efforts.
It is important to acknowledge that meaningful data sharing is a nontrivial process that can require substantial investment to ensure that data are shared with sufficient context to guide data users. When data analysis is narrowly targeted to answer a specific and straightforward question—as with many clinical trials—this added effort might not result in improved insights. However, many areas of science, such as genomics, astronomy and high-energy physics, have moved to data collection methods in which large amounts of raw data are potentially of relevance to a wide variety of research questions, but the methodology of moving from raw data to interpretation is itself a subject of active research….(More)”
The Economist: “….Mr Rhoads is a member of a network started by the Alaska Longline Fishermen’s Association (ALFA), which aims to do something about this and to reduce by-catch of sensitive species such as rockfish at the same time. Network fishermen, who numbered only 20 at the project’s start, agreed to share data on where and what they were catching in order to create maps that highlighted areas of high by-catch. Within two years they had reduced accidental rockfish harvest by as much as 20%.
The rockfish mapping project expanded to create detailed maps of the sea floor, pooling data gathered by transducers fixed to the bottoms of boats. By combining thousands of data points as vessels traverse the fishing grounds, these “wikimaps”—created and updated through crowdsourcing—show gravel beds where bottom-dwelling halibut are likely to linger, craggy terrain where rockfish tend to lurk, and outcrops that could snag gear.
Public charts are imprecise, and equipment with the capability to sense this level of detail could cost a fisherman more than $70,000. Skippers join ALFA for as little as $250, invest a couple of thousand dollars in computers and software and enter into an agreement to turn over fishing data and not to share the information outside the network, which now includes 85 fishermen.
Skippers say the project makes them more efficient, better able to find the sort of fish they want and avoid squandering time on lost or tangled gear. It also means fewer hooks in the water and fewer hours at sea to catch the same amount of fish….(More)”
The GovLab: “As part of an ongoing effort to build a knowledge base for the field of opening governance by organizing and disseminating its learnings, the GovLab Selected Readings series provides an annotated and curated collection of recommended works on key opening governance topics. In this edition, we explore the literature on Data and Humanitarian Response. To suggest additional readings on this or any other topic, please email biblio@thegovlab.org. All our Selected Readings can be found here.
Context
Data, when used well in a trusted manner , allows humanitarian organizations to innovate how to respond to emergency events, including better coordination of post-disaster relief efforts, the ability to harness local knowledge to create more targeted relief strategies, and tools to predict and monitor disasters in real time. Consequently, in recent years both multinational groups and community-based advocates have begun to integrate data collection and evaluation strategies into their humanitarian operations, to better and more quickly respond to emergencies. However, this movement poses a number of challenges. Compared to the private sector, humanitarian organizations are often less equipped to successfully analyze and manage big data, which pose a number of risks related to the security of victims’ data. Furthermore, complex power dynamics which exist within humanitarian spaces may be further exacerbated through the introduction of new technologies and big data collection mechanisms. In the below we share:
- Selected Reading List (summaries and hyperlinks)
- Annotated Selected Reading List
- Additional Readings….(More)”
Lawrence H. Summers in the Washington Post: “I spoke at a World Bank conference on price statistics. … I am convinced that data is the ultimate public good and that we will soon have much more data than we do today. I made four primary observations.
First, scientific progress is driven more by new tools and new observations than by hypothesis construction and testing. I cited a number of examples: the observation that Jupiter was orbited by several moons clinched the case against the Ptolemaic system, the belief that all celestial objects circle around the Earth. We learned of cells by seeing them when the microscope was constructed. Accelerators made the basic structure of atoms obvious.
Second, if mathematics is the queen of the hard sciences then statistics is the queen of the social sciences. I gave examples of the power of very simple data analysis. We first learned that exercise is good for health from the observation that, in the 1940s, London bus conductors had much lower death rates than bus drivers. Similarly, data demonstrated that smoking was a major killer decades before the biological processes were understood. At a more trivial level, “Moneyball” shows how data-based statistics can revolutionize a major sport.
Third, I urged that what “you count counts” and argued that we needed much more timely and complete data. I noted the centrality of timely statistics to meaningful progress toward Sustainable Development Goals. In comparison to the nearly six-year lag in poverty statistics, it took the United States only about 3½ years to win World War II.
Fourth, I envisioned what might be possible in a world where there will soon be as many smartphones as adults. With the ubiquitous ability to collect data and nearly unlimited ability to process it will come more capacity to discover previously unknown relationships. We will improve our ability to predict disasters like famines, storms and revolutions. Communication technologies will allow us to better hold policymakers to account with reliable and rapid performance measures. And if history is any guide, we will gain capacities on dimensions we cannot now imagine but will come to regard as indispensable.
This is the work of both governments and the private sector. It is fantasy to suppose data, as the ultimate public good, will come into being without government effort. Equally, we will sell ourselves short if we stick with traditional collection methods and ignore innovative providers and methods such as the use of smartphones, drones, satellites and supercomputers. That is why something like the Billion Prices Project at MIT, which can provide daily price information, is so important. That is why I am excited to be a director and involved with Premise — a data company that analyzes information people collect on their smartphones about everyday life, like the price of local foods — in its capacity to mobilize these technologies as widely as possible. That is why Planet Labs, with its capacity to scan and monitor environmental conditions, represents such a profound innovation….(More)
Book edited by Jennifer Howard-Grenville, Claus Rerup, Ann Langley, and Haridimos Tsoukas: “Over the past 15 years, organizational routines have been increasingly investigated from a process perspective to challenge the idea that routines are stable entities that are mindlessly enacted.
A process perspective explores how routines are performed by specific people in specific settings. It shows how action, improvisation, and novelty are part of routine performances. It also departs from a view of routines as “black boxes” that transform inputs into organizational outputs and places attention on the actual actions and patterns that comprise routines. Routines are both effortful accomplishments, in that it takes effort to perform, sustain, or change them, and emergent accomplishments, because sometimes the effort to perform routines leads to unforeseen change.
While a process perspective has enabled scholars to open up the “black box” of routines and explore their actions and patterns in fine-grained, dynamic ways, there is much more work to be done. Chapters in this volume make considerable progress, through the three main themes expressed across these chapters. These are: Zooming out to understand routines in larger contexts; Zooming in to reveal actor dispositions and skill; and Innovation, creativity and routines in ambiguous contexts….(More)”
