How The New York Times incorporates editorial judgment in algorithms to curate its home page


Article by Zhen Yang: “Whether on the web or the app, the home page of The New York Times is a crucial gateway, setting the stage for readers’ experiences and guiding them to the most important news of the day. The Times publishes over 250 stories daily, far more than the 50 to 60 stories that can be featured on the home page at a given time. Traditionally, editors have manually selected and programmed which stories appear, when and where, multiple times daily. This manual process presents challenges:

  • How can we provide readers a relevant, useful, and fresh experience each time they visit the home page?
  • How can we make our editorial curation process more efficient and scalable?
  • How do we maximize the reach of each story and expose more stories to our readers?

To address these challenges, the Times has been actively developing and testing editorially driven algorithms to assist in curating home page content. These algorithms are editorially driven in that a human editor’s judgment or input is incorporated into every aspect of the algorithm — including deciding where on the home page the stories are placed, informing the rankings, and potentially influencing and overriding algorithmic outputs when necessary. From the get-go, we’ve designed algorithmic programming to elevate human curation, not to replace it…

The Times began using algorithms for content recommendations in 2011 but only recently started applying them to home page modules. For years, we only had one algorithmically-powered module, “Smarter Living,” on the home page, and later, “Popular in The Times.” Both were positioned relatively low on the page.

Three years ago, the formation of a cross-functional team — including newsroom editors, product managers, data scientists, data analysts, and engineers — brought the momentum needed to advance our responsible use of algorithms. Today, nearly half of the home page is programmed with assistance from algorithms that help promote news, features, and sub-brand content, such as The Athletic and Wirecutter. Some of these modules, such as the features module located at the top right of the home page on the web version, are in highly visible locations. During major news moments, editors can also deploy algorithmic modules to display additional coverage to complement a main module of stories near the top of the page. (The topmost news package of Figure 1 is an example of this in action.)…(More)”

How is editorial judgment incorporated into algorithmic programming?

From Bits to Biology: A New Era of Biological Renaissance powered by AI


Article by Milad Alucozai: “…A new wave of platforms is emerging to address these limitations. Designed with the modern scientist in mind, these platforms prioritize intuitive interfaces, enabling researchers with diverse computational backgrounds to easily navigate and analyze data. They emphasize collaboration, allowing teams to share data and insights seamlessly. And they increasingly incorporate artificial intelligence, offering powerful tools for accelerating analysis and discovery. This shift marks a move towards more user-centric, efficient, and collaborative computational biology, empowering researchers to tackle increasingly complex biological questions. 

Emerging Platforms: 

  • Seqera Labs: Spearheading a movement towards efficient and reproducible research, Seqera Labs provides a suite of tools, including the popular open-source workflow language Nextflow. Their platform empowers researchers to design scalable and reproducible data analysis pipelines, particularly for cloud environments. Seqera streamlines complex computational workflows across diverse biological disciplines by emphasizing automation and flexibility, making data-intensive research scalable, flexible, and collaborative. 
  • Form Bio: Aimed at democratizing access to computational biology, Form Bio provides a comprehensive tech suite built to enable accelerated cell and gene therapy development and computational biology at scale. Its emphasis on collaboration and intuitive design fosters a more inclusive research environment to help organizations streamline therapeutic development and reduce time-to-market.  
  • Code Ocean: Addressing the critical need for reproducibility in research, Code Ocean provides a unique platform for sharing and executing research code, data, and computational environments. By encapsulating these elements in a portable and reproducible format, Code Ocean promotes transparency and facilitates the reuse of research methods, ultimately accelerating scientific discovery. 
  • Pluto Biosciences: Championing a collaborative approach to biological discovery, Pluto Biosciences offers an interactive platform for visualizing and analyzing complex biological data. Its intuitive tools empower researchers to explore data, generate insights, and seamlessly share findings with collaborators. This fosters a more dynamic and interactive research process, facilitating knowledge sharing and accelerating breakthroughs. 

 Open Source Platform: 

  • Galaxy: A widely used open-source platform for bioinformatics analysis. It provides a user-friendly web interface and a vast collection of tools for various tasks, from sequence analysis to data visualization. Its open-source nature fosters community development and customization, making it a versatile tool for diverse research needs. 
  • Bioconductor is a prominent open-source platform for bioinformatics analysis, akin to Galaxy’s commitment to accessibility and community-driven development. It leverages the power of the R programming language, providing a wealth of packages for tasks ranging from genomic data analysis to statistical modeling. Its open-source nature fosters a collaborative environment where researchers can freely access, utilize, and contribute to a growing collection of tools…(More)”

Who Owns AI?


Paper by Amy Whitaker: “While artificial intelligence (AI) stands to transform artistic practice and creative industries, little has been theorized about who owns AI for creative work. Lawsuits brought against AI companies such as OpenAI and Meta under copyright law invite novel reconsideration of the value of creative work. This paper synthesizes across copyright, hybrid practice, and cooperative governance to work toward collective ownership and decision-making. This paper adds to research in arts entrepreneurship because copyright and shared value is so vital to the livelihood of working artists, including writers, filmmakers, and others in the creative industries. Sarah Silverman’s lawsuit against OpenAI is used as the main case study. The conceptual framework of material and machine, one and many, offers a lens onto value creation and shared ownership of AI. The framework includes a reinterpretation of the fourth factor of fair use under U.S. copyright law to refocus on the doctrinal language of value. AI uses the entirety of creative work in a way that is overlooked because of the small scale of one whole work relative to the overall size of the AI model. Yet a theory of value for creative work gives it dignity in its smallness, the way that one vote still has dignity in a national election of millions. As we navigate these frontiers of AI, experimental models pioneered by artists may be instructive far outside the arts…(More)”.

The Deletion Remedy


Paper by Daniel Wilf-Townsend: “A new remedy has emerged in the world of technology governance. Where someone has wrongfully obtained or used data, this remedy requires them to delete not only that data, but also to delete tools such as machine learning models that they have created using the data. Model deletion, also called algorithmic disgorgement or algorithmic destruction, has been increasingly sought in both private litigation and public enforcement actions. As its proponents note, model deletion can improve the regulation of privacy, intellectual property, and artificial intelligence by providing more effective deterrence and better management of ongoing harms

But, this article argues, model deletion has a serious flaw. In its current form, it has the possibility of being a grossly disproportionate penalty. Model deletion requires the destruction of models whose training included illicit data in any degree, with no consideration of how much (or even whether) that data contributed to any wrongful gains or ongoing harms. Model deletion could thereby cause unjust losses in litigation and chill useful technologies.

This article works toward a well-balanced doctrine of model deletion by building on the remedy’s equitable origins. It identifies how traditional considerations in equity—such as a defendant’s knowledge and culpability, the balance of the hardships, and the availability of more tailored alternatives—can be applied in model deletion cases to mitigate problems of disproportionality. By accounting for proportionality, courts and agencies can develop a doctrine of model deletion that takes advantage of its benefits while limiting its potential excesses…(More)”.

Rethinking ‘Checks and Balances’ for the A.I. Age


Article by Steve Lohr: “A new project, orchestrated by Stanford University and published on Tuesday, is inspired by the Federalist Papers and contends that today is a broadly similar historical moment of economic and political upheaval that calls for a rethinking of society’s institutional arrangements.

In an introduction to its collection of 12 essays, called the Digitalist Papers, the editors overseeing the project, including Erik Brynjolfsson, director of the Stanford Digital Economy Lab, and Condoleezza Rice, secretary of state in the George W. Bush administration and director of the Hoover Institution, identify their overarching concern.

“A powerful new technology, artificial intelligence,” they write, “explodes onto the scene and threatens to transform, for better or worse, all legacy social institutions.”

The most common theme in the diverse collection of essays: Citizens need to be more involved in determining how to regulate and incorporate A.I. into their lives. “To build A.I. for the people, with the people,” as one essay summed it up.

The project is being published as the technology is racing ahead. A.I. enthusiasts see a future of higher economic growth, increased prosperity and a faster pace of scientific discovery. But the technology is also raising fears of a dystopian alternative — A.I. chatbots and automated software not only replacing millions of workers, but also generating limitless misinformation and worsening political polarization. How to govern and guide A.I. in the public interest remains an open question…(More)”.

Improving Governance Outcomes Through AI Documentation: Bridging Theory and Practice 


Report by Amy Winecoff, and Miranda Bogen: “AI documentation is a foundational tool for governing AI systems, via both stakeholders within and outside AI organizations. It offers a range of stakeholders insight into how AI systems are developed, how they function, and what risks they may pose. For example, it might help internal model development, governance, compliance, and quality assurance teams communicate about and manage risk throughout the development and deployment lifecycle. Documentation can also help external technology developers determine what testing they should perform on models they incorporate into their products, or it could guide users on whether or not to adopt a technology. While documentation is essential for effective AI governance, its success depends on how well organizations tailor their documentation approaches to meet the diverse needs of stakeholders, including technical teams, policymakers, users, and other downstream consumers of the documentation.

This report synthesizes findings from an in-depth analysis of academic and gray literature on documentation, encompassing 37 proposed methods for documenting AI data, models, systems, and processes, along with 21 empirical studies evaluating the impact and challenges of implementing documentation. Through this synthesis, we identify key theoretical mechanisms through which AI documentation can enhance governance outcomes. These mechanisms include informing stakeholders about the intended use, limitations, and risks of AI systems; facilitating cross-functional collaboration by bridging different teams; prompting ethical reflection among developers; and reinforcing best practices in development and governance. However, empirical evidence offers mixed support for these mechanisms, indicating that documentation practices can be more effectively designed to achieve these goals…(More)”.

AI in Global Development Playbook


USAID Playbook: “…When used effectively and responsibly, AI holds the potential to accelerate progress on sustainable development and close digital divides, but it also poses risks that could further impede progress toward these goals. With the right enabling environment and ecosystem of actors, AI can enhance efficiency and accelerate development outcomes in sectors such as health, education, agriculture, energy, manufacturing, and delivering public services. The United States aims to ensure that the benefits of AI are shared equitably across the globe.

Distilled from consultations with hundreds of government officials, non-governmental organizations, technology firms and startups, and individuals from around the world, the AI in Global Development Playbook is a roadmap to develop the capacity, ecosystems, frameworks, partnerships, applications, and institutions to leverage safe, secure, and trustworthy AI for sustainable development.

The United States’ current efforts are grounded in the belief that AI, when developed and deployed responsibly, can be a powerful force for achieving the Sustainable Development Goals and addressing some of the world’s most urgent challenges. Looking ahead, the United States will continue to support low- and middle-income countries through funding, advocacy, and convening efforts–collectively navigating the complexities of the digital age and working toward a future in which the benefits of technological development are widely shared.

This Playbook seeks to underscore AI as a uniquely global opportunity with far-reaching impacts and potential risks. It highlights that safe, secure, and trustworthy design, deployment, and use of AI is not only possible but essential. Recognizing that international cooperation and multi-stakeholder partnerships are key in achieving progress, we invite others to contribute their expertise, resources, and perspectives to enrich and expand this framework.

The true measure of progress in responsible AI is not in the sophistication of our machines but in the quality of life the technology enhances. Together we can work toward ensuring the promise of AI is realized in service of this goal…(More)”

Artificial intelligence (AI) in action: A preliminary review of AI use for democracy support


Policy paper by Grahm Tuohy-Gaydos: “…provides a working definition of AI for Westminster Foundation for Democracy (WFD) and the broader democracy support sector. It then provides a preliminary review of how AI is being used to enhance democratic practices worldwide, focusing on several themes including: accountability and transparency, elections, environmental democracy, inclusion, openness and participation, and women’s political leadership. The paper also highlights potential risks and areas of development in the future. Finally, the paper shares five recommendations for WFD and democracy support organisations to consider advancing their ‘digital democracy’ agenda. This policy paper also offers additional information regarding AI classification and other resources for identifying good practice and innovative solutions. Its findings may be relevant to WFD staff members, international development practitioners, civil society organisations, and persons interested in using emerging technologies within governmental settings…(More)”.

China’s biggest AI model is challenging American dominance


Article by Sam Eifling: “So far, the AI boom has been dominated by U.S. companies like OpenAI, Google, and Meta. In recent months, though, a new name has been popping up on benchmarking lists: Alibaba’s Qwen. Over the past few months, variants of Qwen have been topping the leaderboards of sites that measure an AI model’s performance.

“Qwen 72B is the king, and Chinese models are dominating,” Hugging Face CEO Clem Delangue wrote in June, after a Qwen-based model first rose to the top of his company’s Open LLM leaderboard.

It’s a surprising turnaround for the Chinese AI industry, which many thought was doomed by semiconductor restrictions and limitations on computing power. Qwen’s success is showing that China can compete with the world’s best AI models — raising serious questions about how long U.S. companies will continue to dominate the field. And by focusing on capabilities like language support, Qwen is breaking new ground on what an AI model can do — and who it can be built for.

Those capabilities have come as a surprise to many developers, even those working on Qwen itself. AI developer David Ng used Qwen to build the model that topped the Open LLM leaderboard. He’s built models using Meta and Google’s technology also but says Alibaba’s gave him the best results. “For some reason, it works best on the Chinese models,” he told Rest of World. “I don’t know why.”..(More)”

Synthetic Data and Social Science Research


Paper by Jordan C. Stanley & Evan S. Totty: “Synthetic microdata – data retaining the structure of original microdata while replacing original values with modeled values for the sake of privacy – presents an opportunity to increase access to useful microdata for data users while meeting the privacy and confidentiality requirements for data providers. Synthetic data could be sufficient for many purposes, but lingering accuracy concerns could be addressed with a validation system through which the data providers run the external researcher’s code on the internal data and share cleared output with the researcher. The U.S. Census Bureau has experience running such systems. In this chapter, we first describe the role of synthetic data within a tiered data access system and the importance of synthetic data accuracy in achieving a viable synthetic data product. Next, we review results from a recent set of empirical analyses we conducted to assess accuracy in the Survey of Income & Program Participation (SIPP) Synthetic Beta (SSB), a Census Bureau product that made linked survey-administrative data publicly available. Given this analysis and our experience working on the SSB project, we conclude with thoughts and questions regarding future implementations of synthetic data with validation…(More)”