Kevin Roose at The New York Times: “One of the fiercest debates in Silicon Valley right now is about who should control A.I., and who should make the rules that powerful artificial intelligence systems must follow.
Should A.I. be governed by a handful of companies that try their best to make their systems as safe and harmless as possible? Should regulators and politicians step in and build their own guardrails? Or should A.I. models be made open-source and given away freely, so users and developers can choose their own rules?
A new experiment by Anthropic, the maker of the chatbot Claude, offers a quirky middle path: What if an A.I. company let a group of ordinary citizens write some rules, and trained a chatbot to follow them?
The experiment, known as “Collective Constitutional A.I.,” builds on Anthropic’s earlier work on Constitutional A.I., a way of training large language models that relies on a written set of principles. It is meant to give a chatbot clear instructions for how to handle sensitive requests, what topics are off-limits and how to act in line with human values.
If Collective Constitutional A.I. works — and Anthropic’s researchers believe there are signs that it might — it could inspire other experiments in A.I. governance, and give A.I. companies more ideas for how to invite outsiders to take part in their rule-making processes.
That would be a good thing. Right now, the rules for powerful A.I. systems are set by a tiny group of industry insiders, who decide how their models should behave based on some combination of their personal ethics, commercial incentives and external pressure. There are no checks on that power, and there is no way for ordinary users to weigh in.
Opening up A.I. governance could increase society’s comfort with these tools, and give regulators more confidence that they’re being skillfully steered. It could also prevent some of the problems of the social media boom of the 2010s, when a handful of Silicon Valley titans ended up controlling vast swaths of online speech.
In a nutshell, Constitutional A.I. works by using a written set of rules (a “constitution”) to police the behavior of an A.I. model. The first version of Claude’s constitution borrowed rules from other authoritative documents, including the United Nations’ Universal Declaration of Human Rights and Apple’s terms of service…(More)”.