AI By the People, For the People


Article by Billy Perrigo/Karnataka: “…To create an effective English-speaking AI, it is enough to simply collect data from where it has already accumulated. But for languages like Kannada, you need to go out and find more.

This has created huge demand for datasets—collections of text or voice data—in languages spoken by some of the poorest people in the world. Part of that demand comes from tech companies seeking to build out their AI tools. Another big chunk comes from academia and governments, especially in India, where English and Hindi have long held outsize precedence in a nation of some 1.4 billion people with 22 official languages and at least 780 more indigenous ones. This rising demand means that hundreds of millions of Indians are suddenly in control of a scarce and newly-valuable asset: their mother tongue.

Data work—creating or refining the raw material at the heart of AI— is not new in India. The economy that did so much to turn call centers and garment factories into engines of productivity at the end of the 20th century has quietly been doing the same with data work in the 21st. And, like its predecessors, the industry is once again dominated by labor arbitrage companies, which pay wages close to the legal minimum even as they sell data to foreign clients for a hefty mark-up. The AI data sector, worth over $2 billion globally in 2022, is projected to rise in value to $17 billion by 2030. Little of that money has flowed down to data workers in India, Kenya, and the Philippines.

These conditions may cause harms far beyond the lives of individual workers. “We’re talking about systems that are impacting our whole society, and workers who make those systems more reliable and less biased,” says Jonas Valente, an expert in digital work platforms at Oxford University’s Internet Institute. “If you have workers with basic rights who are more empowered, I believe that the outcome—the technological system—will have a better quality as well.”

In the neighboring villages of Alahalli and Chilukavadi, one Indian startup is testing a new model. Chandrika works for Karya, a nonprofit launched in 2021 in Bengaluru (formerly Bangalore) that bills itself as “the world’s first ethical data company.” Like its competitors, it sells data to big tech companies and other clients at the market rate. But instead of keeping much of that cash as profit, it covers its costs and funnels the rest toward the rural poor in India. (Karya partners with local NGOs to ensure access to its jobs go first to the poorest of the poor, as well as historically marginalized communities.) In addition to its $5 hourly minimum, Karya gives workers de-facto ownership of the data they create on the job, so whenever it is resold, the workers receive the proceeds on top of their past wages. It’s a model that doesn’t exist anywhere else in the industry…(More)”.

Corporate Responsibility in the Age of AI


Essay by Maria Eitel: “In the past year, a cacophony of conversations about artificial intelligence has erupted. Depending on whom you listen to, AI is either carrying us into a shiny new world of endless possibilities or propelling us toward a grim dystopia. Call them the Barbie and Oppenheimer scenarios – as attention-grabbing and different as the Hollywood blockbusters of the summer. But one conversation is getting far too little attention: the one about corporate responsibility.

I joined Nike as its first Vice President of Corporate Responsibility in 1998, landing right in the middle of the hyper-globalization era’s biggest corporate crisis: the iconic sports and fitness company had become the face of labor exploitation in developing countries. In dealing with that crisis and setting up corporate responsibility for Nike, we learned hard-earned lessons, which can now help guide our efforts to navigate the AI revolution.

There is a key difference today. Taking place in the late 1990s, the Nike drama played out relatively slowly. When it comes to AI, however, we don’t have the luxury of time. This time last year, most people had not heard about generative AI. The technology entered our collective awareness like a lightning strike in late 2022, and we have been trying to make sense of it ever since…

Our collective future now hinges on whether companies – in the privacy of their board rooms, executive meetings, and closed-door strategy sessions – decide to do what is right. Companies need a clear North Star to which they can always refer as they pursue innovation. Google had it right in its early days, when its corporate credo was, “Don’t Be Evil.” No corporation should knowingly harm people in the pursuit of profit.

It will not be enough for companies simply to say that they have hired former regulators and propose possible solutions. Companies must devise credible and effective AI action plans that answer five key questions:

  • What are the potential unanticipated consequences of AI?
  • How are you mitigating each identified risk?
  • What measures can regulators use to monitor companies’ efforts to mitigate potential dangers and hold them accountable?
  • What resources do regulators need to carry out this task?
  • How will we know that the guardrails are working?

The AI challenge needs to be treated like any other corporate sprint. Requiring companies to commit to an action plan in 90 days is reasonable and realistic. No excuses. Missed deadlines should result in painful fines. The plan doesn’t have to be perfect – and it will likely need to be adapted as we continue to learn – but committing to it is essential…(More)”.

Journalism Is a Public Good and Should Be Publicly Funded


Essay by Patrick Walters: “News deserts” have proliferated across the U.S. Half of the nation’s more than 3,140 counties now have only one newspaper—and nearly 200 of them have no paper at all. Of the publications that survive, researchers have found many are “ghosts” of their former selves.

Journalism has problems nationally: CNN announced hundreds of layoffs at the end of 2022, and National Geographic laid off the last of its staff writers this June. In the latter month the Los Angeles Times cut 13 percent of its newsroom staff. But the crisis is even more acute at the local level, with jobs in local news plunging from 71,000 in 2008 to 31,000 in 2020. Closures and cutbacks often leave people without reliable sources that can provide them with what the American Press Institute has described as “the information they need to make the best possible decisions about their daily lives.”

Americans need to understand that journalism is a vital public good—one that, like roads, bridges and schools, is worthy of taxpayer support. We are already seeing the disastrous effects of otherwise allowing news to disintegrate in the free market: namely, a steady supply of misinformation, often masquerading as legitimate news, and too many communities left without a quality source of local news. Former New York Times public editor Margaret Sullivan has a called this a “crisis of American democracy.”

The terms “crisis” and “collapse” have become nearly ubiquitous in the past decade when describing the state of American journalism, which has been based on a for-profit commercial model since the rise of the “penny press” in the 1830s. Now that commercial model has collapsed amid the near disappearance of print advertising. Digital ads have not come close to closing the gap because Google and other platforms have “hoovered up everything,” as Emily Bell, founding director of the Tow Center for Journalism at Columbia University, told the Nieman Journalism Lab in a 2018 interview. In June the newspaper chain Gannett sued Google’s parent company, alleging it has created an advertising monopoly that has devastated the news industry.

Other journalism models—including nonprofits such as MinnPost, collaborative efforts such Broke in Philly and citizen journalism—have had some success in fulfilling what Lewis Friedland of the University of Wisconsin–Madison called “critical community information needs” in a chapter of the 2016 book The Communication Crisis in America, and How to Fix It. Friedland classified those needs as falling in eight areas: emergencies and risks, health and welfare, education, transportation, economic opportunities, the environment, civic information and political information. Nevertheless, these models have proven incapable of fully filling the void, as shown by the dearth of quality information during the early years of the COVID pandemic. Scholar Michelle Ferrier and others have worked to bring attention to how news deserts leave many rural and urban areas “impoverished by the lack of fresh, daily local news and information,” as Ferrier wrote in a 2018 article. A recent study also found evidence that U.S. judicial districts with lower newspaper circulation were likely to see fewer public corruption prosecutions.

growing chorus of voices is now calling for government-funded journalism, a model that many in the profession have long seen as problematic…(More)”.

Why This AI Moment May Be the Real Deal


Essay by Ari Schulman: “For many years, those in the know in the tech world have known that “artificial intelligence” is a scam. It’s been true for so long in Silicon Valley that it was true before there even was a Silicon Valley.

That’s not to say that AI hadn’t done impressive things, solved real problems, generated real wealth and worthy endowed professorships. But peek under the hood of Tesla’s “Autopilot” mode and you would find odd glitches, frustrated promise, and, well, still quite a lot of people hidden away in backrooms manually plugging gaps in the system, often in real time. Study Deep Blue’s 1997 defeat of world chess champion Garry Kasparov, and your excitement about how quickly this technology would take over other cognitive work would wane as you learned just how much brute human force went into fine-tuning the software specifically to beat Kasparov. Read press release after press release of FacebookTwitter, and YouTube promising to use more machine learning to fight hate speech and save democracy — and then find out that the new thing was mostly a handmaid to armies of human grunts, and for many years relied on a technological paradigm that was decades old.

Call it AI’s man-behind-the-curtain effect: What appear at first to be dazzling new achievements in artificial intelligence routinely lose their luster and seem limited, one-off, jerry-rigged, with nothing all that impressive happening behind the scenes aside from sweat and tears, certainly nothing that deserves the name “intelligence” even by loose analogy.

So what’s different now? What follows in this essay is an attempt to contrast some of the most notable features of the new transformer paradigm (the T in ChatGPT) with what came before. It is an attempt to articulate why the new AIs that have garnered so much attention over the past year seem to defy some of the major lines of skepticism that have rightly applied to past eras — why this AI moment might, just might, be the real deal…(More)”.

Wikipedia’s Moment of Truth


Article by Jon Gertner at the New York Times: “In early 2021, a Wikipedia editor peered into the future and saw what looked like a funnel cloud on the horizon: the rise of GPT-3, a precursor to the new chatbots from OpenAI. When this editor — a prolific Wikipedian who goes by the handle Barkeep49 on the site — gave the new technology a try, he could see that it was untrustworthy. The bot would readily mix fictional elements (a false name, a false academic citation) into otherwise factual and coherent answers. But he had no doubts about its potential. “I think A.I.’s day of writing a high-quality encyclopedia is coming sooner rather than later,” he wrote in “Death of Wikipedia,” an essay that he posted under his handle on Wikipedia itself. He speculated that a computerized model could, in time, displace his beloved website and its human editors, just as Wikipedia had supplanted the Encyclopaedia Britannica, which in 2012 announced it was discontinuing its print publication.

Recently, when I asked this editor — he asked me to withhold his name because Wikipedia editors can be the targets of abuse — if he still worried about his encyclopedia’s fate, he told me that the newer versions made him more convinced that ChatGPT was a threat. “It wouldn’t surprise me if things are fine for the next three years,” he said of Wikipedia, “and then, all of a sudden, in Year 4 or 5, things drop off a cliff.”..(More)”.

‘Not for Machines to Harvest’: Data Revolts Break Out Against A.I.


Article by Sheera Frenkel, and Stuart A. Thompson: “Fan fiction writers are just one group now staging revolts against A.I. systems as a fever over the technology has gripped Silicon Valley and the world. In recent months, social media companies such as Reddit and Twitter, news organizations including The New York Times and NBC News, authors such as Paul Tremblay and the actress Sarah Silverman have all taken a position against A.I. sucking up their data without permission.

Their protests have taken different forms. Writers and artists are locking their files to protect their work or are boycotting certain websites that publish A.I.-generated content, while companies like Reddit want to charge for access to their data. At least 10 lawsuits have been filed this year against A.I. companies, accusing them of training their systems on artists’ creative work without consent. This past week, Ms. Silverman and the authors Christopher Golden and Richard Kadrey sued OpenAI, the maker of ChatGPT, and others over A.I.’s use of their work.

At the heart of the rebellions is a newfound understanding that online information — stories, artwork, news articles, message board posts and photos — may have significant untapped value.

The new wave of A.I. — known as “generative A.I.” for the text, images and other content it generates — is built atop complex systems such as large language models, which are capable of producing humanlike prose. These models are trained on hoards of all kinds of data so they can answer people’s questions, mimic writing styles or churn out comedy and poetry.

That has set off a hunt by tech companies for even more data to feed their A.I. systems. Google, Meta and OpenAI have essentially used information from all over the internet, including large databases of fan fiction, troves of news articles and collections of books, much of which was available free online. In tech industry parlance, this was known as “scraping” the internet…(More)”.

Russia Is Trying to Leave the Internet and Build Its Own


Article by Timmy Broderick: “Last week the Russian government tried to disconnect its Internet infrastructure from the larger global Web. This test of Russia’s “sovereign Internet” seemingly failed, causing outages that suggest the system is not ready for practical use.

“Sovereign Internet is not really a whole different Internet; it is more like a project that uses various tools,” says Natalia Krapiva, tech-legal counsel at the international digital-rights nonprofit Access Now. “It involves technology like deep packet inspection, which allows major filtering of the Internet and gives governments the ability to throttle certain connections and websites.” By cutting off access to sites such as Western social media platforms, the Russian government could restrict residents from viewing any source of information other than the country’s accepted channels of influence.

This method of curtailing digital freedom goes beyond Russia: other countries are also attempting to develop their own nationwide Internet. And if successful, these endeavors could fragment the World Wide Web. Scientific American talked with Krapiva over Zoom about the implications of this latest test, the motive behind Russia’s actions and the ways the push for a sovereign Internet affect the digital rights of all users…(More)”.

Digital divides are lower in Smart Cities


Paper by Andrea Caragliu and Chiara F. Del Bo: “Ever since the emergence of digital technologies in the early 1990s, the literature has discussed the potential pitfalls of an uneven distribution of e-skills under the umbrella of the digital divide. To provide a definition of the concept, “Lloyd Morrisett coined the term digital divide to mean “a discrepancy in access to technology resources between socioeconomic groups” (Robyler and Doering, 2014, p. 27)

Despite digital divide being high on the policy agenda, statistics suggest the persisting relevance of this issue. For instance, focusing on Europe, according to EUROSTAT statistics, in 2021 about 90 per cent of people living in Zeeland, a NUTS2 region in the Netherlands, had ordered at least once in their life goods or services over the internet for private use, against a minimum in the EU27 of 15 per cent (in the region of Yugoiztochen, in Bulgaria). In the same year, while basically all (99 per cent) interviewees in the NUTS2 region of Northern and Western Ireland declared using the internet at least once a week, the same statistic drops to two thirds of the sample in the Bulgarian region of Severozapaden. While over time these territorial divides are converging, they can still significantly affect the potential positive impact of the diffusion of digital technologies.

Over the past three years, the digital divide has been made dramatically apparent by the COVID-19 pandemic outbreak. When, during the first waves of full lockdowns enacted in most Countries, tertiary and schooling activities were moved online, many economic outcomes showed significant worsening. Among these, learning outcomes in pupils and service sectors’ productivity were particularly affected.

A simultaneous development in the scientific literature has discussed the attractive features of planning and managing cities ‘smartly’. Smart Cities have been initially identified as urban areas with a tendency to invest and deploy ICTs. More recently, this notion also started to encompass the context characteristics that make a city capable of reaping the benefits of ICTs – social and human capital, soft and hard institutions.

While mounting empirical evidence suggests a superior economic performance of Cities ticking all these boxes, the Smart City movement did not come without critiques. The debate on urban smartness as an instrument for planning and managing more efficient cities has been recently positing that Smart Cities could be raising inequalities. This effect would be due to the role of driver of smart urban transformations played by multinational corporations, who, in a dystopic view, would influence local policymakers’ agendas.

Given these issues, and our own research on Smart Cities, we started asking ourselves whether the risks of increasing inequalities associated with the Smart City model were substantiated. To this end, we focused on empirically verifying whether cities moving forward along the smart city model were facing increases in income and digital inequalities. We answered the first question in Caragliu and Del Bo (2022), and found compelling evidence that smart city characteristics actually decrease income inequalities…(More)”.

How do we know how smart AI systems are?


Article by Melanie Mitchell: “In 1967, Marvin Minksy, a founder of the field of artificial intelligence (AI), made a bold prediction: “Within a generation…the problem of creating ‘artificial intelligence’ will be substantially solved.” Assuming that a generation is about 30 years, Minsky was clearly overoptimistic. But now, nearly two generations later, how close are we to the original goal of human-level (or greater) intelligence in machines?

Some leading AI researchers would answer that we are quite close. Earlier this year, deep-learning pioneer and Turing Award winner Geoffrey Hinton told Technology Review, “I have suddenly switched my views on whether these things are going to be more intelligent than us. I think they’re very close to it now and they will be much more intelligent than us in the future.” His fellow Turing Award winner Yoshua Bengio voiced a similar opinion in a recent blog post: “The recent advances suggest that even the future where we know how to build superintelligent AIs (smarter than humans across the board) is closer than most people expected just a year ago.”

These are extraordinary claims that, as the saying goes, require extraordinary evidence. However, it turns out that assessing the intelligence—or more concretely, the general capabilities—of AI systems is fraught with pitfalls. Anyone who has interacted with ChatGPT or other large language models knows that these systems can appear quite intelligent. They converse with us in fluent natural language, and in many cases seem to reason, to make analogies, and to grasp the motivations behind our questions. Despite their well-known unhumanlike failings, it’s hard to escape the impression that behind all that confident and articulate language there must be genuine understanding…(More)”.

To Save Society from Digital Tech, Enable Scrutiny of How Policies Are Implemented


Article by Ido Sivan-Sevilla: “…there is little discussion about how to create accountability when implementing tech policies. Decades of research exploring policy implementation across diverse areas consistently shows how successful implementation allows policies to be adapted and involves crucial bargaining. But this is rarely understood in the tech sector. For tech policies to work, those responsible for enforcement and compliance should be overseen and held to account. Otherwise, as history shows, tech policies will struggle to fulfill the intentions of their policymakers.

Scrutiny is required for three types of actors. First are regulators, who convert promising tech laws into enforcement practices but are often ill-equipped for their mission. My recent research found that across Europe, the rigor and methods of national privacy regulators tasked with enforcing the European Union’s GDPR vary greatly. The French data protection authority, for instance, proactively monitors for privacy violations and strictly sanctions companies that overstep; in contrast, Bulgarian authorities monitor passively and are hesitant to act. Reflecting on the first five years of the GDPR, Max Schrems, the chair of privacy watchdog NOYB, found authorities and courts reluctant to enforce the law, and companies free to take advantage: “It often feels like there is more energy spent in undermining the GDPR than in complying with it.” Variations in resources and technical expertise among regulators create regulatory arbitrage that the regulated eagerly exploit.

Tech companies are the second type of actor requiring scrutiny. Service providers such as Goolge, Meta, and Twitter, along with lesser-known technology companies, mediate digital services for billions around the world but enjoy considerable latitude on how and whether they comply with tech policies. Civil society groups, for instance, uncovered how Meta was trying to bypass the GDPR and use personal information for advertising…(More)”.