How artificial intelligence can facilitate investigative journalism

Article by Luiz Fernando Toledo: “A few years ago, I worked on a project for a large Brazilian television channel whose objective was to analyze the profiles of more than 250 guardianship counselors in the city of São Paulo. These elected professionals have the mission of protecting the rights of children and adolescents in Brazil.

Critics had pointed out that some counselors did not have any expertise or prior experience working with young people and were only elected with the support of religious communities. The investigation sought to verify whether these elected counselors had professional training in working with children and adolescents or had any relationships with churches.

After requesting the counselors’ resumes through Brazil’s access to information law, a small team combed through each resume in depth—a laborious and time-consuming task. But today, this project might have required far less time and labor. Rapid developments in generative AI hold potential to significantly scale access and analysis of data needed for investigative journalism.

Many articles address the potential risks of generative AI for journalism and democracy, such as threats AI poses to the business model for journalism and its ability to facilitate the creation and spread of mis- and disinformation. No doubt there is cause for concern. But technology will continue to evolve, and it is up to journalists and researchers to understand how to use it in favor of the public interest.

I wanted to test how generative AI can help journalists, especially those that work with public documents and data. I tested several tools, including Ask Your PDF (ask questions to any documents in your computer), Chatbase (create your own chatbot), and Document Cloud (upload documents and ask GPT-like questions to hundreds of documents simultaneously).

These tools make use of the same mechanism that operates OpenAI’s famous ChatGPT—large language models that create human-like text. But they analyze the user’s own documents rather than information on the internet, ensuring more accurate answers by using specific, user-provided sources…(More)”.