Explore our articles
View All Results
Share:

AI and the future of policy evaluation

Blog by PUBLIC: “Across the M&E lifecycle, we are already seeing real, deployable applications of AI tools.

  • AI-assisted evidence synthesis is probably the most mature area. Tools can now search, screen, and summarise bodies of literature at a scale that would take human teams weeks. For evaluation teams scoping a new programme area, or interested in exploring what some other field could say about their topic, this is genuinely useful today.

A recent example of this is the development of systems like InsightAgent, a multi-agent framework designed for complex systematic reviews. Researchers demonstrated that this tool could partition a massive amount of literature, read and synthesise findings, and draft a rigorous review in just 1.5 hours – a process that traditionally takes months to complete manually. Researchers could also visually monitor the AI’s reading trajectory, adjust its inclusion criteria, and verify its sources in real-time.

  • AI-led qualitative interviews – including voice – have been shown to generate substantially richer responses than conventional open text fields. For public sector evaluations, the possibility of running qualitative research at a fraction of the cost is a meaningful shift. Similarly, these practices are effective where there are multiple layers of governance  – such as evaluation framework development and qualitative evaluations of ‘unmonetisable’ outcomes, as per the Green Book.

For example, PUBLIC recently utilised Salomo to conduct user research for a major public sector project. Traditionally, gathering and synthesising user research at this scale would take a team of multiple researchers many months to complete. However, by leveraging Salomo’s agentic capabilities, a team of just two researchers was able to process, code, and extract insights from 100 interviews in less than a week.

  • Getting to concrete outputs and models more quickly. Analysis and reporting workflows are starting to allow evaluators to go from a research question to a documented, reproducible output – with code, findings, and visualisations – in a fraction of the time previously required.

For example, AI Scientist-V2, is a system capable of automating the scientific research lifecycle. Given a high-level prompt, the agent autonomously formulates hypotheses, writes and debugs experiment code, visualises data, and drafts a complete manuscript in under 15 hours. It also recently produced a research paper that successfully passed a double-blind peer review.

While public sector policy evaluation has its own unique complexities and stakeholder dynamics, the implication is clear. These are tools that can handle the heavy mechanical execution – running the econometrics, generating charts, and drafting technical annexes – freeing up evaluators to focus on the harder interpretive questions and policy implications…(More)”.

Share
How to contribute:

Did you come across – or create – a compelling project/report/book/app at the leading edge of innovation in governance?

Share it with us at info@thelivinglib.org so that we can add it to the Collection!

About the Curator

Get the latest news right in your inbox

Subscribe to curated findings and actionable knowledge from The Living Library, delivered to your inbox every Friday

Related articles

Get the latest news right in your inbox

Subscribe to curated findings and actionable knowledge from The Living Library, delivered to your inbox every Friday