Paper by Mihály Fazekas et al: “One-third of total government spending across the globe goes to public procurement, amounting to about 10 trillion dollars a year. Despite its vast size and crucial importance for economic and political developments, there is a lack of globally comparable data on contract awards and tenders run. To fill this gap, this article introduces the Global Public Procurement Dataset (GPPD). Using web scraping methods, we collected official public procurement data on over 72 million contracts from 42 countries between 2006 and 2021 (time period covered varies by country due to data availability constraints). To overcome the inconsistency of data publishing formats in each country, we standardized the published information to fit a common data standard. For each country, key information is collected on the buyer(s) and supplier(s), geolocation information, product classification, price information, and details of the contracting process such as contract award date or the procedure type followed. GPPD is a contract-level dataset where specific filters are calculated allowing to reduce the dataset to the successfully awarded contracts if needed. We also add several corruption risk indicators and a composite corruption risk index for each contract which allows for an objective assessment of risks and comparison across time, organizations, or countries. The data can be reused to answer research questions dealing with public procurement spending efficiency among others. Using unique organizational identification numbers or organization names allows connecting the data to company registries to study broader topics such as ownership networks…(More)”.
How to contribute:
Did you come across – or create – a compelling project/report/book/app at the leading edge of innovation in governance?
Share it with us at info@thelivinglib.org so that we can add it to the Collection!
About the Curator
Get the latest news right in your inbox
Subscribe to curated findings and actionable knowledge from The Living Library, delivered to your inbox every Friday
Related articles
Artificial Intelligence, Collection, DATA
Artificial IntelligenceDATA
Artificial Intelligence
DATA
A Large-Language-Model Framework for Automated Humanitarian Situation Reporting
Posted in March 11, 2026 by Stefaan Verhulst
Artificial Intelligence, Collection, DATA
Artificial IntelligenceDATA
Artificial Intelligence
DATA
AI agents are coming for government. How one big city is letting them in
Posted in March 10, 2026 by Stefaan Verhulst
Artificial Intelligence, Collection, DATA
Artificial IntelligenceDATA
Artificial Intelligence
DATA
The train has left the station: Agentic AI and the future of social science research
Posted in March 4, 2026 by Stefaan Verhulst