Government Transparency Institute (2025) Public procurement data processing (Version 1.0).
Understanding government spending requires standardised and cross-country comparable data. This technical report accompanies GTI’s Global Public Procurement Dataset (GPPD) publication and explains how public procurement announcements are collected, parsed, cleaned, matched and mastered. The process starts with comprehensive source mapping and the automated scraping of HTML portals, XML feeds, APIs and CSV dumps. Each publication is then parsed into a unified JSON template. The cleaning process then converts text values …