Learning from sanctioned government suppliers: a machine learning and network science approach to detecting fraud and corruption in Mexico

Medina-Hernández, M., Kertész, J. & Fazekas, M. (2026) Learning from sanctioned government suppliers: a machine learning and network science approach to detecting fraud and corruption in Mexico. Scientific Reports

Detecting fraud and corruption in public procurement remains a major challenge for governments worldwide. Most research to-date builds on domain-knowledge-based corruption risk indicators of individual contract-level features and some also analyses contracting network patterns. A critical barrier for supervised machine learning is the absence of confirmed non-corrupt (negative) examples, which makes conventional machine learning inappropriate for this task. Using publicly available data on federally funded procurement in Mexico and company sanction records, this study implements positive–unlabeled (PU) learning algorithms that integrate domain-knowledge-based red flags with network-derived features to identify likely corrupt and fraudulent contracts. The best-performing PU model on average captures 32% more known positives than random guessing and has a minimum of 20% precision on the top 5% of the predictions, substantially outperforming approaches based solely on traditional red flags. The analysis of the Shapley Additive Explanations reveals that network-derived features–particularly those associated with contracts in the network core or suppliers with high eigenvector centrality–are the most important. Traditional red flags further enhance model performance in line with expectations, albeit mainly for contracts awarded through competitive tenders. This methodology can support law enforcement in Mexico, and it can be adapted to other national contexts too.

Read the full article here.