Minimal-resource AI Detects CKD in Type 2 Diabetes Patients Across Six LMICs with 90% Sensitivity
Minimal-data ML screened CKD in T2D across 6 LMICs: 90% sensitivity, AUC 0.63.
AI-Driven Early Detection of Chronic Kidney Disease in Type 2 Diabetes: Validating a Minimal-Resource Machine Learning Model Across Six LMICs
Chronic kidney disease (CKD) is a silent but escalating global health crisis, affecting more than 850 million people worldwide. Among individuals with Type 2 Diabetes (T2D), a key driver of CKD, over 40% eventually develop kidney impairment, highlighting the critical need for timely screening. Yet, in many low- and middle-income countries (LMICs), limited access to laboratory diagnostics and specialized screening tools hampers early identification, leaving substantial gaps in care.
In this context, machine learning (ML) offers a promising path forward. By leveraging easily obtainable clinical data, AI models can predict CKD risk without expensive or invasive procedures, making screening more accessible particularly where resources are scarce. A recent study led by Arkangel AI and AstraZeneca sought to externally validate such a minimal-resource ML model for CKD detection in T2D patients across six LMICs, demonstrating robust sensitivity of 90% and reasonable discriminative ability (AUC 0.63) in over 4,300 patients. This work reinforces the real-world potential of AI tools to augment CKD care in resource-limited settings.
Study Partnership and Context
This collaborative study brought together data scientists, clinicians, and medical affairs specialists from Arkangel AI and AstraZeneca, drawing on the iCaReMe Global Registry—a real-world evidence platform capturing comprehensive clinical information from primary care centers across Argentina, Mexico, Egypt, India, Malaysia, and the Philippines. These sites represent diverse healthcare environments in LMICs, where CKD screening remains underutilized due to infrastructural and economic barriers. Importantly, the study population reflected typical patients encountered in routine clinical practice, strengthening the applicability of the findings.
Study Design and Methodology
The investigation was a retrospective observational validation, employing data from 4,342 adult T2D patients collected between June 2020 and December 2021. Inclusion criteria mandated confirmed diagnosis of T2D with available clinical variables—including age, gender, diabetes duration, body mass index (BMI), blood pressure, history of hypertension—and serum creatinine to calculate estimated glomerular filtration rate (eGFR). Patients with prior cardiorenal complications were excluded to focus on early CKD detection.
Data were analyzed using Arkangel AI’s ensemble machine learning model, which predicts the probability of a patient’s eGFR falling below 60 ml/min/1.73m2—a threshold indicative of moderate to severe CKD. The model incorporates six non-invasive predictors readily accessible in primary care settings, simplifying implementation.
Performance was evaluated using established metrics: sensitivity, specificity, positive predictive value (PPV), accuracy, F1 score, and area under the receiver operating characteristic curve (AUC). The ROC curve was plotted to visualize the trade-off between sensitivity and false positive rate at different thresholds.
Key Results
- Patient demographics: Mean age was 57.6 years; 51.5% female; average diabetes duration 14.4 years; 33.6% had coexisting hypertension.
- Sensitivity: 90.05% (95% CI: 88.05%–92.04%), indicating excellent ability to identify patients with CKD.
- Specificity: Moderate at 36.11% (95% CI: 34.52%–37.71%), reflecting some false positives but prioritized detection.
- Positive predictive value (PPV): 25.93%, meaning approximately one in four flagged patients actually had CKD, consistent with screening tools designed to minimize missed cases.
- Overall accuracy: 46.84%, balancing sensitivity and specificity.
- F1 score: 40.27%, a harmonic mean reflecting good predictive performance.
- AUC: 63.08%, demonstrating fair discrimination between CKD and non-CKD patients.
Additionally, the confusion matrix reported 31.2% true positives and 17.1% true negatives, underscoring effective classification in a large heterogeneous cohort.
Interpretation and Implications
The high sensitivity of the AI model means it effectively identifies those at risk for CKD—crucial for a screening tool where missing cases could lead to delayed interventions and disease progression. Although specificity was lower, this trade-off is acceptable in early screening contexts, especially where follow-up confirmatory testing can refine diagnosis.
In practice, this model empowers primary care providers to stratify kidney disease risk among T2D patients using routine clinical information, circumventing the need for immediate lab testing that may be unavailable or costly. Earlier identification can prompt timely interventions such as glycemic control optimization, blood pressure management, and nephroprotective therapies, ultimately reducing CKD progression and associated complications.
Limitations to note include the model’s modest discriminative ability (AUC ~0.63), signaling room for improvement by integrating additional data sources or refining algorithms. Furthermore, the performance was evaluated retrospectively, and prospective real-world deployment studies are necessary to confirm clinical utility and impact on patient outcomes.
Deployment and Scalability
This ML model is well-suited for integration into electronic health record systems or decision support tools in primary care, particularly in LMICs where laboratory infrastructure is limited. Its reliance on non-invasive, easily captured variables facilitates rapid screening without disrupting workflow. Potential barriers include digital infrastructure variability, clinician training, and patient data privacy concerns, all of which require thoughtful program design.
Because the model was validated across multiple countries with diverse patient populations, it demonstrates adaptability and potential for broader application. The underlying AI platform could also be extended to other chronic disease screening scenarios, leveraging minimal data inputs for early detection in resource-constrained settings.
Conclusion and Next Steps
This study represents an important step toward democratizing CKD screening in patients with Type 2 Diabetes, confirming that a minimal-variable machine learning model can perform effectively across varied LMIC healthcare environments. With high sensitivity and practical usability, such AI tools could reshape early CKD detection, enabling preventive care at scale where it is most needed.
Future research should focus on prospective validations, model refinement incorporating additional biomarkers or imaging data, and integration into clinical workflows to evaluate effects on care delivery and patient outcomes. Continued collaboration between AI developers, clinicians, and health systems will be essential to translate these innovations into improved global kidney health.
References available upon request. For more information on the study, please refer to the iCaReMe registry reports and publications from Arkangel AI and AstraZeneca.