Equity and Generalizability of Artificial Intelligence for Skin-Lesion Diagnosis Using Clinical, Dermoscopic, and Smartphone Images: A Systematic Review and Meta-Analysis.
Background and Objectives: Artificial intelligence (AI) has shown promising performance in skin-lesion classification; however, its fairness, external validity, and real-world reliability remain uncertain. This systematic review and meta-analysis evaluated the diagnostic accuracy, equity, and generalizability of AI-based dermatology systems across diverse imaging modalities and clinical settings. Materials and Methods: A comprehensive search of PubMed, Embase, Web of Science, and ClinicalTrials.gov (inception-31 October 2025) identified diagnostic accuracy studies using clinical, dermoscopic, or smartphone images. Eighteen studies (11 melanoma-focused; 7 mixed benign-malignant) met inclusion criteria. Six studies provided complete 2 × 2 contingency data for bivariate Reitsma HSROC modeling, while seven reported AUROC values with extractable variance. Risk of bias was assessed using QUADAS-2, and evidence certainty was graded using GRADE. Results: Across more than 70,000 test images, pooled sensitivity and specificity were 0.91 (95% CI 0.74-0.97) and 0.64 (95% CI 0.47-0.78), respectively, corresponding to an HSROC AUROC of 0.88 (95% CI 0.84-0.92). The AUROC-only meta-analysis yielded a similar pooled AUROC of 0.88 (95% CI 0.87-0.90). Diagnostic performance was highest in specialist settings (AUROC 0.90), followed by community care (0.85) and smartphone environments (0.81). Notably, performance was lower in darker skin tones (Fitzpatrick IV-VI: AUROC 0.82) compared with lighter skin tones (I-III: 0.89), indicating persistent fairness gaps. Conclusions: AI-based dermatology systems achieve high diagnostic accuracy but demonstrate reduced performance in darker skin tones and non-specialist environments. These findings emphasize the need for diverse training datasets, skin-tone-stratified reporting, and rigorous external validation before broad clinical deployment.