Research on machine learning-based clinical prediction models: a bibliometric analysis.

Machine learning (ML) has emerged as a transformative approach for developing high-performance clinical prediction models (CPMs). By leveraging multidimensional patient data, ML enables more accurate disease risk stratification, prognostic assessment, and clinical decision-making. In recent years, research on CPMs has expanded rapidly, with nearly 250,000 publications indexed as of 2024. Despite this remarkable growth, a comprehensive bibliometric analysis of the field is currently lacking.

This study aimed to analyze the global research status, evolutionary trends, and thematic hotspots of machine learning-based clinical prediction models (ML-CPMs) through bibliometric and visualization techniques.

Publications related to ML-CPMs were retrieved from the Web of Science Core Collection and the Scopus database (up to May 9, 2025). Bibliometric analyses were performed using various tools, including R, VOSviewer, and CiteSpace, to generate annual publication trends, collaboration networks, and journal distributions, as well as co-citation, clustering, and keyword analyses.

A total of 8,619 publications (8,000 original articles and 619 reviews) from 118 countries were identified. Since 2015, annual publications have grown exponentially ( = 0.9919). While China led in total publication volume, the United States maintained the highest academic influence (H-index = 105; Total Citations = 66,788). Harvard University and BMC Medical Informatics and Decision Making emerged as the most productive institution and journal, respectively. Tian J from the Chinese Academy of Sciences led in publication count, while Wynants L from KU Leuven in Belgium recorded the highest citation frequency. Key research hotspots include algorithm optimization, multimodal data integration, and model interpretability, with clinical applications primarily focused on oncology, cardiovascular diseases, and critical care medicine.

Research on ML-CPMs has experienced rapid global growth over the past decade, forming extensive international collaboration networks. However, challenges such as limited interpretability, data heterogeneity, and privacy concerns persist. Future studies should prioritize external validation, clinical applicability, and the integration of human-AI collaborative decision-making to ensure robust implementation in real-world clinical settings.
Cardiovascular diseases
Care/Management

Authors

Li Li, Zeng Zeng, Liang Liang, Zheng Zheng, Zhang Zhang, Martin-Payo Martin-Payo
View on Pubmed
Share
Facebook
X (Twitter)
Bluesky
Linkedin
Copy to clipboard