Machine learning methods in real-world studies of cardiovascular disease

Preprint | 
10.55415/deep-2023-0019.v1
This is not the most recent version. There is anewer versionof this content available.
Jiawei Zhou#
Department of Biostatistics, School of Public Health, Nanjing Medical University, Nanjing, Jiangsu, China, 211166
Department of Biostatistics, School of Public Health, Nanjing Medical University, Nanjing, Jiangsu, China, 211166
Dongfang You#
Department of Biostatistics, School of Public Health, Nanjing Medical University, Nanjing, Jiangsu, China, 211166
Department of Biostatistics, School of Public Health, Nanjing Medical University, Nanjing, Jiangsu, China, 211166
Jianling Bai
Department of Biostatistics, School of Public Health, Nanjing Medical University, Nanjing, Jiangsu, China, 211166
Department of Biostatistics, School of Public Health, Nanjing Medical University, Nanjing, Jiangsu, China, 211166
Xin Chen
Department of Biostatistics, School of Public Health, Nanjing Medical University, Nanjing, Jiangsu, China, 211166
Department of Biostatistics, School of Public Health, Nanjing Medical University, Nanjing, Jiangsu, China, 211166
Yaqian Wu
Department of Biostatistics, School of Public Health, Nanjing Medical University, Nanjing, Jiangsu, China, 211166
Department of Biostatistics, School of Public Health, Nanjing Medical University, Nanjing, Jiangsu, China, 211166
Zhongtian Wang
Department of Biostatistics, School of Public Health, Nanjing Medical University, Nanjing, Jiangsu, China, 211166
Department of Biostatistics, School of Public Health, Nanjing Medical University, Nanjing, Jiangsu, China, 211166
Yingdan Tang
Department of Biostatistics, School of Public Health, Nanjing Medical University, Nanjing, Jiangsu, China, 211166
Department of Biostatistics, School of Public Health, Nanjing Medical University, Nanjing, Jiangsu, China, 211166
Yang Zhao*
Department of Biostatistics, School of Public Health, Nanjing Medical University, Nanjing, Jiangsu, China, 211166
Department of Biostatistics, School of Public Health, Nanjing Medical University, Nanjing, Jiangsu, China, 211166
Guoshuang Feng*
Big Data Center, Beijing Children’s Hospital, Capital Medical University, National Center for Children's Health, Beijing, China, 100045++Beijing Advanced Innovation Center for Big Data-Based Precision Medicine, Beihang University & Capital Medical University, Beijing, China, 100083
Big Data Center, Beijing Children’s Hospital, Capital Medical University, National Center for Children's Health, Beijing, China, 100045++Beijing Advanced Innovation Center for Big Data-Based Precision Medicine, Beihang University & Capital Medical University, Beijing, China, 100083

# contributed equally to this work, * Corresponding author


Abstract

Objective: Cardiovascular disease (CVD) is one of the leading causes of death worldwide and multiple questions urgently need answering, especially in risk identification and prognosis prediction. Real-world study (RWS), with huge numbers of observations, is an important data basis for CVD research, but it is constrained by high dimensionality, missing, and unstructured data. Machine learning (ML) methods, including a variety of supervised and unsupervised algorithms, are useful for data governance and effective for high dimensional data analysis and imputation in the real-world study. This study reviewed the theory, strength, limitation, and application of several popular ML methods in the CVD field as a reference for further application. 

Methods: This study introduced the origin, purpose, theory, superiorities, limitations, and applications of multiple popular ML algorithms, including hierarchical and k-means clustering, principal component analysis, random forest, support vector machine, and neural networks. An example using the Systolic Blood Pressure Intervention Trial (SPRINT) data was performed with the random forest to demonstrate the process and main results of ML application in CVD. 

Conclusion: ML methods are effective tools to produce real-world evidence to support clinical decisions and meet clinical needs. This review explains the principles of multiple ML methods in an easy-to-understand language and could be a reference for further application. Future research is warranted for accurate ensemble learning methods and wide application in the medical field.

Keywords
Subject Area
Now Published
Version History
  • 10 Mar 2023 15:03 Version 1
Scores
 4.5
Rapid Rating Times: 1
· Level of Quality: 4
· Level of Repeatability: 5
· Level of Innovation: 5
· Level of Impact: 4

*Each rating ranges from 0-5

Rapid Rating
Your professional field is different from the direction of this article. Go Settings!
  • Level of Quality
    Is the publication of relevance for the academic community and does it provide important insights? Is the language correct and easy to understand for an academic in the field? Are the figures well displayed and captions properly described? Is the article systematically and logically organized?
    0.0
  • Level of Repeatability
    Is the hypothesis clearly formulated? Is the argumentation stringent? Are the data sound, well-controlled and statistically significant? Is the interpretation balanced and supported by the data? Are appropriate and state-of-the-art methods used?
    0.0
  • Level of Innovation
    Does the work represent a novel approach or new findings in comparison with other publications in the field?
    0.0
  • Level of Impact
    Does the work have potential huge impact to the related research area?
    0.0
Submit

我们使用 cookie 将您与其他用户区分开来, 并在我们的网站上为您提供更好的体验。

关闭此消息以接受 cookie 或了解如何管理您的 cookie 设置。

了解更多关于我们的隐私声明..

goTop