Machine learning methods in real-world studies of cardiovascular disease-Deep Science

Machine learning methods in real-world studies of cardiovascular disease

Preprint |

Published: 10 Mar 2023

10.55415/deep-2023-0019.v1

This is not the most recent version. There is anewer versionof this content available.

Jiawei Zhou#

Department of Biostatistics, School of Public Health, Nanjing Medical University, Nanjing, Jiangsu, China, 211166

Dongfang You#

Department of Biostatistics, School of Public Health, Nanjing Medical University, Nanjing, Jiangsu, China, 211166

Jianling Bai

Department of Biostatistics, School of Public Health, Nanjing Medical University, Nanjing, Jiangsu, China, 211166

Xin Chen

Department of Biostatistics, School of Public Health, Nanjing Medical University, Nanjing, Jiangsu, China, 211166

Yaqian Wu

Department of Biostatistics, School of Public Health, Nanjing Medical University, Nanjing, Jiangsu, China, 211166

Zhongtian Wang

Department of Biostatistics, School of Public Health, Nanjing Medical University, Nanjing, Jiangsu, China, 211166

Yingdan Tang

Department of Biostatistics, School of Public Health, Nanjing Medical University, Nanjing, Jiangsu, China, 211166

Yang Zhao*

Department of Biostatistics, School of Public Health, Nanjing Medical University, Nanjing, Jiangsu, China, 211166

Guoshuang Feng*

Big Data Center, Beijing Children’s Hospital, Capital Medical University, National Center for Children's Health, Beijing, China, 100045++Beijing Advanced Innovation Center for Big Data-Based Precision Medicine, Beihang University & Capital Medical University, Beijing, China, 100083

# contributed equally to this work, * Corresponding author

Abstract

Objective: Cardiovascular disease (CVD) is one of the leading causes of death worldwide and multiple questions urgently need answering, especially in risk identification and prognosis prediction. Real-world study (RWS), with huge numbers of observations, is an important data basis for CVD research, but it is constrained by high dimensionality, missing, and unstructured data. Machine learning (ML) methods, including a variety of supervised and unsupervised algorithms, are useful for data governance and effective for high dimensional data analysis and imputation in the real-world study. This study reviewed the theory, strength, limitation, and application of several popular ML methods in the CVD field as a reference for further application.

Methods: This study introduced the origin, purpose, theory, superiorities, limitations, and applications of multiple popular ML algorithms, including hierarchical and k-means clustering, principal component analysis, random forest, support vector machine, and neural networks. An example using the Systolic Blood Pressure Intervention Trial (SPRINT) data was performed with the random forest to demonstrate the process and main results of ML application in CVD.

Conclusion: ML methods are effective tools to produce real-world evidence to support clinical decisions and meet clinical needs. This review explains the principles of multiple ML methods in an easy-to-understand language and could be a reference for further application. Future research is warranted for accurate ensemble learning methods and wide application in the medical field.

Download PDF

CITE

Comments

Download PDF CITE

DOI

10.55415/deep-2023-0019.v1

Keywords

Cardiovascular disease ; Machine learning ; Real-world study

Subject Area

Artificial Intelligence & Robotics ; General Medical Research

Now Published

RECORDABSTRACTARTICLE Machine Learning Methods in Real-World Studies of Cardiovascular Disease

24 Mar 2023

Version History

10 Mar 2023 15:03 Version 1

Scores

4.5

Rapid Rating

Your professional field is different from the direction of this article. Go Settings!

Level of Quality
Is the publication of relevance for the academic community and does it provide important insights? Is the language correct and easy to understand for an academic in the field? Are the figures well displayed and captions properly described? Is the article systematically and logically organized?

0.0
Level of Repeatability
Is the hypothesis clearly formulated? Is the argumentation stringent? Are the data sound, well-controlled and statistically significant? Is the interpretation balanced and supported by the data? Are appropriate and state-of-the-art methods used?

0.0
Level of Innovation
Does the work represent a novel approach or new findings in comparison with other publications in the field?

0.0
Level of Impact
Does the work have potential huge impact to the related research area?

0.0

Submit

Metrics

Abstracts

2269
PDF Downloads

867

Reviewer's Comments on this Article.

Not yet Peer Reviewed

License

The content is available under CC BY 4.0 License CreativeCommons.org

Competing Interest Statement

The author(s) have declared they have no conflict of interest with regard to this content

Machine learning methods in real-world studies of cardiovascular disease

Abstract

Rapid Rating Times: 1