缺血性脑卒中亚型分型不仅对有效干预和治疗具有重要价值,而且对缺血性脑卒中的预后也很重要。疾病分类的手动判断耗时、容易出错,并且限制了对大型数据集的扩展。在这项研究中,使用集成的机器学习方法对国际卒中试验 (IST) 数据集上的缺血性卒中亚型进行分类。我们考虑了医学数据集中特征选择和预测的常见问题。首先利用Shapiro-Wilk算法对特征的重要性进行排序,分析特征之间的Pearson相关性。然后,我们使用带有交叉验证的递归特征消除(RFECV),它结合了线性 SVC、随机森林分类器、额外树分类器、AdaBoost 分类器、和 Multinomial-Naïve-Bayes-Classifier 分别作为估计器,以选择对缺血性卒中亚型重要的稳健特征。此外,所选特征的重要性由 Extra-Trees-Classifier 确定。最后,Extra-Trees-Classifier 和一个简单的深度学习模型使用选定的特征对 IST 数据集上的缺血性中风亚型进行分类。表明所描述的方法可以准确地对缺血性卒中亚型进行分类。结果表明,机器学习方法的表现优于人类专业人士。Extra-Trees-Classifier 和一个简单的深度学习模型使用所选特征对 IST 数据集上的缺血性中风亚型进行分类。表明所描述的方法可以准确地对缺血性卒中亚型进行分类。结果表明,机器学习方法的表现优于人类专业人士。Extra-Trees-Classifier 和一个简单的深度学习模型使用所选特征对 IST 数据集上的缺血性中风亚型进行分类。表明所描述的方法可以准确地对缺血性卒中亚型进行分类。结果表明,机器学习方法的表现优于人类专业人士。
Ischemic stroke subtyping was not only highly valuable for effective intervention and treatment, but also important to the prognosis of ischemic stroke. The manual adjudication of disease classification was time-consuming, error-prone, and limits scaling to large datasets. In this study, an integrated machine learning approach was used to classify the subtype of ischemic stroke on The International Stroke Trial (IST) dataset. We considered the common problems of feature selection and prediction in medical datasets. Firstly, the importances of features were ranked by the Shapiro-Wilk algorithm and Pearson correlations between features were analyzed. Then, we used Recursive Feature Elimination with Cross-Validation (RFECV), which incorporated linear SVC, Random-Forest-Classifier, Extra-Trees-Classifier, AdaBoost-Classifier, and Multinomial-Naïve-Bayes-Classifier as estimator respectively, to select robust features important to ischemic stroke subtyping. Furthermore, the importances of selected features were determined by Extra-Trees-Classifier. Finally, the selected features were used by Extra-Trees-Classifier and a simple deep learning model to classify the ischemic stroke subtype on IST dataset. It was suggested that the described method could classify ischemic stroke subtype accurately. And the result showed that the machine learning approaches outperformed human professionals.