中国麻风皮肤病杂志 ›› 2025, Vol. 41 ›› Issue (12): 858-864.doi: 10.12144/zgmfskin202512858

• 论著 • 上一篇    下一篇

基于机器学习构建银屑病转归银屑病关节炎的预测模型

吴楷1,张强1,薛亚东2,赵雅洁3,孟博1,杨秋红1,邓伟哲1,赵凤莲1   

  1. 1中国人民解放军联勤保障部队第九六二医院,黑龙江哈尔滨,150006; 2哈尔滨医科大学附属第一医院,黑龙江哈尔滨,150007; 3哈尔滨医科大学药学院,黑龙江哈尔滨,150076
  • 出版日期:2025-12-15 发布日期:2025-11-27

A machine learning-based prediction model for psoriasis to psoriatic arthritis

WU Kai1, ZHANG Qiang1, XUE Yadong2, ZHAO Yajie3, MENG Bo1, YANG Qiuhong1, DENG Weizhe1, ZHAO Fenglian1   

  1. 1 The 962nd Hospital of the Chinese PLA, Harbin 150006, China; 2 The First Affiliated Hospital of Harbin Medical University, Harbin 150007, China; 3 College of Pharmacy, Harbin Medical University, Harbin 150076, China
  • Online:2025-12-15 Published:2025-11-27

摘要: 目的:探讨银屑病(psoriasis, PsO)患者发生银屑病关节炎(psoriatic arthritis, PsA)的风险因素,并构建预测模型。 方法:本研究纳入NHANES PsO患者数据作为训练集,中国医院PsO患者数据作为验证集。通过单因素和多因素向后逐步回归筛选变量,通过绘制受试者工作曲线(ROC),校准曲线和决策曲线(DCA)评估区分度,预测概率和临床获益。绘制列线图并进行合理性分析。选择4种机器学习模型再次进行评估适用度和综合性能。结果:最终训练集纳入328例PsO患者,验证集纳入306例PsO患者,训练集PsA与non-PsA患者年龄、性别、平均收缩压,高血压和CKD患病率、尿蛋白、糖化血红蛋白,存在显著的统计学差异(P<0.05)。对训练集进行多因素回归分析,最终纳入年龄、性别、高血压、平均收缩压、中性粒细胞计数、谷草转氨酶、葡萄糖、总胆固醇、高密度脂蛋白9项构建预测模型。训练集(AUC=0.741)和验证集(AUC=0.694)区分度具有较好的一致性。Nomogram总评分比单一变量获益更高。4种机器学习模型中决策树模型区分度最佳(AUC=0.886),敏感性、特异性稳健,准确性稍差。结论:本研究明确PsO向PsA转归的9项影响因素,成功构建预测模型,有助于临床医生区分高、低风险人群,提高疾病管理效率。

关键词: 银屑病, 银屑病关节炎, 机器学习, 预测模型, 决策树

Abstract: Objective: To investigate risk factors for the development of psoriatic arthritis (PsA) in patients with psoriasis (PsO) and to construct a predictive model. Methods: Data from PsO patients in the National Health and Nutrition Examination Survey (NHANES) were utilized as the training set (n=328), while data from Chinese hospital-based PsO patients served as the validation set (n=306). Predictor variables were screened using univariate analysis and multivariate backward stepwise regression. The model's discriminatory power, predictive accuracy, and clinical utility were assessed via receiver operating characteristic (ROC) curves, calibration curves, and decision curve analysis (DCA). A nomogram was developed and subjected to rationality analysis. Additionally, four machine learning models were employed to evaluate applicability and overall performance. Results: Significant differences (P<0.05) were observed between PsA and non-PsA patients in the training set regarding age, sex, mean systolic blood pressure, prevalence of hypertension and chronic kidney disease, urine protein, and glycated hemoglobin. Nine variables were ultimately incorporated into the final model: age, sex, hypertension, mean systolic blood pressure, neutrophil count, aspartate aminotransferase, glucose, total cholesterol, and high-density lipoprotein cholesterol. The model demonstrated consistent discriminatory ability between the training set (AUC=0.741) and the validation set (AUC=0.694), albeit with observed bias in predictive probability and a constrained range of clinical net benefit. The nomogram's total score yielded greater clinical net benefit compared to individual variables. Among the machine learning models evaluated, the decision tree algorithm exhibited superior discrimination (AUC=0.886) with robust sensitivity and specificity, despite marginally lower accuracy. Conclusion: The prediction model developed in this study provides a convenient and effective tool for stratifying PsA risk in PsO patients, facilitating the identification of high-risk individuals for enhanced clinical monitoring and personalized disease management strategies.

Key words: psoriasis, psoriatic arthritis, machine learning, prediction model, decision tree