A Comparison of Three Research Methods: Logistic Regression, Decision Tree, and Random Forest to Reveal Association of Type 2 Diabetes with Risk Factors and Classify Subjects in a Military Population

AuthorMohammad Sahebhonaren
AuthorMehrzad Gholampour Dehakien
AuthorMohammad Hassan Kazemi-Galougahien
AuthorSaeed Soleiman-Meigoonien
OrcidMehrzad Gholampour Dehaki [0000-0001-8381-5122]en
OrcidMohammad Hassan Kazemi-Galougahi [0000-0003-4601-6457]en
OrcidSaeed Soleiman-Meigooni [0000-0001-5641-7380]en
Issued Date2022-06-30en
AbstractBackground: Type 2 diabetes mellitus (T2DM) is one of the major non-communicable diseases, causing morbidity and mortality worldwide. There is no study on T2DM status in Iran Army Forces. Objectives: We aimed to measure the prevalence of T2DM in this population and identify variables associated with T2DM risk in order to classify individuals. Methods: Data from 3661 Iran Army Ground Forces were employed. Characteristics of the subjects with and without T2DM were compared. We examined the classification ability of logistic regression with two tree-based supervised learning algorithms, decision tree and random forest (RF). The ethical committee of AJA University of Medical Sciences approved this study by the approval code 995685. Results: The prevalence of T2DM was 3% less than in the general population. Our results showed that the incidence of T2DM increases as subjects become older. The proportions of staff members with T2DM were more than the other military ranks. T2DM is more common in obese and overweight groups. The highest prevalence of T2DM is in the subjects with high levels of lipid profile. The areas below the receiver operating characteristic curve for logistic regression, decision tree, and RF were 73.8%, 77.1%, and 97.1%, respectively. Conclusions: Age, body mass index, total cholesterol, low-density lipoprotein cholesterol, and triglyceride are associated with T2DM risk. The RF has superior classification performance in comparison with logistic regression and decision tree.en
DOIhttps://doi.org/10.5812/jamm-118525en
KeywordDiabetesen
KeywordEpidemiologyen
KeywordRisk Managementen
KeywordQualitative Researchen
PublisherBrieflandsen
TitleA Comparison of Three Research Methods: Logistic Regression, Decision Tree, and Random Forest to Reveal Association of Type 2 Diabetes with Risk Factors and Classify Subjects in a Military Populationen
TypeResearch Articleen

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
jamm-10-2-118525.pdf
Size:
989.68 KB
Format:
Adobe Portable Document Format
Description:
Article/s PDF