Identifying Survival Subtypes of Esophageal Squamous Cell Carcinoma Patients: An Application of Deep Learning in Gene Expression Data Analysis

AuthorZahra Kousehlouen
AuthorEbrahim HajiZadehen
AuthorLeili Tapaken
AuthorAhmad Shalbafen
OrcidZahra Kousehlou [0000-0002-4807-4813]en
OrcidEbrahim HajiZadeh [0000-0001-7863-4837]en
OrcidLeili Tapak [0000-0002-4378-3143]en
OrcidAhmad Shalbaf [0000-0002-1595-7281]en
Issued Date2024-12-31en
AbstractBackground: Esophageal squamous cell carcinoma (ESCC) is one of the most lethal types of cancer. Late diagnosis significantly decreases patient survival rates. Objectives: The study aimed to identify survival groups for patients with ESCC and find predictive biomarkers of time-to-death from ESCC using state-of-the-art deep learning (DL) and machine learning algorithms. Methods: Expression profiles of 60 ESCC patients, along with their demographic and clinical variables, were downloaded from the GEO dataset. A DL autoencoder model was employed to extract lncRNA features. The univariate Cox proportional hazard (Cox-PH) model was used to select significant extracted features related to patient survival. Hierarchical clustering (HC) identified risk groups, followed by a decision trees algorithm which was used to identify lncRNA profiles. We used Python.3.7 and R.4.0.1 software. Results: Inputs of the autoencoder were 8,900 long noncoding RNAs (lncRNAs), of which 1000 features were extracted. Out of the features, 42 lncRNAs were significantly related to time-to-death using the Cox-PH model and used as input for clustering of patients into high and low-risk groups (P-value of log-rank test = 0.022). These groups were then labeled for supervised HC. The C5.0 algorithm achieved an overall accuracy of 0.929 on the test set and identified four hub lncRNAs associated with time-to-death. Conclusions: Novel discovered lncRNAs lnc-FAM84A-1, LINC01866, lnc-KCNE4-2 and lnc-NUDT12-4 implicated in the pathogenesis of death from ESCC. Our findings represent a significant advancement in understanding the role of lncRNAs on ESCC prognosis. Further research is necessary to confirm the potential and clinical application of these lncRNAs.en
DOIhttps://doi.org/10.5812/ijcm-145929en
KeywordEsophageal Squamous Cell Carcinomaen
KeywordDeep Learningen
KeywordMachine Learningen
KeywordSurvivalen
KeywordGene Expressionen
KeywordDecision Treesen
PublisherBrieflandsen
TitleIdentifying Survival Subtypes of Esophageal Squamous Cell Carcinoma Patients: An Application of Deep Learning in Gene Expression Data Analysisen
TypeResearch Articleen

Files