Could the Utilization of Large Language Models Contribute to Magnetic Resonance Imaging Acquisition Protocols?

AuthorEren Çamuren
AuthorTuray Cesuren
AuthorYusuf Öztürken
AuthorMehmet Kutluen
AuthorHalil Karataşen
AuthorSadettin Emre Erogluen
AuthorRıza Sarper Öktenen
AuthorSemra Duranen
AuthorArzu Özsoyen
AuthorHatice Gül Hatipogluen
OrcidEren Çamur [0000-0002-8774-5800]en
OrcidTuray Cesur [0000-0002-2726-8045]en
OrcidYusuf Öztürk [0009-0009-6372-1845]en
OrcidMehmet Kutlu [0000-0002-5922-0169]en
OrcidHalil Karataş [0009-0007-3858-555X]en
OrcidSadettin Emre Eroglu [0009-0008-2817-6565]en
OrcidRıza Sarper Ökten [0000-0002-4721-6357]en
OrcidSemra Duran [0000-0003-0863-2443]en
OrcidArzu Özsoy [0000-0002-0367-4794]en
OrcidHatice Gül Hatipoglu [0000-0002-4231-3983]en
Accessioned Date2025-08-07T01:31:38Z
Issued Date2025-04-30en
AbstractBackground: Large language models (LLMs) are increasingly integrated into radiology, offering potential benefits in workflow optimization and imaging study selection. The Turkish Society of Radiology published the "Turkish Society of Radiology 2018 Magnetic Resonance Imaging and Computed Tomography Acquisition Standards Guideline" (TSR-2018 MCASG) in 2018. This guideline covers sequence selection, patient positioning, scanning parameters, and specific sequence requirements. To our knowledge, no study has assessed the proficiency and knowledge of LLMs in determining magnetic resonance acquisitions and compared them to radiologists. Objectives: This study aims to evaluate the performance of various LLMs in guiding magnetic resonance imaging (MRI) acquisition protocols based on TSR-2018 MCASG and to compare their proficiency with radiologists across different experience levels. Materials and Methods: In this cross-sectional observational study, eight LLMs (including ChatGPT-4o models, ChatGPT-o1, Claude 3 Opus, Claude 3.5 Sonnet, Gemini 1.5 Pro, Llama 3.1 405B, and Mistral Large 2) were assessed alongside radiologists ranging from junior residents to senior radiologists (SRs). A total of 105 open-ended questions (OEQs) and 105 case-based questions (CBQs) including different sections were prepared from TSR-2018 MCASG. Statistical analyses employed non-parametric tests, including the Kruskal - Wallis test with Tamhane’s T2 post hoc comparisons and McNemar’s test, with a Bonferroni-adjusted significance threshold set at P < 0.0004. Results: Claude 3.5 Sonnet emerged as the standout performer, achieving a mean Likert score of 3.51 ± 0.54 in OEQs and an impressive 83.8% accuracy in CBQs, outperforming other LLMs and radiology residents (P < 0.0004). While SRs demonstrated strong performance, Claude 3.5 Sonnet outperformed them in both OEQs and CBQs. Furthermore, LLMs have demonstrated competitive performance with junior radiologists (JRs) in both OEQs and CBQs. Conclusion: Our findings herald a transformative era in radiology, with Claude 3.5 Sonnet leading the vanguard in MRI sequence selection and their contribution to MRI acquisitions. The LLMs can make an important contribution as supportive tools for MRI acquisition optimization.en
DOIhttps://doi.org/10.5812/iranjradiol-161555en
URIhttps://repository.brieflands.com/handle/123456789/65626
KeywordMR Sequencesen
KeywordMagnetic Resonance Imagingen
KeywordChatGPTen
KeywordLarge Language Modelsen
KeywordAcquisitionen
KeywordImaging Protocolen
PublisherBrieflandsen
TitleCould the Utilization of Large Language Models Contribute to Magnetic Resonance Imaging Acquisition Protocols?en
TypeResearch Articleen

Files

Collections