Currently submitted to: JMIR Medical Informatics
Date Submitted: Sep 10, 2025
Open Peer Review Period: Sep 16, 2025 - Nov 11, 2025
(currently open for review)
Warning: This is an author submission that is not peer-reviewed or edited. Preprints - unless they show as "accepted" - should not be relied on to guide clinical practice or health-related behavior and should not be reported in news media as established information.
GPT-4o Powered Pre-Anesthetic AI: Development and Validation
ABSTRACT
Background:
Pre-anesthetic assessment is essential for identifying high-risk surgical patients and minimizing perioperative complications. However, conventional tools such as the American Society of Anesthesiologists (ASA) classification and postoperative nausea and vomiting (PONV) risk scores are limited by subjectivity and reliance on manual data input, reducing consistency and scalability.
Objective:
This study aimed to develop and retrospectively validate an artificial intelligence (AI)–enabled pre-anesthetic assessment system powered by GPT-4o. The system was designed to predict ASA physical status and PONV risk using structured and unstructured data from electronic medical records.
Methods:
A retrospective, single-center study was conducted at a medical center in Taiwan between January and May 2025. A total of 600 hospitalized surgical patients aged ≥18 years were selected using stratified random sampling. (For PONV, the primary analysis counted High risk as test-positive; a sensitivity analysis counted Moderate and High as positive.)
Results:
With National Health Insurance (NHI) data, agreement for ASA was near-perfect (κ=0.883); without NHI it was moderate (κ=0.518). For PONV (High=positive), the AI achieved sensitivity 34.7% (95% CI 22.9–48.7), specificity 99.1% (95% CI 97.9–99.6), and accuracy 93.8% (95% CI 91.6–95.5). The 2×3 association was significant (χ²(2)=169.25, p<0.001; Cramér’s V=0.531).
Conclusions:
The GPT-4o–powered AI system demonstrated robust validity in pre-anesthetic risk assessment. Incorporating comprehensive data sources, such as NHI datasets, significantly improved ASA prediction accuracy. These findings support the integration of large language model (LLM)–based tools into preoperative workflows, with potential to enhance decision support, optimize resource use, and advance smart healthcare delivery.
Citation
Request queued. Please wait while the file is being generated. It may take some time.
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.