Natural language processing for automated quantification of brain metastases reported in free-text radiology reports
EANS Academy. Senders J. Sep 27, 2019; 276113; EP12065
Discussion Forum (0)
Rate & Comment (0)
Introduction: Although the bulk of patient-generated health data is increasing exponentially, its utilization is impeded because most data comes in unstructured format, namely free-text clinical reports. A variety of natural language processing (NLP) methods have emerged to automate the processing of free text ranging from statistical to deep learning-based models; however, the optimal approach for medical text analysis remains to be determined. The aim of this study was to provide a head-to-head comparison of novel NLP techniques and inform future studies about their utility for automated medical text analysis.
Methods: Magnetic resonance imaging reports of patients with brain metastases treated in two tertiary centers were retrieved and manually annotated using a binary classification (single metastasis versus two or more metastases). Multiple bag-of-words and sequence-based NLP models were developed and compared after randomly splitting the annotated reports into a training and test set in an 80:20 ratio.
Results: A total of 1479 radiology reports of patients diagnosed with brain metastases were retrieved. The LASSO regression model demonstrated the best overall performance on the hold-out test set with an area under the receiver operating curve of 0.92 (95%CI 0.89-0.94), accuracy of 83% (95%CI 80-87%), calibration intercept of -0.06 (95%CI -0.14-0.01), and calibration slope of 1.06 (95%CI 0.95-1.17).
Conclusion: Among various NLP techniques, the bag-of-words approach combined with a LASSO regression model demonstrated the best overall performance in extracting binary outcomes from free-text clinical reports. This study provides a framework for the development of machine learning-based NLP models, as well as a clinical vignette in patients diagnosed with brain metastasis.
Code of conduct/disclaimer available in General Terms & Conditions
Anonymous User Privacy Preferences

Strictly Necessary Cookies (Always Active)

MULTILEARNING platforms and tools hereinafter referred as “MLG SOFTWARE” are provided to you as pure educational platforms/services requiring cookies to operate. In the case of the MLG SOFTWARE, cookies are essential for the Platform to function properly for the provision of education. If these cookies are disabled, a large subset of the functionality provided by the Platform will either be unavailable or cease to work as expected. The MLG SOFTWARE do not capture non-essential activities such as menu items and listings you click on or pages viewed.

Performance Cookies

Performance cookies are used to analyse how visitors use a website in order to provide a better user experience.

Google Analytics is used for user behavior tracking/reporting. Google Analytics works in parallel and independently from MLG’s features. Google Analytics relies on cookies and these cookies can be used by Google to track users across different platforms/services.

Save Settings