For a long time, screening patients individually has been a very costly, labor-intensive task. However, according to a study by a group of researchers on the GPT-4V AI model, we can soon say goodbye to this hassle. Let’s take a further look into the study and understand the AI model’s marvelous potential!
Highlights:
- Study shows GPT-4V along with RAG offers numerous benefits in Clinical Trial Screening.
- Results show that the model can increase efficiency and reduce costs.
- Holds big potential for the future of AI in the medical sector.
GPT-4V for Clinical Trials Screening Research
To determine whether an AI model could analyze medical records and identify potential clinical trial participants, a group of researchers from Mass General Brigham Personalised Medicine, Harvard Medical School, and Brigham and Women’s Hospital performed a study.
They processed clinical notes and electronic health records (EHR) of possible candidates using GPT-4V, OpenAI’s LLM with image processing made possible by Retrieval-Augmented Generation (RAG).
A thorough selection process was conducted to select patients who met the criteria for the study. The process consisted of several inclusion and exclusion techniques and entailed having trained personnel search through hundreds or thousands of patients’ EHRs to find those who meet the requirements.
Their current structured data in the EHRs was quite amazing and it allowed them to determine 5 out of 6 inclusion and 5 out of 17 exclusion criteria. The labor-intensive task that the researchers believed artificial intelligence (AI) could help with was analyzing unstructured data in each patient’s healthcare notes to establish the remaining 13 criteria.
They created a clinical note-based question-answering system driven by RAG architecture and GPT-4 and named it RECTIFIER (RAG-Enabled Clinical Trial Infrastructure for Inclusion Exclusion Review). They used the structured assessments completed by the study staff and clinical notes for the past two years as a reference in development.
The study had multiple sectors for the dataset. They used 1894 patient notes as a test set, 282 patient notes as a validation dataset, and 100 patient notes as a development dataset.
For every query and screening technique, they computed the following metrics: sensitivity, specificity, accuracy, and Matthews correlation coefficient (MCC). To determine the confidence intervals for each statistic, they additionally used bootstrapping.
What did the Results Show?
The results of the study were quite impressive. The accuracy of the responses from RECTIFIER and study personnel varied between 97.9% and 100% (MCC 0.837 and 1), and between 91.7% and 100% (MCC 0.644 and 1), respectively, in comparison to the expert physician replies across all criteria.
The X Factor of the result was that RECTIFIER performed better than the study staff in determining the inclusion character of “symptomatic heart failure”. It completely transcended and surpassed the latter with an accuracy of 97.9% vs 91.7% and an MCC of 0.924 vs 0.721, respectively.
RECTIFIER also didn’t fail to show impressive metric figures across sensitivity and specificity, having 92.3% (CI) and 93.9% (CI) respectively compared to the study staff’s 90.1% (CI) and 83.6% (CI).
The results were overall successful in proving that while retaining accuracy, GPT-4V with RAG could complete the task more quickly than study personnel could.
What does this mean for the Future of AI in the Medical Field?
This study gives us a breakthrough for AI in the medical sector. With the rise of potential models such as RECTIFIER, we can expect many more similar models in the future with seemingly more benefits.
For now, let’s explore four ways in which Retrieval-Augmented Generation Enabled Generative Pre-Trained Transformer 4 (GPT-4) can revolutionize the medical field:
- Enhanced Clinical Decision Support: GPT-4V along with RAG could analyze vast amounts of medical literature, patient data, and clinical guidelines to provide real-time, personalized recommendations to healthcare professionals, aiding in diagnosis, treatment selection, and risk assessment.
- Improved Efficiency and Accuracy in Research: GPT-4’s ability to process and synthesize information could accelerate research by Generating summaries of relevant research papers to save researchers time. It can also Identify potential research gaps and suggest new avenues for investigation. Lastly, it can also automate tedious tasks like data mining and literature review.
- Drug Discovery and Development: GPT-4 could analyze data on existing drugs, clinical trials, and molecular structures to identify promising candidates for new drug development, potentially leading to faster and more effective treatments.
- Streamlining Administrative Tasks: GPT-4 could automate administrative tasks such as scheduling appointments, generating reports, and processing medical claims, freeing up valuable time for healthcare professionals to focus on patient care.
Thus GPT-4, a large language model (LLM), holds promising potential to revolutionize how healthcare data is utilized and interacted with, leading to better patient care and medical research.
Conclusion
The researchers concluded that, in comparison to using human personnel, utilizing an AI model like GPT-4V with RAG can significantly reduce costs in the identification of clinical trial candidates. However, they acknowledged that there should be caution when entrusting automated systems with medical care, but it appears that AI would perform more effectively than humans if given the right guidance!