Papers
arxiv:2310.11266

Emulating Human Cognitive Processes for Expert-Level Medical Question-Answering with Large Language Models

Published on Oct 17, 2023
Authors:
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,

Abstract

In response to the pressing need for advanced clinical problem-solving tools in healthcare, we introduce BooksMed, a novel framework based on a Large Language Model (LLM). BooksMed uniquely emulates human cognitive processes to deliver evidence-based and reliable responses, utilizing the GRADE (Grading of Recommendations, Assessment, Development, and Evaluations) framework to effectively quantify evidence strength. For clinical decision-making to be appropriately assessed, an evaluation metric that is clinically aligned and validated is required. As a solution, we present ExpertMedQA, a multispecialty clinical benchmark comprised of open-ended, expert-level clinical questions, and validated by a diverse group of medical professionals. By demanding an in-depth understanding and critical appraisal of up-to-date clinical literature, ExpertMedQA rigorously evaluates LLM performance. BooksMed outperforms existing state-of-the-art models Med-PaLM 2, Almanac, and ChatGPT in a variety of medical scenarios. Therefore, a framework that mimics human cognitive stages could be a useful tool for providing reliable and evidence-based responses to clinical inquiries.

Community

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2310.11266 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2310.11266 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2310.11266 in a Space README.md to link it from this page.

Collections including this paper 1