DigitalCommons@Kennesaw State University - C-Day Computing Showcase: GRM-038 Optimizing Prompts for Alzheimer's Speech Classification Using LLM

 

Presenter Information

Imaan ShahidFollow

Location

https://www.kennesaw.edu/ccse/events/computing-showcase/sp25-cday-program.php

Streaming Media

Document Type

Event

Start Date

15-4-2025 4:00 PM

Description

Large Language Models (LLMs) are widely used in Alzheimer's disease research to classify speech patterns. However, there is no standardized framework to ensure the reliability of prompts used in these classifications. This study investigates the sensitivity of Alzheimer’s disease classification prompts to small variations and finds that these prompts are indeed sensitive, leading to inconsistencies in model performance. To address this, we implement an automatic prompt optimization framework to refine the base prompt. Experimental results demonstrate that the optimized prompt improves classification accuracy by 12.83% compared to the baseline, underscoring the significance of systematic prompt engineering in enhancing the reliability of LLM-based Alzheimer’s disease detection. Although the optimized prompt remained sensitive to variations, it consistently showed improved overall accuracy.

Share

COinS
 
Apr 15th, 4:00 PM

GRM-038 Optimizing Prompts for Alzheimer's Speech Classification Using LLM

https://www.kennesaw.edu/ccse/events/computing-showcase/sp25-cday-program.php

Large Language Models (LLMs) are widely used in Alzheimer's disease research to classify speech patterns. However, there is no standardized framework to ensure the reliability of prompts used in these classifications. This study investigates the sensitivity of Alzheimer’s disease classification prompts to small variations and finds that these prompts are indeed sensitive, leading to inconsistencies in model performance. To address this, we implement an automatic prompt optimization framework to refine the base prompt. Experimental results demonstrate that the optimized prompt improves classification accuracy by 12.83% compared to the baseline, underscoring the significance of systematic prompt engineering in enhancing the reliability of LLM-based Alzheimer’s disease detection. Although the optimized prompt remained sensitive to variations, it consistently showed improved overall accuracy.