Presenter Information

Kris PrasadFollow

Location

https://www.kennesaw.edu/ccse/events/computing-showcase/fa24-cday-program.php

Streaming Media

Document Type

Event

Start Date

19-11-2024 4:00 PM

Description

This study evaluates the effectiveness of LLMs in supporting mental health applications by analyzing their performance in understanding and categorizing user (mental health-related) inputs. We collected data from various mental health apps on the Google Play Store, including user reviews and app descriptions, and filtered content using a targeted mental health keyword bank. Sentiment analysis and keyword similarity scores were generated for reviews using RoBERTa-based models, this showed us how each review aligned with the mental health keywords advertised by the app and how users felt about the app. We prompted four modern LLMs: GPT-4o, Claude 3.5 Sonnet, Gemma 2, and GPT-3.5-Turbo. We provided Gemma 2 and GPT-3.5-Turbo with our dataset for more informed outputs. Our prompts consisted of five common mental health conditions (depression, anxiety, ADHD, PTSD, and insomnia) and we asked for the models to provide us with up to five app recommendations. The results showed that our data-enhanced LLMs noticeably outperformed the other state-of-the-art LLMs in accuracy, quality, and variety of outputs while being much more cost-effective. This suggests that data-enhanced, low-cost LLMs can serve as an effective alternative to newer, more powerful, and more expensive models, achieving notably better results in interpreting nuanced text for mental health applications.

Share

COinS
 
Nov 19th, 4:00 PM

UR-172 A Comparative Study of LLM Effectiveness in Mental Health Assistance​

https://www.kennesaw.edu/ccse/events/computing-showcase/fa24-cday-program.php

This study evaluates the effectiveness of LLMs in supporting mental health applications by analyzing their performance in understanding and categorizing user (mental health-related) inputs. We collected data from various mental health apps on the Google Play Store, including user reviews and app descriptions, and filtered content using a targeted mental health keyword bank. Sentiment analysis and keyword similarity scores were generated for reviews using RoBERTa-based models, this showed us how each review aligned with the mental health keywords advertised by the app and how users felt about the app. We prompted four modern LLMs: GPT-4o, Claude 3.5 Sonnet, Gemma 2, and GPT-3.5-Turbo. We provided Gemma 2 and GPT-3.5-Turbo with our dataset for more informed outputs. Our prompts consisted of five common mental health conditions (depression, anxiety, ADHD, PTSD, and insomnia) and we asked for the models to provide us with up to five app recommendations. The results showed that our data-enhanced LLMs noticeably outperformed the other state-of-the-art LLMs in accuracy, quality, and variety of outputs while being much more cost-effective. This suggests that data-enhanced, low-cost LLMs can serve as an effective alternative to newer, more powerful, and more expensive models, achieving notably better results in interpreting nuanced text for mental health applications.