Location
https://www.kennesaw.edu/ccse/events/computing-showcase/sp26-cday-program.php
Document Type
Event
Start Date
22-4-2026 4:00 PM
Description
Medical AI systems are increasingly deployed in clinical settings, yet most published models report only clean accuracy, dataset details, and training procedures—while omitting security‑critical evaluations such as robustness to perturbations, adversarial vulnerability, and failure modes under realistic noise. This project addresses that gap by building a cross‑modal robustness assessment for Parkinson’s disease (PD) screening models across handwriting trajectories, speech‑derived acoustic features, and an LLM‑based preprocessing layer. Despite strong clean performance (visual subject‑level ROC AUC ≈ 0.99; audio ≈ 1.0), the visual pipeline proved highly brittle to realistic acquisition distortions. Downsampling and point‑dropout caused near‑chance collapse, while pressure noise and XY jitter produced monotonic degradation and threshold instability. Simple defenses improved robustness in corruption‑specific ways—augmentation was the strongest general‑purpose method, and resampling+augmentation best mitigated sampling‑density failures—yet all defenses reduced clean accuracy. Small adversarial perturbations (ε ≈ 0.2–0.3) reliably flipped predictions. Speech‑feature models were naturally robust to random corruptions and generalized well across utterance types, but adversarial attacks again caused sharp collapse at small ε, revealing a shared vulnerability across modalities. An LLM layer introduced additional instability: Llama‑3 was format‑stable but less feature‑grounded, while Gemma‑3 was more accurate but more brittle. Overall, the project demonstrates that high clean accuracy does not imply real‑world reliability, and that security evaluation must become a standard component of medical AI development.
Included in
GRP-03-141 Stress-Testing Parkinson’s Disease Screening: A Cross-Modal Analysis of Drawing and Speech Models
https://www.kennesaw.edu/ccse/events/computing-showcase/sp26-cday-program.php
Medical AI systems are increasingly deployed in clinical settings, yet most published models report only clean accuracy, dataset details, and training procedures—while omitting security‑critical evaluations such as robustness to perturbations, adversarial vulnerability, and failure modes under realistic noise. This project addresses that gap by building a cross‑modal robustness assessment for Parkinson’s disease (PD) screening models across handwriting trajectories, speech‑derived acoustic features, and an LLM‑based preprocessing layer. Despite strong clean performance (visual subject‑level ROC AUC ≈ 0.99; audio ≈ 1.0), the visual pipeline proved highly brittle to realistic acquisition distortions. Downsampling and point‑dropout caused near‑chance collapse, while pressure noise and XY jitter produced monotonic degradation and threshold instability. Simple defenses improved robustness in corruption‑specific ways—augmentation was the strongest general‑purpose method, and resampling+augmentation best mitigated sampling‑density failures—yet all defenses reduced clean accuracy. Small adversarial perturbations (ε ≈ 0.2–0.3) reliably flipped predictions. Speech‑feature models were naturally robust to random corruptions and generalized well across utterance types, but adversarial attacks again caused sharp collapse at small ε, revealing a shared vulnerability across modalities. An LLM layer introduced additional instability: Llama‑3 was format‑stable but less feature‑grounded, while Gemma‑3 was more accurate but more brittle. Overall, the project demonstrates that high clean accuracy does not imply real‑world reliability, and that security evaluation must become a standard component of medical AI development.