C-Day Fall 2025 Graduate Projects

GC-1215 ClinicalRAG: A Scalable Benchmark of Privacy, Relevance, and Speed in Semantic Retrieval for clinical transcriptions

Pradyumna KumarFollow
Sai Sruti DandibhatlaFollow
Srinivasan SubramanianFollow
Purna Chandu AnukulaFollow
Pranitha AthukuriFollow

Location

https://www.kennesaw.edu/ccse/events/computing-showcase/fa25-cday-program.php

Document Type

Event

Start Date

24-11-2025 4:00 PM

Description

Traditional keyword search struggles with the scale, complexity, and contextual depth of clinical data. This project develops and evaluates semantic search systems that better understand medical language, enabling physicians and researchers to retrieve contextually relevant information through a Retrieval Augmented Generation (RAG) framework. We integrate privacy-preserving methods, including differential privacy and homomorphic encryption to protect sensitive clinical transcriptions. For improved speed and accuracy, we enhance the baseline RAG architecture with Hierarchical Navigable Small World (HNSW) indexing and Maximal Marginal Relevance (MMR) based reranking. To ensure scalability, clinical documents are ingested using PySpark and stored in a vector database optimized for high-dimensional queries, enabling fast, accurate, and privacy-aware retrieval of medical transcriptions.

Download

Included in

Computer Sciences Commons

COinS

Nov 24th, 4:00 PM

GC-1215 ClinicalRAG: A Scalable Benchmark of Privacy, Relevance, and Speed in Semantic Retrieval for clinical transcriptions

https://www.kennesaw.edu/ccse/events/computing-showcase/fa25-cday-program.php

C-Day Fall 2025 Graduate Projects

GC-1215 ClinicalRAG: A Scalable Benchmark of Privacy, Relevance, and Speed in Semantic Retrieval for clinical transcriptions

Location

Document Type

Start Date

Description

Included in

C-Day Links

Search

Authors

Browse

Links

C-Day Fall 2025 Graduate Projects

GC-1215 ClinicalRAG: A Scalable Benchmark of Privacy, Relevance, and Speed in Semantic Retrieval for clinical transcriptions​ ​

Presenter Information

Location

Document Type

Start Date

Description

Included in

Share

C-Day Links

Search

Authors

Browse

Links

GC-1215 ClinicalRAG: A Scalable Benchmark of Privacy, Relevance, and Speed in Semantic Retrieval for clinical transcriptions