Date of Award

Spring 4-28-2023

Degree Type

Dissertation

Degree Name

Doctor of Philosophy in Analytic and Data Science

Department

Statistics and Analytical Sciences

Committee Chair/First Advisor

Dr. Ramazan Aygun

Committee Member

Dr. Ying Xie

Committee Member

Dr. Yifan Zhang

Abstract

Natural Language Processing (NLP) systems are included everywhere on the internet from search engines, language translations to more advanced systems like voice assistant and customer service. Since humans are always on the receiving end of NLP technologies, it is very important to analyze whether or not the Large Language Models (LLMs) in use have bias and are therefore unfair. The majority of the research in NLP bias has focused on societal stereotype biases embedded in LLMs. However, our research focuses on all types of biases, namely model class level bias, stereotype bias and domain bias present in LLMs. Model class level bias happens when a model tends to favor some classification labels or outputs compared to the others. We investigate how a classification model hugely favors one class with respect to another. We propose a bias evaluation technique called \textit{directional pairwise class confusion bias} that highlights an LLM's bias on pairs of classes. Unfavorable kind of stereotype bias takes place when LLMs cause significant injustice or harm to disadvantaged or marginalized group of people. Although the most advanced deep LLMs claim to mimic human responses via powerful and sophisticated algorithms, the capabilities that such models offer have shown to possess bias. Quantifying such stereotype biases appropriately is essential so that the bias measures can be used to calibrate potential harm the models can cause. On the other hand, domain biases are desired for the model because it indicates the model is learning necessary facts for it to be powerful. We devise techniques to measure class level, stereotype, and domain biases appropriately.

Share

COinS