Semester of Graduation

Spring 2026

Degree Type

Thesis

Degree Name

Master of Science in Artificial Intelligence

Department

Computer Science - College of Computing and Software Engineering

Committee Chair/First Advisor

Dr. Jiho Noh

Second Advisor

Dr. Nasrin Dehbozorgi

Third Advisor

Dr. Dylan Gaines

Abstract

Assessing creativity at scale remains a persistent challenge in cognitive science, as human raters are costly, slow, and often inconsistent in their judgments. This thesis introduced a novel framework for automated scientific creativity assessment using forced pairwise ranking, in which fine-tuned large language models compared response pairs and determined which was more creative. Five empirical studies were conducted using Llama-2-7B and Llama-2-13B models adapted via LoRA fine-tuning and benchmarked against human scored responses from the Scientific Creative Thinking Test. A regression baseline achieved Pearson �� = .74 on the test set, matching the human inter-rater ceiling reported in the literature. A pairwise classification paradigm aggregated judgments into continuous creativity scores via Elo rating, achieving �� = .69. Increased model scale provided no meaningful performance gains under matched training conditions. Uncer- tainty quantification via stochastic sampling and DBSCAN clustering did not reliably pre- dict prediction error, while tolerance accuracy analysis confirmed that both models were most reliable in the mid-range of the creativity spectrum and could approximate whether a response reflects lower, average, or elevated creativity. Inter-rater disagreement among human judges was identified as a significant contributing source of the performance ceiling, with disagreement concentrated at the upper end of the creativity spectrum where model predictions were also least reliable. These findings suggest that future progress in automated creativity assessment depends more on improving ground truth label quality than on scaling model size or changing training paradigms.

Comments

None

Download

Included in

Cognitive Science Commons, Computer and Systems Architecture Commons, Multivariate Analysis Commons

COinS

Master's Theses

A Novel Approach to Creativity Assessment: Forced Pairwise Ranking With Large Language Models

Semester of Graduation

Degree Type

Degree Name

Department

Committee Chair/First Advisor

Second Advisor

Third Advisor

Abstract

Comments

Included in

Search

Authors

Browse

Useful Links

Master's Theses

A Novel Approach to Creativity Assessment: Forced Pairwise Ranking With Large Language Models

Author

Semester of Graduation

Degree Type

Degree Name

Department

Committee Chair/First Advisor

Second Advisor

Third Advisor

Abstract

Comments

Included in

Share

Search

Authors

Browse

Useful Links