Semester of Gradation

Summer 2025

Degree Type

Dissertation

Degree Name

Doctor of Philosophy in Data Science and Analytics

Department

Data Science and Analytics

Committee Chair/First Advisor

Herman E. Ray

Second Advisor

Linh Le

Third Advisor

Xinyan Zhang

Abstract

Recent advancements in deep learning, particularly the development of large language models, have generated substantial interest, yet there remains limited evidence that these technologies consistently fulfill their anticipated potential. While uncertainty quantification has been extensively studied in the context of classification and regression tasks, it is comparatively underdeveloped in generative models, and image captioning models in particular. At present, there is limited consensus regarding appropriate methodologies for quantifying uncertainty in these systems. This research examines existing uncertainty quantification approaches and evaluates their suitability for image captioning models. The findings indicate that current methods are generally inadequate for the generative setting, owing to the conditional and recursive nature of language generation. To address this gap, we conduct experiments involving the generation of structured captions and developed a distributional framework to quantify uncertainty based on the predicted probabilities associated with generated tokens. We find the distributional method works for a limited number of tokens generated. Subsequently, the investigation extends to unstructured captions, wherein we introduce a method for constructing prediction sets around parts of speech, thereby providing a specified level of confidence that the true value resides within the set. These prediction sets can be utilized to score captions, facilitating the identification of captions that warrant further review. This approach not only enables the quantification of uncertainty in generated text captions but also supports the formation of word sets that are most relevant to the image.

Available for download on Tuesday, July 27, 2027

Share

COinS