Date of Award

Spring 3-28-2024

Degree Type


Degree Name

PhD in Analytics and Data Science


School of Data Science

Committee Chair/First Advisor

Ying Xie

Second Advisor

Sherry Ni

Third Advisor

Xinyue Zhang

Fourth Advisor

Sumit Chakravarty


The rapid growth of e-commerce has necessitated the development of sophisticated product retrieval systems that can effectively match user queries with relevant products. However, the semantic gap between queries and products remains a significant challenge, as traditional retrieval methods often fail to capture the nuances of user purchase intentions. E-commerce click-stream data and product catalogs offer critical user behavior insights and product knowledge that are untapped in the current product search algorithms. This dissertation presents learning strategies that leverage the query-product transaction logs to enrich the pipeline of our proposed multi-modal transformer model, which transforms initial user queries into pseudo product embeddings. The proposed architecture integrates a two-stage training process that first estimates purchase intention and extracts granular product features, and then refines the model to generate high-quality pseudo product representations. By extracting novel information from the purchase history transactions, this model can infer users' potential purchase intent from their limited queries and purchased products, enabling the model to bridge the semantic gap between queries and relevant products. We demonstrate our model's superior performance over state-of-the-art alternatives on e-commerce online retrieval in both controlled and real-world experiments. Our ablation studies confirm that the proposed transformer architecture and integrated learning strategies enable the mining of key data sources to infer purchase intent, extract product features, and enhance the transformation pipeline from queries to more accurate pseudo-product representations.

Available for download on Tuesday, May 06, 2025