Date of Award
Spring 3-28-2024
Degree Type
Dissertation
Degree Name
Doctor of Philosophy in Data Science and Analytics
Department
School of Data Science and Analytics
Committee Chair/First Advisor
Ying Xie
Second Advisor
Sherry Ni
Third Advisor
Xinyue Zhang
Fourth Advisor
Sumit Chakravarty
Abstract
The rapid growth of e-commerce has necessitated the development of sophisticated product retrieval systems that can effectively match user queries with relevant products. However, the semantic gap between queries and products remains a significant challenge, as traditional retrieval methods often fail to capture the nuances of user purchase intentions. E-commerce click-stream data and product catalogs offer critical user behavior insights and product knowledge that are untapped in the current product search algorithms. This dissertation presents learning strategies that leverage the query-product transaction logs to enrich the pipeline of our proposed multi-modal transformer model, which transforms initial user queries into pseudo product embeddings. The proposed architecture integrates a two-stage training process that first estimates purchase intention and extracts granular product features, and then refines the model to generate high-quality pseudo product representations. By extracting novel information from the purchase history transactions, this model can infer users' potential purchase intent from their limited queries and purchased products, enabling the model to bridge the semantic gap between queries and relevant products. We demonstrate our model's superior performance over state-of-the-art alternatives on e-commerce online retrieval in both controlled and real-world experiments. Our ablation studies confirm that the proposed transformer architecture and integrated learning strategies enable the mining of key data sources to infer purchase intent, extract product features, and enhance the transformation pipeline from queries to more accurate pseudo-product representations.
Included in
Artificial Intelligence and Robotics Commons, Databases and Information Systems Commons, Data Science Commons, Theory and Algorithms Commons