Measuring Customer Similarity and Identifying Cross-Selling Products by Community Detection


School of Data Science and Analytics

Document Type


Publication Date



Product affinity segmentation discovers groups of customers with similar purchase preferences for cross-selling opportunities to increase sales and customer loyalty. However, this concept can be challenging to implement efficiently and effectively for actionable strategies. First, the nature of skewed and sparse product-level data in the clustering process results in less meaningful solutions. Second, customer segmentation becomes challenging on massive data sets due to the computational complexity of traditional clustering methods. Third, market basket analysis may suffer from association rules too general to be relevant for important segments. In this article, we propose to partition customers into groups with their product purchase similarity maximized by detecting communities in the customer-product bipartite graph using the Louvain algorithm. Through a case study using data from a large U.S. retailer, we demonstrate that the proposed method generates interpretable clustering results with distinct product purchase patterns. Comprehensive characteristics of customers and products in each cluster can be inferred with statistical significance since they are essentially driven by products purchased by customers. Compared with the conventional RFM (recency, frequency, monetary) model, the proposed approach leads to higher response rates in the recommendation of products to customers in the same cluster. Our analysis provides greater insights into customer purchase behaviors, improves product recommendation effectiveness, and addresses computational complexity in the context of skewed and sparse big data.

Journal Title

Big data





First Page


Last Page


Digital Object Identifier (DOI)