Department
Statistics and Analytical Sciences
Document Type
Article
Submission Date
2016
Abstract
An isotonic regression model fits an isotonic function of the explanatory variables to estimate the expectation of the response variable. In other words, as the function increases, the estimated expectation of the response must be non-decreasing. With this characteristic, isotonic regression could be a suitable option to analyze and predict business risk scores. A current challenge of isotonic regression is the decrease of performance when the model is fitted in a large data set e.g. more than four or five dimensions. This paper attempts to apply isotonic regression models into prediction of business risk scores using a large data set – approximately 50 numeric variables and 24 million observations. Evaluations are based on comparing the new models with a traditional logistic regression model built for the same data set. The primary finding is that isotonic regression using distance aggregate functions does not outperform logistic regression. The performance gap is narrow however, suggesting that isotonic regression may still be used if necessary since isotonic regression may achieve better convergence speed in massive data sets.