Article Preview
Top2. Review Of Literature
Aggregate claims for a homogeneous insurance portfolio have long been estimated using pure algorithmic methods (Chain-Ladder, Bornhuetter & Ferguson and Poisson) or simple stochastic methods (Generalized linear models, Bayesian, Distributional, Bootstrap method, and other) (Wuthrich & Merz, 2008; De Jong, & Heller 2008). Algorithmic, distribution-free methods use mechanical technics (run-off triangle) to predict claim reserves. This understanding does not allow for the quantification of the uncertainties in these predictions. Uncertainties can only be determined if we have an underlying stochastic model on which the prediction algorithms can be based. Some recent studies suggest improvements for the existing stochastic models (Björkwall, Hössjer, Ohlsson & Verrall, 2011; Brillinger, 2012; Zhang, Dukic & Guszcza, 2012).
For micro-level (level of individual claims), recent studies have perceived that a mixed discrete-continuous model may be appropriate to estimate claims and risk in insurance data (Christmann, 2004; Heller, Stasinopoulos & Rigby, 2006; Parnitzke, 2008; Bortoluzzo, Claro, Caetano & Artes, 2011; Huo,Wang, & Yang, 2013). According to Parnitzke (2008), the model explicitly specifies a logit-linear model for the occurrence of a claim (i.e. claim probability) and linear model for the mean claim size. Generalized linear models and more flexible Tweedie’s compound Poisson model are often used to construct insurance tariffs (Smyth & Jorgensen, 2002). However, even this more general models still can yield problems in modeling high-dimensional realtionships which is quite common for insurance data set. The best modeling in these circumstances is one which using methods from machine learning and data mining (Christmann, 2004). In recent years many papers deal with the application of data mining methods for loss cost estimation and risk analysis in insurance (Xiahou & Mu, 2010; Guelman, 2012; Thakur & Sing, 2013; Huo, Wang & Yang, 2013).