Article Preview
TopIntroduction
Credit risk management is the core of finance. A perfect credit system plays an important role in the rapid development of the financial industry, especially the modern financial credit industry. The risk measurement of the modern financial credit industry has always been a key research issue in financial risk assessment. Financial credit risk forecasting helps to establish a data-centric financial risk early warning system to improve the accuracy of credit risk forecasting. In addition, it determines the key elements and paths that affect credit risk, helping investors reduce the losses of financial risks. In turn, financial credit risk helps the financial and credit industry improve and complete business risk management to better regulate and prevent related risks (Wickens, 2017).
The financial and credit industry has typical long-tail characteristics. The focus of traditional financial institutions is mainly on meeting the financial needs of 20% of high net-worth customers. The other 80% of the small, scattered, personalized financing needs are distributed in the financial long tail. These needs cannot be satisfied for a long time because they require high cost and energy from traditional financial institutions. The modern financial and credit industry makes up for the deficiencies of traditional financial institutions and specializes in developing credit financing services for small, scattered, and personalized customers distributed in the tail of the financial long tail. While meeting the financing needs of many long-tail customers, it also makes the coverage of its industrial risks increasingly larger. Therefore, a financial risk early warning system for the financial credit industry is necessary.
There are two main methods for measuring and researching risk issues in the financial and credit industry: traditional statistical and econometric methods and modern artificial intelligence methods (Zhao & Jin, 2018). The statistical and measurement methods used in traditional financial credit risk prediction include discriminant analysis, logistic regression analysis, multivariate statistical regression, and probit models. By comparing and analyzing multiple regression models, West (2000) concluded that the multiexpert model and radial basis neural network are more effective in credit risk prediction and have higher accuracy of logistic regression in traditional methods. However, in recent years, the data have undergone changes like an increase in sample size, increase in characteristic indicators, and the multisourcing of data structures. Problems like the correlation between data characteristic variables, multicollinearity, imbalance between samples, and heterogeneity are also prominent. Therefore, it is more difficult to use the traditional econometric method to process the data model, resulting in a reduced prediction accuracy (Efron et al., 2004).
Scholars began to improve the algorithm based on the model design of dimension specification sample disequilibrium and traditional methods (Ozturk et al., 2016). For example, Hwang et al. (2010) introduced an ordered semiparametric function to replace the linear regression function, establishing an ordered semiparametric probit credit scoring model that achieved good results. Aburrous et al. (2010) used the fuzzy mathematics methods to point out security problems of electronic payment faced by traditional online banks. The study noted that virus attacks caused significant security risks to payment pages.
At present, the usage of improved logistic methods is widespread, including the use of multiple logit market share models to study peer-to-peer (P2P) loan evaluations (Lee & Lee, 2012), principal component analysis (PCA), and logistic regression models with bilevel selection within and between groups for grouping structural variables (Rimiru et al., 2017), credit models based on logistic regression and machine learning techniques (Paula et al., 2019), and lasso-logistic regression models under reduced demission (Chen & Xiang, 2017). These models deal with problematic data that have multiple sources, complex structures, missing labels, high dimensions, and uneven categories. In addition, these models have a better performance on credit risk measurement.