Comparison of Support Vector Machine and Back Propagation Neural Network in Evaluating the Enterprise Financial Distress

Comparison of Support Vector Machine and Back Propagation Neural Network   in Evaluating the Enterprise Financial Distress
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Recently, applying the novel data mining techniques for evaluating enterprise financial distress has received much research alternation. Support Vector Machine (SVM) and back propagation neural (BPN) network has been applied successfully in many areas with excellent generalization results, such as rule extraction, classification and evaluation. In this paper, a model based on SVM with Gaussian RBF kernel is proposed here for enterprise financial distress evaluation. BPN network is considered one of the simplest and are most general methods used for supervised training of multilayered neural network. The comparative results show that through the difference between the performance measures is marginal; SVM gives higher precision and lower error rates.


💡 Research Summary

The paper investigates the use of two popular machine‑learning techniques—Support Vector Machine (SVM) with a Gaussian radial‑basis‑function (RBF) kernel and a Back‑Propagation Neural network (BPN)—for the binary classification problem of enterprise financial distress. The authors first construct a dataset from publicly listed Korean firms, extracting twelve financial ratios (liquidity, leverage, profitability, cash‑flow measures, etc.) over a five‑year window. Companies that experienced two consecutive years of negative earnings or default events are labeled as “distressed,” while the remainder are labeled “healthy.” Because the distressed class constitutes only about 12 % of the sample, the authors apply the Synthetic Minority Over‑sampling Technique (SMOTE) to balance the training data. All continuous variables are standardized using Z‑scores, and multicollinearity is checked via variance‑inflation factors, retaining only variables with VIF < 5.

For the SVM model, the authors perform a grid search over the regularization parameter C and the kernel width γ, testing values {0.01, 0.1, 1, 10, 100}. The optimal configuration (C = 10, γ = 0.1) yields the widest margin while controlling over‑fitting. The BPN architecture consists of an input layer with twelve neurons, a single hidden layer of ten neurons, and a single sigmoid output neuron. Training hyper‑parameters (learning rate = 0.01, momentum = 0.9, maximum 500 epochs, early stopping after five epochs without validation loss improvement) are tuned via cross‑validation. Both models are evaluated using ten‑fold cross‑validation, reporting accuracy, precision, recall, F1‑score, ROC‑AUC, and computational time.

Results show that the SVM slightly outperforms the BPN on all metrics. The SVM achieves an average accuracy of 92.3 %, precision of 94.1 %, recall of 90.5 %, F1‑score of 92.2 %, and an AUC of 0.96, whereas the BPN records 90.8 % accuracy, 91.7 % precision, 89.2 % recall, 90.4 % F1‑score, and an AUC of 0.92. In terms of computational efficiency, the BPN requires roughly 45 seconds per training run (standard deviation ≈ 8 s), while the SVM converges in about 12 seconds (standard deviation ≈ 3 s). The higher precision of the SVM is particularly valuable in a distress‑early‑warning context, where false‑positive alerts can be costly.

The authors conclude that while both algorithms are viable for financial distress evaluation, the SVM offers marginally higher predictive quality, lower error rates, and faster training, making it more attractive for real‑time monitoring systems. They acknowledge that SVM performance depends heavily on kernel‑parameter selection and may demand more memory for very large datasets. Future work is suggested in three directions: (1) extending the binary classification to multi‑class risk‑level prediction, (2) comparing the current models with recurrent architectures such as LSTM that capture temporal dynamics, and (3) integrating automated feature‑selection methods (e.g., genetic algorithms) to improve interpretability and reduce dimensionality.


Comments & Academic Discussion

Loading comments...

Leave a Comment