A two-stage architecture for stock price forecasting by combining SOM and fuzzy-SVM

This paper proposed a model to predict the stock price based on combining Self-Organizing Map (SOM) and fuzzy-Support Vector Machines (f-SVM). Extraction of fuzzy rules from raw data based on the combining of statistical machine learning models is base of this proposed approach. In the proposed model, SOM is used as a clustering algorithm to partition the whole input space into the several disjoint regions. For each partition, a set of fuzzy rules is extracted based on a f-SVM combining model. Then fuzzy rules sets are used to predict the test data using fuzzy inference algorithms. The performance of the proposed approach is compared with other models using four data sets

💡 Research Summary

The paper introduces a two‑stage hybrid architecture that integrates Self‑Organizing Maps (SOM) with fuzzy Support Vector Machines (f‑SVM) for the purpose of forecasting stock prices. In the first stage, SOM—a competitive, unsupervised neural network—clusters the high‑dimensional input space into a set of disjoint, topologically coherent regions. By preserving the intrinsic data topology, SOM isolates sub‑spaces that exhibit relatively homogeneous statistical characteristics, thereby reducing the overall non‑linearity and noise that typically plague financial time‑series.

In the second stage, each SOM‑derived cluster is processed independently by an f‑SVM. The f‑SVM extends the classic SVM formulation by embedding fuzzy membership functions into the Lagrange multipliers, which enables the extraction of interpretable fuzzy rules from the support vectors. Concretely, a rule takes the form “If feature A is high and feature B is low then the output is upward,” where the antecedent conditions are defined by triangular (or Gaussian) membership functions and the consequent is a crisp prediction obtained through fuzzy inference. The authors employ a Gaussian kernel for the underlying SVM and adopt a triangular membership shape for computational simplicity.

To evaluate the proposed framework, four real‑world datasets were assembled: the Korean KOSPI index (200 daily observations), the US NASDAQ index (150 days), the Dow Jones Industrial Average (180 days), and a single‑stock series (250 days). Each dataset includes eight engineered features—closing price, volume, several moving averages, volatility measures, etc.—which are linearly interpolated to fill missing values and normalized to the