Currency exchange prediction using machine learning, genetic algorithms and technical analysis

Currency exchange prediction using machine learning, genetic algorithms   and technical analysis
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Technical analysis is used to discover investment opportunities. To test this hypothesis we propose an hybrid system using machine learning techniques together with genetic algorithms. Using technical analysis there are more ways to represent a currency exchange time series than the ones it is possible to test computationally, i.e., it is unfeasible to search the whole input feature space thus a genetic algorithm is an alternative. In this work, an architecture for automatic feature selection is proposed to optimize the cross validated performance estimation of a Naive Bayes model using a genetic algorithm. The proposed architecture improves the return on investment of the unoptimized system from 0,43% to 10,29% in the validation set. The features selected and the model decision boundary are visualized using the algorithm t-Distributed Stochastic Neighbor embedding.


💡 Research Summary

**
The paper investigates whether technical analysis (TA) can provide exploitable signals for the EUR/USD foreign‑exchange market by integrating it with machine‑learning techniques and evolutionary optimization. The authors first compute six widely used TA indicators—Relative Strength Index (RSI), Commodity Channel Index (CCI), Moving Average Convergence Divergence (MACD), Rate of Change (ROC), Stochastic Oscillator, and Average True Range (ATR)—on an hourly EUR/USD price series spanning from January 1 2013 to March 9 2017. Each indicator contains one or more free parameters (e.g., look‑back windows), leading to an astronomically large search space (>10²¹ possible configurations). Because exhaustive search is infeasible, a modified Genetic Algorithm (GA) is employed to select a subset of indicators and simultaneously tune their parameters.

The GA individuals encode both binary inclusion flags for each indicator and real‑valued parameter settings. The fitness function combines 5‑fold cross‑validation accuracy of a Naïve Bayes (NB) classifier with the estimated return on investment (ROI) on the training data. To improve exploration, the authors augment the classic GA with two mechanisms: (1) Random Immigrants, which replace a configurable fraction of the worst individuals each generation with newly generated random chromosomes, and (2) Hyper‑Mutation, which temporarily raises the mutation probability when fitness stagnates over several generations. These extensions aim to avoid premature convergence and to explore distant regions of the feature space.

The binary target variable is defined as the sign of the one‑hour price change: y = 1 if the closing price at time t is greater than or equal to the closing price at t‑1, otherwise y = 0. The NB classifier assumes conditional independence among the selected TA features and models each feature with a Gaussian distribution. A rejection threshold (P_rejection) is introduced: predictions with posterior probability below this threshold are discarded, mimicking a “no‑trade” decision.

Data are split chronologically: 70 % for training (including GA‑driven feature selection) and 30 % for a held‑out validation set. Within the training phase, the GA iteratively searches for the best feature‑parameter combination, and the final NB model is retrained on the entire training set using the selected configuration. Performance is then assessed on the untouched validation set.

Results show a dramatic improvement after GA optimization. The baseline NB model (using all six indicators with default parameters) yields an ROI of only 0.43 % on the validation set. After GA‑driven selection and tuning, ROI rises to 10.29 %, an increase of roughly 24‑fold. Classification accuracy also improves modestly (by about 5–7 %). The GA typically discards redundant indicators (e.g., overlapping moving‑average windows) and settles on a compact subset with tuned look‑back periods, thereby better satisfying the NB independence assumption. To provide interpretability, the authors apply t‑Distributed Stochastic Neighbor Embedding (t‑SNE) to the high‑dimensional feature space. The resulting 2‑D plots reveal clearer separation between the two classes after feature selection, suggesting that the model captures meaningful market structure rather than random noise.

Despite these promising findings, several limitations are acknowledged. First, ROI is computed without accounting for transaction costs, slippage, or leverage, which could substantially erode the reported gains in a real trading environment. Second, the Naïve Bayes independence assumption is unlikely to hold perfectly for TA indicators, potentially biasing probability estimates. Third, the paper provides limited details on GA hyper‑parameters (population size, number of generations, mutation/crossover rates), hindering reproducibility. Fourth, the validation period ends in early 2017, so the model’s robustness to more recent market regimes, high‑frequency trading effects, or macro‑economic shocks remains untested. Finally, the binary target based solely on price direction ignores risk‑adjusted considerations such as drawdown, position sizing, or stop‑loss rules.

In conclusion, the study contributes a coherent pipeline—technical‑indicator generation → evolutionary feature selection → lightweight Naïve Bayes classification—augmented with visual analytics via t‑SNE. It demonstrates that carefully selected TA features can substantially improve a simple probabilistic classifier’s profitability on EUR/USD hourly data. Future work could extend the approach to multiple currency pairs and timeframes, incorporate realistic trading cost models, compare against more sophisticated classifiers (e.g., deep neural networks, gradient boosting), and explore meta‑optimization of GA parameters to further enhance robustness and practical applicability.


Comments & Academic Discussion

Loading comments...

Leave a Comment