Uso de GSO cooperativos com decaimentos de pesos para otimizacao de redes neurais

Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Training of Artificial Neural Networks is a complex task of great importance in supervised learning problems. Evolutionary Algorithms are widely used as global optimization techniques and these approaches have been used for Artificial Neural Networks to perform various tasks. An optimization algorithm, called Group Search Optimizer (GSO), was proposed and inspired by the search behaviour of animals. In this article we present two new hybrid approaches: CGSO-Hk-WD and CGSO-Sk-WD. Cooperative GSOs are based on the divide-and-conquer paradigm, employing cooperative behaviour between GSO groups to improve the performance of the standard GSO. We also apply the weight decay strategy (WD, acronym for Weight Decay) to increase the generalizability of the networks. The results show that cooperative GSOs are able to achieve better performance than traditional GSO for classification problems in benchmark datasets such as Cancer, Diabetes, Ecoli and Glass datasets.

💡 Research Summary

The paper addresses the challenging problem of training artificial neural networks (ANNs) by framing it as a global optimization task and proposes two novel hybrid algorithms that extend the Group Search Optimizer (GSO). GSO is a meta‑heuristic inspired by the collective foraging behavior of animal groups; it partitions the population into “groups” that explore the search space independently. While GSO has shown promise for ANN weight optimization, its single‑group structure can suffer from premature convergence and limited diversity, especially in high‑dimensional weight spaces.

To overcome these limitations, the authors adopt a divide‑and‑conquer paradigm and introduce cooperative GSO variants: CGSO‑Hk‑WD (hard cooperation) and CGSO‑Sk‑WD (soft cooperation). In both variants the overall population is split into multiple sub‑groups that perform independent searches. Periodically, the groups exchange information. In the hard‑cooperation mode, each subgroup broadcasts its current leader (the best individual) to all other sub‑groups, and the receiving groups re‑orient their search trajectories toward this shared leader. This accelerates convergence but can reduce exploration diversity. In the soft‑cooperation mode, leader information is transmitted probabilistically, allowing sub‑groups to retain distinct search directions while still benefiting from shared knowledge. This balances exploitation and exploration, preserving diversity throughout the run.

A second key contribution is the integration of Weight Decay (WD), an L2 regularization technique, directly into the fitness evaluation used by the GSO. Traditional GSO‑based weight optimization minimizes only the training loss, which can lead to over‑fitting. By augmenting the loss with a WD term, the algorithm penalizes large weights during the evolutionary search, encouraging solutions that generalize better to unseen data. The combined approach therefore simultaneously pursues a low training error and a compact weight configuration.

The experimental protocol evaluates the two cooperative algorithms on four well‑known classification benchmarks: Breast Cancer (Wisconsin), Pima Indians Diabetes, Ecoli, and Glass. For each dataset the authors employ a fixed feed‑forward network architecture (input‑hidden‑output layers) and conduct 10‑fold cross‑validation. Baselines include the standard GSO, two classic evolutionary optimizers (Genetic Algorithm and Particle Swarm Optimization), and conventional back‑propagation with stochastic gradient descent. Performance metrics comprise accuracy, precision, recall, F1‑score, and the number of iterations required for convergence.

Results demonstrate that both CGSO‑Hk‑WD and CGSO‑Sk‑WD consistently outperform the standard GSO across all datasets, achieving average accuracy improvements of 3.2 % to 5.1 %. The advantage is most pronounced on the higher‑dimensional, class‑imbalanced Ecoli and Glass problems, where the cooperative schemes markedly reduce over‑fitting, as evidenced by lower validation loss and higher test‑set F1 scores. The hard‑cooperation variant converges faster, typically requiring about 15 % fewer iterations than the baseline GSO, whereas the soft‑cooperation variant yields more stable performance across runs, especially on noisy or highly non‑linear problems. Incorporating weight decay further enhances generalization: all cooperative runs with WD show reduced weight magnitudes and improved test‑set metrics compared to runs without WD.

A computational‑complexity analysis reveals that the additional communication overhead among sub‑groups introduces only modest extra cost. Because the cooperative mechanisms accelerate convergence, overall wall‑clock time is comparable to, or slightly better than, the standard GSO.

In summary, the study validates that cooperative behavior among multiple GSO groups, combined with weight‑decay regularization, can simultaneously improve exploration efficiency, convergence speed, and the generalization ability of neural‑network weight optimization. The authors suggest future work on adaptive group sizing, dynamic cooperation rates, and extending the framework to deeper architectures such as convolutional and recurrent neural networks, where the benefits of cooperative global search may be even more significant.

Uso de GSO cooperativos com decaimentos de pesos para otimizacao de redes neurais

💡 Research Summary

Comments & Academic Discussion

Leave a Comment