Cellular Simultaneous Recurrent Neural Network (SRN) has been shown to be a function approximator more powerful than the MLP. This means that the complexity of MLP would be prohibitively large for some problems while SRN could realize the desired mapping with acceptable computational constraints. The speed of training of complex recurrent networks is crucial to their successful application. Present work improves the previous results by training the network with extended Kalman filter (EKF). We implemented a generic Cellular SRN and applied it for solving two challenging problems: 2D maze navigation and a subset of the connectedness problem. The speed of convergence has been improved by several orders of magnitude in comparison with the earlier results in the case of maze navigation, and superior generalization has been demonstrated in the case of connectedness. The implications of this improvements are discussed.
The artificial neural networks, inspired by the enormous capabilities of living brains, are one of the cornerstones of today's field of artificial intelligence. Their applicability to real world engineering problems has become unquestionable in the recent decades, see for example [1]. Yet most of the networks used in the real world applications use the feedforward architecture, which is a far cry from the massively recurrent architecture of the biological brains. The widespread use of feed-forward architecture is facilitated by the availability of numerous efficient training methods. However, the introduction of recurrent elements makes training more difficult and even impractical for most non-trivial cases.
The SRN’s have been shown to be more powerful function approximators by several researchers ( [2], [3]). It has been shown experimentally that an arbitrary function generated by a MLP can always be learned by an SRN. However the opposite was not true, as not all functions given by a SRN could be learned by a MLP. These results support the idea that the recurrent networks are essential in harnessing the power of brain-like computing.
It is well known that MLPs and a variety of kernel-based networks (like RBF) are universal function approximators, in some sense. Andrew Barron [4] proved that MLPs are better than linear basis function systems like Taylor series in approximating smooth functions; more precisely, as the number of inputs N to a learning system grows, the required complexity for an MLP only grows as O(N), while the complexity for a linear basis function approximator grows exponentially, for a given degree of accuracy in approximation.
However, when the function to be approximated does not live up to the usual concept of smoothness, or when the number of inputs becomes even larger than what an MLP can readily handle, it becomes ever more important to use a more general class of neural network.
The area of intelligent control provides examples of very difficult functions to be tackled by ANN’s. Such functions arise as solutions to multistage optimization problems, given by the Bellman equation 8. The design of non-linear control systems, also known as “Adaptive Critics”, presupposes the ability of the so called “Critic network” to approximate the solution of the Bellman equation. See [5] for overview of adaptive critic designs. Such problems also are classified as Approximate Dynamic Programming (ADP). A simple example of such function is the 2D maze navigation problem, considered in this contribution. See [6] for in depth overview of the ADP and Maze navigation problem. The applications of EKF for NN training have been developed by researchers in the field of control [7], [8], [5].
The classic challenge posed by Rosenblatt to perception theory is the recognition of topological relations [9]. Minsky and Papert [10] have shown that such problems fundamentally cannot be solved by perceptrons because of their exponential complexity. The multi-layer perceptrons are more powerful than Rosenblatt’s perceptron but they are also claimed to be fundamentally limited in their ability to solve topological relation problems [11]. An example of such problem is the connectedness predicate. The task is to determine whether the input pattern is connected regardless of its shape and size.
The two problems described above pose fundamental challenges to the new types of neural networks, just like the XOR problem posed a fundamental challenge to the perceptrons, which could be overcome only by the introduction of the hidden layer and thus effectively moving to the new type of ANN.
In this contribution, we present the Cellular Simultaneous Neural Network (CSRN) architecture. This is a case of more generic architecture called ObjectNet, see [12], chapter 6, page 120. We use the Extended Kalman Filter (EKF) methodology for training our networks and obtain very encouraging results. For the first time an efficient training methodology is applied to the complex recurrent network architecture. Extending the preliminary result introduced in [13], the present study addresses not only learning but also generalization of the network on two problem: maze and connectedness. Improvement in speed of learning by several orders of magnitude as a result of using EKF is also demonstrated. We consider the results introduced in this work as initial demonstration of the proposed learning principle, which should be thoroughly studied and implemented in various domains.
The rest of this paper is organized as follows. Section 2 describes the calculation of derivatives in the recurrent network. Section 3 describes the CSRN architecture. Section 4 gives the EKF formulas. Section 5 describes the operation of a generic CSRN application. Sections 6 and 7 describe the two problems addressed by this contribution and give the simulation results. Section 8 is the discussion and conclusions.
The backpropagation algorithm is the foundation of
This content is AI-processed based on open access ArXiv data.