Quaternion-Valued Recurrent Projection Neural Networks on Unit Quaternions
Hypercomplex-valued neural networks, including quaternion-valued neural networks, can treat multi-dimensional data as a single entity. In this paper, we present the quaternion-valued recurrent projection neural networks (QRPNNs). Briefly, QRPNNs are …
Authors: Marcos Eduardo Valle, Rodolfo Anibal Lobo
Quaternion-V alued Recurrent Projection Neural Networks on Unit Quaternions Marcos Eduardo V alle and Rodolfo Anibal Lobo Institute of Mathematics, Statistics, and Scientific Computing University of Campinas Campinas, Brazil valle@ime.unicamp.br , r odolfolobo@ug.uc hile.cl Abstract Hypercomplex-v alued neural networks, including quaternion-valued neural net- works, can treat multi-dimensional data as a single entity . In this paper , we present the quaternion-v alued recurrent projection neural networks (QRPNNs). Briefly , QRPNNs are obtained by combining the non-local projection learning with the quaternion-v alued recurrent correlation neural network (QRCNNs). W e show that QRPNNs o vercome the cross-talk problem of QRCNNs. Thus, they are appropri- ate to implement associati ve memories. Furthermore, computational e xperiments re veal that QRPNNs exhibit greater storage capacity and noise tolerance than their corresponding QRCNNs. K e ywor ds: Recurrent neural network, Hopfield network, associati ve memory, quaternion-v alued neural network. 1. Introduction The Hopfield neural network, de veloped in the early 1980s, is an important and widely-kno wn recurrent neural network which can be used to implement asso- ciati ve memories [1, 2]. Successful applications of the Hopfield network include control [3, 4], computer vision and image processing [5, 6], classification [7, 8], and optimization [2, 9, 10]. Despite its many successful applications, the Hopfield network may suffer from a very lo w storage capacity when used to implement associativ e memo- ries. Precisely , due to cross-talk between the stored items, the Hebbian learning adopted by Hopfield in his original work allows for the storage of approximately n/ (2 ln n ) items, where n denotes the length of the stored vectors [11]. Pr eprint submitted to Elsevier F ebruary 3, 2020 Se veral neural networks and learning rules hav e been proposed in the literature to increase the storage capacity of the original bipolar Hopfield network. For ex- ample, Personnaz et al. [12] as well as Kanter and Sompolinsk y [13] proposed the projection rule to determine the synaptic weights of the Hopfield networks. The projection rule increases the storage capacity of the Hopfield network to n − 1 items. Another simple b ut effecti ve improv ement on the storage capacity of the original Hopfield networks was achie ved by Chiueh and Goodman’ s recurrent cor- relation neural networks (RCNNs) [14, 15]. Briefly , an RCNN is obtained by de- composing the Hopfield network with Hebbian learning into a two layer recurrent neural network. The first layer computes the inner product (correlation) between the input and the memorized items followed by the e valuation of a non-decreasing continuous excitation function. The subsequent layer yields a weighted av erage of the stored items. Alternati vely , certain RCNNs can be viewed as kernelized versions of the Hopfield netw ork with Hebbian learning [16, 17, 18]. It turns out that the associati ve memory models described in the pre vious para- graphs are all designed for the storage and recall of bipolar real-v alued v ectors. In many applications, ho we ver , we hav e to process multiv alued or multidimensional data [19]. In view of this remark, the Hopfield neural network as well as the RC- NNs ha ve been e xtended to hypercomple x systems such as comple x numbers and quaternions. Research on comple x-v alued Hopfield neural netw orks dates to the late 1980s [20, 21, 22]. In 1996, Janko wski et al. [23] proposed a multistate complex- v alued Hopfield network with Hebbian learning that corroborated to the dev elop- ment of many other hypercomplex-v alued networks. Among the many papers on hypercomplex-v alued v ersions of the Hopfield network, we would like to mention the following works which are strongly related to the models addressed in this paper . First, Lee de veloped the projection rule for (multistate) complex-valued Hopfield networks [24]. Based on the w orks of Janko wski et al. and Lee, Isokawa et al. proposed a multistate quaternion-valued Hopfield neural network using ei- ther Hebbian learning or projection rule [25, 26]. Unfortunately , the multistate quaternion-v alued Hopfield neural network may fails to settle at an equilibrium state [27, 28]. In contrast, the continuous-valued quaternionic model proposed independently by V alle and K obayashi always comes to rest at a stable equilib- rium point under the usual conditions on the synaptic weights [29, 30]. Apart from hypercomplex-v alued Hopfield networks, V alle proposed a complex-v alued version of the RCNNs [31]. Recently , the RCNNs hav e been further extended to quaternions [32]. In this paper we propose an improved version of the quaternion-valued RC- 2 NNs [32]. Although real, comple x, and quaternion-v alued RCNNs can be used to implement high-capacity associativ e memories, they require a suf ficiently large parameter which can be impossible in practical implementations [15, 32]. In order to circumvent this problem, we combine the projection rule and the QR- CNNs to obtain the new quaternion-valued recurrent projection neural networks (QRPNNs). As we will show , QRPNNs al ways hav e optimal absolute storage ca- pacity . In other words, the fundamental memories are all stationary states (fixed points) of a QRPNN under mild conditions on the stored items and the excitation function. Furthermore, the noise tolerance of QRPNNs are usually higher than their corresponding QRCNNs. Also, bipolar RPNNs are strontly related to the kernel associativ e memories proposed by Garcia and Moreno and further in vesti- gated by Perfetti and Ricci [16, 17, 18]. W e would like to point out that this paper corresponds to an extended v er- sion of the conference paper [33]. The most significant differences in this paper include: • A matrix-based formulation of QRCNNs and QRPNNs. • A detailed algorithm describing the new QRPNN, which can also be used to implement QRCNNs. • Formalized the results concerning the storage capacity of QRPNNs (Theo- rem 1) and their relationship with QRCNNs and RKAMs in the bipolar case (Theorems 2 and 3). • Additional computational experiments, including experiments concerning the storage and recall of color images from the CIF AR dataset [34]. The paper is organized as follows: Next section presents some basic concepts on quaternions. A brief revie w on the quaternion-v alued Hopfield neural network (QHNN) and quaternion-v alued recurrent correlation neural networks (QRCNNs) are given respecti vely in Sections 3 and 4. Quaternion-v alued recurrent projection neural networks (QRPNNs) are introduced in Section 5. Computational experi- ments are presented in Section 6. The paper finishes with the concluding remarks in Section 7. 2. Some Basic Concepts on Quaternions Quaternions are hyper-comple x numbers that extend the real and complex numbers systems. A quaternion may be regarded as a 4-tuple of real numbers, 3 i.e., q = ( q 0 , q 1 , q 2 , q 3 ) . Alternativ ely , a quaternion q can be written as follows q = q 0 + q 1 i + q 2 j + q 3 k , (1) where i , j , and k are imaginary numbers that satisfy the following identities: i 2 = j 2 = k 2 = ijk = − 1 . (2) Note that 1 , i , j , and k form a basis for the set of all quaternions, denoted by H . A quaternion q = q 0 + q 1 i + q 2 j + q 3 k can also be written as q = q 0 + ~ q , where q 0 and ~ q = q 1 i + q 2 j + q 3 k are called respectiv ely the real part and the vector part of q . The real and the vector part of a quaternion q are also denoted by Re { q } := q 0 and V e { q } := ~ q . The sum p + q of two quaternions p = p 0 + p 1 i + p 2 j + p 3 k and q = q 0 + q 1 i + q 2 j + q 3 k is the quaternion obtained by adding their components, that is, p + q = ( p 0 + q 0 ) + ( p 1 + q 1 ) i + ( p 2 + q 2 ) j + ( p 3 + q 3 ) k . (3) Furthermore, the product pq of two quaternions p = p 0 + ~ p and q = q 0 + ~ q is the quaternion gi ven by pq = p 0 q 0 − ~ p · ~ q + p 0 ~ q + q 0 ~ p + ~ p × ~ q , (4) where ~ p · ~ q and ~ p × ~ q denote respecti vely the scalar and cross products commonly defined in v ector algebra. Quaternion algebra are implemented in man y program- ming languages, including MATLAB , GNU Octave , Julia , and python . W e would like to recall that the product of quaternions is not commutativ e. Thus, spe- cial attention should be gi ven to the order of the terms in the quaternion product. The conjugate and the norm of a quaternion q = q 0 + ~ q , denoted respectiv ely by ¯ q and | q | , are defined by ¯ q = q 0 − ~ q and | q | = √ ¯ q q = q q 2 0 + q 2 1 + q 2 2 + q 2 3 . (5) W e say that q is a unit quaternion if | q | = 1 . W e denote by S the set of all unit quaternions, i.e., S = { q ∈ H : | q | = 1 |} . Note the S can be regarded as an hypersphere in R 4 . The quanternion-valued function σ : H ∗ → S giv en by σ ( q ) = q | q | , (6) 4 maps the set of non-zero quaternions H ∗ = H \ { 0 } to the set of all unit quater- nions. The function σ can be interpreted as a generalization of the signal func- tion to unit quaternions. Furthermore, σ generalizes to the quaternion domain the complex-v alued activ ation function proposed by Aizenberg and Morag a [35]. Finally , the inner product of two quaternion-valued column vectors x = [ x 1 , . . . , x n ] T ∈ H n and y = [ y 1 , . . . , y n ] T ∈ H n is gi ven by h x , y i = n X i =1 ¯ y i x i . (7) Note that h x , x i = n for all unit quaternion-valued vectors x ∈ S n . The Euclidean norm of a quaternion-v alued vector x ∈ H n is defined by k x k 2 = p h x , x i . 3. Quaternion-V alued Hopfield Neural Networks The famous Hopfield neural network (HNN) is a recurrent model which can be used to implement associativ e memories [1]. Quaternion-valued versions of the Hopfield network, which generalize complex-v alued models, have been ex- tensi vely in vestig ated in the past years [28, 29, 30, 36, 37, 38, 39, 40, 41, 42]. A comprehensiv e revie w on se veral types of quaternionic HNN (QHNN) can be found in [26, 28]. Briefly , the main dif ference between the several QHNN models resides in the acti vation function. In this paper , we consider a quaternion-valued activ ation function σ giv en by (6) whose output is obtained by normalizing its argument to length one [29, 30]. The resulting network can be implemented and analyzed more easily than the multistate quaternion-valued Hopfield neural network proposed by Isokawa et al. [25, 26]. Furthermore, as far as we know , it is the unique version of the Hopfield network on unit quaternions that always yields a con ver gent sequence in the asyn- chronous update mode under the usual conditions on the synaptic weights [28]. W e would like to point out that, together with the QHNN on unit quaternions, the QHNN based on the twin-multistate activ ation function are the unique stable quaternion-v alued Hopfield networks [41]. The QHNN based on twin-multistate acti vation function, howe ver , do not generalize the bipolar and complex-v alued models, i,e., the bipolar and complex-v alued models are not particular instances of the twin-multistate QHNN. The QHNN is defined as follows: Let w ij ∈ H denotes the j th quaternionic synaptic weight of the i th neuron of a network with n neurons. Also, let the state of the QHNN at time t be represented by a column quaternion-v alued v ector 5 x ( t ) = [ x 1 ( t ) , . . . , x n ( t )] T ∈ S n , that is, the unit quaternion x i ( t ) = x i 0 ( t ) + x i 1 ( t ) i + x i 2 ( t ) j + x i 3 ( t ) k corresponds to the state of the i th neuron at time t . Gi ven an initial state (or input v ector) x (0) = [ x 1 , . . . , x n ] T ∈ S n , the QHNNs de- fines recursiv ely the sequence of quaternion-v alued vectors x (0) , x (1) , x (2) , . . . by means of the equation x j ( t + 1) = ( σ ( a j ( t )) , 0 < | a j ( t ) | < + ∞ , x j ( t ) , otherwise , (8) where a i ( t ) = n X j =1 w ij x j ( t ) , (9) is the activ ation potential of the i th neuron at iteration t . In analogy with the traditional real-v alued bipolar Hopfield network, the sequence produced by (8) and (9) using asynchronous update mode is con vergent for an y initial state x (0) ∈ S n if the synaptic weights satisfy [29]: w ij = ¯ w j i and w ii ≥ 0 , ∀ i, j ∈ { 1 , . . . , n } . (10) Here, the inequality w ii ≥ 0 means that w ii is a non-negati ve real number . More- ov er , the synaptic weights of a QHNN are usually determined using either the cor- relation or projection rule [26]. Both correlation and projection rule yield synaptic weights that satisfy (10). Consider a fundamental memory set U = { u 1 , . . . , u p } , where each u ξ = [ u ξ 1 , . . . , u ξ n ] T is a quaternion-valued column vector whose components u ξ i = u ξ i 0 + u ξ i 1 i + u ξ i 2 j + u ξ i 3 k are unit quaternions. In the quaternionic version of the corr e- lation rule , also called Hebbian learning [26], the synaptic weights are giv en by w c ij = 1 n p X ξ =1 u ξ i ¯ u ξ j , ∀ i, j ∈ { 1 , 2 , . . . , n } . (11) Unfortunately , such as the real-v alued correlation recording recipe, the quater- nionic correlation rule is subject to the cross-talk between the original vectors u 1 , . . . , u p . In contrast, the pr ojection rule , also kno wn as the generalized-in verse r ecor ding r ecipe , is a non-local storage prescription that can suppress the cross- talk ef fect between the fundamental memories u 1 , . . . , u p [13]. Formally , in the projection rule the synaptic weights are defined by w p ij = 1 n p X η =1 p X ξ =1 u η i c − 1 η ξ ¯ u ξ j , (12) 6 where c − 1 η ξ denotes the ( η , ξ ) -entry of the quaternion-valued in verse of the matrix C ∈ H p × p gi ven by c η ξ = 1 n n X j =1 ¯ u η j u ξ j = 1 n u ξ , u η , ∀ µ, ν ∈ { 1 , . . . , p } . (13) It is not hard to sho w that, if the matrix C is inv ertible, then P n j =1 w p ij u ξ j = u ξ i for all ξ = 1 , . . . , p and i = 1 , . . . , n [26]. Therefore, all the fundamental memories are fixed points of the QHNN with the projection rule. On the downside, the projection rule requires the in version of a p × p quaternion-valued matrix. 4. Quaternion-V alued Recurrent Corr elation Neural Networks Recurrent correlation neural networks (RCNNs), formerly known as recurrent correlation associati ve memories (RCAMs), ha v e been introduced in 1991 by Chi- ueh and Goodman for the storage and recall of n -bit vectors [14]. In contrast to the original Hopfield neural network which has a limited storage capacity , some RCNN models can reach the storage capacity of an ideal associati ve memory [43]. Furthermore, certain RCNNs can be viewed as kernelized versions of the Hopfield network with Hebbian learning [16, 17, 18]. Finally , RCNNs are closely related to the dense associativ e memory model introduced by Krotov and Hopfield to establish the duality between associati ve memories and deep learning [44, 45]. The RCNNs hav e been generalized for the storage and recall of complex- v alued and quaternion-valued vectors [31, 32]. In the following, we briefly revie w the quaternionic recurrent neural networks (QRCNNs). Precisely , to pa ve the way for the development of the new models introduced in the next section, let us deriv e the QRCNNs from the correlation-based QHNN described by (8), (9), and (11). Consider a fundamental memory set U = { u 1 , . . . , u p } ⊂ S n . Using the synaptic weights w c ij gi ven by (11), we conclude from (9) that the activ ation po- tential of the i th neuron at iteration t of the correlation-based QHNN satisfies a i ( t ) = n X j =1 w c ij x j ( t ) = n X j =1 " 1 n p X ξ =1 u ξ i ¯ u ξ j # x j ( t ) = p X ξ =1 u ξ i " 1 n n X j =1 ¯ u ξ j x j ( t ) # = p X ξ =1 u ξ i 1 n x ( t ) , u ξ . 7 In words, the acti vation potential a i ( t ) is giv en by a weighted sum of u 1 i , . . . , u p i . Moreov er , the weights are proportional to the inner product between the current state x ( t ) and the fundamental memory u ξ . In the QRCNN, the acti vation potential a i ( t ) is also giv en by a weighted sum of u 1 i , . . . , u p i . Follo wing a reasoning similar to the “kernel trick” [46], howe ver , the weights are giv en by function of the real part of the inner product x ( t ) , u ξ . Precisely , let f : [ − 1 , 1] → R be a (real-valued) continuous and monotone non- decreasing function referred to as the excitation function. Giv en a quaternionic input vector x (0) = [ x 1 (0) , . . . , x N (0)] T ∈ S N , a QRCNN defines recursiv ely a sequence { x ( t ) } t ≥ 0 of quaternion-valued vectors by means of (8) where the acti vation potential of the i th output neuron at time t is giv en by a i ( t ) = p X ξ =1 w ξ ( t ) u ξ i , ∀ i = 1 , . . . , n, (14) with w ξ ( t ) = f 1 n Re x ( t ) , u ξ , ∀ ξ ∈ 1 , . . . , p. (15) Alternati vely , the dynamic of a QRCNN can be described using a matrix- vector notation. Let U = [ u 1 , . . . , u p ] ∈ S n × p be the matrix whose columns correspond to the fundamental memories, U ∗ denote the conjugate transpose of U , and assume the functions f , σ , and Re {·} are ev aluated in a component-wise manner . Giv en an initial state x (0) , the dynamic of a QRCNN using synchronous update is described by the equations w ( t ) = f Re { U ∗ x ( t ) } /n , (16) and x ( t + 1) = σ U w ( t ) . (17) Such as the original [14] and the complex-v alued RCNNs [31], a QRCNN is implemented by the fully connected two layer neural network with p hidden neurons shown in Figure 1a). The first layer e valuates f at the real part of the inner product x ( t ) , u ξ di vided by n . Equi valently , the first layer is equipped with quaternion-v alued neurons whose quaternionic synaptic weights are giv en by the matrix U ∗ and the activ ation function is ϕ ( · ) = f ( Re {·} ) . The output layer ev aluates σ at a weighted sum of u 1 , . . . , u p . In other words, the synaptic weights of the output neurons are giv en by the quaternionic matrix U and their acti vation function is σ . Note that the first layer encodes the quaternion-valued 8 a ) U ∗ f <{·} /n U σ ( · ) x 1 ( t ) / / , , % % / / x 1 ( t + 1) 3 3 + + $ $ x 2 ( t ) / / 2 2 , , / / x 2 ( t + 1) : : 3 3 + + x 3 ( t ) / / 9 9 2 2 # # / / x 3 ( t + 1) . . . . . . . . . . . . . . . E E B B ; ; + + x n ( t ) / / E E B B 2 2 / / x n ( t + 1) b ) U ∗ f <{·} /n V σ ( · ) Figure 1: The network topology of quaternionic recurrent correlation and projection neural net- works. vector x ( t ) of length n into a real-valued vector w ( t ) of length p . Similarly , the next quaternion-valued state vector x ( t + 1) , produced by the output layer , corresponds to the decoded v ersion of the p -dimensional real-v alued vector w ( t ) . Examples of QRCNNs include the following straightforward quaternionic gen- eralizations of the bipolar RCNNs: 1. The identity QRCNN is obtained by considering in (15) the identity function f i ( x ) = x . 2. The high-order QRCNN , which is determined by the function f h ( x ; q ) = (1 + x ) q , q > 1 . (18) 3. The potential-function QRCNN , which is obtained by considering in (15) the function f p ( x ; L ) = 1 (1 − x + ε p ) L , L ≥ 1 , (19) 9 where ε p > 0 is a small number 1 introduced to av oid a division by zero when x = 1 . 4. The exponential QRCNN , which is determined by an exponential f e ( x ; α ) = e αx , α > 0 . (20) Note that QRCNNs generalize both bipolar and complex-v alued RCNNs [14, 31]. Precisely , the bipolar and the complex-v alued models are obtained by con- sidering vectors x = [ x 1 , . . . , x n ] T ∈ S n whose components satisfy respectiv ely x j = x j 0 + 0 i + 0 j + 0 k and x j = x j 0 + x j 1 i + 0 j + 0 k for all j = 1 , . . . , n . Furthermore, the identity QRCNN generalizes the traditional bipolar correlation- based Hopfield neural network but it does not generalize the correlation-based QHNN. Indeed, in contrast to the correlation-based QHNN, the identity QRCNN uses only the real part of the inner product x ( t ) , u ξ . Let us briefly turn our attention to the high-order , potential-function, and the exponential excitation functions. Fixed the first argument x , the functions f h , f p , and f e are all exponential with respect to their parameters. Precisely , these three excitation functions belong to the follo wing family of parametric functions: F = { f ( x ; λ ) : f ( x ; λ ) = [ A ( x )] λ for λ ≥ 0 and a continuous function A (21) such that 0 ≤ A ( x 1 ) < A ( x 2 ) when − 1 ≤ x 1 < x 2 ≤ 1 } . The family F is particularly important for the implementation of associative mem- ories [32]. Precisely , if the excitation function belongs to the family F , then a QRCNN can reach the storage capacity of an ideal associati ve memory by choos- ing a suf ficiently lar ge parameter λ . On the do wnside, ov erflo w imposes an upper bound on the parameter λ of an excitation function f ∈ F which may the limit application of a QRCNN as associati ve memory . Finally , we would like to point out that, independently of the initial state x (0) ∈ S n and the update mode (synchronous or asynchronous), a QRCNN model alw ays yields a con ver gent sequence { x ( t ) } t ≥ 0 [32]. Therefore, QRC- NNs are potential models to implement associativ e memories. Morev oer , due to the non-linearity of the acti v ation functions of the hidden neurons, the high-order , potential-function, and exponential QRCNNs may ov ercome the rotational in v ari- ance problem found on quaternionic Hopfield neural network [30]. On the down- side, such as the correlation-based quaternionic Hopfield network, QRCNNs may 1 Like our previous works [31, 32], we adopted the value ε p = √ mach , where mach denotes the machine floating-point relativ e accuracy , in our computational implementation f p . 10 suf fer from cross-talk between the fundamental memories u 1 , . . . , u p . Inspired by the projection rule, the next section introduces improved models which o vercome the cross-talk problem of the QRCNNs. 5. Quaternion-V alued Recurrent Pr ojection Neural Networks Quaternion-v alued recurrent projection neural networks (QRPNNs) combine the main idea behind the projection rule and the QRCNN models to yield high capacity associativ e memories. Specifically , using the synaptic weights w p ij gi ven by (12), the acti v ation potential of the i th neuron at time t of the projection-based QHNN is a i ( t ) = n X j =1 w p ij x j ( t ) = n X j =1 " 1 n p X η =1 p X ξ =1 u η i c − 1 η ξ ¯ u ξ j # x j ( t ) = p X η =1 p X ξ =1 u η i c − 1 η ξ " 1 n n X j =1 ¯ u ξ j x j ( t ) # = p X ξ =1 p X η =1 u η i c − 1 η ξ ! 1 n x ( t ) , u ξ . In analogy to the QRCNN, we replace the term proportional to the inner product between x ( t ) and u η by the weight w ξ ( t ) giv en by (15). Accordingly , we define c − 1 η ξ as the ( η , ξ ) -entry of the in verse of the real-v alued matrix C ∈ R p × p gi ven by c η ξ = f 1 n Re u ξ , u η , ∀ η , ξ ∈ { 1 , . . . , p } . (22) Furthermore, to simplify the computation, we define v ξ i = p X η =1 u η i c − 1 η ξ , (23) for all i = 1 , . . . , n and η = 1 , . . . , p . Thus, the activ ation potential of a QRPNN is gi ven by a i ( t ) = p X ξ =1 w ξ ( t ) v ξ i , ∀ i = 1 , . . . , n. (24) 11 Concluding, gi ven a fundamental memory set U = { u 1 , . . . , u p } ⊂ S n , define the p × p real-v alued matrix C by means of (22) and compute the quaternion- v alued vectors v 1 , . . . , v p using (23). Like the QRCNN, giv en an input vector x (0) ∈ S n , a QRPNN yields the sequence { x ( t ) } t ≥ 0 by means of (8) where the acti vation potential of the i th output neuron at time t is giv en by (24) with w ξ ( t ) defined by (15). Alternati vely , using a matrix-vector notation, a synchronous QRPNN can be described as follows: Let U = [ u 1 , . . . , u p ] ∈ S n × p be the quaternion-valued matrix whose columns corresponds to the fundamental memories. Define the real- v alued matrix C ∈ R p × p and the quaternion-v alued matrix V ∈ H n × p by means of the equations C = f ( U ∗ U /n ) and V = U C − 1 , (25) where the excitation function f is ev aluated in an entry-wise manner . Note that the two equations in (25) are equiv alent to (22) and (23), respecti vely . Giv en the initial state x (0) , a QRCNN defines recursi vely w ( t ) = f Re { U ∗ x ( t ) } /n , (26) and x ( t + 1) = σ V w ( t ) , (27) where f : [ − 1 , 1] → R and σ : H ∗ → S are ev aluated in a component-wise manner . Algorithm 1, formulated using matrix notation, summarizes the imple- mentation of a QRPNN using synchronous update. W e would like to point out that a QRCNN is obtained by setting V = U in Algorithm 1. Like the QRCNN, a QRPNN is also implemented by the fully connected two layer neural network with p hidden neurons sho wn in Figure 1b). The dif ference between the QRCNN and the QRPNN is the synaptic weight matrix of the output layer . In other words, they differ in the way the real-valued vector w ( t ) is decoded to yield the next state x ( t ) . From the computational point of view , although the training phase of a QRPNN requires O ( p 3 + np 2 ) operations to compute the matrices C − 1 and V , they usu- ally exhibit better noise tolerance than the QRCNNs. Moreover , the following theorem shows that QRCNNs overcome the cross-talk between the fundamental memories if the matrix C is in vertible. Precisely , the following theorem shows that all the fundamental memories are stationary states of a QRPNN if the matrix C is in vertible. 12 Algorithm 1: Quaternion-v alued Recurrent Projection Neural Network Data: 1. A continuous and non-decreasing real-valued function f : R → R . 2. Matrices U = [ u 1 , . . . , u p ] and V = [ v 1 , . . . , v p ] . 3. The input vector x = [ x 1 , . . . , x n ] . 4. Maximum number of iterations t max and a tolerance τ > 0 . Result: Retriev ed vector y . Initialize t = 0 and ∆ = τ + 1 . while t ≤ t max and ∆ ≥ τ do 1. Compute the weights w ξ = f 1 n Re { U ∗ x } . 2. Compute the next state y = σ ( V w ) . 3. Update respectiv ely t ← t + 1 , ∆ ← k y − x k , and x ← y . Theorem 1. Gi ven a fundamental memory set U = { u , . . . , u p } , define the real- v alued p × p -matrix C by (22). If C is in vertible, then all fundamental memories u 1 , . . . , u p are stationary states of a QRPNN defined by (8), (15), and (14). Pr oof. Let us assume that the matrix C gi ven by (22) is in vertible. Also, suppose a QRPNN is initialized at a fundamental memory , that is, x (0) = u γ for some γ ∈ { 1 , . . . , p } . From (15) and (22), we conclude that w ξ (0) = f Re u γ , u ξ /n = c ξ γ , ∀ ξ = 1 , . . . , p. Furthermore, from (24) and (23), we obtain the follo wing identities for any i ∈ 13 { 1 , . . . , n } : a i (0) = p X ξ =1 w ξ (0) v ξ i = p X ξ =1 p X η =1 u η i c − 1 η ξ ! c ξ γ = p X η =1 u η i p X ξ =1 c − 1 η ξ c ξ γ ! = p X η =1 u η i δ η γ = u γ i , where δ η γ is the Kronecker delta, that is, δ η γ = 1 if η = γ and δ η γ = 0 if η 6 = γ . From (8), we conclude that the fundamental memory u γ is a fixed point of the QRPNN if the matrix C is in vertible. Let us in vestigate further the relationship between QRPNNs and QRCNNs. Theorem 1 shows that QRPNNs can be used to implement an associati ve memory whene ver the matrix C is inv ertible. It turns out that QRCNNs can also be used to implement an associativ e memory using an excitation function f ∈ F with a sufficiently large parameter λ [32]. Assuming f ∈ F , the following theorem sho ws that the matrix C gi ven by (22) is in vertible if the parameter λ is sufficiently large. Moreover , the QRPNN and the QRCNN coincide in this case. Theorem 2. Consider a fundamental memory set U = { u 1 , . . . , u p } ⊂ S n and an excitation function f ∈ F , where F is the family of parameterized functions gi ven by (21). The matrix C gi ven by (22) is in vertible for a parameter λ suffi- ciently large. Furthermore, gi ven an arbitrary state vector x ∈ S n , let x P and x C denote respectiv ely the states of the QRPNN and QRCNN models after one single synchronous update, that is, x P = σ ( V w ) and x C = σ ( U w ) , (28) where w = f ( Re { U ∗ x } /n ; λ ) . In this case, x P approaches x C as λ tends to infinity . Formally , we hav e lim λ →∞ k x P − x C k 2 = 0 , (29) where k · k 2 = p h x , x i denotes the Euclidean norm. Pr oof. First of all, recall that f ( x ; λ ) = [ A ( x )] λ because f ∈ F . Let A 1 = A (1) > 0 denote the maximum v alue of the function A . Also, note that Re {h x , y i} = n − 1 2 k x − y k 2 2 , ∀ x , y ∈ S n . 14 As a consequence, Re {h x , y i} /n = 1 if and only if x = y and Re {h x , y i} < 1 if x 6 = y . Therefore, an entry of the matrix C giv en by (22) satisfies lim λ →∞ c η ξ A λ 1 = lim λ →∞ 1 A λ 1 f 1 n Re u ξ , u η ; λ = lim λ →∞ " A Re u ξ , u η /n A 1 # λ = ( 1 , ξ = η , 0 , otherwise. Equi valently , lim λ →∞ 1 A λ 1 C = lim λ →∞ 1 A λ 1 I C = I , (30) where I denotes the identity matrix. In a similar fashion, we conclude that lim λ →∞ w A λ 1 = ( e ξ , x = u ξ , 0 , otherwise , (31) where e ξ denotes the ξ th column of the p × p identity matrix. Since the matrix product is continuous and the in v erse of C is unique, C − 1 exists and approaches A − λ 1 I as λ increases. In other words, (30) ensures the existence of C − 1 for λ suf ficiently large. Let us now show (29). T o this end, let us define z = C − 1 w or, equi valently , w = C z . From (30) and (31), we conclude that lim λ →∞ z = lim λ →∞ 1 A λ 1 C h lim λ →∞ z i = lim λ →∞ 1 A λ 1 C z = lim λ →∞ 1 A λ 1 ( C z ) = lim λ →∞ w A λ 1 = e ξ . No w , recalling that V = U C − 1 ( λ ) , k · k and σ are continuous, and A 1 > 0 , we 15 obtain lim λ →∞ k x P − x C k 2 = lim λ →∞ k σ V w − σ U w k 2 = lim λ →∞ σ U C − 1 w − σ 1 A λ 1 U w 2 = σ U lim λ →∞ C − 1 w − σ U lim λ →∞ 1 A λ 1 w 2 = σ U lim λ →∞ z − σ U lim λ →∞ 1 A λ 1 w 2 = k σ ( U e ξ ) − σ ( U e ξ ) k 2 = 0 . The last identity concludes the proof of the theorem. W e would like to point out that the basic idea behind Theorem 2 is that the matrix C gi ven by (22) can be approximated by a multiple of the identity matrix for a sufficiently large parameter λ of an excitation function f ∈ F . Borro wing the terminology from [18], we say that a QRCNN as well as a QRPNN are in saturated mode if the matrix C can be approximated by cI for some c > 0 . In the saturated mode, QRCNNs and QRPNNs coincide. In analogy to the QRCNNs, the identity mapping and the functions f h , f p , and f e gi ven by (18), (19), and (20) are used to define respectiv ely the identity QRPNN , the high-order QRPNN , the potential-function QRPNN , and the expo- nential QRPNN . Note that the identity QRPNN generalizes the traditional bipolar projection- based Hopfield neural network. The identity QRPNN, ho wev er , does not general- ize the projection-based QHNN because the former uses only the real part of the inner product between x ( t ) and u ξ . In fact, in contrast to the projection-based QHNN, the design of a QRPNN does not require the in version of a quarternion- v alued matrix but only the in version of a real-v alued matrix. 5.1. Bipolar RPNNs and Recurr ent K ernel Associative Memories As pointed out previously , quaternion-v alued RPNNs reduce to bipolar mod- els when the fundamental memories are all real-valued, that is, their vector part is zero. In this subsection, we address the relationship between bipolar RPNNs and the recurrent kernel associativ e memories (RKAMs) proposed by Garcia and Moreno [16, 17] and further in vestigated by Perfetti and Ricci [18]. A RKAM model is defined as follows. Let κ denote an inner-product ker - nel and ρ > 0 be a user-defined parameter . Giv en a fundamental memory set 16 U = { u 1 , . . . , u p } ⊆ {− 1 , +1 } n , define the Lagrange multipliers vector β i = [ β i 1 , . . . , β ip ] as the solution of the follo wing quadratic problem for i = 1 , . . . , n : minimize Q ( β i ) = 1 2 p X ξ ,η =1 β ξ i β η i u ξ i u η i κ ( u ξ , u η ) − p X ξ =1 β ξ i , subject to 0 ≤ β ξ i ≤ ρ, ∀ ξ = 1 , . . . , p. (32) Then, gi ven a bipolar initial state x (0) ∈ {− 1 , +1 } n , a RKAM e v olves according to the follo wing equation for all i = 1 , . . . , n : x i ( t + 1) = sgn p X ξ =1 β ξ i u ξ i κ u ξ , x ( t ) ! . (33) Note that x i ( t + 1) , the next state of the i th neuron of a RKAM, corresponds to the output of a support vector machine (SVM) classifier without the bias term determined using the training set T i = { ( u ξ , u ξ i ) : ξ = 1 , . . . , p } . Therefore, the design of a RKAM requires, in some sense, training n independent support v ector classifiers (one SVM for each output neuron of the RKAM!). Furthermore, like the soft-margin SVM, the used-defined parameter ρ controls the trade-of f between the training error and the separation margin [46]. In the associativ e memory con- text, the larger the parameter ρ is, the larger the storage capacity of the RKAM. Con versely , some fundamental memories may fail to be stationary states of the RKAM if ρ is small. Let us now compare the RKAMs with bipolar RCNNs and RPNNs. T o this end, let us assume the excitation function f is a valid kernel. The exponential function f e , for example, yields a valid kernel, namely the Gaussian radial-basis function kernel [18]. As pointed out by Perfetti and Ricci [18], the main difference between a RKAM and a RCAM is the presence of the Lagrange multipliers in the former . Precisely , the Lagrange multipliers are β ξ i = 1 in the RCNNs while, in the RKAM, they are obtained solving (32). Furthermore, the RKAM is equiv alent to the corresponding bipolar RCNN in the saturation model, that is, when the matrix C given by (22) exhibits a diagonal dominance [18]. In a similar fashion, we observe that a RKAM and a RPNN coincide if v ξ i = β ξ i u ξ i for all i = 1 , . . . , n and ξ = 1 , . . . , p . The following theorem shows that this equation holds true if all the constraints are inacti ve at the solution of the quadratic problem (32). 17 Theorem 3. Let f : [ − 1 , +1] → R be a continuous and non-decreasing function such that the follo wing equation yields a valid k ernel κ ( x , y ) = f h x , y i n , ∀ x , y ∈ {− 1 , +1 } n . (34) Consider a fundamental memory set U = { u 1 , . . . , u p } ⊂ {− 1 , +1 } n such that the matrix C giv en by (22) is in vertible. If the solutions of the quadratic problem defined by (32) satisfy 0 < β ξ i < C for all i = 1 , . . . , n and ξ = 1 , . . . , p , then the RKAM defined by (32) and (33) coincide with the RPNN defined by (8), (15), and (24). Alternatively , the RKAM and the bipolar RPNN coincide if the vectors v ξ ’ s gi ven by (23) satisfy 0 < v ξ i u ξ i < ρ for all i = 1 , . . . , n and ξ = 1 , . . . , p . Pr oof. First of all, note that κ ( u ξ , u η ) = c ξ η gi ven by (22). Let us first show that the RKAM coincide with the bipolar RPNN if the in- equalities 0 < β ξ i < C hold true for all i and ξ . If there is no activ e constrain at the solution of (32), then the Lagrange multiplier β i are also the solution of the unconstrained quadratic problem (32): minimize Q ( β i ) = 1 2 p X ξ ,η =1 β ξ i u ξ i c ξ η u η i β η i − p X ξ =1 β ξ i , ∀ i = 1 , . . . , n. (35) It turns out that the minimum of (35), obtained by imposing ∂ Q ∂ β ξ i = 0 , is the solution of the linear system p X η =1 u ξ i c ξ η u η i β η i = 1 , ξ = 1 , . . . , p. (36) Multiplying (36) by u ξ i and recalling that ( u ξ i ) 2 = 1 , we obtain p X η =1 c ξ η u η i β η i = u ξ i , ξ = 1 , . . . , p. (37) Since the matrix C is inv ertible, the solution of the linear system of equations (37) is u ξ i β ξ i = p X η =1 c − 1 ξ η u η i , ∀ ξ = 1 , . . . , p and i = 1 , . . . , n, (38) 18 where c − 1 ξ η denotes the ( ξ , η ) -entry of C − 1 . W e conclude the first part of the proof by noting that the right-hand side of equations (38) and (23) coincide. Therefore, v ξ i = β ξ i u ξ i for all i = 1 , . . . , n and ξ = 1 , . . . , p and the RKAM coincides with the bipolar QRPNN. On the other hand, if 0 < v ξ i u ξ i < ρ for all i = 1 , . . . , n and ξ = 1 , . . . , p , then v ξ i = β ξ i u ξ i is a solution of (38). Equi v alently , β ξ i = u ξ i v ξ i is the solution of the unconstrained quadratic problem (35) as well as the quadratic problem with bounded constraints (32). As a consequence, the RKAM defined by (32) and (33) coincide with the RPNN defined by (8), (15), and (24). W e would like to point out that the condition 0 < β ξ i < ρ often occurs in the saturation mode, that is, when the matrix C exhibits a diagonal dominance. In particular , the matrix C is diagonally dominant giv en a sufficiently large parameter λ of an excitation function f ∈ F . In fact, giv en a parametric function f ∈ F , the Lagrange multiplier β ξ i approaches 1 /f (1; λ ) as λ increases [18]. Concluding, in the saturation mode, the RKAM, RCNN, and RPNN are all equi valent. W e shall confirm this remark in the computational experiments presented in the next section. 6. Computational Experiments This section provides computational experiments comparing the performance of QHNNs, QRCNNs and the ne w QRPNN models as associati ve memory mod- els. Although the paper address quaternionic models, let us begin by address- ing the noise tolerance and storage capacity of the recurrent neural networks for the storage and recall of bipolar real-valued vectors. The bipolar case are also used to confirm the results presented in Section 5.1. W e address the noise tolerance and storage capacity of quaternion-valued vectors subsequently . W e would like to point out that the source-codes of the computational experiments, implemented in Julia Language , are av ailable https://github.com/ mevalle/Quaternion- valued- Recurrent- Projection- Neural- Networks . 6.1. Bipolar Associative Memories Let us compare the storage capacity and noise tolerance of the Hopfield neural networks (HNNs), the original RCNNs, and the new RPNNs designed for the stor- age of p = 36 randomly generated bipolar (real-v alued) vectors of length n = 100 . Precisely , we consider the correlation-based and projection-based Hopfield neural networks, the identity , high-order , potential-function, and exponential RCNN and RPNN models with parameters q = 5 , L = 3 , and α = 4 , respectiv ely . 19 T o ev aluate the storage capacity and noise tolerance of the bipolar associativ e memories, the follo wing steps ha ve been performed 100 times for n = 100 and p = 36 : 1. W e synthesized associativ e memories designed for the storage and recall of a randomly generated fundamental memory set U = { u 1 , . . . , u p } ⊂ {− 1 , +1 } n , where Pr [ u ξ i = 1] = Pr [ u ξ i = − 1] = 0 . 5 for all i = 1 , . . . , n and ξ = 1 , . . . , p . 2. W e probed the associati ve memories with an input vector x (0) = [ x 1 (0) , . . . , x n (0)] T obtained by re versing some components of u 1 with probability π , i.e., Pr [ x i (0) = − u 1 i ] = π and Pr [ x i (0) = u 1 i ] = 1 − π , for all i and ξ . 3. The associativ e memories hav e been iterated until they reached a station- ary state or completed a maximum of 1000 iterations. A memory model succeeded to recall a stored item if the output equals u 1 . Figure 2 sho ws the probability of an associati ve memory to recall a fundamental memory by the probability of noise introduced in the initial state. Note that the projection-based HNN coincides with the identity RPNN. Similarly , the correlation- based HNN coincides with the identity RCNN. Also, note that the RPNNs always succeeded to recall undistorted fundamental memories (zero noise probability). The high-order , potential-function, and exponential RCNNs also succeeded to re- call undistorted fundamental memories. Ne vertheless, the recall probability of the high-order and exponential RPNNs are greater than or equal to the recall prob- ability of the corresponding RCNNs. In other words, the RPNNs exhibit better noise tolerance than the corresponding RCNNs. The potential-function RCNN and RPNN yielded similar recall probabilities. In a similar fashion, let us compare the storage capacity and noise tolerance of the RCNN, RPNN and the RKAM with kernel giv en by (34) with f ≡ f e . In this experiment, we considered ρ = 1000 and α = 1 and α = 3 . Figure 3 sho ws the recall probabilities of the exponential bipolar RCNN and RKAM, with dif ferent parameter v alues, by the noise probability introduced in the input. Note that both RPNN and RKAM outperformed the RCNN. Furthermore, the exponen- tial bipolar RPNN and the RKAM coincided for all the values of the parameter α ∈ { 1 , 3 } . According to Theorem 3, an RKAM coincide with a bipolar RPNN if the Lagrange multipliers satisfy 0 < β ξ i < ρ for all i = 1 , . . . , n and ξ = 1 , . . . , p . Figure 4 shows the histogram of the Lagrange multipliers obtained solving (32) for a random generated matrix U and the exponential kernel with α = 1 , 2, and 3. The vertical dotted lines correspond to the v alues e − 3 , e − 2 , and e − 1 . Note that 20 0.0 0.2 0.4 0.6 0.8 1.0 Noise Probability 0.0 0.2 0.4 0.6 0.8 1.0 Recall Probability Projection HNN Correlation HNN Identity RPNN Identity RCNN High-order RPNN High-order RCNN Potential RPNN Potential RCNN Exponential RPNN Exponential RCNN Figure 2: Recall probability of bipolar associative memories by the noise intensity introduced in the input vector . β ξ i approaches 1 /f (1) = e − α as α increases. More importantly , the inequalities 0 < β ξ i < ρ are satisfied for α > 1 and ρ ≥ 2 . 6.2. Quaternion-valued Associative Memories Let us no w in vestigate the storage capacity and noise tolerance of the assso- ciati ve memory models for the storage and recall of p = 36 randomly generated quaternion-v alued vectors of length n = 100 . In this example, we considered the projection-based and the correlation-based quaternion-v alued Hopfield neural network (QHNNs) as well as the identity , high-order , potential-function, and ex- ponential QRCNNs and QRPNNs with parameters q = 20 , L = 3 , and α = 15 . These parameters hav e been determined so that the QRCNNs hav e more than 50% probability to recall an undistorted fundamental memory . In analogy to the previ- ous example, the follo wing steps have been performed 100 times: 21 0.0 0.2 0.4 0.6 0.8 1.0 Noise Probability 0.0 0.2 0.4 0.6 0.8 1.0 Recall Probability RCNN, alpha = 1 RPNN, alpha = 1 RKAM, alpha = 1 RCNN, alpha = 3 RPNN, alpha = 3 RKAM, alpha = 3 Figure 3: Recall probability of exponential bipolar RPNN and RKAM by the noise intensity in- troduced in the input vector . 1. W e synthesized associativ e memories designed for the storage and recall of uniformly distributed fundamental memories U = { u 1 , . . . , u p } . Formally , we defined u ξ i = RandQ for all i = 1 , . . . , n and ξ = 1 , . . . , p where RandQ = (cos φ + i sin φ )(cos ψ + k sin ψ )(cos θ + j sin θ ) , is a randomly generated unit quaternion obtained by sampling angles φ ∈ [ − π , π ) , ψ ∈ [ − π / 4 , π / 4] , and θ ∈ [ − π / 2 , π / 2) using an uniform distribu- tion. 2. W e probed the associati ve memories with an input vector x (0) = [ x 1 (0) , . . . , x n (0)] T obtained by replacing some components of u 1 with probability π by an uni- formly distributed component, i.e., Pr [ x i (0) = RandQ ] = π and Pr [ x i (0) = u 1 i ] = 1 − π , for all i and ξ . 3. The associativ e memories hav e been iterated until they reached a station- ary state or completed a maximum of 1000 iterations. The memory model 22 0.0 0.2 0.4 0.6 0.8 1.0 1.2 Lagrange Multipliers 0 100 200 300 400 500 600 700 800 alpha=1.0 alpha=2.0 alpha=3.0 Figure 4: Histogram of the Lagrange multipliers of a RKAM model. succeeded if the output equals the fundamental memory u 1 . Figure 5 sho ws the probability of a quaternion-valued associativ e memory to recall a fundamental memory by the probability of noise introduced in the ini- tial state. As expected, the QRPNNs always succeeded to recall undistorted fun- damental memories. The potential-function and exponential QRCNNs also suc- ceeded to recall undistorted fundamental memories. Indeed, the potential-function QRCNN and QRPNNs yielded the same recall probability . The noise tolerance of the exponential QRCNN and QRPNN also coincided. Ne vertheless, the recall probability of the high-order QRPNN is greater than the recall probability of the corresponding QRCNNs. Furthermore, in contrast to the real-valued case, the projection QHNN differs from the identity QRPNN. In fact, the noise tolerance of the identity QRPNN is far greater than the noise tolerance of the projection QHNN. 23 0.0 0.2 0.4 0.6 0.8 1.0 Noise Probability 0.0 0.2 0.4 0.6 0.8 1.0 Recall Probability Projection QHNN Correlation QHNN Identity QRPNN Identity QRCNN High-order QRPNN High-order QRCNN Potential QRPNN Potential QRCNN Exponential QRPNN Exponential QRCNN Figure 5: Recall probability of quaternion-valued associativ e memories by the noise intensity introduced in the input vector . 6.3. Stora ge and Recall of Color Imag es In the previous subsection, we compared the performance of QHNN, QRCNN, and QRPNN models designed for the storage and recall of uniformly distributed fundamental memories. Let us no w compare the performance of the quaternion- v alued associativ e memories for the storage and recall of color images. Specifi- cally , let us compare the noise tolerance of QHNN, QRCNN, and QRPNN models when the input is an color image corrupted by Gaussian noise. Recall that Gaus- sian noise are introduced in a color image, for e xample, due to faulty sensors [47]. Computationally , an image corrupted by Gaussian noise is obtained by adding a term drawn from an Gaussian distrib ution with zero mean and a fix ed standard v ariation to each channel of a color image. At this point, we would like to recall that an RGB color image I can be con- verted to a unit quaternion-valued vector x = [ x 1 , . . . , x n ] ∈ S n , n = 1024 , as 24 follo ws [48]: Let I R i ∈ [0 , 1] , I G i ∈ [0 , 1] , and I B i ∈ [0 , 1] denote respecti vely the red, green, and blue intensities at the i th pixel of an RGB color image I . For i = 1 , . . . , n , we first compute the phase-angles φ i = ( − π + ) + 2 ( π − ) I R i , (39) ψ i = − π 4 + + π 2 − 2 I G i , (40) θ i = − π 2 + + ( π − 2 ) I G i , (41) where > 0 is a small number such that φ i ∈ [ − π , π ) , ψ i ∈ [ − π / 4 , π / 4] , and θ i ∈ [ − π / 2 , π / 2) [49]. In our computational experiments, we adopted = 10 − 4 . Then, we define the unit quaternion-valued v ector x using the phase-angle repre- sentation x i = e φ i i e ψ i k e θ i j of its components. Equiv alently , we hav e x i = (cos φ i i sin φ i )(cos ψ i + k sin ψ i )(cos θ i + j sin θ i ) , ∀ i = 1 , . . . , n. (42) Con versely , gi ven an unit quaternion-valued vector x , we first compute the phase- angles φ i , ψ i , and θ i of the component x i using T able 2.2 in [49]. Afterwards, we obtain the RGB color image I by in verting (39)-(41), that is, I R i = φ i + π − 2 ∗ ( π − ) , I G i = ψ i + π / 4 − ( π / 2 − 2 ) , and I B i = θ i + π / 2 − ( π − 2 ) , (43) for all i = 1 , . . . , n . In order to compare the performance of the quaternion-v alued associativ e memories, we used color images from the CIF AR dataset [34]. Recall that the CIF AR dataset contains 60000 RGB color images of size 32 × 32 . In this exper - iment, we randomly selected p = 200 color images and conv erted them to unit quaternion-v alued vectors u 1 , . . . , u p . Similarly , we corrupted one of the p se- lected images with Gaussian noise and con verted it to a quaternion-valued v ector x ∈ S n . The corrupted vector x ha ve been presented to associativ e memories designed for the storage of the fundamental memory set U = { u 1 , . . . , u p } ∈ S n . Figure 6 shows an original color image selected from the CIF AR dataset, a color image corrupted by Gaussian noise with standard de viation 0 . 1 , and the corresponding images retrie ved by the associati ve memory models. Note that the correlation-based QHNN as well as the identity QRCNN failed to retriev e the original image due to the cross-talk between the stored items. Although the projection-based QHNN yielded an image visually similar to the original cab’ s 25 a) Original image b) Corrupted image c) Correlation-based QHNN d) Projection-based QHNN 0 5 10 15 20 25 30 0 5 10 15 20 25 30 0 5 10 15 20 25 30 0 5 10 15 20 25 30 0 5 10 15 20 25 30 0 5 10 15 20 25 30 0 5 10 15 20 25 30 0 5 10 15 20 25 30 e) Identity QRCNN f) High-order QRCNN ( q = 70 ) g) Potential-function QRCNN ( L = 5 ) h) Exponential QRCNN ( α = 40 ) 0 5 10 15 20 25 30 0 5 10 15 20 25 30 0 5 10 15 20 25 30 0 5 10 15 20 25 30 0 5 10 15 20 25 30 0 5 10 15 20 25 30 0 5 10 15 20 25 30 0 5 10 15 20 25 30 i) Identity QRPNN j) High-order QRPNN ( q = 70 ) k) Potential-function QRPNN ( L = 5 ) l) Exponential QRPNN ( α = 40 ) 0 5 10 15 20 25 30 0 5 10 15 20 25 30 0 5 10 15 20 25 30 0 5 10 15 20 25 30 0 5 10 15 20 25 30 0 5 10 15 20 25 30 0 5 10 15 20 25 30 0 5 10 15 20 25 30 Figure 6: Original color image, input image corrupted by Gaussian noise with standard deviation 0.1, and the corresponding images retriev ed by the quaternion-v alued associativ e memories. image, this memory model also failed to retriev e the original image due to the ma- genta pixels near the cab’ s bumpers. All the other associativ e memories succeed to retrie ve the original image. Quantitativ ely , we say that an associativ e memory suc- ceed to recall an stored image if the error gi ven by the Euclidean norm k u 1 − y k , where y denotes the retriev ed quaternion-valued vector , is less than or equal to a tolerance τ = 10 − 4 . T able 1 shows the error produced by the QHNN, QR- CNN, and QRPNN memory models. This table also contains the error between the fundamental memory u 1 and the quaternion-v alued vector corresponding to the corrupted image. For a better comparison of the noise tolerance of the quaternion-valued as- sociati ve memories, we repeated the preceding experiment 100 times. W e also 26 Corrupted image: 20 . 8 Correlation-based QHNN: 35 . 2 Projection-based QHNN: 1 . 9 Identity QRCNN: 41 . 2 High-order QRCNN: 8 . 3 × 10 − 11 Potential-function QRCNN: 3 . 0 × 10 − 15 Exponential QRCNN: 3 . 2 × 10 − 10 Identity QRPNN: 9 . 0 × 10 − 5 High-order QRPNN: 3 . 3 × 10 − 15 Potential-function QRPNN: 3 . 9 × 10 − 15 Exponential QRPNN: 1 . 6 × 10 − 15 T able 1: Absolute error between the fundamental memory u 1 and either the input or the quaternion-valued v ector recalled by an associati ve memory model. considered images corrputed by Gaussian noise with several different standard de viation values. Figure 7 shows the probability of successful recall by the stan- dard deviation of the Gaussian noise introduced in the input image. In aggreement with Theorem 1, the QRPNN alw ays succeeded to recall undistorted patterns (zero standard deviation). Note that the potential-function QRPNN and QRCNN coin- cided. From Theorem 2, we conclude that these neural networks are in saturated mode. Furthermore, like the experiment described in the previous subsection, the projection-based QHNN dif fers from the identity QRPNN. The latter, ho wev er , yielded larger recall probabilities because it circumvents the rotational in variance present in the QHNN model [30]. Finally , we repeated the experiment used to generate Figure 7 considering only the exponential QRPNN and QRCNN but with dif ferent values of the parameter α . Moreover , to better discriminate the QRPNN and the QRCNN models, instead of computing the recall probability , we computed the Euclidean error between the desired output u 1 and the retrie ved vector y , that is, the error is giv en by k u 1 − y k 2 . Figure 8 depicts the av erage error by the standard de viation of the Gaussian noise introduced in the original color image. Note that the error produced by the expo- nential QRPNNs from an undistorted input are all around the machine precision, that is, around 10 − 14 . Equi valently , the QRPNNs succeeded to recall undistorted images. Note also that the error produced by both QRPNN and QRCNN associa- ti ve memories decreases as the parameter α increases. Ne vertheless, the av erage error produced by the QRPNN are alw ays below the corresponding QRCNN mod- 27 0.0 0.1 0.2 0.3 0.4 0.5 Standard Variation 0.0 0.2 0.4 0.6 0.8 1.0 Recall Probability Projection QHNN Correlation QHNN Identity QRPNN Identity QRCNN High-order QRPNN High-order QRCNN Potential QRCNN Potential QRPNN Exponential QRCNN Exponential QRPNN Figure 7: Recall probability of quaternion-valued associativ e memories by the standard deviation of the Gaussian noise introduced in the input. els. Finally , in accordance with Theorem 2, the exponential QRPNN and QRCNN coincide when the parameter α is sufficiently large, i.e., α = 160 in this experi- ment. 7. Concluding Remarks In this paper , we presented the quaternion-valued recurrent projection asso- ciati ve memories (QRPNNs). Briefly , QRPNNs are obtained by combining the projection rule with the quaternion-valued recurrent correlation associati ve mem- ories (QRCNNs). In contrast to the QRCNNs, howe ver , QRPNNs always exhibit optimal storage capacity (see Theorem 1). Nev ertheless, QRPNN and QRCNN coincide in the saturated mode (see Theorem 2). Also, bipolar QRPNN and the recurrent kernel associati ve memory (RKAM) models coincide under mild con- ditions (see Theorem 3). The computational experiments provided in Section 6 28 0.0 0.1 0.2 0.3 0.4 0.5 Standard Variation 1 0 1 3 1 0 1 0 1 0 7 1 0 4 1 0 1 1 0 2 Error QRPNN, alpha = 10 QRCNN, alpha = 10 QRPNN, alpha = 20 QRCNN, alpha = 20 QRPNN, alpha = 40 QRCNN, alpha = 40 QRPNN, alpha = 80 QRCNN, alpha = 80 QRPNN, alpha = 160 QRCNN, alpha = 160 Figure 8: Error between the desired output and the retriev ed quaternion-valued vector by the standard deviation of the Gaussian noise introduced in the input. sho w that the storage capacity and noise tolerance of QRPNNs (including real- v alued case) are greater than or equal to the storage capacity and noise tolerance of their corresponding QRCNNs. In the future, using recent results on hypercomplex-v alued Hopfield neural network [50], we plan to extend the RCNN and RPNN models for other hyper - complex algebras such as hyperbolic numbers, commutativ e quaternions, and oc- tonions. W e also intent to in vestig ate further the noise tolerance of the QRPNNs as well as to address the performance of the new associati ve memories for pattern reconstruction and classification. Acknowledgments This work was supported in part by CNPq under grant no. 310118/2017-4, F APESP under grant no. 2019/02278-2, and Coordenação de Aperfeiçoamento 29 de Pessoal de Nível Superior - Brasil (CAPES) - Finance Code 001. References [1] J. J. Hopfield, Neural Networks and Physical Systems with Emergent Col- lecti ve Computational Abilities, Proceedings of the National Academy of Sciences 79 (1982) 2554–2558. [2] J. Hopfield, D. T ank, Neural computation of decisions in optimization prob- lems, Biological Cybernetics 52 (1985) 141–152. [3] J. Gan, Discrete Hopfield neural network approach for crane safety e valua- tion, in: 2017 International Conference on Mechanical, System and Control Engineering (ICMSC), 2017, pp. 40–43. doi: 10.1109/ICMSC.2017. 7959439 . [4] Y . Song, B. Xing, L. Guo, X. Xu, System parameter identification ex- periment based on Hopfield neural network for self balancing vehicle, in: 2017 36th Chinese Control Conference (CCC), 2017, pp. 6887–6890. doi: 10.23919/ChiCC.2017.8028442 . [5] Q. W ang, W . Shi, P . M. Atkinson, Z. Li, Land cover change detection at subpixel resolution with a Hopfield neural network, IEEE Journal of Se- lected T opics in Applied Earth Observations and Remote Sensing 8 (2015) 1339–1352. doi: 10.1109/JSTARS.2014.2355832 . [6] J. Li, X. Li, B. Huang, L. Zhao, Hopfield neural network approach for super- vised nonlinear spectral unmixing, IEEE Geoscience and Remote Sensing Letters 13 (2016) 1002–1006. doi: 10.1109/LGRS.2016.2560222 . [7] G. Pajares, M. Guijarro, A. Ribeiro, A Hopfield neural network for com- bining classifiers applied to textured images, Neural Networks 23 (2010) 144–153. doi: 10.1016/j.neunet.2009.07.019 . [8] H. Zhang, Y . Hou, J. Zhao, L. W ang, T . Xi, Y . Li, Automatic welding quality classification for the spot welding based on the Hopfield associati ve memory neural network and chernof f face description of the electrode displacement signal features, Mechanical Systems and Signal Processing 85 (2017) 1035 – 1043. doi: https://doi.org/10.1016/j.ymssp.2016.06.036 . 30 [9] G. Serpen, Hopfield Network as Static Optimizer: Learning the W eights and Eliminating the Guesswork., Neural Processing Letters 27 (2008) 1–15. doi: 10.1007/s11063- 007- 9055- 8 . [10] C. Li, X. Y u, T . Huang, G. Chen, X. He, A generalized Hopfield netw ork for nonsmooth constrained con vex optimization: Lie deri vati ve approach, IEEE T ransactions on Neural Networks and Learning Systems 27 (2015) 1–14. doi: 10.1109/TNNLS.2015.2496658 . [11] R. J. McEliece, E. C. Posner , E. R. Rodemich, S. V enkatesh, The capacity of the Hopfield associati ve memory, IEEE T ransactions on Information Theory 1 (1987) 33–45. [12] L. Personnaz, I. Guyon, G. Dreyfus, Information storage and retriev al in spin glass like neural netw orks, Journal of Physics Letter 46 (1985) L359–L365. [13] I. Kanter , H. Sompolinsky , Associati ve Recall of Memory without Errors, Physical Re vie w 35 (1987) 380–392. [14] T . Chiueh, R. Goodman, Recurrent Correlation Associativ e Memories, IEEE T rans. on Neural Networks 2 (1991) 275–284. [15] T . Chiueh, R. Goodman, Recurrent Correlation Associativ e Memories and their VLSI Implementation, in: M. H. Hassoun (Ed.), Associati ve Neural Memories: Theory and Implementation, Oxford Univ ersity Press, 1993, pp. 276–287. [16] C. García, J. A. Moreno, The Hopfield Associati ve Memory Network: Im- proving Performance with the K ernel “T rick”, in: Lecture Notes in Artificial Inteligence - Proceedings of IBERAMIA 2004, v olume 3315 of Advances in Artificial Intelligence – IBERAMIA 2004 , Springer-V erlag, 2004, pp. 871– 880. [17] C. García, J. A. Moreno, The Kernel Hopfield Memory Network, in: P . M. A. Sloot, B. Chopard, A. G. Hoekstra (Eds.), Cellular Automata, Springer Berlin Heidelberg, Berlin, Heidelber g, 2004, pp. 755–764. [18] R. Perfetti, E. Ricci, Recurrent correlation associativ e memories: A feature space perspecti ve, IEEE Transactions on Neural Networks 19 (2008) 333– 345. 31 [19] A. Hirose, Complex-V alued Neural Networks, Studies in Computational In- telligence, 2nd edition ed., Springer , Heidelberg, Germany , 2012. [20] A. J. Noest, Associati ve memory in sparse phasor neural networks, EPL (Europhysics Letters) 6 (1988) 469. [21] A. J. Noest, Discrete-state phasor neural networks, Physical Revie w A 38 (1988) 2196–2199. doi: 10.1103/PhysRevA.38.2196 . [22] N. N. Aizenberg, I. N. Aizenberg, CNN based on multi valued neuron as a model of associati ve memory for gray-scale images, in: Proceedings of the 2nd International W orkshop on Cellular Neural Networks and Their Appli- cations, 1992, pp. 36–42. [23] S. Janko wski, A. Lozowski, J. Zurada, Complex-V alued Multi-State Neu- ral Associativ e Memory , IEEE T ransactions on Neural Networks 7 (1996) 1491–1496. [24] D.-L. Lee, Impro vements of complex-v alued Hopfield associati ve memory by using generalized projection rules, IEEE Transactions on Neural Net- works 17 (2006) 1341–1347. [25] T . Isokawa, H. Hishimura, A. Saitoh, N. Kamiura, N. Matsui, On the Scheme of Multistate Quaternionic Hopfiels Neural Network, in: Proceed- ings of Joint 4th International Conference on Soft Computing and Intelligent Systems and 9th International Symposium on adv anced Intelligent Systems (SCIS and ISIS 2008), Nagoya, Japan, 2008, pp. 809–813. [26] T . Isokaw a, H. Nishimura, N. Matsui, Quaternionic Neural Networks for As- sociati ve Memories, in: A. Hirose (Ed.), Complex-V alued Neural Networks, W iley-IEEE Press, 2013, pp. 103–131. doi: 10.1002/9781118590072. ch5 . [27] M. E. V alle, F . Z. Castro, Theoretical and computational aspects of quater- nionic multiv alued Hopfield neural networks, in: 2016 International Joint Conference on Neural Networks (IJCNN), 2016, pp. 4418–4425. doi: 10. 1109/IJCNN.2016.7727777 . [28] M. E. V alle, F . Z. Castro, On the Dynamics of Hopfield Neural Networks on Unit Quaternions, IEEE Transactions on Neural Networks and Learning Systems 29 (2018) 2464–2471. doi: 10.1109/TNNLS.2017.2691462 . 32 [29] M. E. V alle, A novel continuous-valued quaternionic hopfield neural net- work, in: 2014 Brazilian Conference on Intelligent Systems, 2014, pp. 97– 102. doi: 10.1109/BRACIS.2014.28 . [30] M. K obayashi, Rotational in v ariance of quaternionic hopfield neural net- works, IEEJ T ransactions on Electrical and Electronic Engineering 11 (2016) 516–520. doi: 10.1002/tee.22269 . [31] M. V alle, Complex-V alued Recurrent Correlation Neural Networks, IEEE T ransactions on Neural Networks and Learning Systems 25 (2014) 1600– 1612. doi: 10.1109/TNNLS.2014.2341013 . [32] M. E. V alle, Quaternionic Recurrent Correlation Neural Networks, in: 2018 International Joint Conference on Neural Networks (IJCNN), 2018, pp. 1–8. doi: 10.1109/IJCNN.2018.8489714 . [33] M. E. V alle, R. A. Lobo, An introduction to quaternion-valued recurrent projection neural networks, in: 2019 8th Brazilian Conference on Intelligent Systems (BRA CIS), 2019, pp. 848–853. doi: 10.1109/BRACIS.2019. 00151 . [34] A. Krizhe vsky , Learning multiple layers of features from tiny images, T echnical Report, Uni versity of T oronto, 2009. URL: http://www.cs. toronto.edu/~kriz/cifar.html . [35] I. Aizenberg, C. Moraga, Multilayer feedforward neural network based on multi-v alued neurons (mlmvn) and a backpropagation learn- ing algorithm, Soft Computing 11 (2007) 169–183. doi: 10.1007/ s00500- 006- 0075- 5 . [36] T . Isokawa, H. Nishimura, N. Kamiura, N. Matsui, Fundamental Properties of Quaternionic Hopfield Neural Network, in: Proceedings of the Interna- tional Joint Conference on Neural Networks (IJCNN), V ancouver , Canada, 2006, pp. 218–223. doi: 10.1109/IJCNN.2006.246683 . [37] T . Isokawa, H. Nishimura, N. Kamiura, N. Matsui, Dynamics of Discrete- T ime Quaternionic Hopfield Neural Networks, in: J. M. Sá, L. A. Alexan- dre, W . Duch, D. Mandic (Eds.), Artificial Neural Networks – ICANN 2007, volume 4668 of Lectur e Notes in Computer Science , Springer Berlin Heidel- berg, 2007, pp. 848–857. doi: 10.1007/978- 3- 540- 74690- 4_86 . 33 [38] T . Isokaw a, H. Hishimura, N. Kamiura, N. Matsui, Associativ e Memory in Quaternionic Hopfield Neural Network, International Journal of Neural Systems 18 (2008) 135–145. doi: 10.1142/S0129065708001440 . [39] T . Isokawa, H. Nishimura, N. Matsui, On the fundamental properties of fully quaternionic Hopfield network, in: Proceedings of the International Joint Conference on Neural Networks (IJCNN), Brisbane, Australia, 2012, pp. 1–4. doi: 10.1109/IJCNN.2012.6252536 . [40] Y . Osana, Chaotic Quaternionic Associati ve Memory, in: Proceedings of the International Joint Conference on Neural Networks (IJCNN), Brisbane, Australia, 2012, pp. 1–8. doi: 10.1109/IJCNN.2012.6252775 . [41] M. K obayashi, Quaternionic Hopfield neural networks with twin-multistate acti vation function, Neurocomputing 267 (2017) 304–310. doi: 10.1016/ j.neucom.2017.06.013 . [42] M. K obayashi, Gradient descent learning for quaternionic Hopfield neu- ral networks, Neurocomputing 260 (2017) 174–179. doi: 10.1016/j. neucom.2017.04.025 . [43] M. H. Hassoun, A. M. Y oussef, A New Recording Algorithm for Hopfield Model Associativ e Memories, in: Neural Netw ork Models for Optical Com- puting, volume 882 of Pr oceedings of SPIE , 1988, pp. 62–70. [44] D. Krotov , J. J. Hopfield, Dense associati ve memory for pattern recognition, in: Proceedings of the 30th International Conference on Neural Information Processing Systems, NIPS’16, Curran Associates Inc., Red Hook, NY , USA, 2016, p. 1180–1188. [45] M. Demircigil, J. Heusel, M. Löwe, S. Upgang, F . V ermet, On a Model of Associati ve Memory with Huge Storage Capacity, Journal of Statistical Physics 168 (2017) 288–299. doi: 10.1007/s10955- 017- 1806- y . [46] B. Schölkopf, A. J. Smola, Learning with Kernels: Support V ec- tor Machines, Regularization, Optimization, and Beyond, Adaptiv e Computation and Machine Learning, MIT Press, Cambridge, MA, USA, 2002. URL: http://mitpress.mit.edu/catalog/item/ default.asp?ttype=2&tid=8684 . 34 [47] K. Plataniotis, D. Androutsos, A. V enetsanopoulos, Adaptiv e Fuzzy Systems for Multichannel Signal Processing, Proceedings of the IEEE 87 (1999) 1601–1622. doi: 10.1109/5.784243 . [48] F . Z. Castro, M. E. V alle, Continuous-V alued Quaternionic Hopfield Neural Network for Image Retriev al: A Color Space Study, in: 2017 Brazilian Conference on Intelligent Systems (BRA CIS), 2017, pp. 186–191. doi: 10. 1109/BRACIS.2017.52 . [49] T . Bülo w , Hypercomplex Spectral Signal Representations for Image Processing and Analysis, T echnical Report NR 9903, Kiel Univ ersity , 1999. A vailable at: http://www .informatik.uni-kiel.de/en/department-of- computer-science/technical-reports. [50] F . Z. de Castro, M. E. V alle, A broad class of discrete-time hypercomplex- v alued hopfield neural networks, Neural Networks 122 (2020) 54 – 67. doi: https://doi.org/10.1016/j.neunet.2019.09.040 . 35
Original Paper
Loading high-quality paper...
Comments & Academic Discussion
Loading comments...
Leave a Comment