The structure of evolved representations across different substrates for artificial intelligence
Artificial neural networks (ANNs), while exceptionally useful for classification, are vulnerable to misdirection. Small amounts of noise can significantly affect their ability to correctly complete a task. Instead of generalizing concepts, ANNs seem to focus on surface statistical regularities in a given task. Here we compare how recurrent artificial neural networks, long short-term memory units, and Markov Brains sense and remember their environments. We show that information in Markov Brains is localized and sparsely distributed, while the other neural network substrates “smear” information about the environment across all nodes, which makes them vulnerable to noise.
💡 Research Summary
The paper investigates how different computational substrates encode internal representations of the world, using an information‑theoretic framework that measures the conditional mutual information R = H(W : B | S) between world states (W) and internal brain states (B) given sensor inputs (S). The authors evolve three types of agents—standard recurrent neural networks (RNNs), long short‑term memory networks (LSTMs), and Markov Brains (MBs)—to solve the same dynamic task, the Active Categorical Perception (ACP) task. In ACP, a 16 × 32 grid world drops blocks of two possible sizes (small or large) that move diagonally left or right. An agent, equipped with four upward‑facing sensors and two lateral actuators, must maneuver to catch small blocks and avoid large ones, requiring it to infer block size, direction, and relative location from temporally sparse sensory data.
All agents have ten hidden/internal nodes, and evolution is performed with the MABE framework using a genome that encodes either connection weights (RNN, LSTM) or deterministic logic gates (MB). Mutations, deletions, and duplications are applied uniformly across substrates, ensuring a fair comparison. After evolution, the agents achieve high fitness on the ACP task.
To quantify representations, the authors record sensor, brain, and world variables over time and compute the representation matrix M where each entry M_{concept, node} = H(W_concept : B_i | S). Two “smearedness” metrics are defined: (1) node‑smearedness S_N, the sum over nodes of the pairwise minima of concept information within a node, indicating how many concepts a single node encodes; (2) concept‑smearedness S_C, the sum over concepts of the pairwise minima across nodes, indicating how widely a single concept is distributed. Low values correspond to sparse, localized representations; high values indicate that information is spread globally.
Results show that MBs concentrate most of the relevant information in a few nodes, yielding very low S_N and S_C. In contrast, both RNNs and LSTMs distribute information across all ten hidden units, with LSTMs exhibiting the highest smearedness due to their continuous-valued states and complex gating.
Robustness to sensory noise is then tested by flipping each sensor with probability p. MBs maintain performance even at relatively high noise levels, whereas RNNs and LSTMs degrade sharply once p exceeds ~0.1. The authors argue that globally smeared representations are more vulnerable because a small perturbation can corrupt the shared information across many nodes, whereas localized representations protect the core knowledge in a few dedicated units.
The study draws three main conclusions. First, evolutionary processes naturally produce sparse, localized representations, reminiscent of functional specialization in biological brains. Second, the prevalent deep‑learning paradigm that relies on globally distributed representations, while powerful, carries an inherent susceptibility to adversarial perturbations. Third, future AI design may benefit from hybrid approaches that combine the performance of deep networks with the robustness of sparsely encoded, evolution‑driven architectures. The paper thus contributes both a methodological framework for measuring internal representations and empirical evidence that the structure of those representations critically determines noise robustness.
Comments & Academic Discussion
Loading comments...
Leave a Comment