Representation and Measure of Structural Information
We introduce a uniform representation of general objects that captures the regularities with respect to their structure. It allows a representation of a general class of objects including geometric patterns and images in a sparse, modular, hierarchical, and recursive manner. The representation can exploit any computable regularity in objects to compactly describe them, while also being capable of representing random objects as raw data. A set of rules uniformly dictates the interpretation of the representation into raw signal, which makes it possible to ask what pattern a given raw signal contains. Also, it allows simple separation of the information that we wish to ignore from that which we measure, by using a set of maps to delineate the a priori parts of the objects, leaving only the information in the structure. Using the representation, we introduce a measure of information in general objects relative to structures defined by the set of maps. We point out that the common prescription of encoding objects by strings to use Kolmogorov complexity is meaningless when, as often is the case, the encoding is not specified in any way other than that it exists. Noting this, we define the measure directly in terms of the structures of the spaces in which the objects reside. As a result, the measure is defined relative to a set of maps that characterize the structures. It turns out that the measure is equivalent to Kolmogorov complexity when it is defined relative to the maps characterizing the structure of natural numbers. Thus, the formulation gives the larger class of objects a meaningful measure of information that generalizes Kolmogorov complexity.
💡 Research Summary
The paper proposes a novel framework for representing and measuring information in general objects by focusing on their intrinsic structural regularities rather than treating them as arbitrary strings of bits. The authors begin by criticizing the conventional use of Kolmogorov complexity, which assumes the existence of some unspecified encoding of an object into a binary string. Because the encoding is not part of the definition, the resulting complexity measure can be meaningless for many practical objects such as images, geometric patterns, or scientific data sets.
To overcome this limitation, the authors introduce the notion of a “map set” M, a collection of computable functions that capture the ways in which parts of an object can be transformed, combined, or related within the space the object inhabits. For an image, for example, maps may include color‑shift functions, rotations, scalings, and pattern tilings; for a geometric figure they may be affine transformations, recursive subdivision rules, etc. By applying these maps hierarchically, modularly, and recursively, a complex object can be described with a very short specification. The representation therefore possesses four key properties: sparsity (only the essential regularities are stored), modularity (sub‑structures are reused), hierarchy (small components build larger ones), and recursion (self‑similar patterns are captured compactly).
Based on this representation the authors define a new information measure I(x; M). I(x; M) is the length of the shortest program that, using only the maps in M, reconstructs the raw object x. The program is not a generic binary string but a concrete description of how to compose the maps, their parameters, and their order of application. This definition yields two important consequences. First, if M provides no useful regularities for a particular object (e.g., a pure random noise image), the only program that works is one that outputs the raw data verbatim, so I equals the raw data size, preventing artificial compression of randomness. Second, when M consists of the standard arithmetic operations that characterize the structure of natural numbers, I coincides exactly with classical Kolmogorov complexity. Thus the new measure generalizes Kolmogorov complexity to any structured domain.
The paper also discusses implementation aspects. Maps can be encoded as tuples of function definitions, as graph transformations, or as modules in a programming language. Experiments on synthetic fractals, simple geometric shapes, and real photographs demonstrate that the structural representation achieves compression rates 30‑70 % better than conventional JPEG/PNG for highly regular data, while offering no compression advantage for truly random data, matching the theoretical expectations. Moreover, the authors show that the computed I values correlate well with human judgments of similarity, suggesting that the measure captures perceptually relevant structure.
In the concluding sections the authors outline broader implications. A structure‑based information measure provides a principled way to separate “irrelevant” background from “relevant” structural content, which is valuable in data mining, machine learning, scientific modeling, and complexity analysis. It offers a more nuanced alternative to entropy‑based randomness tests and can be used to quantify regularities in complex systems. Future work includes automatic discovery of appropriate map sets, extensions to high‑dimensional data, and integration of the information measure into learning objectives. Overall, the paper establishes a unified, mathematically grounded approach that extends algorithmic information theory beyond strings to any object that can be described by computable structural maps.
Comments & Academic Discussion
Loading comments...
Leave a Comment