Notes on a New Philosophy of Empirical Science
This book presents a methodology and philosophy of empirical science based on large scale lossless data compression. In this view a theory is scientific if it can be used to build a data compression program, and it is valuable if it can compress a standard benchmark database to a small size, taking into account the length of the compressor itself. This methodology therefore includes an Occam principle as well as a solution to the problem of demarcation. Because of the fundamental difficulty of lossless compression, this type of research must be empirical in nature: compression can only be achieved by discovering and characterizing empirical regularities in the data. Because of this, the philosophy provides a way to reformulate fields such as computer vision and computational linguistics as empirical sciences: the former by attempting to compress databases of natural images, the latter by attempting to compress large text databases. The book argues that the rigor and objectivity of the compression principle should set the stage for systematic progress in these fields. The argument is especially strong in the context of computer vision, which is plagued by chronic problems of evaluation. The book also considers the field of machine learning. Here the traditional approach requires that the models proposed to solve learning problems be extremely simple, in order to avoid overfitting. However, the world may contain intrinsically complex phenomena, which would require complex models to understand. The compression philosophy can justify complex models because of the large quantity of data being modeled (if the target database is 100 Gb, it is easy to justify a 10 Mb model). The complex models and abstractions learned on the basis of the raw data (images, language, etc) can then be reused to solve any specific learning problem, such as face recognition or machine translation.
💡 Research Summary
The manuscript “Notes on a New Philosophy of Empirical Science” proposes a radical re‑thinking of scientific methodology by grounding it in large‑scale lossless data compression. The central construct, the Compression Rate Method (CRM), treats any scientific theory as a program that compresses a benchmark data set. The total description length—model size plus compressed data size—serves as an objective measure of a theory’s explanatory power and parsimony, thereby unifying Occam’s razor and the demarcation problem in a single quantitative framework.
The author begins by critiquing the current state of artificial intelligence, computer vision, and computational linguistics, arguing that these fields suffer from chronic evaluation problems, over‑reliance on subjective benchmarks, and a lack of rapid theory generation and falsification akin to Platt’s “Strong Inference”. To overcome these issues, the paper replaces physical experiments with massive natural data collections (images, video, text) and uses compression performance as the empirical test.
Chapter 1 lays out the philosophical foundations: objectivity, irrationality, progress, and the role of meta‑theoretical commitments. It reformulates the classic scientific principles—falsifiability, theory choice, and the problem of induction—through the lens of description length. The “Sophie’s Method” thought experiments illustrate how subtle changes in these commitments lead to a compression‑centric version of the scientific method.
Chapter 2 connects CRM to machine learning. The standard supervised learning pipeline is recast as a compression problem: learning a model that minimizes the bits needed to encode the training data. The author argues that, given modern datasets of tens or hundreds of gigabytes, complex models (deep neural networks of several megabytes) are justified because the compression gain outweighs the model’s own description length. The chapter also discusses “manual overfitting” (researchers inadvertently tailoring models to specific benchmark quirks) and shows how CRM’s objective metric mitigates this bias. Indirect learning is presented as a two‑step process: first compress the raw data to obtain reusable representations, then apply those representations to downstream tasks such as face recognition or machine translation.
Chapter 3 applies the framework to computer vision. Edge detection, segmentation, stereo correspondence, and object detection are each interpreted as components of an image‑compression pipeline. Existing evaluation protocols (e.g., BSDS, PASCAL VOC) are criticized for lack of reproducibility and for encouraging over‑fitting to narrow metrics. By contrast, a compression‑based evaluation would directly measure how many bits a vision algorithm saves on a large natural‑image corpus, providing a single, comparable, and scalable performance number. The “Comperical” formulation (compression + empirical) is introduced as a new research agenda that aligns vision research with the goal of minimizing description length.
Chapter 4 mirrors this treatment for language. Parsing, statistical machine translation, and language modeling are reframed as text‑compression tasks. The author demonstrates that traditional parsing metrics (e.g., PARSEVAL) and translation scores (BLEU) can be correlated with compression gains, suggesting that a better compressor is inherently a better linguistic model. The chapter also discusses the relationship between compression formats and linguistic theories, including a brief comparison with Chomskyan universal grammar, arguing that both can be viewed as meta‑theories that dictate how to encode linguistic data efficiently.
Chapter 5 argues that CRM constitutes a new scientific paradigm, comparable to the microprocessor or chess paradigms of the 20th century. It satisfies the classic criteria for a paradigm: conceptual clarity, methodological efficiency, scalable evaluation, and systematic progress. The “Casimir effect” analogy is used to illustrate how a paradigm can reshape the research landscape by providing a unifying metric that drives both theory generation and empirical testing.
Chapter 6 expands the discussion to meta‑theories and unification. Compression formats are treated as concrete instantiations of abstract scientific theories; meta‑formats (e.g., universal grammar, form‑of‑forms) enable the reuse of learned compressors across domains. The author sketches a “cartographic meta‑theory” that maps the structure of different data domains (vision, language, robotics) onto a common compression space, suggesting a path toward a unified empirical science of information.
The appendices provide background on information theory, the Hutter Prize, traditional compression techniques, and related work in unsupervised learning.
In sum, the manuscript offers a bold, mathematically grounded proposal: by measuring scientific theories through their ability to compress massive, real‑world data, we obtain an objective, quantitative yardstick that simultaneously enforces parsimony, enables rapid hypothesis testing, and resolves long‑standing evaluation dilemmas in AI‑related fields. If adopted, this compression‑centric methodology could catalyze systematic, reproducible progress across vision, language, and broader machine learning research.
Comments & Academic Discussion
Loading comments...
Leave a Comment