Formal semantics of language and the Richard-Berry paradox

Formal semantics of language and the Richard-Berry paradox
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

The classical logical antinomy known as Richard-Berry paradox is combined with plausible assumptions about the size i.e. the descriptional complexity of Turing machines formalizing certain sentences, to show that formalization of language leads to contradiction.


šŸ’” Research Summary

The paper revisits the classic Richard‑Berry paradox – the definition of ā€œthe smallest natural number that cannot be defined in fewer than twenty wordsā€ – and recasts it in the language of algorithmic information theory. The author first sketches the historical development of formal semantics, from Frege and Russell through Tarski, Carnap and Montague, and notes its modern relevance to computational linguistics and AI.

The core of the argument is a two‑step construction. First, the author defines the formal complexity of an English text as the length (in bits) of the smallest Turing‑machine program that computes exactly that text. If a text admits several formalizations, the one with minimal program size is taken as the reference machine. This notion mirrors Kolmogorov‑Chaitin complexity.

Two plausible assumptions are then introduced:

  1. Unboundedness – for every integer n there exists an English sentence whose formal complexity exceeds n. This reflects the intuition that natural language can express arbitrarily intricate ideas.

  2. Logarithmic overhead – consider the family of sentences

    t(n) : ā€œthe first text whose formal complexity is not less than n.ā€

    The description of t(n) consists of a fixed part (independent of n) plus a representation of the integer n. Since an integer can be encoded in ⌈log₂ nāŒ‰ bits, the formal complexity of t(n) exceeds that of t(20) by at most a quantity proportional to log n.

With these definitions, the author reproduces the paradoxical reasoning. Suppose a ā€œfirst textā€ satisfying (2) exists. Two sub‑cases arise:

  • (a) If the formal complexity of t(20) is less than 20, then t(20) itself provides a definition of ā€œthe first text ā€¦ā€, contradicting the very statement that its complexity is at least 20.

  • (b) If the formal complexity of t(20) is k ≄ 20, the logarithmic‑overhead assumption guarantees a sufficiently large K > k such that the formal complexity of t(K) is < K. Consequently, t(K) would be a definition of ā€œthe first text whose complexity is not less than Kā€, again violating its own definition.

If, on the other hand, no such ā€œfirst textā€ exists, then every English sentence would have formal complexity < 20, which directly contradicts the unboundedness assumption.

Thus, ordering English sentences by the size of their minimal Turing‑machine descriptions and then referring to ā€œthe first sentence whose description is at least n words longā€ reproduces the self‑referential inconsistency of the Richard‑Berry paradox. The conclusion is that a complete computational formalization of natural language—i.e., a mapping that assigns to every sentence a unique, minimal Turing‑machine description—is logically impossible. The paper therefore highlights a fundamental limitation of formal semantics: the very act of trying to capture all of natural language within a fixed formal system inevitably leads to paradox.


Comments & Academic Discussion

Loading comments...

Leave a Comment