Quantum-Like Uncertain Conditionals for Text Analysis

Reading time: 6 minute
...

📝 Abstract

Simple representations of documents based on the occurrences of terms are ubiquitous in areas like Information Retrieval, and also frequent in Natural Language Processing. In this work we propose a logical-probabilistic approach to the analysis of natural language text based in the concept of Uncertain Conditional, on top of a formulation of lexical measurements inspired in the theoretical concept of ideal quantum measurements. The proposed concept can be used for generating topic-specific representations of text, aiming to match in a simple way the perception of a user with a pre-established idea of what the usage of terms in the text should be. A simple example is developed with two versions of a text in two languages, showing how regularities in the use of terms are detected and easily represented.

💡 Analysis

Simple representations of documents based on the occurrences of terms are ubiquitous in areas like Information Retrieval, and also frequent in Natural Language Processing. In this work we propose a logical-probabilistic approach to the analysis of natural language text based in the concept of Uncertain Conditional, on top of a formulation of lexical measurements inspired in the theoretical concept of ideal quantum measurements. The proposed concept can be used for generating topic-specific representations of text, aiming to match in a simple way the perception of a user with a pre-established idea of what the usage of terms in the text should be. A simple example is developed with two versions of a text in two languages, showing how regularities in the use of terms are detected and easily represented.

📄 Content

QUANTUM-LIKE UNCERTAIN CONDITIONALS FOR TEXT ANALYSIS Alvaro Francisco Huertas-Rosero1 and C. J. van Rijsbergen2 {alvaro, keith}@dcs.gla.ac.uk 1 University of Glasgow 2 University of Cambridge Abstract. Simple representations of documents based on the occur- rences of terms are ubiquitous in areas like Information Retrieval, and also frequent in Natural Language Processing. In this work we propose a logical-probabilistic approach to the analysis of natural language text based in the concept of Uncertain Conditional, on top of a formulation of lexical measurements inspired in the theoretical concept of ideal quan- tum measurements. The proposed concept can be used for generating topic-specific representations of text, aiming to match in a simple way the perception of a user with a pre-established idea of what the usage of terms in the text should be. A simple example is developed with two versions of a text in two languages, showing how regularities in the use of terms are detected and easily represented. 1 Introduction How do prior expectations/knowledge affect the way a user approaches a text, and how they drive the user’s attention from one place of it to another? This is a very important but tremendously complex question; it is indeed as complex as human perception of text can be. Including such effects in the representation of text may be a relatively easy way to enhance the power of a text retrieval or processing system. In this work we will not address the question, but assume a simple answer to it, and follow it while building theoretical concepts that can constitute natural language text for retrieval of similar processing tasks. The key concept to be defined will be an Uncertain conditional defined between lexical measurements, which will allow us to exploit structures and features from both Boolean and Quantum logics to include certain features in a text representation. Automatic procedures for acquiring information about term usage in nat- ural language text can be viewed as lexical measurements, and can be put as statements such as [term t appears in the text]3, to which it is possible to as- sign true/false values. These can be regarded as a set of propositions. Some relations between propositions have the properties of an order relation ⊑: for 3 In this paper we will use the convention that anything between square brackets [ and ] is a proposition arXiv:1106.0411v1 [cs.CL] 2 Jun 2011 example, when one is a particular case of the other, e.g P1 = [term “research” appears in this text] and P2 = [term “research” appears in this text twice] we can say that P2 ⊑P1 or that P2 is below P1 according to this ordering. The set of propositions ordered by relation ⊑can be called a lattice when two conditions are fulfilled [2]: 1) a proposition exists that is above all the others (supremum), and 2) a proposition exists that is below all the others (infimum). When any pair of elements of a set has an order relation, the set is said to be totally ordered, as is the case with sets of integer, rational or real numbers and the usual order “larger or equal/smaller or equal than ” ⩾/ ⩽. If there are pairs that are not ordered, the set is partially ordered. Two operations can be defined in a lattice: the join [A ∧B] is the higher element that is below A and B and the meet [A ∨B] is the lower element that is above A and B. In this work, only lattices where both the join and the meet exist and are unique. These operations are sometimes also called conjunction and disjunction, but we will avoid these denominations, which are associated with more subtle considerations elsewhere [5]. In terms of only ordering, another concept can be defined: the complement. Whe referring to propositions, this can also be called negation. For a given proposition P, the complement is a proposition ¬P such that their join is the supremum sup and their meet is the infimum inf: [P ∧¬P = inf] ∧[P ∨¬P = sup] (1) Correspondences between two ordered sets where orderings are not altered are called valuations. A very useful valuation is that assigning “false” or “true” to any lattice of propositions, where {“false”,“true”} is made an ordered set by stating [“false” ⊑“true”]. With the example it can be checked that any sensible assignation of truth to a set of propositions ordered with ⊑will preserve the order. Formally, a valuation V can be defined: V : {Pi} →{Qi}, such that (Pi ⊑P Pj) ⇒(V (Pi) ⊑Q V (Pj)) (2) where ⊑P is an order relation defined in {Pi} and ⊑Q is an order relation defined in {Qi}. Symbol ⇒represents material implication: [X ⇒Y ] is true unless X is true and Y is false. Another very important and useful kind of valuations is that of probability measures: they assign a real number between 0 and 1 to every proposition. Valuations allow for a different way of defining the negation or complement: for a proposition P, the complement ¬P is such that in any valuation V , when P is mapped to one extreme of the lattice (supremum sup or infimum inf) then

This content is AI-processed based on ArXiv data.

Start searching

Enter keywords to search articles

↑↓
ESC
⌘K Shortcut