Experimenting with Transitive Verbs in a DisCoCat

Experimenting with Transitive Verbs in a DisCoCat
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Formal and distributional semantic models offer complementary benefits in modeling meaning. The categorical compositional distributional (DisCoCat) model of meaning of Coecke et al. (arXiv:1003.4394v1 [cs.CL]) combines aspected of both to provide a general framework in which meanings of words, obtained distributionally, are composed using methods from the logical setting to form sentence meaning. Concrete consequences of this general abstract setting and applications to empirical data are under active study (Grefenstette et al., arxiv:1101.0309; Grefenstette and Sadrzadeh, arXiv:1106.4058v1 [cs.CL]). . In this paper, we extend this study by examining transitive verbs, represented as matrices in a DisCoCat. We discuss three ways of constructing such matrices, and evaluate each method in a disambiguation task developed by Grefenstette and Sadrzadeh (arXiv:1106.4058v1 [cs.CL]).


💡 Research Summary

The paper investigates how to represent transitive verbs as matrices within the categorical compositional distributional (DisCoCat) framework and evaluates three concrete encoding strategies on a semantic disambiguation task. DisCoCat combines distributional word vectors with categorical grammar, interpreting a sentence’s meaning as the application of its grammatical type to the tensor product of its word vectors. In this setting, a transitive verb, which relates a subject and an object, is modeled as an r × r matrix in the space N ⊗ N, where N is the r‑dimensional noun vector space.

Previously, Grefenstette and Sadrzadeh (2011) introduced an “indirect” method: for each verb, all subject–object pairs observed in a corpus are taken, each pair’s Kronecker product of noun vectors is computed, and the resulting matrices are summed to obtain the verb matrix. This approach directly reflects co‑occurrence frequencies but requires large corpora and can be sparse.

The authors propose three alternative ways to turn a verb’s r‑dimensional distributional vector v into an r × r matrix:

  1. 0‑diag – place the vector on the diagonal and fill all off‑diagonal entries with zeros. This discards any cross‑information between different dimensions of the subject and object.

  2. 1‑diag – same diagonal placement but fill off‑diagonal entries with ones, thereby providing a uniform minimal interaction across dimensions.

  3. v ⊗ v – compute the Kronecker product of the verb vector with itself, yielding a full matrix where each entry is the product of two components of v. This encodes the verb’s information across all possible dimension pairs.

For any encoding, the sentence meaning is computed as
  tverb ⊙ (sub ⊗ obj)
where ⊙ denotes component‑wise multiplication and sub ⊗ obj is the Kronecker product of the subject and object vectors. The resulting matrix is compared to another sentence’s matrix using the Frobenius inner product, normalized to a cosine‑like similarity score in


Comments & Academic Discussion

Loading comments...

Leave a Comment