Textual Fingerprinting with Texts from Parkin, Bassewitz, and Leander

Reading time: 6 minute
...

📝 Original Info

  • Title: Textual Fingerprinting with Texts from Parkin, Bassewitz, and Leander
  • ArXiv ID: 0802.2234
  • Date: 2008-02-18
  • Authors: Researchers from original ArXiv paper

📝 Abstract

Current research in author profiling to discover a legal author's fingerprint does not only follow examinations based on statistical parameters only but include more and more dynamic methods that can learn and that react adaptable to the specific behavior of an author. But the question on how to appropriately represent a text is still one of the fundamental tasks, and the problem of which attribute should be used to fingerprint the author's style is still not exactly defined. In this work, we focus on linguistic selection of attributes to fingerprint the style of the authors Parkin, Bassewitz and Leander. We use texts of the genre Fairy Tale as it has a clear style and texts of a shorter size with a straightforward story-line and a simple language.

💡 Deep Analysis

Deep Dive into Textual Fingerprinting with Texts from Parkin, Bassewitz, and Leander.

Current research in author profiling to discover a legal author’s fingerprint does not only follow examinations based on statistical parameters only but include more and more dynamic methods that can learn and that react adaptable to the specific behavior of an author. But the question on how to appropriately represent a text is still one of the fundamental tasks, and the problem of which attribute should be used to fingerprint the author’s style is still not exactly defined. In this work, we focus on linguistic selection of attributes to fingerprint the style of the authors Parkin, Bassewitz and Leander. We use texts of the genre Fairy Tale as it has a clear style and texts of a shorter size with a straightforward story-line and a simple language.

📄 Full Content

Textual Fingerprinting with texts from Parkin, Bassewitz, and Leander Christoph Schommer University of Luxembourg Dept. of Computer Science - ILIAS Laboratory 6, Rue Richard Coudenhove-Kalergi, 1359 Luxembourg, Luxembourg Email: christoph.schommer @uni.lu Conny Uhde JW Goethe-University Frankfurt am Main Dept. of Computer Science and Mathematics Robert-Mayer-Str. 11-15, D-60486 Frankfurt am Main, Germany. Email: uhde@cs.uni-frankfurt.de November 14, 2021 Abstract Current research in author profiling to discover a legal author’s fingerprint does not only follow examinations based on statistical parameters only but in- clude more and more dynamic methods that can learn and that react adaptable to the specific behavior of an author. But the question on how to appropriately represent a text is still one of the fundamental tasks, and the problem of which attribute should be used to fingerprint the author’s style is still not exactly defined. In this work, we focus on linguistic selection of attributes to finger- print the style of the authors Parkin, Bassewitz and Leander. We use texts of the genre Fairy Tale as it has a clear style and texts of a shorter size with a straightforward story-line and a simple language. 1 What is it about? The1. forensic linguistics is concerned with a verification process for the decryption of texts and the analysis through pattern discovery. In this respect, verification means the usage of existing and well-known stylistic attributes to discover an indi- vidual (linguistic) fingerprint. However, it is still a controversial discussion, if such 1This work has been supported by the University of Luxembourg within the project TRIAS - Logic of Trust and Reliability for I nformation Agents in Science 1 arXiv:0802.2234v1 [cs.CL] 15 Feb 2008 a linguistic fingerprint is a clear indication per se: stylistic tests assume that typ- ical attributes are directly influenceable by the author and that a certain number of attributes still keep constantly, even though the author changes consciously the behavior or the own style [7]. However, following Dixon and Mannion to their eval- uations to the texts of Oliver Goldsmith, it may be observed that an appropriate selection of stylistic attributes take a risk: Goldsmith’s style is characterized by an adaptive fluency, where he adapts his onw style in a reported speech of the respective actor. To identify Goldsmith’s characteristic style attributes, Dixon and Mannion compared his essays with those of four contemporary writers. They found out that two of the four writers show a suspicious similarity with Goldsmith as they originate from the same irish area, living in the english exile [14]. Additionally, stylistic attributes that are influenced by the genre may interfere the individual style [3]. This leads to the conclusion that only texts of the same domain are affiliated with each other. Using the texts of the Nijmegen-corpus, Baayen et al. have analyzed the differences of diverse authors of the same genre as well as the texts of authors who represent different genres: they have found out that texts of the same genre are generally more similar than texts of the different genre that are from the same author. 2 Style Discovery Stylometry refers to the measurement of the style with the aim to fingerprint a text following a certain number of linguistic attributes, to conclude the authorship of a text and/or to order texts following their chronology [18]. The content, the meaning and the correctness of the text is not concerned. The general ambition is to dis- cover those attributes that difference texts sufficiently [17]. Generally, the data is analyzed statistically taking numerical attributes into account but disregarding cat- egorical attributes. Figure of speeches like metaphor and symbols are clear defined indeed, but are not to be discovered automatically. In [18], Oakes writes that any linguistic occurrence can be taken for the stylometric analyis as the attribute can be expressed by a numerical attribute. However, it must be assured that the attribute is relevant for other genres as well [16]. Another important aspect is the differentia- tion of linguistic attributes of whether they are consciously controlled by the author or not [13]. Many examinations take explicitly unconscious stylistic attributes as the relevant discrimination criterion as they are a stronger sign of a stylistic finger- print. However, this includes the existence of stylistic attributes that stay constantly through the whole text and the existence of linguistic attributes that adapt [12]. In this respect, we focus on a differentiation of conscious and unconscious stylistic at- tributes, well noting that diverse authors differ more in their style than texts of an individual author. Furthermore, texts of an individual authors differ more than passages within a text [6]. We therefore conclude that an appropriate consistence of 2 a continuous usage of conscious and unconscious stylistic attributes must

…(Full text truncated)…

Reference

This content is AI-processed based on ArXiv data.

Start searching

Enter keywords to search articles

↑↓
ESC
⌘K Shortcut