📝 Original Info
- Title: Textual Fingerprinting with Texts from Parkin, Bassewitz, and Leander
- ArXiv ID: 0802.2234
- Date: 2008-02-18
- Authors: Researchers from original ArXiv paper
📝 Abstract
Current research in author profiling to discover a legal author's fingerprint does not only follow examinations based on statistical parameters only but include more and more dynamic methods that can learn and that react adaptable to the specific behavior of an author. But the question on how to appropriately represent a text is still one of the fundamental tasks, and the problem of which attribute should be used to fingerprint the author's style is still not exactly defined. In this work, we focus on linguistic selection of attributes to fingerprint the style of the authors Parkin, Bassewitz and Leander. We use texts of the genre Fairy Tale as it has a clear style and texts of a shorter size with a straightforward story-line and a simple language.
💡 Deep Analysis
Deep Dive into Textual Fingerprinting with Texts from Parkin, Bassewitz, and Leander.
Current research in author profiling to discover a legal author’s fingerprint does not only follow examinations based on statistical parameters only but include more and more dynamic methods that can learn and that react adaptable to the specific behavior of an author. But the question on how to appropriately represent a text is still one of the fundamental tasks, and the problem of which attribute should be used to fingerprint the author’s style is still not exactly defined. In this work, we focus on linguistic selection of attributes to fingerprint the style of the authors Parkin, Bassewitz and Leander. We use texts of the genre Fairy Tale as it has a clear style and texts of a shorter size with a straightforward story-line and a simple language.
📄 Full Content
Textual Fingerprinting with texts from
Parkin, Bassewitz, and Leander
Christoph Schommer
University of Luxembourg
Dept. of Computer Science - ILIAS Laboratory
6, Rue Richard Coudenhove-Kalergi, 1359 Luxembourg, Luxembourg
Email: christoph.schommer @uni.lu
Conny Uhde
JW Goethe-University Frankfurt am Main
Dept. of Computer Science and Mathematics
Robert-Mayer-Str. 11-15, D-60486 Frankfurt am Main, Germany.
Email: uhde@cs.uni-frankfurt.de
November 14, 2021
Abstract
Current research in author profiling to discover a legal author’s fingerprint
does not only follow examinations based on statistical parameters only but in-
clude more and more dynamic methods that can learn and that react adaptable
to the specific behavior of an author. But the question on how to appropriately
represent a text is still one of the fundamental tasks, and the problem of which
attribute should be used to fingerprint the author’s style is still not exactly
defined. In this work, we focus on linguistic selection of attributes to finger-
print the style of the authors Parkin, Bassewitz and Leander. We use texts of
the genre Fairy Tale as it has a clear style and texts of a shorter size with a
straightforward story-line and a simple language.
1
What is it about?
The1. forensic linguistics is concerned with a verification process for the decryption
of texts and the analysis through pattern discovery.
In this respect, verification
means the usage of existing and well-known stylistic attributes to discover an indi-
vidual (linguistic) fingerprint. However, it is still a controversial discussion, if such
1This work has been supported by the University of Luxembourg within the project TRIAS -
Logic of Trust and Reliability for I nformation Agents in Science
1
arXiv:0802.2234v1 [cs.CL] 15 Feb 2008
a linguistic fingerprint is a clear indication per se: stylistic tests assume that typ-
ical attributes are directly influenceable by the author and that a certain number
of attributes still keep constantly, even though the author changes consciously the
behavior or the own style [7]. However, following Dixon and Mannion to their eval-
uations to the texts of Oliver Goldsmith, it may be observed that an appropriate
selection of stylistic attributes take a risk: Goldsmith’s style is characterized by an
adaptive fluency, where he adapts his onw style in a reported speech of the respective
actor. To identify Goldsmith’s characteristic style attributes, Dixon and Mannion
compared his essays with those of four contemporary writers. They found out that
two of the four writers show a suspicious similarity with Goldsmith as they originate
from the same irish area, living in the english exile [14].
Additionally, stylistic attributes that are influenced by the genre may interfere the
individual style [3]. This leads to the conclusion that only texts of the same domain
are affiliated with each other. Using the texts of the Nijmegen-corpus, Baayen et
al. have analyzed the differences of diverse authors of the same genre as well as the
texts of authors who represent different genres: they have found out that texts of
the same genre are generally more similar than texts of the different genre that are
from the same author.
2
Style Discovery
Stylometry refers to the measurement of the style with the aim to fingerprint a text
following a certain number of linguistic attributes, to conclude the authorship of a
text and/or to order texts following their chronology [18]. The content, the meaning
and the correctness of the text is not concerned. The general ambition is to dis-
cover those attributes that difference texts sufficiently [17]. Generally, the data is
analyzed statistically taking numerical attributes into account but disregarding cat-
egorical attributes. Figure of speeches like metaphor and symbols are clear defined
indeed, but are not to be discovered automatically. In [18], Oakes writes that any
linguistic occurrence can be taken for the stylometric analyis as the attribute can be
expressed by a numerical attribute. However, it must be assured that the attribute
is relevant for other genres as well [16]. Another important aspect is the differentia-
tion of linguistic attributes of whether they are consciously controlled by the author
or not [13]. Many examinations take explicitly unconscious stylistic attributes as
the relevant discrimination criterion as they are a stronger sign of a stylistic finger-
print. However, this includes the existence of stylistic attributes that stay constantly
through the whole text and the existence of linguistic attributes that adapt [12]. In
this respect, we focus on a differentiation of conscious and unconscious stylistic at-
tributes, well noting that diverse authors differ more in their style than texts of
an individual author. Furthermore, texts of an individual authors differ more than
passages within a text [6]. We therefore conclude that an appropriate consistence of
2
a continuous usage of conscious and unconscious stylistic attributes must
…(Full text truncated)…
Reference
This content is AI-processed based on ArXiv data.