Statistics / Applications Statistics / stat.ME

Construction and evaluation of classifiers for forensic document analysis

February 23, 2026

Reading time: 5 minute

...

📝 Original Info

Title: Construction and evaluation of classifiers for forensic document analysis
ArXiv ID: 1004.0678
Date: 2015-03-14
Authors: ** Christopher P. Saunders, Linda J. Davis, Andrea C. Lamas, John J. Miller, Donald T. Gantz **

📝 Abstract

In this study we illustrate a statistical approach to questioned document examination. Specifically, we consider the construction of three classifiers that predict the writer of a sample document based on categorical data. To evaluate these classifiers, we use a data set with a large number of writers and a small number of writing samples per writer. Since the resulting classifiers were found to have near perfect accuracy using leave-one-out cross-validation, we propose a novel Bayesian-based cross-validation method for evaluating the classifiers.

💡 Deep Analysis

Deep Dive into Construction and evaluation of classifiers for forensic document analysis.

📄 Full Content

arXiv:1004.0678v2 [stat.AP] 28 Jun 2011 The Annals of Applied Statistics 2011, Vol. 5, No. 1, 381–399 DOI: 10.1214/10-AOAS379 c ⃝Institute of Mathematical Statistics, 2011 CONSTRUCTION AND EVALUATION OF CLASSIFIERS FOR FORENSIC DOCUMENT ANALYSIS1 By Christopher P. Saunders2, Linda J. Davis3, Andrea C. Lamas3, John J. Miller3 and Donald T. Gantz3 George Mason University In this study we illustrate a statistical approach to questioned document examination. Speciﬁcally, we consider the construction of three classiﬁers that predict the writer of a sample document based on categorical data. To evaluate these classiﬁers, we use a data set with a large number of writers and a small number of writing samples per writer. Since the resulting classiﬁers were found to have near per- fect accuracy using leave-one-out cross-validation, we propose a novel Bayesian-based cross-validation method for evaluating the classiﬁers. 1. Introduction. A common goal of forensic handwriting examination is the determination, by a forensic document examiner, of which individual is the actual writer of a given document. Recently, there has been a growing interest in the development of forensic handwriting biometric systems that can assist with this determination process. Forensic handwriting biometric systems tend to focus on two main tasks. The ﬁrst task, known as writer veriﬁcation, is the determination of whether or not two documents were written by a single writer. The second task, commonly referred to as hand- writing biometric identiﬁcation, is the selection from a set of known writers of a short list of potential writers for a given document. (Another exam- ple of a biometric identiﬁcation problem in forensics is searching ﬁngerprint databases to ﬁnd a match for a latent ﬁngerprint.) Received May 2008; revised June 2010. 1Supported in part under a contract award from the Counterterrorism and Forensic Science Research Unit of the Federal Bureau of Investigation’s Laboratory Division. Names of commercial manufacturers are provided for information only and inclusion does not imply endorsement by the FBI. Points of view in this document are those of the authors and do not necessarily represent the oﬃcial position of the FBI or the US Government. 2Supported by IC Post Doctorial Research Fellowship, NGIA HM1582-06-1-2016. 3Supported by Gannon Technologies Group. Key words and phrases. Classiﬁcation, handwriting identiﬁcation, cross-validation, Bayesian statistics. This is an electronic reprint of the original article published by the Institute of Mathematical Statistics in The Annals of Applied Statistics, 2011, Vol. 5, No. 1, 381–399. This reprint diﬀers from the original in pagination and typographic detail. 1 2 C. P. SAUNDERS ET AL. In this paper we focus on closed-set biometric identiﬁcation, which as- sumes that the writer of a document of unknown writership is one of W known writers with handwriting styles that have been modeled by the bio- metric system. It is important to note that the fundamental forensic writer identiﬁcation problem, which is to verify that a document of questioned writ- ership came from a “suspect” to the exclusion of all other possible writers, is not addressed in this paper. The “exclusion of all other possible writers” requires an assumption that the suspect writer has a unique handwriting proﬁle and, further, that the handwriting quantiﬁcation contains enough in- formation to uniquely associate the writing sample of unknown writership with the suspect’s writing proﬁle. These issues are addressed in handwriting individuality studies. [See Srihari et al. (2002) and related discussion pa- pers in the Journal of Forensic Sciences.] Ongoing research by Saunders et al. (2008) explores some of the issues associated with studying handwriting individuality using computational biometric systems. At a basic level, closed-set biometric identiﬁcation is similar to a tradi- tional multi-group statistical discriminate analysis problem. In this paper, we implement three diﬀerent discriminant functions (or classiﬁcation pro- cedures) for categorical data resulting from the quantiﬁcation of a hand- written document. We determine the accuracy of these three classiﬁcation procedures with respect to a database of 100 writers provided by the FBI. Each of the three classiﬁcation procedures is shown to identify with close to 100% accuracy the writer of a short handwritten note. The quantiﬁcation technology used in this study is a derivative of the handwriting biometric identiﬁcation system developed and implemented by the Gannon Technologies Group and the George Mason University Doc- ument Forensics Laboratory. Components of the system are described as needed. For a document of unknown writership, the system returns a short list of potential writers from a set of known writers. This functionality is the common goal of most forensic biometric systems [Dessimoz and Cham- pod (2008)]. A forensic document examiner can pursue a ﬁna

…(Full text truncated)…

📄 Read Full PDF on ArXiv

Reference

This content is AI-processed based on ArXiv data.

Construction and evaluation of classifiers for forensic document analysis

📝 Original Info

📝 Abstract

💡 Deep Analysis

📄 Full Content

Reference

Table of Contents

Table of Contents

📝 Original Info

📝 Abstract

💡 Deep Analysis

📄 Full Content

Reference

Related Posts

The Directed Closure Process in Hybrid Social-Information Networks, with an Analysis of Link Formation on Twitter

Discussion of: Statistical analysis of an archeological find--skeptical counting challenges to an archaeological find

Error Analysis of Approximated PCRLBs for Nonlinear Dynamics

Start searching

No results found