Statistics / Applications Statistics / stat.ME

Quality assessment for short oligonucleotide microarray data

February 23, 2026

Reading time: 6 minute

...

📝 Original Info

Title: Quality assessment for short oligonucleotide microarray data
ArXiv ID: 0710.0178
Date: 2011-11-10
Authors: Researchers from original ArXiv paper

📝 Abstract

Quality of microarray gene expression data has emerged as a new research topic. As in other areas, microarray quality is assessed by comparing suitable numerical summaries across microarrays, so that outliers and trends can be visualized, and poor quality arrays or variable quality sets of arrays can be identified. Since each single array comprises tens or hundreds of thousands of measurements, the challenge is to find numerical summaries which can be used to make accurate quality calls. To this end, several new quality measures are introduced based on probe level and probeset level information, all obtained as a by-product of the low-level analysis algorithms RMA/fitPLM for Affymetrix GeneChips. Quality landscapes spatially localize chip or hybridization problems. Numerical chip quality measures are derived from the distributions of Normalized Unscaled Standard Errors and of Relative Log Expressions. Quality of chip batches is assessed by Residual Scale Factors. These quality assessment measures are demonstrated on a variety of datasets (spike-in experiments, small lab experiments, multi-site studies). They are compared with Affymetrix's individual chip quality report.

💡 Deep Analysis

Deep Dive into Quality assessment for short oligonucleotide microarray data.

📄 Full Content

To be published in Technometrics (with Discussion) Julia Brettschneider ac∗, Fran¸cois Collinb, Benjamin M. Bolstadb, Terence P. Speedbd Quality assessment for short oligonucleotide microarray data Quality of microarray gene expression data has emerged as a new research topic. As in other areas, microarray quality is assessed by comparing suitable numerical summaries across microarrays, so that outliers and trends can be visualized, and poor quality arrays or variable quality sets of arrays can be identiﬁed. Since each single array comprises tens or hundreds of thousands of measurements, the challenge is to ﬁnd numerical summaries which can be used to make accurate quality calls. To this end, several new quality measures are introduced based on probe level and probeset level information, all obtained as a by-product of the low-level analysis algorithms RMA/ﬁtPLM for Aﬀymetrix GeneChips. Quality landscapes spatially localize chip or hybridization problems. Numerical chip quality measures are derived from the distributions of Normalized Unscaled Standard Errors and of Relative Log Expressions. Quality of chip batches is assessed by Residual Scale Factors. These quality assessment measures are demonstrated on a variety of datasets (spike-in experiments, small lab experiments, multi-site studies). They are compared with Aﬀymetrix’s individual chip quality report. KEYWORDS: quality control, microarrays, Aﬀymetrix chips, relative log expression, normalized unscaled standard errors, residual scale factors. 1. INTRODUCTION With the introduction of microarrays biologist have been witnessing entire labs shrinking to matchbox size. This paper invites quality researchers to join scientists on their fantastic journey into the world of microscopic high-throughput measurement technologies. Building a biological organism as laid out by the genetic code is a multi-step process with room for variation at each step. The ﬁrst steps, as described by the Dogma of molecular biology, are genes (and DNA sequence in general), their transcripts and proteins. Substantial factors contributing to their variation in both structure and abundance include cell type, developmental stage, genetic background and environmental conditions. Connecting molecular observations to the state of an organism is a central interest in molecular biology. This includes the study of the gene and protein functions and interactions, and their alteration in response to changes in environmental and developmental conditions. Traditional methods in molecular biology generally work on a ”one gene (or protein) in one experiment” basis. With the invention of microarrays huge numbers of such macromolecules can now be monitored in one experiment. The most common kinds are gene expression microarrays, which measure the mRNA transcript abundance for tens of thousands of genes simultaneously. aUniversity of Warwick, Department of Statistics, Coventry, UK; bUniversity of California at Berkeley, Department of Statistics, Berkeley, California, USA; cQueen’s University, Cancer Research Institute Division of Cancer Care & Epidemiology and Department of Community Heath & Epidemiology, Kingston, Ontario, Canada; dWalter and Eliza Hall Institute Bioinformatics Division, Melbourne, Australia. ∗Corresponding author: julia.brettschneider@warwick.ac.uk arXiv:0710.0178v2 [stat.ME] 16 Nov 2007 2 For biologists, this high-throughput approach has opened up entirely new avenues of research. Rather than experimentally conﬁrming the hypothesized role of a certain candidate gene in a certain cellular process, they can use genome-wide comparisons to screen for all genes which might be involved in that process. One of the ﬁrst examples of such an exploratory approach is the expression proﬁling study of mitotic yeast cells by Cho et al. (1998) which determined a set of a few hundred genes involved in the cell cycle and triggered a cascade of articles re-analyzing the data or replicating the experiment. Microarrays have become a central tool in cancer research initiated by the discovery and re-deﬁnition of tumor subtypes based on molecular signatures (see e.g. Perou et al. (2000), Alizadeh et al. (2000), Ramaswamy and Golub (2002), Yeoh et al. (2002)). In Section 2 we will explain diﬀerent kinds of microarray technologies in more detail and describe their current applications in life sciences research. A DNA microarray consists of a glass surface with a large number of distinct fragments of DNA called probes attached to it at ﬁxed positions. A ﬂuorescently labelled sample containing a mixture of unknown quantities of DNA molecules called the target is applied to the microarray. Under the right chemical conditions, single-stranded fragments of target DNA will base pair with the probes which are their complements, with great speciﬁcity. This reaction is called hybridization, and is the reason DNA microarrays work. The ﬁxed probes are either fragments of DNA called complementary DNA (cDNA) obtain

…(Full text truncated)…

📄 Read Full PDF on ArXiv

📸 Image Gallery

Reference

This content is AI-processed based on ArXiv data.

Quality assessment for short oligonucleotide microarray data

📝 Original Info

📝 Abstract

💡 Deep Analysis

📄 Full Content

📸 Image Gallery

Reference

Table of Contents

Table of Contents

📝 Original Info

📝 Abstract

💡 Deep Analysis

📄 Full Content

📸 Image Gallery

Reference

Related Posts

Hybrid data regression modelling in measurement

Low Dimensional Embedding of fMRI datasets

Spatio-temporal Functional Regression on Paleo-ecological Data

Start searching

No results found