📝 Original Info
- Title: Quality assessment for short oligonucleotide microarray data
- ArXiv ID: 0710.0178
- Date: 2011-11-10
- Authors: Researchers from original ArXiv paper
📝 Abstract
Quality of microarray gene expression data has emerged as a new research topic. As in other areas, microarray quality is assessed by comparing suitable numerical summaries across microarrays, so that outliers and trends can be visualized, and poor quality arrays or variable quality sets of arrays can be identified. Since each single array comprises tens or hundreds of thousands of measurements, the challenge is to find numerical summaries which can be used to make accurate quality calls. To this end, several new quality measures are introduced based on probe level and probeset level information, all obtained as a by-product of the low-level analysis algorithms RMA/fitPLM for Affymetrix GeneChips. Quality landscapes spatially localize chip or hybridization problems. Numerical chip quality measures are derived from the distributions of Normalized Unscaled Standard Errors and of Relative Log Expressions. Quality of chip batches is assessed by Residual Scale Factors. These quality assessment measures are demonstrated on a variety of datasets (spike-in experiments, small lab experiments, multi-site studies). They are compared with Affymetrix's individual chip quality report.
💡 Deep Analysis
Deep Dive into Quality assessment for short oligonucleotide microarray data.
Quality of microarray gene expression data has emerged as a new research topic. As in other areas, microarray quality is assessed by comparing suitable numerical summaries across microarrays, so that outliers and trends can be visualized, and poor quality arrays or variable quality sets of arrays can be identified. Since each single array comprises tens or hundreds of thousands of measurements, the challenge is to find numerical summaries which can be used to make accurate quality calls. To this end, several new quality measures are introduced based on probe level and probeset level information, all obtained as a by-product of the low-level analysis algorithms RMA/fitPLM for Affymetrix GeneChips. Quality landscapes spatially localize chip or hybridization problems. Numerical chip quality measures are derived from the distributions of Normalized Unscaled Standard Errors and of Relative Log Expressions. Quality of chip batches is assessed by Residual Scale Factors. These quality assessme
📄 Full Content
To be published in Technometrics (with Discussion)
Julia Brettschneider ac∗, Fran¸cois Collinb, Benjamin M. Bolstadb, Terence P. Speedbd
Quality assessment for
short oligonucleotide microarray data
Quality of microarray gene expression data has emerged as a new research topic.
As in
other areas, microarray quality is assessed by comparing suitable numerical summaries across
microarrays, so that outliers and trends can be visualized, and poor quality arrays or variable
quality sets of arrays can be identified. Since each single array comprises tens or hundreds of
thousands of measurements, the challenge is to find numerical summaries which can be used
to make accurate quality calls. To this end, several new quality measures are introduced based
on probe level and probeset level information, all obtained as a by-product of the low-level
analysis algorithms RMA/fitPLM for Affymetrix GeneChips.
Quality landscapes spatially
localize chip or hybridization problems. Numerical chip quality measures are derived from the
distributions of Normalized Unscaled Standard Errors and of Relative Log Expressions. Quality
of chip batches is assessed by Residual Scale Factors. These quality assessment measures are
demonstrated on a variety of datasets (spike-in experiments, small lab experiments, multi-site
studies). They are compared with Affymetrix’s individual chip quality report.
KEYWORDS: quality control, microarrays, Affymetrix chips, relative log expression, normalized
unscaled standard errors, residual scale factors.
1.
INTRODUCTION
With the introduction of microarrays biologist have been witnessing entire labs shrinking to
matchbox size. This paper invites quality researchers to join scientists on their fantastic journey
into the world of microscopic high-throughput measurement technologies. Building a biological
organism as laid out by the genetic code is a multi-step process with room for variation at each
step. The first steps, as described by the Dogma of molecular biology, are genes (and DNA sequence
in general), their transcripts and proteins. Substantial factors contributing to their variation in
both structure and abundance include cell type, developmental stage, genetic background and
environmental conditions.
Connecting molecular observations to the state of an organism is a
central interest in molecular biology. This includes the study of the gene and protein functions
and interactions, and their alteration in response to changes in environmental and developmental
conditions. Traditional methods in molecular biology generally work on a ”one gene (or protein) in
one experiment” basis. With the invention of microarrays huge numbers of such macromolecules
can now be monitored in one experiment. The most common kinds are gene expression microarrays,
which measure the mRNA transcript abundance for tens of thousands of genes simultaneously.
aUniversity of Warwick, Department of Statistics, Coventry, UK;
bUniversity of California at Berkeley, Department of Statistics, Berkeley, California, USA;
cQueen’s University, Cancer Research Institute Division of Cancer Care & Epidemiology and
Department of Community Heath & Epidemiology, Kingston, Ontario, Canada;
dWalter and Eliza Hall Institute Bioinformatics Division, Melbourne, Australia.
∗Corresponding author: julia.brettschneider@warwick.ac.uk
arXiv:0710.0178v2 [stat.ME] 16 Nov 2007
2
For biologists, this high-throughput approach has opened up entirely new avenues of research.
Rather than experimentally confirming the hypothesized role of a certain candidate gene in a
certain cellular process, they can use genome-wide comparisons to screen for all genes which might
be involved in that process. One of the first examples of such an exploratory approach is the
expression profiling study of mitotic yeast cells by Cho et al. (1998) which determined a set of a
few hundred genes involved in the cell cycle and triggered a cascade of articles re-analyzing the
data or replicating the experiment. Microarrays have become a central tool in cancer research
initiated by the discovery and re-definition of tumor subtypes based on molecular signatures (see
e.g. Perou et al. (2000), Alizadeh et al. (2000), Ramaswamy and Golub (2002), Yeoh et al. (2002)).
In Section 2 we will explain different kinds of microarray technologies in more detail and describe
their current applications in life sciences research.
A DNA microarray consists of a glass surface with a large number of distinct fragments of
DNA called probes attached to it at fixed positions. A fluorescently labelled sample containing a
mixture of unknown quantities of DNA molecules called the target is applied to the microarray.
Under the right chemical conditions, single-stranded fragments of target DNA will base pair with
the probes which are their complements, with great specificity. This reaction is called hybridization,
and is the reason DNA microarrays work. The fixed probes are either fragments of DNA called
complementary DNA (cDNA) obtain
…(Full text truncated)…
📸 Image Gallery
Reference
This content is AI-processed based on ArXiv data.