A Multivariate Regression Approach to Association Analysis of Quantitative Trait Network

Reading time: 6 minute
...

📝 Original Info

  • Title: A Multivariate Regression Approach to Association Analysis of Quantitative Trait Network
  • ArXiv ID: 0811.2026
  • Date: 2008-11-16
  • Authors: Researchers from original ArXiv paper

📝 Abstract

Many complex disease syndromes such as asthma consist of a large number of highly related, rather than independent, clinical phenotypes, raising a new technical challenge in identifying genetic variations associated simultaneously with correlated traits. In this study, we propose a new statistical framework called graph-guided fused lasso (GFlasso) to address this issue in a principled way. Our approach explicitly represents the dependency structure among the quantitative traits as a network, and leverages this trait network to encode structured regularizations in a multivariate regression model over the genotypes and traits, so that the genetic markers that jointly influence subgroups of highly correlated traits can be detected with high sensitivity and specificity. While most of the traditional methods examined each phenotype independently and combined the results afterwards, our approach analyzes all of the traits jointly in a single statistical method, and borrow information across correlated phenotypes to discover the genetic markers that perturbe a subset of correlated triats jointly rather than a single trait. Using simulated datasets based on the HapMap consortium data and an asthma dataset, we compare the performance of our method with the single-marker analysis, and other sparse regression methods such as the ridge regression and the lasso that do not use any structural information in the traits. Our results show that there is a significant advantage in detecting the true causal SNPs when we incorporate the correlation pattern in traits using our proposed methods.

💡 Deep Analysis

Deep Dive into A Multivariate Regression Approach to Association Analysis of Quantitative Trait Network.

Many complex disease syndromes such as asthma consist of a large number of highly related, rather than independent, clinical phenotypes, raising a new technical challenge in identifying genetic variations associated simultaneously with correlated traits. In this study, we propose a new statistical framework called graph-guided fused lasso (GFlasso) to address this issue in a principled way. Our approach explicitly represents the dependency structure among the quantitative traits as a network, and leverages this trait network to encode structured regularizations in a multivariate regression model over the genotypes and traits, so that the genetic markers that jointly influence subgroups of highly correlated traits can be detected with high sensitivity and specificity. While most of the traditional methods examined each phenotype independently and combined the results afterwards, our approach analyzes all of the traits jointly in a single statistical method, and borrow information across

📄 Full Content

arXiv:0811.2026v1 [stat.ML] 13 Nov 2008 Submitted to The American Journal of Human Genetics A Multivariate Regression Approach to Association Analysis of Quantitative Trait Network Seyoung Kim Kyung-Ah Sohn Eric P. Xing TECHNICAL REPORT CMU-ML-08-113 School of Computer Science Carnegie Mellon University Pittsburgh, PA 15213 Abstract Many complex disease syndromes such as asthma consist of a large number of highly related, rather than independent, clinical phenotypes, raising a new technical challenge in identifying ge- netic variations associated simultaneously with correlated traits. In this study, we propose a new statistical framework called graph-guided fused lasso (GFlasso) to address this issue in a principled way. Our approach explicitly represents the dependency structure among the quantitative traits as a network, and leverages this trait network to encode structured regularizations in a multivariate regression model over the genotypes and traits, so that the genetic markers that jointly influence subgroups of highly correlated traits can be detected with high sensitivity and specificity. While most of the traditional methods examined each phenotype independently and combined the re- sults afterwards, our approach analyzes all of the traits jointly in a single statistical method, and borrow information across correlated phenotypes to discover the genetic markers that perturbe a subset of correlated triats jointly rather than a single trait. Using simulated datasets based on the HapMap consortium data and an asthma dataset, we compare the performance of our method with the single-marker analysis, and other sparse regression methods such as the ridge regression and the lasso that do not use any structural information in the traits. Our results show that there is a significant advantage in detecting the true causal SNPs when we incorporate the correlation pattern in traits using our proposed methods. Keywords: lasso, fused lasso, association analysis, quantitative trait network 1 Introduction Recent advances in high-throughput genotyping technologies have significantly reduced the cost and time of genome-wide screening of individual genetic differences over millions of single nu- cleotide polymorphism (SNP) marker loci, shedding light to an era of “personalized genome” [1, 1 2]. Accompanying this trend, clinical and molecular phenotypes are being measured at phenome and transcriptome scale over a wide spectrum of diseases in various patient populations and lab- oratory models, creating an imminent need for appropriate methodology to identify omic-wide association between genetic markers and complex traits which are implicative of causal relation- ships between them. Many statistical approaches have been proposed to address various challenges in identifying genetic locus associated with the phenotype from a large set of markers, with the primary focus on problems involving a univariate trait [3, 4, 5]. However, in modern studies the patient cohorts are routinely surveyed with a large number of traits (from measures of hundreds of clinical phenotypes to genome-wide profiling of thousands of gene expressions), many of which are correlated among them. For example, in Figure 1, the correlation structure of the 53 clinical traits in the asthma dataset collected as a part of the Severe Asthma Research Program (SARP) [6] is represented as a network, with each trait as a node, the interaction between two traits as an edge, and the thickness of an edge representing the strength of correlation. Within this network, there ex- ists several subnetworks involving a subset of traits, and furthermore, the large subnetwork on the left-hand side of Figure 1 contains two subgroups of densely connected traits with thick edges. In order to understand how genetic variations in asthma patients affect various asthma-related clinical traits in the presence of such a complex correlation pattern among phenotypes, it is necessary to consider all of the traits jointly and take into account their correlation structure in the association analysis. Although numerous research efforts have been devoted to studying the interaction pat- terns among many quantitative traits represented as networks [7, 8, 9, 10, 11, 12, 13, 14] as well as discovering network submodules from such networks [11, 15], this type of network structure has not been exploited in association mapping [16, 17]. Many of the previous approaches examined one phenotype at a time to localize the SNP markers with a significant association and combined the results from a set of such single-phenotype association mapping across phenotypes. How- ever, we conjecture that one can detect additional weak associations and at the same time reduce false signals by combining the information across multiple phenotypes under a single statistical framework. In QTL mapping studies with pedigree data, a number of approaches have been proposed to detect pleiotropic effect of markers on multiple

…(Full text truncated)…

Reference

This content is AI-processed based on ArXiv data.

Start searching

Enter keywords to search articles

↑↓
ESC
⌘K Shortcut