Correction for Founder Effects in Host-Viral Association Studies via Principal Components
-
Karyn L. Reeves
, Elizabeth J. McKinnon and Ian R. James
Abstract
Viruses such as HIV and Hepatitis C (HCV) replicate rapidly and with high transcription error rates, which may facilitate their escape from immune detection through the encoding of mutations at key positions within human leukocyte antigen (HLA)-specific peptides, thus impeding T-cell recognition. Large-scale population-based host-viral association studies are conducted as hypothesis-generating analyses which aim to determine the positions within the viral sequence at which host HLA immune pressure may have led to these viral escape mutations. When transmission of the virus to the host is HLA-associated, however, standard tests of association can be confounded by the viral relatedness of contemporarily circulating viral sequences, as viral sequences descended from a common ancestor may share inherited patterns of polymorphisms, termed founder effects. Recognizing the correspondence between this problem and the confounding of case-control genome-wide association studies by population stratification, we adapt methods taken from that field to the analysis of host-viral associations. In particular, we consider methods based on principal components analysis within a logistic regression framework motivated by alternative formulations in the Frisch-Waugh-Lovell Theorem. We demonstrate via simulation their utility in detecting true host-viral associations whilst minimizing confounding by associations generated by founder effects. The proposed methods incorporate relatively robust, standard statistical procedures which can be easily implemented using widely available software, and provide alternatives to the more complex computer intensive methods often implemented in this area.
©2012 Walter de Gruyter GmbH & Co. KG, Berlin/Boston
Articles in the same Issue
- Article
- A New Explained-Variance Based Genetic Risk Score for Predictive Modeling of Disease Risk
- Hessian Calculation for Phylogenetic Likelihood based on the Pruning Algorithm and its Applications
- Cluster-Localized Sparse Logistic Regression for SNP Data
- How to analyze many contingency tables simultaneously in genetic association studies
- Incorporating the Empirical Null Hypothesis into the Benjamini-Hochberg Procedure
- Estimating the Number of One-step Beneficial Mutations
- Testing clonality of three and more tumors using their loss of heterozygosity profiles
- Correction for Founder Effects in Host-Viral Association Studies via Principal Components
- A Non-Homogeneous Dynamic Bayesian Network with Sequentially Coupled Interaction Parameters for Applications in Systems and Synthetic Biology
- An Integrated Hierarchical Bayesian Model for Multivariate eQTL Mapping
- A Novel and Fast Normalization Method for High-Density Arrays
- Performance of MAX Test and Degree of Dominance Index in Predicting the Mode of Inheritance
- A Bayesian autoregressive three-state hidden Markov model for identifying switching monotonic regimes in Microarray time course data
- QTL Mapping Using a Memetic Algorithm with Modifications of BIC as Fitness Function
- Computing Posterior Probabilities for Score-based Alignments Using ppALIGN
Articles in the same Issue
- Article
- A New Explained-Variance Based Genetic Risk Score for Predictive Modeling of Disease Risk
- Hessian Calculation for Phylogenetic Likelihood based on the Pruning Algorithm and its Applications
- Cluster-Localized Sparse Logistic Regression for SNP Data
- How to analyze many contingency tables simultaneously in genetic association studies
- Incorporating the Empirical Null Hypothesis into the Benjamini-Hochberg Procedure
- Estimating the Number of One-step Beneficial Mutations
- Testing clonality of three and more tumors using their loss of heterozygosity profiles
- Correction for Founder Effects in Host-Viral Association Studies via Principal Components
- A Non-Homogeneous Dynamic Bayesian Network with Sequentially Coupled Interaction Parameters for Applications in Systems and Synthetic Biology
- An Integrated Hierarchical Bayesian Model for Multivariate eQTL Mapping
- A Novel and Fast Normalization Method for High-Density Arrays
- Performance of MAX Test and Degree of Dominance Index in Predicting the Mode of Inheritance
- A Bayesian autoregressive three-state hidden Markov model for identifying switching monotonic regimes in Microarray time course data
- QTL Mapping Using a Memetic Algorithm with Modifications of BIC as Fitness Function
- Computing Posterior Probabilities for Score-based Alignments Using ppALIGN