The discordance in results between independent genome-wide association studies (GWAS) indicates the potential for Type I and Type II errors. To identify the causes of variability underlying lack of reproducibility, here we present the results of a repeatability experiment on GWAS on a cohort of 1991 coronary artery disease individuals and 1500 controls (National Blood Service) provided by the Wellcome Trust Case Control Consortium. As part of the MicroArray Quality Control project, we identified quality control (QC) and association analysis steps with a major impact on the identification of candidate markers for possible classifiers. Different experimental conditions were used with the CHIAMO calling algorithm to assess the effects of batch size and case-control composition on downstream association analysis. Results showed that both composition and size create discordant single-nucleotide polymorphism (SNP) results for QC and statistical analysis and may contribute to the lack of reproducibility in GWAS. An interactive effect of batch size and composition contributes to discordant results in significantly associated loci. About 800 significant SNPs (Cochran-Armitage trend test, P<5.0*10^-7) were found for batches of 2000 samples with separated cases and controls, whereas only 14 significant markers were found with one batch of all samples.

An interactive effect of batch size and composition contributes to discordant results in GWAS with the CHIAMO genotyping algorithm.

Chierici, Marco;Furlanello, Cesare
2010-01-01

Abstract

The discordance in results between independent genome-wide association studies (GWAS) indicates the potential for Type I and Type II errors. To identify the causes of variability underlying lack of reproducibility, here we present the results of a repeatability experiment on GWAS on a cohort of 1991 coronary artery disease individuals and 1500 controls (National Blood Service) provided by the Wellcome Trust Case Control Consortium. As part of the MicroArray Quality Control project, we identified quality control (QC) and association analysis steps with a major impact on the identification of candidate markers for possible classifiers. Different experimental conditions were used with the CHIAMO calling algorithm to assess the effects of batch size and case-control composition on downstream association analysis. Results showed that both composition and size create discordant single-nucleotide polymorphism (SNP) results for QC and statistical analysis and may contribute to the lack of reproducibility in GWAS. An interactive effect of batch size and composition contributes to discordant results in significantly associated loci. About 800 significant SNPs (Cochran-Armitage trend test, P<5.0*10^-7) were found for batches of 2000 samples with separated cases and controls, whereas only 14 significant markers were found with one batch of all samples.
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11582/10588
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
social impact