Read the previous part: Unraveling GWAS: A Researcher's 15-Minute Guide, Part 2
Recent Improvements in GWAS Analysis Methodology Recent advancements in GWAS analysis methodology have focused on enhancing computational efficiency and statistical power within the mixed linear model framework. Interested readers can refer to Tibbs Cortes et al. (2021) for further details. Additionally, an alternative framework based on Bayesian statistics has gained attention. Interested readers can explore Stephens and Balding (2009) for more information on this alternative approach.
Ongoing Challenges in GWAS Analysis In current GWAS practices, several challenges continue to be discussed and addressed. One such challenge revolves around the handling of rare variants, which have low minor allele frequencies (MAF). Due to their nature, rare variants tend to generate lower p-values in linear model analysis. There has been a debate regarding the removal of rare variants before conducting a GWAS. While some argue for their exclusion upfront, rare variants can indeed be truly significant and often exhibit larger effect sizes. Current practices lean towards retaining rare variants for further examination after they are identified as significant in a GWAS.
Another challenge in GWAS analysis is selecting an appropriate method to control the multiple testing problem. Bonferroni correction was commonly employed in earlier GWASs, but it is known for its overly stringent nature, frequently leading to false negatives. False discovery rate (FDR) control has gained popularity as it strikes a balance between sensitivity and stringency. However, FDR control assumes independence among test statistics, which is not always the case in GWAS. Another alternative for controlling the multiple testing problem is conducting a permutation test, although this approach can be computationally intensive.
Another challenge in GWAS analysis is selecting an appropriate method to control the multiple testing problem. Bonferroni correction was commonly employed in earlier GWASs, but it is known for its overly stringent nature, frequently leading to false negatives. False discovery rate (FDR) control has gained popularity as it strikes a balance between sensitivity and stringency. However, FDR control assumes independence among test statistics, which is not always the case in GWAS. Another alternative for controlling the multiple testing problem is conducting a permutation test, although this approach can be computationally intensive.
GWAS-Derived Approaches The remarkable success of GWAS has led to the development of several related approaches. An epigenome-wide association study (EWAS) aims to identify epigenetic variants, particularly DNA methylation at CpG sites, associated with a disease or phenotype. Transcriptome-wide association studies (TWAS) aim to identify gene expression changes linked to a disease or phenotype. Similarly, proteome-wide association studies (PWAS) and metabolome-wide association studies (MWAS) focus on evaluating protein expression changes and metabolite level changes, respectively, associated with a disease or phenotype.
Most recently, the GWAS approach has been extended to the pan-genome level, leading to the emergence of pangenome-wide association studies (Pan-GWAS). Pan-GWAS explores genetic variations beyond the reference genome and incorporates a more comprehensive understanding of genomic diversity.
References
Ozaki K, Ohnishi Y, Iida A, Sekine A, Yamada R, Tsunoda T, Sato H, Sato H, Hori M, Nakamura Y, Tanaka T (2002) Functional SNPs in the lymphotoxin-alpha gene that are associated with susceptibility to myocardial infarction. Nat Genet 32:650-654.
Stephens M, Balding DJ (2009) Bayesian statistical methods for genetic association studies. Nat Rev Genet 10:681-690.
Tibbs Cortes L, Zhang Z, Yu J (2021) Status and prospects of genome-wide association studies in plants. Plant Genome 14:e20077.
VanRaden PM (2008) Efficient methods to compute genomic predictions. J Dairy Sci 91:4414-4423.
Yu J, Pressoir G, Briggs WH, Vroh Bi I, Yamasaki M, Doebley JF, McMullen MD, Gaut BS, Nielsen DM, Holland JB, Kresovich S, Buckler ES (2006) A unified mixed-model method for association mapping that accounts for multiple levels of relatedness. Nat Genet 38:203-208.
Back to the beginning of the article: Unraveling GWAS: A Researcher's 15-Minute Guide, Part 1
About the Author: Justin T. Li received his B.S in Biophysics from Peking University in 1991, his Ph.D. in Neurobiology from the University of Wisconsin-Madison in 2000, and a M.S. in Computer Science from the University of Houston in 2001. He served as an Assistant Professor at the University of Minnesota Medical School from 2004 to 2009, and as Chief Bioinformatics Officer at LC Sciences in Houston from 2009 to 2013. In June 2013, Justin joined AccuraScience as a Lead Bioinformatician. He has published more than 40 research articles in bioinformatics, computational biology, and related fields. From 2013 to 2022, Justin led a team of bioinformaticians at AccuraScience in the completion of more than 120 bioinformatics research and development projects, including several GWAS, EWAS and TWAS projects. More information about Justin can be found at https://www.accurascience.com/our_team.html.
Need assistance in your GWAS, EWAS, TWAS or PWAS project? We may be able to help. Take a look at the intro to our bioinformatician team, see some of the advantages of using our team's help here, and check out our FAQ page!