Transcriptome-Wide Association Studies (TWAS) have become a very important tool in functional genomics. Compared to GWAS, TWAS incorporates expression data to better interpret associations. With increasing availability of GTEx and eQTL panels, TWAS is now a popular approach for connecting non-coding variants with gene-level regulatory consequences. But in actual data analysis, many TWAS studies suffer from major methodological issues - often without researchers even realizing it. We have seen too many projects where the TWAS looked promising at first, but then collapsed under scrutiny because of wrong assumptions or incorrect implementations.
In this article, we summarize nine major pitfalls we have encountered in TWAS projects from both academic and biotech clients. For each one, we explain why it happens, provide anonymized real-life cases, and describe how experienced analysts approach things differently. Our goal is not just to warn about errors - but to help you build a solid, reproducible TWAS that can support real biological insight.
TWAS can illuminate gene–trait links - or mislead you badly. We harmonize panels, LD, and models so your hits stand up to review. Request a free consultation →
The Problem
TWAS model uses GWAS summary statistics and an eQTL-derived expression prediction model - but they are not matched in terms of SNP coverage, genome build, or ancestry, leading to invalid associations.
Why It Happens
Researchers use publicly available eQTL models like GTEx v8, but their GWAS data may come from older genome builds (hg19 vs hg38), different SNP panels (Immunochip, Metabochip), or populations (e.g., East Asian GWAS with European GTEx). Even slight mismatches in allele strand or SNP ID cause misalignment, leading to unpredictable TWAS results.
Real Example
A client used summary statistics from a European autoimmune GWAS with 1000 Genomes LD reference, and combined it with GTEx v8 whole blood expression model. More than 30% of model SNPs were missing or mismatched. Some TWAS hits were driven entirely by proxy SNPs not actually present in GWAS.
What We Do Differently
We harmonize GWAS and eQTL data - including allele strand, RSID matching, liftover between genome builds, and ancestry-matched LD panels. If necessary, we reconstruct expression models using ancestry-specific eQTL data (e.g., from eQTLGen or CAGE) to ensure compatibility.
The Problem
TWAS used GTEx models from irrelevant tissues - such as liver or testis - even though the phenotype is neurological or immunological, which dilutes the signal and creates noise.
Why It Happens
Analysts run TWAS using all available tissues and report hits from the most significant model, without evaluating whether the tissue is relevant to disease biology. Others pick tissues with the largest sample size, thinking it’s statistically safer.
Real Example
A schizophrenia TWAS identified strong associations with predicted gene expression in testis and esophagus muscularis, but no signal in brain cortex or cerebellum. The authors did not question this, but reviewers flagged the issue immediately.
What We Do Differently
We guide clients to select tissues based on phenotype relevance. If sample size is limited in key tissues (e.g., brain), we use cross-tissue models (UTMOST, MULTIPLIER) or integrate cell-type deconvolution results. Biological plausibility comes before p-values.
Your top TWAS hit may be a tissue artifact. We ensure tissue choice matches disease biology and sample constraints. Request a free consultation →
The Problem
LD structure used in TWAS is not matched to the GWAS population, resulting in miscalculated expression weights or inflated Z-scores due to incorrect covariance estimates.
Why It Happens
Expression models need LD reference to calculate SNP–SNP correlations. Many pipelines use 1000 Genomes EUR data by default, even if the GWAS cohort is admixed, or uses East Asian or African ancestry samples. TWAS output then becomes misleading.
Real Example
In a metabolic trait study in Japanese population, TWAS showed strong associations at a gene locus not replicated in an independent cohort. The signal came from LD mismatch between GTEx-based weights (European LD) and Japanese GWAS.
What We Do Differently
We check GWAS ancestry explicitly and always use matching LD reference panels - 1000G EAS, AFR, or even custom-built LD matrices from the client’s raw genotypes. When not possible, we apply sensitivity analyses and warn the client about interpretability risks.
The Problem
Some genes included in TWAS models have very low expression heritability (R² near zero), yet still yield significant p-values, due to noise or model overfitting.
Why It Happens
Public tools like FUSION or S-PrediXcan include hundreds of genes in models, even if their expression cannot be accurately predicted from genotype. Analysts often report all hits without filtering for model quality.
Real Example
One group reported top TWAS hits for an uncharacterized gene on chr7, but the R² of its GTEx model was only 0.01. Follow-up showed the signal was driven by 3 SNPs with weak weights, high LD, and no reproducibility.
What We Do Differently
We filter TWAS results by prediction performance (cross-validated R² ≥ 0.1 or FDR-adjusted model q-values). For low-quality models, we suppress reporting or clearly label them as exploratory. We favor sparse models with interpretable weights.
Not all gene prediction models are suitable for your trait. TWAS signals can vanish or reverse when models built on irrelevant tissues are used. We help select and justify tissue sources, correct for covariance artifacts, and re-estimate effects using biologically coherent priors. Ask us about tissue matching →
The Problem
Population stratification not fully controlled in the GWAS leads to spurious TWAS hits, which cannot be replicated or interpreted.
Why It Happens
Even if the GWAS was adjusted for principal components (PCs), residual structure can remain. TWAS aggregates multiple SNPs, making it more sensitive to subtle stratification effects, especially if the expression prediction model amplifies them.
Real Example
In a cardiometabolic study, all TWAS hits were clustered in chromosome 17. After further analysis, it turned out that this region had stratified allele frequencies due to a known inversion polymorphism, not disease biology.
What We Do Differently
We test for stratification-induced inflation using LD score regression intercepts and genomic control lambda. We also perform permutation TWAS or randomization tests to assess robustness. When possible, we validate TWAS signals using replication cohorts.
The Problem
TWAS highlights genes whose expression correlates with trait-associated SNPs - but these genes are not causal, just co-regulated or in LD with true drivers.
Why It Happens
TWAS is a correlational method. If two genes in a region share regulatory SNPs, both may appear as TWAS hits. But only one might be functionally relevant. This misleads interpretation and leads to wrong biological conclusions.
Real Example
In an autoimmune TWAS, both IL10 and a nearby lncRNA showed strong signals. Follow-up experiments confirmed only IL10 was functionally involved. The lncRNA was co-regulated due to shared enhancer.
What We Do Differently
We complement TWAS with fine-mapping tools like FOCUS or SuSiE, which assign posterior probabilities to each gene being causal. We integrate expression correlation matrices to assess redundancy. If needed, we rerun TWAS after conditioning on top signal.
Many TWAS hits reflect LD, not biology. We help separate true gene-level regulation from LD hitchhiking and colocalization artifacts. Our TWAS bioinformatics support includes conditional modeling, fine-mapping, and integration with eQTL data from matched populations. See how we validate TWAS hits →
The Problem
Analysts treat every TWAS hit as novel insight, but without evaluating the credible set or checking whether signal is due to known GWAS SNPs.
Why It Happens
Many studies don’t use TWAS fine-mapping tools and skip the step of comparing TWAS results with standard GWAS loci. They assume TWAS adds new information - but often it just redistributes known GWAS effects.
Real Example
A cancer study reported 14 TWAS genes near the MYC locus. But all of them traced to the same GWAS SNP cluster. No fine-mapping was performed, and claims of novel biology were unfounded.
What We Do Differently
We perform FOCUS analysis to identify credible gene sets. We report overlap with known GWAS peaks and test how much variance TWAS model adds beyond SNP-level associations. We label redundant hits and focus only on truly informative signals.
The Problem
TWAS and colocalization are conflated. Analysts claim a gene is causal based on TWAS p-value, but do not check whether the GWAS and eQTL signals actually colocalize.
Why It Happens
Tools like S-PrediXcan or FUSION do not automatically test for colocalization. Some researchers believe a strong TWAS signal proves gene–trait causality, but this is not true if the GWAS and eQTL peaks are separate.
Real Example
In a brain-related TWAS, gene A showed strong association. But coloc analysis revealed the GWAS peak was independent from the eQTL. The real eQTL was in a nearby region, affecting another gene B.
What We Do Differently
We always perform coloc (e.g., with coloc, eCAVIAR, or enloc) to test for shared causal variant between GWAS and eQTL. We clearly report PP4 (posterior probability of colocalization) and only interpret TWAS hits as causal if colocalization is supported.
The Problem
TWAS findings are presented as final results, without replication in independent datasets, or any experimental validation to support gene function.
Why It Happens
TWAS results can look statistically impressive, and some studies stop there. But reviewers often question robustness. Without replication, results remain hypothesis-generating only.
Real Example
A published TWAS in neurodegeneration reported several novel gene associations. But follow-up in an independent GWAS with larger sample size showed no signal for most of them. The paper received harsh criticism post-publication.
What We Do Differently
We assist clients in validating TWAS hits in independent GWAS (e.g., UK Biobank, FinnGen), or alternative eQTL panels. We help design follow-up experiments such as CRISPR perturbation or expression manipulation in disease-relevant cell lines.
TWAS can bridge the gap between GWAS and gene function - but only if done right. Misalignment in reference panels, wrong tissue models, LD mismatch, and lack of fine-mapping can all create misleading signals. Many TWAS studies produce impressive lists of hits that cannot be interpreted or reproduced. This is a dangerous situation for translational science.
If you want your TWAS analysis to truly guide downstream experiments and support biological discovery, we recommend addressing the pitfalls above with full rigor. A solid TWAS project is not about running a single tool - it is about harmonization, validation, and careful biological thinking at every step.
Missed isoform-level regulation can hide key targets. Many TWAS models ignore transcript-level specificity. We re-analyze key loci with isoform-resolved prediction models and integrate sQTL and transcriptomic splicing data to uncover hidden signals missed by default pipelines. Ask about isoform-aware TWAS →