RNA-seq Data Analaysis (9/21/2014)


Sun, 09/21/2014

Customer asks a general question about what methods should be used to analyze RNA-seq data.

Sun, 09/21/2014 at 9:26 AM

AccuraScience LB: If the samples are of a species for which a reference genome is available (e.g., human or mouse), the standard procedure of carrying out the RNA-seq analysis would include (1) sequencing data quality control, (2) mapping of the sequencing reads to the reference genome, allow exon-exon junctions to be detected, using TopHat, (3) run through the Cufflinks pipeline to obtain expression quantification of genes, and often times, differential expression analysis. The Cufflinks pipeline also allows analysis of alternative splicing isoforms. More "advanced" analysis related to RNA-seq data includes identification of novel transcripts, assessing the coding potential of the novel transcripts for purposes of identifying long non-coding RNAs, and pathway analysis following the differential expression analysis for purposes of identifying enriched biological pathways among the differentially expressed genes.

If the samples are of a non-model organism without a reliable reference genome, then de novo assembly of the transcriptome would have to be performed. Multiple tools are available for this purposes, but our experience is that Trinity would work best. Following de novo assembly, the functional annotation of the transcribed genes can be attempted.

If there are other types of data (e.g., ChIP-seq and resequencing data - including data in the public domain that are not generated in your own lab), there is the potential of developing integrated analysis of multiple types of data which could generate insights not possible if you look at a single type of data alone. We would need to get to know more about your project - its objectives and what related data might be available - in order to be able to inject more ideas about what other analysis could be attempted.

Back to Other Selected Recent Inquiries

Note: LB stands for Lead Bioinformatician. An AccuraScience LB is a senior bioinformatics expert and leader of an AccuraScience data analysis team.

Disclaimer: This text was selected and edited based on genuine communications that took place between a customer and AccuraScience data analysis team at specified dates and times. The editing was made to protect the customer’s privacy and for brevity. The edited text may or may not have been reviewed and approved by the customer. AccuraScience is solely responsible for the accuracy of the information reflected in this text.