RNA-seq and Exome Sequencing Analysis for Cancer Samples (9/24/2014)


Tue, 09/23/2014

Tue, 09/23/014 at 4:10 PM

Customer: Could you do the following analysis: (1) RNA-Seq: Will sequence sarcoma tissue from 12 patients on HiSeq. Looking for cancer-related alterations associated with response to chemoradiotherapy. Analysis: QC, mapping onto reference genome, differential expression analysis. Also interested in prices for alternative splicing analysis, calling of SNP variants and pathway analysis. (2) Exome alignment: Will sequence exome for 10 prostate cancer patients on HiSeq. Analysis: QC, mapping onto reference human genome, selection of SNPs, and CNV analysis . Wanting variants, read-depth and separation of variants into known, novel and potentially pathogenic classes.

Wed, 09/24/2014 at 3:22 PM

AccuraScience LB: RNA-seq analysis - up to the point of differential expression analysis - is considered Track 1, routine analysis (http://www.accurascience.com/pricing.html). SNV calling, splice variant analysis and pathway analysis (we most often use GO-defined pathways; if you have other requirements please let us know) can all be done. I am sure you are aware that RNA-seq data differ from resequencing in that the depth coverage varies across regions, and SNV calling for lowly expressed genes is not as not as accurate as for highly expressed genes.

For exome sequencing analysis, up to the point of SNV calling, it is considered as Track 1 routine analysis. We have experience with some methods developed for calling CNVs based on deep sequencing data (e.g., BreakDancer, VariationHunter and CNVnator), but their accuracy falls far behind that of SNV calling: typically results produced by two CNV calling tools overlap by only ~20% - and that's for whole-genome sequencing data. It is even more challenging to do this for exome sequencing data, because the depth coverage variation across regions for exome sequencing data is much higher than that of whole-genome sequencing data - the depth coverage distribution resembles an exponential distribution. We are aware of some recently developed methods designed to address this difficulty, but have not tried them in our own hands. Thus there is uncertainty involved in this part of work.

Finally, for the annotation of the variants, many researchers would be interested in running several tools to predict the "deleteriousness" of the mutations identified. Some of these tools we use most often include ANNOVAR, SIFT, PolyPhen2 and MutationTaster. The reason many people look at results of multiple tools is because each of these tools have its own strengths and limitations. You could take a look at these papers http://www.ncbi.nlm.nih.gov/pubmed/22604720, http://www.ncbi.nlm.nih.gov/pubmed/22495306 and http://www.ncbi.nlm.nih.gov/pubmed/23201682 to see how they use the results produced by multiple tools. As of "known" vs "novel" annotation, this is a little tricky, because recent studies suggest that for each newly sequenced individual, about 5-15% of variants identified are known, and the remaining ones are novel - most of the latter category are rare or private mutations. Thus the work of annotating "known" vs "novel" variants is a little like shooting a moving target... In order to accurately annotate each identified variant as "known" or "novel", it would take the analysis of a large population dataset (e.g., a thousand genomes project's data) to get a more complete list of currently "known" variants. Many cancer researchers would be interested in identifying "recurrent" mutations that occur frequently in cancer samples. This we also have had a lot of experience with.

Back to Other Selected Recent Inquiries

Note: LB stands for Lead Bioinformatician. An AccuraScience LB is a senior bioinformatics expert and leader of an AccuraScience data analysis team.

Disclaimer: This text was selected and edited based on genuine communications that took place between a customer and AccuraScience data analysis team at specified dates and times. The editing was made to protect the customer’s privacy and for brevity. The edited text may or may not have been reviewed and approved by the customer. AccuraScience is solely responsible for the accuracy of the information reflected in this text.