Wed, 11/20/2013
Customer calls and says that he has a set of Affymetrix gene expression array data for a few time points in development, and seeks ideas from AccuraScience about in general terms, how these data may be analyzed to produce useful hypotheses/results.
Wed, 11/20/2013 at 8:49 PM
AccuraScience LB: In general terms, your data may be analyzed in a few different ways: 1) Differential expression (DE) analysis: Between two consecutive times points (e.g. 2h vs. 4h after cell differentiation) or two different conditions (e.g. control vs. treatment), detecting statistically differentially expressed genes. This is usually the first step of microarray/RNA-seq analysis and has been widely used in most expression analysis.
2) Time series analysis. Microarray/RNA-seq experiments are also widely used for exploring temporal gene expression dynamics, for example, sporulation in budding yeast or cardiac differentiation. Comparing the simple DE analysis, time series analysis can better capture the global expression dynamics and provide valuable information that is usually missed by simple DE analysis. Advanced statistical technique, for example, fuzzy C-means, can cluster genes with distinct expression pattern into different groups.
3) Gene set enrichment analysis. Simple DE analysis or time series analysis can give a few lists of interesting genes. For DE analysis, the gene list contains the genes are either up- or down-regulated between two conditions; for time series analysis, the gene list comes naturally from the clustering procedure. Gene set enrichment analysis (GSEA) can estimate whether or not a specific set of genes is statistically enriched in the gene lists. The gene set is usually predefined and could be, for example, a biological process (e.g. Wnt signaling pathway), transcriptional factor binding sites (e.g. Sox2 binding sites), microRNA target (e.g. mir-13a), lung cancer related genes and etc. GSEA is a powerful tool of linking primitive results from microarray/RNA-seq experiments to available biological knowledge and help biologists better understand and explain the experiment results.
4) Network/pathway analysis. Beside GSEA, another popular technique for analyzing temporal expression data is network-based analysis. Genes showing correlated expression pattern usually regulate each other through certain signaling pathways or physically interacted. First it is interesting to know from available knowledge that which of them are known to be directly or indirectly interacted with each other. For example, regeneration related genes coming from temporal expression clustering are mapped to the known signaling pathways. More advanced technique can be also used to model the dynamic regulation, as well as inferring unknown regulatory relationship. For example, a gene network was inferred from time-series microarray data of chronic lymphocytic leukemia and successfully predicted the outcome of intervention.
Back to Other Selected Recent Inquiries
Note: LB stands for Lead Bioinformatician. An AccuraScience LB is a senior bioinformatics expert and leader of an AccuraScience data analysis team.
Disclaimer: This text was selected and edited based on genuine communications that took place between a customer and AccuraScience data analysis team at specified dates and times. The editing was made to protect the customer’s privacy and for brevity. The edited text may or may not have been reviewed and approved by the customer. AccuraScience is solely responsible for the accuracy of the information reflected in this text.