Bacterial Sequencing Study (1/9/2014)


Wed, 01/08/2014

Customer is a FDA researcher and asks about sequencing a few strains of a bacterial species.

Wed, 01/08/2014 at 4:31 PM

AccuraScience LB: Although we do not perform sequencing experiments in our own facility, we do have several very reliable partner companies that we could recommend to have your bacterium strains sequenced. The cost of sequencing will differ depending on the number of strains you would want to sequence together. If there are only 1-2 strains, we would recommend that you use Illumina's MiSeq sequencer, which was designed to sequence individual small, bacterium-size genomes. However, if the number of strains is larger, then it would be considerably more cost effective to do it in a multiplex manner (with barcoding) in Illumina's HiSeq sequencers - the per-base cost of which could be 10X lower than MiSeq. If you could tell me the strain number you would want sequenced, I will help you look up the options and costs.

After the sequencing experiments are performed, AccuraScience could help you analyze the data. Since the complete genome of another strain of the bacterium is available, what we would do includes (1) sequencing data quality control, (2) Mapping of the reads to the genome of completed genome of existing strain, and (3) identifying and document the variants (SNVs and indels) between the newly sequenced strain and the existing strain.

If, based on your expert opinion, there may be larger scale, structural differences between the new strains and the strain that's already sequenced and assembled, then the analysis of the new data may include identification of structural variations or copy number variations. You mentioned gene annotation – in this avenue, we could annotate novel genes based on their predicted functions.

Thu, 01/09/2014 at 2:28 PM

Customer: We have generated many different strains of this bacterial species. We are interested in finding out changes that has occurred in the genes that have made them biologically different from their wild types. We are specifically interested in changes in the gene sequence that has occurred in two of the isolates in comparison to their wild type. At this point we need to sequence only 3 different strains. One wild type and two resistant mutants.

Thu, 01/09/2014 at 8:52 PM

AccuraScience LB: The quantity of sequencing data required for your study is so small that it might be more sensible to sequence a larger number of strains together. Here is my calculation: in order to reliably identify most SNVs and indels in a strain, 50-60X coverage is recommended. The genome size for your bacterial species is about 3Mb. If we do pair-ended sequencing with 100-base reads, the total number of reads required for each strain is about 3,000,000*60/(100*2) = 0.9 million. One Hiseq lane would produce 120-150 million reads, which means >120 strains can all be sequenced in one lane of experiment. And what this means is that the effort and cost for sample preparation could far exceed the cost of the sequencing experiment itself: one lane of Hiseq would cost $3000-4000 to run, which translates to only $30 per strain if you have 120 strains to sequence together. In contrast, the sample preparation and barcoding cost could be on the order of $300 per strain...

Back to Other Selected Recent Inquiries

Note: LB stands for Lead Bioinformatician. An AccuraScience LB is a senior bioinformatics expert and leader of an AccuraScience data analysis team.

Disclaimer: This text was selected and edited based on genuine communications that took place between a customer and AccuraScience data analysis team at specified dates and times. The editing was made to protect the customer’s privacy and for brevity. The edited text may or may not have been reviewed and approved by the customer. AccuraScience is solely responsible for the accuracy of the information reflected in this text.