Reasons to consider a umi-based sequencing approach
Next-generation sequencing (NGS) has unlocked the ability to cost-effectively interrogate the human genome. NGS begins with library preparation, which usually involves PCR amplification of the targets of interest. Although necessary, this PCR step and the eventual sequencing process, can result in bias and error. To correct for any bias or errors, one can incorporate a unique molecular index (UMI, also known as unique molecular identifier) into their sequencing workflow. During the RareSeq™️ process, 16bp UMIs are added prior to PCR in the library prep process. These UMIs allow for read families to be generated during the analysis process, which essentially means that all reads with the same UMI are grouped and can be attributed to the same original molecule. Below, we have expanded on the bias and errors that NGS can incur in order to highlight the top reasons to consider a UMI-based approach for your next sequencing experiment.
- UMIs Allow for More Accurate Allele Quantification It is understood that specific sequences may be preferentially amplified during PCR, which can be corrected during analysis by a process called deduplication. In deduplication, reads that align to the exact same position in the reference genome are determined to be PCR duplicates and therefore are counted as an individual molecule. Regardless of whether or not you’ve ever thought about deduplication, it’s probably obvious that there are inherent problems with this step in the analysis process. Two unique molecules from different cells could result in the exact same reads and would therefore be reduced to a single molecule during the deduplication process. So, instead of that molecule receiving a count of 2 within the sample, it’s counted as only 1. It’s easy to see how this issue can compound to give inaccurate estimates based on NGS data. However, with a UMI approach like RareSeq™️, individual molecules are labeled prior to PCR such that even if they result in identical aligned sequences, the UMIs attached to those molecules will still be unique and thus 2 unique molecules are counted in the sample.
- UMIs Allow for Increased Sensitivity to Identify Rare Variants When looking for rare variants, sequencing errors become a concern. Sequencing errors can occur during PCR as the sample is being prepared. Errors can also occur during the actual sequencing process. As scientists have started to push the limits of NGS, we are more frequently looking for that needle in a haystack. Take liquid biopsies as an example. The goal might be to identify the first sign of recurrence by sequencing circulating tumor DNA (ctDNA). The concentration of ctDNA is likely to be very low relative to normal cell free DNA (cfDNA). If sequencing unveils a variant in a key gene, how do we know if that is a true variant or just a sequencing artifact? It is estimated that the error rate for standard NGS is ~1-5%, which means that at best, we can call a variant at 1:100. Any true variant below 1% would not be able to be distinguished from an error. This is a concern for a number of applications that require increased sensitivity, including ctDNA analysis for recurrence monitoring. Using RareSeq™️, we can push the limit of detection to 1:10,000 – 100X greater sensitivity than sequencing without a UMI-based approach.