The uses of the Genome Reference Consortium's human reference sequence can be roughly categorized into three related but distinct categories: as a representative species genome, as a coordinate systemfor identifying variants, and as an alignment reference for variation detection algorithms. However, the use of this reference sequence as simultaneously a representative species genome and as an alignment reference leads to unnecessary artifacts for structural variation detection algorithms and limits their accuracy.We show how decoupling these two references and developing a separate alignment reference can significantly improve the accuracy of structural variation detection, lead to improved genotyping of disease related genes, and decrease the cost of studying polymorphismin a population.
All Science Journal Classification (ASJC) codes
- Biochemistry, Genetics and Molecular Biology(all)
- Agricultural and Biological Sciences(all)