Metalign: Efficient alignment-based metagenomic profiling via containment min hash

Nathan Lapierre, Mohammed Alser, Eleazar Eskin, David Koslicki, Serghei Mangul

    Research output: Contribution to journalArticlepeer-review

    Abstract

    Metagenomic profiling, predicting the presence and relative abundances of microbes in a sample, is a critical first step in microbiome analysis. Alignment-based approaches are often considered accurate yet computationally infeasible. Here, we present a novel method, Metalign, that performs efficient and accurate alignment-based metagenomic profiling. We use a novel containment min hash approach to pre-filter the reference database prior to alignment and then process both uniquely aligned and multi-aligned reads to produce accurate abundance estimates. In performance evaluations on both real and simulated datasets, Metalign is the only method evaluated that maintained high performance and competitive running time across all datasets.

    Original languageEnglish (US)
    Article number242
    JournalGenome biology
    Volume21
    Issue number1
    DOIs
    StatePublished - Sep 10 2020

    All Science Journal Classification (ASJC) codes

    • Ecology, Evolution, Behavior and Systematics
    • Genetics
    • Cell Biology

    Fingerprint Dive into the research topics of 'Metalign: Efficient alignment-based metagenomic profiling via containment min hash'. Together they form a unique fingerprint.

    Cite this