IMAP: An integrated bioinformatics and visualization pipeline for microbiome data analysis

Teresia M. Buza, Triza Tonui, Francesca Stomeo, Christian Tiambo, Robab Katani, Megan Schilling, Beatus Lyimo, Paul Gwakisa, Isabella Cattadori, Joram Buza, Vivek Kapur

Research output: Contribution to journalArticle

Abstract

Background: One of the major challenges facing investigators in the microbiome field is turning large numbers of reads generated by next-generation sequencing (NGS) platforms into biological knowledge. Effective analytical workflows that guarantee reproducibility, repeatability, and result provenance are essential requirements of modern microbiome research. For nearly a decade, several state-of-the-art bioinformatics tools have been developed for understanding microbial communities living in a given sample. However, most of these tools are built with many functions that require an in-depth understanding of their implementation and the choice of additional tools for visualizing the final output. Furthermore, microbiome analysis can be time-consuming and may even require more advanced programming skills which some investigators may be lacking. Results: We have developed a wrapper named iMAP (Integrated Microbiome Analysis Pipeline) to provide the microbiome research community with a user-friendly and portable tool that integrates bioinformatics analysis and data visualization. The iMAP tool wraps functionalities for metadata profiling, quality control of reads, sequence processing and classification, and diversity analysis of operational taxonomic units. This pipeline is also capable of generating web-based progress reports for enhancing an approach referred to as review-as-you-go (RAYG). For the most part, the profiling of microbial community is done using functionalities implemented in Mothur or QIIME2 platform. Also, it uses different R packages for graphics and R-markdown for generating progress reports. We have used a case study to demonstrate the application of the iMAP pipeline. Conclusions: The iMAP pipeline integrates several functionalities for better identification of microbial communities present in a given sample. The pipeline performs in-depth quality control that guarantees high-quality results and accurate conclusions. The vibrant visuals produced by the pipeline facilitate a better understanding of the complex and multidimensional microbiome data. The integrated RAYG approach enables the generation of web-based reports, which provides the investigators with the intermediate output that can be reviewed progressively. The intensively analyzed case study set a model for microbiome data analysis.

Original languageEnglish (US)
Article number374
JournalBMC bioinformatics
Volume20
Issue number1
DOIs
StatePublished - Jul 3 2019

Fingerprint

Microbiota
Bioinformatics
Computational Biology
Data analysis
Visualization
Pipelines
Quality Control
Profiling
Research Personnel
Web-based
Quality control
Integrate
Provenance
Data visualization
Wrapper
Multidimensional Data
Data Visualization
Output
Repeatability
Reproducibility

All Science Journal Classification (ASJC) codes

  • Structural Biology
  • Biochemistry
  • Molecular Biology
  • Computer Science Applications
  • Applied Mathematics

Cite this

Buza, Teresia M. ; Tonui, Triza ; Stomeo, Francesca ; Tiambo, Christian ; Katani, Robab ; Schilling, Megan ; Lyimo, Beatus ; Gwakisa, Paul ; Cattadori, Isabella ; Buza, Joram ; Kapur, Vivek. / IMAP : An integrated bioinformatics and visualization pipeline for microbiome data analysis. In: BMC bioinformatics. 2019 ; Vol. 20, No. 1.
@article{5435c3a9d9cb46eda59e22694c27ae60,
title = "IMAP: An integrated bioinformatics and visualization pipeline for microbiome data analysis",
abstract = "Background: One of the major challenges facing investigators in the microbiome field is turning large numbers of reads generated by next-generation sequencing (NGS) platforms into biological knowledge. Effective analytical workflows that guarantee reproducibility, repeatability, and result provenance are essential requirements of modern microbiome research. For nearly a decade, several state-of-the-art bioinformatics tools have been developed for understanding microbial communities living in a given sample. However, most of these tools are built with many functions that require an in-depth understanding of their implementation and the choice of additional tools for visualizing the final output. Furthermore, microbiome analysis can be time-consuming and may even require more advanced programming skills which some investigators may be lacking. Results: We have developed a wrapper named iMAP (Integrated Microbiome Analysis Pipeline) to provide the microbiome research community with a user-friendly and portable tool that integrates bioinformatics analysis and data visualization. The iMAP tool wraps functionalities for metadata profiling, quality control of reads, sequence processing and classification, and diversity analysis of operational taxonomic units. This pipeline is also capable of generating web-based progress reports for enhancing an approach referred to as review-as-you-go (RAYG). For the most part, the profiling of microbial community is done using functionalities implemented in Mothur or QIIME2 platform. Also, it uses different R packages for graphics and R-markdown for generating progress reports. We have used a case study to demonstrate the application of the iMAP pipeline. Conclusions: The iMAP pipeline integrates several functionalities for better identification of microbial communities present in a given sample. The pipeline performs in-depth quality control that guarantees high-quality results and accurate conclusions. The vibrant visuals produced by the pipeline facilitate a better understanding of the complex and multidimensional microbiome data. The integrated RAYG approach enables the generation of web-based reports, which provides the investigators with the intermediate output that can be reviewed progressively. The intensively analyzed case study set a model for microbiome data analysis.",
author = "Buza, {Teresia M.} and Triza Tonui and Francesca Stomeo and Christian Tiambo and Robab Katani and Megan Schilling and Beatus Lyimo and Paul Gwakisa and Isabella Cattadori and Joram Buza and Vivek Kapur",
year = "2019",
month = "7",
day = "3",
doi = "10.1186/s12859-019-2965-4",
language = "English (US)",
volume = "20",
journal = "BMC Bioinformatics",
issn = "1471-2105",
publisher = "BioMed Central",
number = "1",

}

IMAP : An integrated bioinformatics and visualization pipeline for microbiome data analysis. / Buza, Teresia M.; Tonui, Triza; Stomeo, Francesca; Tiambo, Christian; Katani, Robab; Schilling, Megan; Lyimo, Beatus; Gwakisa, Paul; Cattadori, Isabella; Buza, Joram; Kapur, Vivek.

In: BMC bioinformatics, Vol. 20, No. 1, 374, 03.07.2019.

Research output: Contribution to journalArticle

TY - JOUR

T1 - IMAP

T2 - An integrated bioinformatics and visualization pipeline for microbiome data analysis

AU - Buza, Teresia M.

AU - Tonui, Triza

AU - Stomeo, Francesca

AU - Tiambo, Christian

AU - Katani, Robab

AU - Schilling, Megan

AU - Lyimo, Beatus

AU - Gwakisa, Paul

AU - Cattadori, Isabella

AU - Buza, Joram

AU - Kapur, Vivek

PY - 2019/7/3

Y1 - 2019/7/3

N2 - Background: One of the major challenges facing investigators in the microbiome field is turning large numbers of reads generated by next-generation sequencing (NGS) platforms into biological knowledge. Effective analytical workflows that guarantee reproducibility, repeatability, and result provenance are essential requirements of modern microbiome research. For nearly a decade, several state-of-the-art bioinformatics tools have been developed for understanding microbial communities living in a given sample. However, most of these tools are built with many functions that require an in-depth understanding of their implementation and the choice of additional tools for visualizing the final output. Furthermore, microbiome analysis can be time-consuming and may even require more advanced programming skills which some investigators may be lacking. Results: We have developed a wrapper named iMAP (Integrated Microbiome Analysis Pipeline) to provide the microbiome research community with a user-friendly and portable tool that integrates bioinformatics analysis and data visualization. The iMAP tool wraps functionalities for metadata profiling, quality control of reads, sequence processing and classification, and diversity analysis of operational taxonomic units. This pipeline is also capable of generating web-based progress reports for enhancing an approach referred to as review-as-you-go (RAYG). For the most part, the profiling of microbial community is done using functionalities implemented in Mothur or QIIME2 platform. Also, it uses different R packages for graphics and R-markdown for generating progress reports. We have used a case study to demonstrate the application of the iMAP pipeline. Conclusions: The iMAP pipeline integrates several functionalities for better identification of microbial communities present in a given sample. The pipeline performs in-depth quality control that guarantees high-quality results and accurate conclusions. The vibrant visuals produced by the pipeline facilitate a better understanding of the complex and multidimensional microbiome data. The integrated RAYG approach enables the generation of web-based reports, which provides the investigators with the intermediate output that can be reviewed progressively. The intensively analyzed case study set a model for microbiome data analysis.

AB - Background: One of the major challenges facing investigators in the microbiome field is turning large numbers of reads generated by next-generation sequencing (NGS) platforms into biological knowledge. Effective analytical workflows that guarantee reproducibility, repeatability, and result provenance are essential requirements of modern microbiome research. For nearly a decade, several state-of-the-art bioinformatics tools have been developed for understanding microbial communities living in a given sample. However, most of these tools are built with many functions that require an in-depth understanding of their implementation and the choice of additional tools for visualizing the final output. Furthermore, microbiome analysis can be time-consuming and may even require more advanced programming skills which some investigators may be lacking. Results: We have developed a wrapper named iMAP (Integrated Microbiome Analysis Pipeline) to provide the microbiome research community with a user-friendly and portable tool that integrates bioinformatics analysis and data visualization. The iMAP tool wraps functionalities for metadata profiling, quality control of reads, sequence processing and classification, and diversity analysis of operational taxonomic units. This pipeline is also capable of generating web-based progress reports for enhancing an approach referred to as review-as-you-go (RAYG). For the most part, the profiling of microbial community is done using functionalities implemented in Mothur or QIIME2 platform. Also, it uses different R packages for graphics and R-markdown for generating progress reports. We have used a case study to demonstrate the application of the iMAP pipeline. Conclusions: The iMAP pipeline integrates several functionalities for better identification of microbial communities present in a given sample. The pipeline performs in-depth quality control that guarantees high-quality results and accurate conclusions. The vibrant visuals produced by the pipeline facilitate a better understanding of the complex and multidimensional microbiome data. The integrated RAYG approach enables the generation of web-based reports, which provides the investigators with the intermediate output that can be reviewed progressively. The intensively analyzed case study set a model for microbiome data analysis.

UR - http://www.scopus.com/inward/record.url?scp=85068595842&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85068595842&partnerID=8YFLogxK

U2 - 10.1186/s12859-019-2965-4

DO - 10.1186/s12859-019-2965-4

M3 - Article

C2 - 31269897

AN - SCOPUS:85068595842

VL - 20

JO - BMC Bioinformatics

JF - BMC Bioinformatics

SN - 1471-2105

IS - 1

M1 - 374

ER -