IndeCut evaluates performance of network motif discovery algorithms

Mitra Ansariola, Molly Megraw, David Koslicki

    Research output: Contribution to journalArticle

    1 Citation (Scopus)

    Abstract

    Motivation Genomic networks represent a complex map of molecular interactions which are descriptive of the biological processes occurring in living cells. Identifying the small over-represented circuitry patterns in these networks helps generate hypotheses about the functional basis of such complex processes. Network motif discovery is a systematic way of achieving this goal. However, a reliable network motif discovery outcome requires generating random background networks which are the result of a uniform and independent graph sampling method. To date, there has been no method to numerically evaluate whether any network motif discovery algorithm performs as intended on realistically sized datasets - thus it was not possible to assess the validity of resulting network motifs. Results In this work, we present IndeCut, the first method to date that characterizes network motif finding algorithm performance in terms of uniform sampling on realistically sized networks. We demonstrate that it is critical to use IndeCut prior to running any network motif finder for two reasons. First, IndeCut indicates the number of samples needed for a tool to produce an outcome that is both reproducible and accurate. Second, IndeCut allows users to choose the tool that generates samples in the most independent fashion for their network of interest among many available options. Availability and implementation The open source software package is available at https://github.com/megrawlab/IndeCut. Contact megrawm@science.oregonstate.edu or david.koslicki@math.oregonstate.edu Supplementary informationSupplementary dataare available at Bioinformatics online.

    Original languageEnglish (US)
    Pages (from-to)1514-1521
    Number of pages8
    JournalBioinformatics
    Volume34
    Issue number9
    DOIs
    StatePublished - May 1 2018

    Fingerprint

    Motif Discovery
    Sampling
    Molecular interactions
    Evaluate
    Bioinformatics
    Software packages
    Biological Phenomena
    Cells
    Availability
    Computational Biology
    Software
    Open Source Software
    Sampling Methods
    Software Package
    Genomics
    Open source software
    Choose

    All Science Journal Classification (ASJC) codes

    • Statistics and Probability
    • Biochemistry
    • Molecular Biology
    • Computer Science Applications
    • Computational Theory and Mathematics
    • Computational Mathematics

    Cite this

    Ansariola, Mitra ; Megraw, Molly ; Koslicki, David. / IndeCut evaluates performance of network motif discovery algorithms. In: Bioinformatics. 2018 ; Vol. 34, No. 9. pp. 1514-1521.
    @article{bacf5aaca6db405896ab570df7b3d885,
    title = "IndeCut evaluates performance of network motif discovery algorithms",
    abstract = "Motivation Genomic networks represent a complex map of molecular interactions which are descriptive of the biological processes occurring in living cells. Identifying the small over-represented circuitry patterns in these networks helps generate hypotheses about the functional basis of such complex processes. Network motif discovery is a systematic way of achieving this goal. However, a reliable network motif discovery outcome requires generating random background networks which are the result of a uniform and independent graph sampling method. To date, there has been no method to numerically evaluate whether any network motif discovery algorithm performs as intended on realistically sized datasets - thus it was not possible to assess the validity of resulting network motifs. Results In this work, we present IndeCut, the first method to date that characterizes network motif finding algorithm performance in terms of uniform sampling on realistically sized networks. We demonstrate that it is critical to use IndeCut prior to running any network motif finder for two reasons. First, IndeCut indicates the number of samples needed for a tool to produce an outcome that is both reproducible and accurate. Second, IndeCut allows users to choose the tool that generates samples in the most independent fashion for their network of interest among many available options. Availability and implementation The open source software package is available at https://github.com/megrawlab/IndeCut. Contact megrawm@science.oregonstate.edu or david.koslicki@math.oregonstate.edu Supplementary informationSupplementary dataare available at Bioinformatics online.",
    author = "Mitra Ansariola and Molly Megraw and David Koslicki",
    year = "2018",
    month = "5",
    day = "1",
    doi = "10.1093/bioinformatics/btx798",
    language = "English (US)",
    volume = "34",
    pages = "1514--1521",
    journal = "Bioinformatics",
    issn = "1367-4803",
    publisher = "Oxford University Press",
    number = "9",

    }

    IndeCut evaluates performance of network motif discovery algorithms. / Ansariola, Mitra; Megraw, Molly; Koslicki, David.

    In: Bioinformatics, Vol. 34, No. 9, 01.05.2018, p. 1514-1521.

    Research output: Contribution to journalArticle

    TY - JOUR

    T1 - IndeCut evaluates performance of network motif discovery algorithms

    AU - Ansariola, Mitra

    AU - Megraw, Molly

    AU - Koslicki, David

    PY - 2018/5/1

    Y1 - 2018/5/1

    N2 - Motivation Genomic networks represent a complex map of molecular interactions which are descriptive of the biological processes occurring in living cells. Identifying the small over-represented circuitry patterns in these networks helps generate hypotheses about the functional basis of such complex processes. Network motif discovery is a systematic way of achieving this goal. However, a reliable network motif discovery outcome requires generating random background networks which are the result of a uniform and independent graph sampling method. To date, there has been no method to numerically evaluate whether any network motif discovery algorithm performs as intended on realistically sized datasets - thus it was not possible to assess the validity of resulting network motifs. Results In this work, we present IndeCut, the first method to date that characterizes network motif finding algorithm performance in terms of uniform sampling on realistically sized networks. We demonstrate that it is critical to use IndeCut prior to running any network motif finder for two reasons. First, IndeCut indicates the number of samples needed for a tool to produce an outcome that is both reproducible and accurate. Second, IndeCut allows users to choose the tool that generates samples in the most independent fashion for their network of interest among many available options. Availability and implementation The open source software package is available at https://github.com/megrawlab/IndeCut. Contact megrawm@science.oregonstate.edu or david.koslicki@math.oregonstate.edu Supplementary informationSupplementary dataare available at Bioinformatics online.

    AB - Motivation Genomic networks represent a complex map of molecular interactions which are descriptive of the biological processes occurring in living cells. Identifying the small over-represented circuitry patterns in these networks helps generate hypotheses about the functional basis of such complex processes. Network motif discovery is a systematic way of achieving this goal. However, a reliable network motif discovery outcome requires generating random background networks which are the result of a uniform and independent graph sampling method. To date, there has been no method to numerically evaluate whether any network motif discovery algorithm performs as intended on realistically sized datasets - thus it was not possible to assess the validity of resulting network motifs. Results In this work, we present IndeCut, the first method to date that characterizes network motif finding algorithm performance in terms of uniform sampling on realistically sized networks. We demonstrate that it is critical to use IndeCut prior to running any network motif finder for two reasons. First, IndeCut indicates the number of samples needed for a tool to produce an outcome that is both reproducible and accurate. Second, IndeCut allows users to choose the tool that generates samples in the most independent fashion for their network of interest among many available options. Availability and implementation The open source software package is available at https://github.com/megrawlab/IndeCut. Contact megrawm@science.oregonstate.edu or david.koslicki@math.oregonstate.edu Supplementary informationSupplementary dataare available at Bioinformatics online.

    UR - http://www.scopus.com/inward/record.url?scp=85047095428&partnerID=8YFLogxK

    UR - http://www.scopus.com/inward/citedby.url?scp=85047095428&partnerID=8YFLogxK

    U2 - 10.1093/bioinformatics/btx798

    DO - 10.1093/bioinformatics/btx798

    M3 - Article

    C2 - 29236975

    AN - SCOPUS:85047095428

    VL - 34

    SP - 1514

    EP - 1521

    JO - Bioinformatics

    JF - Bioinformatics

    SN - 1367-4803

    IS - 9

    ER -