Modes of inference for evaluating the confidence of peptide identifications

Matt Fitzgibbon, Qunhua Li, Martin McIntosh

Research output: Contribution to journalReview article

30 Citations (Scopus)

Abstract

Several modes of inference are currently used in practice to evaluate the confidence of putative peptide identifications resulting from database scoring algorithms such as Mascot, SEQUEST, or X!Tandem. The approaches include parametric methods, such as classic PeptideProphet, and distribution free methods, such as methods based on reverse or decoy databases. Because of its parametric nature, classic PeptideProphet, although highly robust, was not highly flexible and was difficult to apply to new search algorithms or classification scores. While commonly applied, the decoy approach has not yet been fully formalized and standardized. And, although they are distribution-free, they like other approaches are not free of assumptions. Recent manuscripts by Käll et al., Choi and Nesvizhskii, and Choi et al. help advance these methods, specifically by formalizing an alternative formulation of decoy databases approaches and extending the PeptideProphet methods to make explicit use of decoy databases, respectively. Taken together with standardized decoy database methods, and expectation scores computed by search engines like Tandem, there exist at least four different modes of inference used to assign confidence levels to individual peptides or groups of peptides. We overview and compare the assumptions of each of these approaches and summarize some interpretation issues. We also discuss some suggestions, which may make the use of decoy databases more computationally efficient in practice.

Original languageEnglish (US)
Pages (from-to)34-39
Number of pages6
JournalJournal of Proteome Research
Volume7
Issue number1
DOIs
StatePublished - Jan 1 2008

Fingerprint

Databases
Peptides
Search Engine
Manuscripts
Search engines

All Science Journal Classification (ASJC) codes

  • Chemistry(all)
  • Biochemistry

Cite this

Fitzgibbon, Matt ; Li, Qunhua ; McIntosh, Martin. / Modes of inference for evaluating the confidence of peptide identifications. In: Journal of Proteome Research. 2008 ; Vol. 7, No. 1. pp. 34-39.
@article{8e97fcdf072941629ec1a5f843301665,
title = "Modes of inference for evaluating the confidence of peptide identifications",
abstract = "Several modes of inference are currently used in practice to evaluate the confidence of putative peptide identifications resulting from database scoring algorithms such as Mascot, SEQUEST, or X!Tandem. The approaches include parametric methods, such as classic PeptideProphet, and distribution free methods, such as methods based on reverse or decoy databases. Because of its parametric nature, classic PeptideProphet, although highly robust, was not highly flexible and was difficult to apply to new search algorithms or classification scores. While commonly applied, the decoy approach has not yet been fully formalized and standardized. And, although they are distribution-free, they like other approaches are not free of assumptions. Recent manuscripts by K{\"a}ll et al., Choi and Nesvizhskii, and Choi et al. help advance these methods, specifically by formalizing an alternative formulation of decoy databases approaches and extending the PeptideProphet methods to make explicit use of decoy databases, respectively. Taken together with standardized decoy database methods, and expectation scores computed by search engines like Tandem, there exist at least four different modes of inference used to assign confidence levels to individual peptides or groups of peptides. We overview and compare the assumptions of each of these approaches and summarize some interpretation issues. We also discuss some suggestions, which may make the use of decoy databases more computationally efficient in practice.",
author = "Matt Fitzgibbon and Qunhua Li and Martin McIntosh",
year = "2008",
month = "1",
day = "1",
doi = "10.1021/pr7007303",
language = "English (US)",
volume = "7",
pages = "34--39",
journal = "Journal of Proteome Research",
issn = "1535-3893",
publisher = "American Chemical Society",
number = "1",

}

Modes of inference for evaluating the confidence of peptide identifications. / Fitzgibbon, Matt; Li, Qunhua; McIntosh, Martin.

In: Journal of Proteome Research, Vol. 7, No. 1, 01.01.2008, p. 34-39.

Research output: Contribution to journalReview article

TY - JOUR

T1 - Modes of inference for evaluating the confidence of peptide identifications

AU - Fitzgibbon, Matt

AU - Li, Qunhua

AU - McIntosh, Martin

PY - 2008/1/1

Y1 - 2008/1/1

N2 - Several modes of inference are currently used in practice to evaluate the confidence of putative peptide identifications resulting from database scoring algorithms such as Mascot, SEQUEST, or X!Tandem. The approaches include parametric methods, such as classic PeptideProphet, and distribution free methods, such as methods based on reverse or decoy databases. Because of its parametric nature, classic PeptideProphet, although highly robust, was not highly flexible and was difficult to apply to new search algorithms or classification scores. While commonly applied, the decoy approach has not yet been fully formalized and standardized. And, although they are distribution-free, they like other approaches are not free of assumptions. Recent manuscripts by Käll et al., Choi and Nesvizhskii, and Choi et al. help advance these methods, specifically by formalizing an alternative formulation of decoy databases approaches and extending the PeptideProphet methods to make explicit use of decoy databases, respectively. Taken together with standardized decoy database methods, and expectation scores computed by search engines like Tandem, there exist at least four different modes of inference used to assign confidence levels to individual peptides or groups of peptides. We overview and compare the assumptions of each of these approaches and summarize some interpretation issues. We also discuss some suggestions, which may make the use of decoy databases more computationally efficient in practice.

AB - Several modes of inference are currently used in practice to evaluate the confidence of putative peptide identifications resulting from database scoring algorithms such as Mascot, SEQUEST, or X!Tandem. The approaches include parametric methods, such as classic PeptideProphet, and distribution free methods, such as methods based on reverse or decoy databases. Because of its parametric nature, classic PeptideProphet, although highly robust, was not highly flexible and was difficult to apply to new search algorithms or classification scores. While commonly applied, the decoy approach has not yet been fully formalized and standardized. And, although they are distribution-free, they like other approaches are not free of assumptions. Recent manuscripts by Käll et al., Choi and Nesvizhskii, and Choi et al. help advance these methods, specifically by formalizing an alternative formulation of decoy databases approaches and extending the PeptideProphet methods to make explicit use of decoy databases, respectively. Taken together with standardized decoy database methods, and expectation scores computed by search engines like Tandem, there exist at least four different modes of inference used to assign confidence levels to individual peptides or groups of peptides. We overview and compare the assumptions of each of these approaches and summarize some interpretation issues. We also discuss some suggestions, which may make the use of decoy databases more computationally efficient in practice.

UR - http://www.scopus.com/inward/record.url?scp=38649136282&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=38649136282&partnerID=8YFLogxK

U2 - 10.1021/pr7007303

DO - 10.1021/pr7007303

M3 - Review article

C2 - 18067248

AN - SCOPUS:38649136282

VL - 7

SP - 34

EP - 39

JO - Journal of Proteome Research

JF - Journal of Proteome Research

SN - 1535-3893

IS - 1

ER -