Several modes of inference are currently used in practice to evaluate the confidence of putative peptide identifications resulting from database scoring algorithms such as Mascot, SEQUEST, or X!Tandem. The approaches include parametric methods, such as classic PeptideProphet, and distribution free methods, such as methods based on reverse or decoy databases. Because of its parametric nature, classic PeptideProphet, although highly robust, was not highly flexible and was difficult to apply to new search algorithms or classification scores. While commonly applied, the decoy approach has not yet been fully formalized and standardized. And, although they are distribution-free, they like other approaches are not free of assumptions. Recent manuscripts by Käll et al., Choi and Nesvizhskii, and Choi et al. help advance these methods, specifically by formalizing an alternative formulation of decoy databases approaches and extending the PeptideProphet methods to make explicit use of decoy databases, respectively. Taken together with standardized decoy database methods, and expectation scores computed by search engines like Tandem, there exist at least four different modes of inference used to assign confidence levels to individual peptides or groups of peptides. We overview and compare the assumptions of each of these approaches and summarize some interpretation issues. We also discuss some suggestions, which may make the use of decoy databases more computationally efficient in practice.
All Science Journal Classification (ASJC) codes