Adaptive-BLAST: A user-defined platform for the study of proteins

Yoojin Hong, Sree V. Chintapalli, Gaurav Bhardwaj, Zhenhai Zhang, Randen L. Patterson, Damian B. Van Rossum

Research output: Contribution to journalArticle

3 Citations (Scopus)

Abstract

Profile-based protein-sequence analysis algorithms comprise some of the most powerful and user-friendly methods for exploring protein sequences to determine their structure, function, and/or evolution (1-4). PSI-BLAST (5, 6) and rps-BLAST (7) are two of the most popular profile-based algorithms (~1,120 references to date), and have exceptional utility in the identification of homology between proteins, particularly for biological scientists who do not specialize in computational approaches. However, when the performance of these algorithms is compared to other methods [e.g. support-vector machine learning (SVM) (8), hidden-Markov models (HMMs) (9)], they often underperform in identifying the aforementioned protein properties (3, 9-11). We have previously demonstrated that the utility of BLAST algorithms can be significantly improved by: (i) adaptations to the profile libraries employed, (ii) adjustments to output formats, and (iii) alterations to BLAST algorithm itself (4, 6, 12-14). We present here Adaptive-BLAST (Ada-BLAST), which provides a simple user-defined platform for measuring and analyzing primary amino acid sequences. Within this platform, we developed a series of local BLAST applications (apps) that take advantage of the speed and sensitivity afforded by BLAST, while allowing for maximal user-definitions and flexible visualization. We tested the efficacy of these apps in control experiments, studying fold-recognition, in which we obtained >90% accuracy in highly divergent sequences (>25% identity). In addition, these same apps were proficient in classifying transmembrane proteins, identifying structural/functional determinants of ion-channels/receptors, and informing structural modeling algorithms. Indeed, these Ada-BLAST informed-structural models were useful in guiding our experimental research on the N-terminus of Transient Receptor Potential ion-channels (TRPs). Taken together, we propose that Ada-BLAST provides a powerful computational tool that is accessible to bench-scientists and computational biologists alike. The codes for Ada-BLAST are publicly available at: http://empathy.rcc.psu.edu/.

Original languageEnglish (US)
Pages (from-to)88-101
Number of pages14
JournalJournal of Integrated OMICS
Volume1
Issue number1
DOIs
StatePublished - Feb 2011

Fingerprint

Proteins
Ion Channels
Transient Receptor Potential Channels
Social Adjustment
Structural Models
Protein Sequence Analysis
Hidden Markov models
Libraries
Support vector machines
Learning systems
Amino Acid Sequence
Visualization
Amino Acids
Research
Experiments
Support Vector Machine
Machine Learning

All Science Journal Classification (ASJC) codes

  • Biochemistry
  • Molecular Biology
  • Genetics

Cite this

Hong, Yoojin ; Chintapalli, Sree V. ; Bhardwaj, Gaurav ; Zhang, Zhenhai ; Patterson, Randen L. ; Van Rossum, Damian B. / Adaptive-BLAST : A user-defined platform for the study of proteins. In: Journal of Integrated OMICS. 2011 ; Vol. 1, No. 1. pp. 88-101.
@article{a107e9f609cd411dbc0902ec7645192c,
title = "Adaptive-BLAST: A user-defined platform for the study of proteins",
abstract = "Profile-based protein-sequence analysis algorithms comprise some of the most powerful and user-friendly methods for exploring protein sequences to determine their structure, function, and/or evolution (1-4). PSI-BLAST (5, 6) and rps-BLAST (7) are two of the most popular profile-based algorithms (~1,120 references to date), and have exceptional utility in the identification of homology between proteins, particularly for biological scientists who do not specialize in computational approaches. However, when the performance of these algorithms is compared to other methods [e.g. support-vector machine learning (SVM) (8), hidden-Markov models (HMMs) (9)], they often underperform in identifying the aforementioned protein properties (3, 9-11). We have previously demonstrated that the utility of BLAST algorithms can be significantly improved by: (i) adaptations to the profile libraries employed, (ii) adjustments to output formats, and (iii) alterations to BLAST algorithm itself (4, 6, 12-14). We present here Adaptive-BLAST (Ada-BLAST), which provides a simple user-defined platform for measuring and analyzing primary amino acid sequences. Within this platform, we developed a series of local BLAST applications (apps) that take advantage of the speed and sensitivity afforded by BLAST, while allowing for maximal user-definitions and flexible visualization. We tested the efficacy of these apps in control experiments, studying fold-recognition, in which we obtained >90{\%} accuracy in highly divergent sequences (>25{\%} identity). In addition, these same apps were proficient in classifying transmembrane proteins, identifying structural/functional determinants of ion-channels/receptors, and informing structural modeling algorithms. Indeed, these Ada-BLAST informed-structural models were useful in guiding our experimental research on the N-terminus of Transient Receptor Potential ion-channels (TRPs). Taken together, we propose that Ada-BLAST provides a powerful computational tool that is accessible to bench-scientists and computational biologists alike. The codes for Ada-BLAST are publicly available at: http://empathy.rcc.psu.edu/.",
author = "Yoojin Hong and Chintapalli, {Sree V.} and Gaurav Bhardwaj and Zhenhai Zhang and Patterson, {Randen L.} and {Van Rossum}, {Damian B.}",
year = "2011",
month = "2",
doi = "10.5584/jiomics.v1i1.33",
language = "English (US)",
volume = "1",
pages = "88--101",
journal = "Journal of Integrated OMICS",
issn = "2182-0287",
publisher = "Proteomass Scientific Society",
number = "1",

}

Hong, Y, Chintapalli, SV, Bhardwaj, G, Zhang, Z, Patterson, RL & Van Rossum, DB 2011, 'Adaptive-BLAST: A user-defined platform for the study of proteins', Journal of Integrated OMICS, vol. 1, no. 1, pp. 88-101. https://doi.org/10.5584/jiomics.v1i1.33

Adaptive-BLAST : A user-defined platform for the study of proteins. / Hong, Yoojin; Chintapalli, Sree V.; Bhardwaj, Gaurav; Zhang, Zhenhai; Patterson, Randen L.; Van Rossum, Damian B.

In: Journal of Integrated OMICS, Vol. 1, No. 1, 02.2011, p. 88-101.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Adaptive-BLAST

T2 - A user-defined platform for the study of proteins

AU - Hong, Yoojin

AU - Chintapalli, Sree V.

AU - Bhardwaj, Gaurav

AU - Zhang, Zhenhai

AU - Patterson, Randen L.

AU - Van Rossum, Damian B.

PY - 2011/2

Y1 - 2011/2

N2 - Profile-based protein-sequence analysis algorithms comprise some of the most powerful and user-friendly methods for exploring protein sequences to determine their structure, function, and/or evolution (1-4). PSI-BLAST (5, 6) and rps-BLAST (7) are two of the most popular profile-based algorithms (~1,120 references to date), and have exceptional utility in the identification of homology between proteins, particularly for biological scientists who do not specialize in computational approaches. However, when the performance of these algorithms is compared to other methods [e.g. support-vector machine learning (SVM) (8), hidden-Markov models (HMMs) (9)], they often underperform in identifying the aforementioned protein properties (3, 9-11). We have previously demonstrated that the utility of BLAST algorithms can be significantly improved by: (i) adaptations to the profile libraries employed, (ii) adjustments to output formats, and (iii) alterations to BLAST algorithm itself (4, 6, 12-14). We present here Adaptive-BLAST (Ada-BLAST), which provides a simple user-defined platform for measuring and analyzing primary amino acid sequences. Within this platform, we developed a series of local BLAST applications (apps) that take advantage of the speed and sensitivity afforded by BLAST, while allowing for maximal user-definitions and flexible visualization. We tested the efficacy of these apps in control experiments, studying fold-recognition, in which we obtained >90% accuracy in highly divergent sequences (>25% identity). In addition, these same apps were proficient in classifying transmembrane proteins, identifying structural/functional determinants of ion-channels/receptors, and informing structural modeling algorithms. Indeed, these Ada-BLAST informed-structural models were useful in guiding our experimental research on the N-terminus of Transient Receptor Potential ion-channels (TRPs). Taken together, we propose that Ada-BLAST provides a powerful computational tool that is accessible to bench-scientists and computational biologists alike. The codes for Ada-BLAST are publicly available at: http://empathy.rcc.psu.edu/.

AB - Profile-based protein-sequence analysis algorithms comprise some of the most powerful and user-friendly methods for exploring protein sequences to determine their structure, function, and/or evolution (1-4). PSI-BLAST (5, 6) and rps-BLAST (7) are two of the most popular profile-based algorithms (~1,120 references to date), and have exceptional utility in the identification of homology between proteins, particularly for biological scientists who do not specialize in computational approaches. However, when the performance of these algorithms is compared to other methods [e.g. support-vector machine learning (SVM) (8), hidden-Markov models (HMMs) (9)], they often underperform in identifying the aforementioned protein properties (3, 9-11). We have previously demonstrated that the utility of BLAST algorithms can be significantly improved by: (i) adaptations to the profile libraries employed, (ii) adjustments to output formats, and (iii) alterations to BLAST algorithm itself (4, 6, 12-14). We present here Adaptive-BLAST (Ada-BLAST), which provides a simple user-defined platform for measuring and analyzing primary amino acid sequences. Within this platform, we developed a series of local BLAST applications (apps) that take advantage of the speed and sensitivity afforded by BLAST, while allowing for maximal user-definitions and flexible visualization. We tested the efficacy of these apps in control experiments, studying fold-recognition, in which we obtained >90% accuracy in highly divergent sequences (>25% identity). In addition, these same apps were proficient in classifying transmembrane proteins, identifying structural/functional determinants of ion-channels/receptors, and informing structural modeling algorithms. Indeed, these Ada-BLAST informed-structural models were useful in guiding our experimental research on the N-terminus of Transient Receptor Potential ion-channels (TRPs). Taken together, we propose that Ada-BLAST provides a powerful computational tool that is accessible to bench-scientists and computational biologists alike. The codes for Ada-BLAST are publicly available at: http://empathy.rcc.psu.edu/.

UR - http://www.scopus.com/inward/record.url?scp=84977578892&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84977578892&partnerID=8YFLogxK

U2 - 10.5584/jiomics.v1i1.33

DO - 10.5584/jiomics.v1i1.33

M3 - Article

AN - SCOPUS:84977578892

VL - 1

SP - 88

EP - 101

JO - Journal of Integrated OMICS

JF - Journal of Integrated OMICS

SN - 2182-0287

IS - 1

ER -