A sparse gaussian processes classification framework for fast tag suggestions

Yang Song, Lu Zhang, C. Lee Giles

Research output: Chapter in Book/Report/Conference proceedingConference contribution

27 Citations (Scopus)

Abstract

Tagged data is rapidly becoming more available on theWorld Wide Web. Web sites which populate tagging services offer a good way for Internet users to share their knowledge. An interesting problem is how to make tag suggestions when a new resource becomes available. In this paper, we address the issue of efficient tag suggestion. We first propose a multi-class sparse Gaussian process classification framework (SGPS) which is capable of classifying data with very few training instances. We suggest a novel prototype selection algorithm to select the best subset of points for model learning. The framework is then extended to a novel multi-class multi-label classification algorithm (MMSG) that transforms tag suggestion into the problem of multi-label ranking. Experiments on bench-mark data sets and real-world data from Del.icio.us and BibSonomy suggest that our model can greatly improve the performance of tag suggestions when compared to the state-of-the-art. Overall, our model requires linear time to train and constant time to predict per case. The memory consumption is also significantly less than traditional batch learning algorithms such as SVMs. In addition, results on tagging digital data also demonstrate that our model is capable of recommending relevant tags to images and videos by using their surrounding textual information.

Original languageEnglish (US)
Title of host publicationProceedings of the 17th ACM Conference on Information and Knowledge Management, CIKM'08
Pages93-102
Number of pages10
DOIs
StatePublished - Dec 1 2008
Event17th ACM Conference on Information and Knowledge Management, CIKM'08 - Napa Valley, CA, United States
Duration: Oct 26 2008Oct 30 2008

Publication series

NameInternational Conference on Information and Knowledge Management, Proceedings

Other

Other17th ACM Conference on Information and Knowledge Management, CIKM'08
CountryUnited States
CityNapa Valley, CA
Period10/26/0810/30/08

Fingerprint

Tag
Gaussian process
Tagging
World Wide Web
Batch
Learning model
Prototype
Learning algorithm
Ranking
Experiment
Benchmark
Resources
Train
Web sites

All Science Journal Classification (ASJC) codes

  • Decision Sciences(all)
  • Business, Management and Accounting(all)

Cite this

Song, Y., Zhang, L., & Giles, C. L. (2008). A sparse gaussian processes classification framework for fast tag suggestions. In Proceedings of the 17th ACM Conference on Information and Knowledge Management, CIKM'08 (pp. 93-102). (International Conference on Information and Knowledge Management, Proceedings). https://doi.org/10.1145/1458082.1458098
Song, Yang ; Zhang, Lu ; Giles, C. Lee. / A sparse gaussian processes classification framework for fast tag suggestions. Proceedings of the 17th ACM Conference on Information and Knowledge Management, CIKM'08. 2008. pp. 93-102 (International Conference on Information and Knowledge Management, Proceedings).
@inproceedings{2f2bd5f2e7774ac78ad99ebd1b76342c,
title = "A sparse gaussian processes classification framework for fast tag suggestions",
abstract = "Tagged data is rapidly becoming more available on theWorld Wide Web. Web sites which populate tagging services offer a good way for Internet users to share their knowledge. An interesting problem is how to make tag suggestions when a new resource becomes available. In this paper, we address the issue of efficient tag suggestion. We first propose a multi-class sparse Gaussian process classification framework (SGPS) which is capable of classifying data with very few training instances. We suggest a novel prototype selection algorithm to select the best subset of points for model learning. The framework is then extended to a novel multi-class multi-label classification algorithm (MMSG) that transforms tag suggestion into the problem of multi-label ranking. Experiments on bench-mark data sets and real-world data from Del.icio.us and BibSonomy suggest that our model can greatly improve the performance of tag suggestions when compared to the state-of-the-art. Overall, our model requires linear time to train and constant time to predict per case. The memory consumption is also significantly less than traditional batch learning algorithms such as SVMs. In addition, results on tagging digital data also demonstrate that our model is capable of recommending relevant tags to images and videos by using their surrounding textual information.",
author = "Yang Song and Lu Zhang and Giles, {C. Lee}",
year = "2008",
month = "12",
day = "1",
doi = "10.1145/1458082.1458098",
language = "English (US)",
isbn = "9781595939913",
series = "International Conference on Information and Knowledge Management, Proceedings",
pages = "93--102",
booktitle = "Proceedings of the 17th ACM Conference on Information and Knowledge Management, CIKM'08",

}

Song, Y, Zhang, L & Giles, CL 2008, A sparse gaussian processes classification framework for fast tag suggestions. in Proceedings of the 17th ACM Conference on Information and Knowledge Management, CIKM'08. International Conference on Information and Knowledge Management, Proceedings, pp. 93-102, 17th ACM Conference on Information and Knowledge Management, CIKM'08, Napa Valley, CA, United States, 10/26/08. https://doi.org/10.1145/1458082.1458098

A sparse gaussian processes classification framework for fast tag suggestions. / Song, Yang; Zhang, Lu; Giles, C. Lee.

Proceedings of the 17th ACM Conference on Information and Knowledge Management, CIKM'08. 2008. p. 93-102 (International Conference on Information and Knowledge Management, Proceedings).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

TY - GEN

T1 - A sparse gaussian processes classification framework for fast tag suggestions

AU - Song, Yang

AU - Zhang, Lu

AU - Giles, C. Lee

PY - 2008/12/1

Y1 - 2008/12/1

N2 - Tagged data is rapidly becoming more available on theWorld Wide Web. Web sites which populate tagging services offer a good way for Internet users to share their knowledge. An interesting problem is how to make tag suggestions when a new resource becomes available. In this paper, we address the issue of efficient tag suggestion. We first propose a multi-class sparse Gaussian process classification framework (SGPS) which is capable of classifying data with very few training instances. We suggest a novel prototype selection algorithm to select the best subset of points for model learning. The framework is then extended to a novel multi-class multi-label classification algorithm (MMSG) that transforms tag suggestion into the problem of multi-label ranking. Experiments on bench-mark data sets and real-world data from Del.icio.us and BibSonomy suggest that our model can greatly improve the performance of tag suggestions when compared to the state-of-the-art. Overall, our model requires linear time to train and constant time to predict per case. The memory consumption is also significantly less than traditional batch learning algorithms such as SVMs. In addition, results on tagging digital data also demonstrate that our model is capable of recommending relevant tags to images and videos by using their surrounding textual information.

AB - Tagged data is rapidly becoming more available on theWorld Wide Web. Web sites which populate tagging services offer a good way for Internet users to share their knowledge. An interesting problem is how to make tag suggestions when a new resource becomes available. In this paper, we address the issue of efficient tag suggestion. We first propose a multi-class sparse Gaussian process classification framework (SGPS) which is capable of classifying data with very few training instances. We suggest a novel prototype selection algorithm to select the best subset of points for model learning. The framework is then extended to a novel multi-class multi-label classification algorithm (MMSG) that transforms tag suggestion into the problem of multi-label ranking. Experiments on bench-mark data sets and real-world data from Del.icio.us and BibSonomy suggest that our model can greatly improve the performance of tag suggestions when compared to the state-of-the-art. Overall, our model requires linear time to train and constant time to predict per case. The memory consumption is also significantly less than traditional batch learning algorithms such as SVMs. In addition, results on tagging digital data also demonstrate that our model is capable of recommending relevant tags to images and videos by using their surrounding textual information.

UR - http://www.scopus.com/inward/record.url?scp=70349237857&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=70349237857&partnerID=8YFLogxK

U2 - 10.1145/1458082.1458098

DO - 10.1145/1458082.1458098

M3 - Conference contribution

AN - SCOPUS:70349237857

SN - 9781595939913

T3 - International Conference on Information and Knowledge Management, Proceedings

SP - 93

EP - 102

BT - Proceedings of the 17th ACM Conference on Information and Knowledge Management, CIKM'08

ER -

Song Y, Zhang L, Giles CL. A sparse gaussian processes classification framework for fast tag suggestions. In Proceedings of the 17th ACM Conference on Information and Knowledge Management, CIKM'08. 2008. p. 93-102. (International Conference on Information and Knowledge Management, Proceedings). https://doi.org/10.1145/1458082.1458098