ICTNET at blog track TREC 2009

Xueke Xu, Yue Liu, Hongbo Xu, Xiaoming Yu, Linhai Song, Feng Guan, Zeying Peng, Xueqi Cheng

Research output: Contribution to journalConference article

Abstract

This paper describes our participation in blog track of TREC2009. All runs are submitted for both two task, namely Top stories identification task and faceted blog distillation task. The "FirteX" platform was used to index and retrieval posts. As for top stories identification task, to identify important headlines, we measure the importance of headline by accumulating the BM25 relevance score with posts on the query day. We propose a graph-based iterative approach and a sub-topic detecting based approach respectively to identify diverse blog posts. As for faceted blog distillation task: we adopt a very straightforward approach and measure the topical relevance by only exploiting top ad-hoc 10000 posts. To identify facet inclination, we either train centroid classifier or compute facet inclination weights of terms to compute facet inclination score and rerank feed by combining relevance score and facet inclination score.

Original languageEnglish (US)
JournalNIST Special Publication
StatePublished - Dec 1 2009
Event18th Text REtrieval Conference, TREC 2009 - Gaithersburg, MD, United States
Duration: Nov 17 2009Nov 20 2009

Fingerprint

Blogs
Distillation
Classifiers

All Science Journal Classification (ASJC) codes

  • Engineering(all)

Cite this

Xu, X., Liu, Y., Xu, H., Yu, X., Song, L., Guan, F., ... Cheng, X. (2009). ICTNET at blog track TREC 2009. NIST Special Publication.
Xu, Xueke ; Liu, Yue ; Xu, Hongbo ; Yu, Xiaoming ; Song, Linhai ; Guan, Feng ; Peng, Zeying ; Cheng, Xueqi. / ICTNET at blog track TREC 2009. In: NIST Special Publication. 2009.
@article{06999c844bdf422cabf75f69cc66ddbd,
title = "ICTNET at blog track TREC 2009",
abstract = "This paper describes our participation in blog track of TREC2009. All runs are submitted for both two task, namely Top stories identification task and faceted blog distillation task. The {"}FirteX{"} platform was used to index and retrieval posts. As for top stories identification task, to identify important headlines, we measure the importance of headline by accumulating the BM25 relevance score with posts on the query day. We propose a graph-based iterative approach and a sub-topic detecting based approach respectively to identify diverse blog posts. As for faceted blog distillation task: we adopt a very straightforward approach and measure the topical relevance by only exploiting top ad-hoc 10000 posts. To identify facet inclination, we either train centroid classifier or compute facet inclination weights of terms to compute facet inclination score and rerank feed by combining relevance score and facet inclination score.",
author = "Xueke Xu and Yue Liu and Hongbo Xu and Xiaoming Yu and Linhai Song and Feng Guan and Zeying Peng and Xueqi Cheng",
year = "2009",
month = "12",
day = "1",
language = "English (US)",
journal = "NIST Special Publication",
issn = "1048-776X",
publisher = "Institute of Electrical and Electronics Engineers Inc.",

}

Xu, X, Liu, Y, Xu, H, Yu, X, Song, L, Guan, F, Peng, Z & Cheng, X 2009, 'ICTNET at blog track TREC 2009', NIST Special Publication.

ICTNET at blog track TREC 2009. / Xu, Xueke; Liu, Yue; Xu, Hongbo; Yu, Xiaoming; Song, Linhai; Guan, Feng; Peng, Zeying; Cheng, Xueqi.

In: NIST Special Publication, 01.12.2009.

Research output: Contribution to journalConference article

TY - JOUR

T1 - ICTNET at blog track TREC 2009

AU - Xu, Xueke

AU - Liu, Yue

AU - Xu, Hongbo

AU - Yu, Xiaoming

AU - Song, Linhai

AU - Guan, Feng

AU - Peng, Zeying

AU - Cheng, Xueqi

PY - 2009/12/1

Y1 - 2009/12/1

N2 - This paper describes our participation in blog track of TREC2009. All runs are submitted for both two task, namely Top stories identification task and faceted blog distillation task. The "FirteX" platform was used to index and retrieval posts. As for top stories identification task, to identify important headlines, we measure the importance of headline by accumulating the BM25 relevance score with posts on the query day. We propose a graph-based iterative approach and a sub-topic detecting based approach respectively to identify diverse blog posts. As for faceted blog distillation task: we adopt a very straightforward approach and measure the topical relevance by only exploiting top ad-hoc 10000 posts. To identify facet inclination, we either train centroid classifier or compute facet inclination weights of terms to compute facet inclination score and rerank feed by combining relevance score and facet inclination score.

AB - This paper describes our participation in blog track of TREC2009. All runs are submitted for both two task, namely Top stories identification task and faceted blog distillation task. The "FirteX" platform was used to index and retrieval posts. As for top stories identification task, to identify important headlines, we measure the importance of headline by accumulating the BM25 relevance score with posts on the query day. We propose a graph-based iterative approach and a sub-topic detecting based approach respectively to identify diverse blog posts. As for faceted blog distillation task: we adopt a very straightforward approach and measure the topical relevance by only exploiting top ad-hoc 10000 posts. To identify facet inclination, we either train centroid classifier or compute facet inclination weights of terms to compute facet inclination score and rerank feed by combining relevance score and facet inclination score.

UR - http://www.scopus.com/inward/record.url?scp=84873464786&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84873464786&partnerID=8YFLogxK

M3 - Conference article

AN - SCOPUS:84873464786

JO - NIST Special Publication

JF - NIST Special Publication

SN - 1048-776X

ER -

Xu X, Liu Y, Xu H, Yu X, Song L, Guan F et al. ICTNET at blog track TREC 2009. NIST Special Publication. 2009 Dec 1.