Unsupervised ranking for plagiarism source retrieval: Notebook for PAN at CLEF 2013

Kyle Williams, Hung Hsuan Chen, Sagnik Ray Choudhury, C. Lee Giles

Research output: Contribution to journalArticle

2 Citations (Scopus)

Abstract

The source retrieval task for plagiarism detection involves the use of a search engine to retrieve candidate sources of plagiarism for a suspicious document and provides a way to efficiently identify candidate documents so that more accurate comparisons can take place. We describe a strategy for source retrieval that makes use of an unsupervised ranking method to rank the results returned by a search engine by their similarity with the query document and that only retrieves documents that are likely to be sources of plagiarism. Evaluation shows the performance of our approach, which achieved the highest F1 score (0.47) among all task participants.

Original languageEnglish (US)
JournalCEUR Workshop Proceedings
Volume1179
StatePublished - 2013

Fingerprint

Search engines

All Science Journal Classification (ASJC) codes

  • Computer Science(all)

Cite this

Williams, Kyle ; Chen, Hung Hsuan ; Choudhury, Sagnik Ray ; Giles, C. Lee. / Unsupervised ranking for plagiarism source retrieval : Notebook for PAN at CLEF 2013. In: CEUR Workshop Proceedings. 2013 ; Vol. 1179.
@article{f0cd5c61ea3647178698c179bc9a950c,
title = "Unsupervised ranking for plagiarism source retrieval: Notebook for PAN at CLEF 2013",
abstract = "The source retrieval task for plagiarism detection involves the use of a search engine to retrieve candidate sources of plagiarism for a suspicious document and provides a way to efficiently identify candidate documents so that more accurate comparisons can take place. We describe a strategy for source retrieval that makes use of an unsupervised ranking method to rank the results returned by a search engine by their similarity with the query document and that only retrieves documents that are likely to be sources of plagiarism. Evaluation shows the performance of our approach, which achieved the highest F1 score (0.47) among all task participants.",
author = "Kyle Williams and Chen, {Hung Hsuan} and Choudhury, {Sagnik Ray} and Giles, {C. Lee}",
year = "2013",
language = "English (US)",
volume = "1179",
journal = "CEUR Workshop Proceedings",
issn = "1613-0073",
publisher = "CEUR-WS",

}

Unsupervised ranking for plagiarism source retrieval : Notebook for PAN at CLEF 2013. / Williams, Kyle; Chen, Hung Hsuan; Choudhury, Sagnik Ray; Giles, C. Lee.

In: CEUR Workshop Proceedings, Vol. 1179, 2013.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Unsupervised ranking for plagiarism source retrieval

T2 - Notebook for PAN at CLEF 2013

AU - Williams, Kyle

AU - Chen, Hung Hsuan

AU - Choudhury, Sagnik Ray

AU - Giles, C. Lee

PY - 2013

Y1 - 2013

N2 - The source retrieval task for plagiarism detection involves the use of a search engine to retrieve candidate sources of plagiarism for a suspicious document and provides a way to efficiently identify candidate documents so that more accurate comparisons can take place. We describe a strategy for source retrieval that makes use of an unsupervised ranking method to rank the results returned by a search engine by their similarity with the query document and that only retrieves documents that are likely to be sources of plagiarism. Evaluation shows the performance of our approach, which achieved the highest F1 score (0.47) among all task participants.

AB - The source retrieval task for plagiarism detection involves the use of a search engine to retrieve candidate sources of plagiarism for a suspicious document and provides a way to efficiently identify candidate documents so that more accurate comparisons can take place. We describe a strategy for source retrieval that makes use of an unsupervised ranking method to rank the results returned by a search engine by their similarity with the query document and that only retrieves documents that are likely to be sources of plagiarism. Evaluation shows the performance of our approach, which achieved the highest F1 score (0.47) among all task participants.

UR - http://www.scopus.com/inward/record.url?scp=84922022494&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84922022494&partnerID=8YFLogxK

M3 - Article

AN - SCOPUS:84922022494

VL - 1179

JO - CEUR Workshop Proceedings

JF - CEUR Workshop Proceedings

SN - 1613-0073

ER -