On the use of similarity search to detect fake scientific papers

Kyle Williams, C. Lee Giles

Research output: Chapter in Book/Report/Conference proceedingConference contribution

5 Citations (Scopus)

Abstract

Fake scientific papers have recently become of interest within the academic community as a result of the identification of fake papers in the digital libraries of major academic publishers [8]. Detecting and removing these papers is important for many reasons. We describe an investigation into the use of similarity search for detecting fake scientific papers by comparing several methods for signature construction and similarity scoring and describe a pseudo-relevance feedback technique that can be used to improve the effectiveness of these methods. Experiments on a dataset of 40,000 computer science papers show that precision, recall and MAP scores of 0.96, 0.99 and 0.99, respectively, can be achieved, thereby demonstrating the usefulness of similarity search in detecting fake scientific papers and ranking them highly.

Original languageEnglish (US)
Title of host publicationSimilarity Search and Applications - 8th International Conference, SISAP 2015, Proceedings
EditorsRichard Connor, Giuseppe Amato, Fabrizio Falchi, Claudio Gennaro
PublisherSpringer Verlag
Pages332-338
Number of pages7
ISBN (Print)9783319250861
DOIs
StatePublished - Jan 1 2015
Event8th International Conference on Similarity Search and Applications, SISAP 2015 - Glasgow, United Kingdom
Duration: Oct 12 2015Oct 14 2015

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume9371
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Other

Other8th International Conference on Similarity Search and Applications, SISAP 2015
CountryUnited Kingdom
CityGlasgow
Period10/12/1510/14/15

Fingerprint

Similarity Search
Digital libraries
Computer science
Pseudo-relevance Feedback
Feedback
Digital Libraries
Scoring
Ranking
Computer Science
Signature
Experiments
Experiment
Similarity
Community

All Science Journal Classification (ASJC) codes

  • Theoretical Computer Science
  • Computer Science(all)

Cite this

Williams, K., & Giles, C. L. (2015). On the use of similarity search to detect fake scientific papers. In R. Connor, G. Amato, F. Falchi, & C. Gennaro (Eds.), Similarity Search and Applications - 8th International Conference, SISAP 2015, Proceedings (pp. 332-338). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 9371). Springer Verlag. https://doi.org/10.1007/978-3-319-25087-8_32
Williams, Kyle ; Giles, C. Lee. / On the use of similarity search to detect fake scientific papers. Similarity Search and Applications - 8th International Conference, SISAP 2015, Proceedings. editor / Richard Connor ; Giuseppe Amato ; Fabrizio Falchi ; Claudio Gennaro. Springer Verlag, 2015. pp. 332-338 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
@inproceedings{2e21c5a2aa50426fb6ee04fce157e81c,
title = "On the use of similarity search to detect fake scientific papers",
abstract = "Fake scientific papers have recently become of interest within the academic community as a result of the identification of fake papers in the digital libraries of major academic publishers [8]. Detecting and removing these papers is important for many reasons. We describe an investigation into the use of similarity search for detecting fake scientific papers by comparing several methods for signature construction and similarity scoring and describe a pseudo-relevance feedback technique that can be used to improve the effectiveness of these methods. Experiments on a dataset of 40,000 computer science papers show that precision, recall and MAP scores of 0.96, 0.99 and 0.99, respectively, can be achieved, thereby demonstrating the usefulness of similarity search in detecting fake scientific papers and ranking them highly.",
author = "Kyle Williams and Giles, {C. Lee}",
year = "2015",
month = "1",
day = "1",
doi = "10.1007/978-3-319-25087-8_32",
language = "English (US)",
isbn = "9783319250861",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
publisher = "Springer Verlag",
pages = "332--338",
editor = "Richard Connor and Giuseppe Amato and Fabrizio Falchi and Claudio Gennaro",
booktitle = "Similarity Search and Applications - 8th International Conference, SISAP 2015, Proceedings",
address = "Germany",

}

Williams, K & Giles, CL 2015, On the use of similarity search to detect fake scientific papers. in R Connor, G Amato, F Falchi & C Gennaro (eds), Similarity Search and Applications - 8th International Conference, SISAP 2015, Proceedings. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 9371, Springer Verlag, pp. 332-338, 8th International Conference on Similarity Search and Applications, SISAP 2015, Glasgow, United Kingdom, 10/12/15. https://doi.org/10.1007/978-3-319-25087-8_32

On the use of similarity search to detect fake scientific papers. / Williams, Kyle; Giles, C. Lee.

Similarity Search and Applications - 8th International Conference, SISAP 2015, Proceedings. ed. / Richard Connor; Giuseppe Amato; Fabrizio Falchi; Claudio Gennaro. Springer Verlag, 2015. p. 332-338 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 9371).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

TY - GEN

T1 - On the use of similarity search to detect fake scientific papers

AU - Williams, Kyle

AU - Giles, C. Lee

PY - 2015/1/1

Y1 - 2015/1/1

N2 - Fake scientific papers have recently become of interest within the academic community as a result of the identification of fake papers in the digital libraries of major academic publishers [8]. Detecting and removing these papers is important for many reasons. We describe an investigation into the use of similarity search for detecting fake scientific papers by comparing several methods for signature construction and similarity scoring and describe a pseudo-relevance feedback technique that can be used to improve the effectiveness of these methods. Experiments on a dataset of 40,000 computer science papers show that precision, recall and MAP scores of 0.96, 0.99 and 0.99, respectively, can be achieved, thereby demonstrating the usefulness of similarity search in detecting fake scientific papers and ranking them highly.

AB - Fake scientific papers have recently become of interest within the academic community as a result of the identification of fake papers in the digital libraries of major academic publishers [8]. Detecting and removing these papers is important for many reasons. We describe an investigation into the use of similarity search for detecting fake scientific papers by comparing several methods for signature construction and similarity scoring and describe a pseudo-relevance feedback technique that can be used to improve the effectiveness of these methods. Experiments on a dataset of 40,000 computer science papers show that precision, recall and MAP scores of 0.96, 0.99 and 0.99, respectively, can be achieved, thereby demonstrating the usefulness of similarity search in detecting fake scientific papers and ranking them highly.

UR - http://www.scopus.com/inward/record.url?scp=84951798118&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84951798118&partnerID=8YFLogxK

U2 - 10.1007/978-3-319-25087-8_32

DO - 10.1007/978-3-319-25087-8_32

M3 - Conference contribution

AN - SCOPUS:84951798118

SN - 9783319250861

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 332

EP - 338

BT - Similarity Search and Applications - 8th International Conference, SISAP 2015, Proceedings

A2 - Connor, Richard

A2 - Amato, Giuseppe

A2 - Falchi, Fabrizio

A2 - Gennaro, Claudio

PB - Springer Verlag

ER -

Williams K, Giles CL. On the use of similarity search to detect fake scientific papers. In Connor R, Amato G, Falchi F, Gennaro C, editors, Similarity Search and Applications - 8th International Conference, SISAP 2015, Proceedings. Springer Verlag. 2015. p. 332-338. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). https://doi.org/10.1007/978-3-319-25087-8_32