Autonomous citation matching

Steve Lawrence, C. Lee Giles, Kurt D. Bollacker

Research output: Chapter in Book/Report/Conference proceedingConference contribution

63 Citations (Scopus)

Abstract

Scientific literature is increasingly becoming available on the World Wide Web. This paper considers the matching of citations found in different papers in order to autonomously construct a citation index from papers in electronic format. Citation indices of scientific literature have traditionally been constructed manually, partly because it can be difficult to autonomously determine if two citations refer to the same paper (citations can be written in many different formats). We present four algorithms for autonomous citation matching. The algorithms are based on edit-distance computation, word matching, word and phrase matching, and subfield extraction. The word and phrase matching algorithm obtains the lowest error rate, and the subfield algorithm is the most computationally efficient. We quantitatively compare the accuracy and efficiency of the algorithms on a number of datasets.

Original languageEnglish (US)
Title of host publicationProceedings of the International Conference on Autonomous Agents
Pages392-393
Number of pages2
StatePublished - 1999
EventProceedings of the 1999 3rd International Conference on Autonomous Agents - Seattle, WA, USA
Duration: May 1 1999May 5 1999

Other

OtherProceedings of the 1999 3rd International Conference on Autonomous Agents
CitySeattle, WA, USA
Period5/1/995/5/99

All Science Journal Classification (ASJC) codes

  • Engineering(all)

Cite this

Lawrence, S., Giles, C. L., & Bollacker, K. D. (1999). Autonomous citation matching. In Proceedings of the International Conference on Autonomous Agents (pp. 392-393)
Lawrence, Steve ; Giles, C. Lee ; Bollacker, Kurt D. / Autonomous citation matching. Proceedings of the International Conference on Autonomous Agents. 1999. pp. 392-393
@inproceedings{1dc250368c814cceb154364240a7c30f,
title = "Autonomous citation matching",
abstract = "Scientific literature is increasingly becoming available on the World Wide Web. This paper considers the matching of citations found in different papers in order to autonomously construct a citation index from papers in electronic format. Citation indices of scientific literature have traditionally been constructed manually, partly because it can be difficult to autonomously determine if two citations refer to the same paper (citations can be written in many different formats). We present four algorithms for autonomous citation matching. The algorithms are based on edit-distance computation, word matching, word and phrase matching, and subfield extraction. The word and phrase matching algorithm obtains the lowest error rate, and the subfield algorithm is the most computationally efficient. We quantitatively compare the accuracy and efficiency of the algorithms on a number of datasets.",
author = "Steve Lawrence and Giles, {C. Lee} and Bollacker, {Kurt D.}",
year = "1999",
language = "English (US)",
pages = "392--393",
booktitle = "Proceedings of the International Conference on Autonomous Agents",

}

Lawrence, S, Giles, CL & Bollacker, KD 1999, Autonomous citation matching. in Proceedings of the International Conference on Autonomous Agents. pp. 392-393, Proceedings of the 1999 3rd International Conference on Autonomous Agents, Seattle, WA, USA, 5/1/99.

Autonomous citation matching. / Lawrence, Steve; Giles, C. Lee; Bollacker, Kurt D.

Proceedings of the International Conference on Autonomous Agents. 1999. p. 392-393.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

TY - GEN

T1 - Autonomous citation matching

AU - Lawrence, Steve

AU - Giles, C. Lee

AU - Bollacker, Kurt D.

PY - 1999

Y1 - 1999

N2 - Scientific literature is increasingly becoming available on the World Wide Web. This paper considers the matching of citations found in different papers in order to autonomously construct a citation index from papers in electronic format. Citation indices of scientific literature have traditionally been constructed manually, partly because it can be difficult to autonomously determine if two citations refer to the same paper (citations can be written in many different formats). We present four algorithms for autonomous citation matching. The algorithms are based on edit-distance computation, word matching, word and phrase matching, and subfield extraction. The word and phrase matching algorithm obtains the lowest error rate, and the subfield algorithm is the most computationally efficient. We quantitatively compare the accuracy and efficiency of the algorithms on a number of datasets.

AB - Scientific literature is increasingly becoming available on the World Wide Web. This paper considers the matching of citations found in different papers in order to autonomously construct a citation index from papers in electronic format. Citation indices of scientific literature have traditionally been constructed manually, partly because it can be difficult to autonomously determine if two citations refer to the same paper (citations can be written in many different formats). We present four algorithms for autonomous citation matching. The algorithms are based on edit-distance computation, word matching, word and phrase matching, and subfield extraction. The word and phrase matching algorithm obtains the lowest error rate, and the subfield algorithm is the most computationally efficient. We quantitatively compare the accuracy and efficiency of the algorithms on a number of datasets.

UR - http://www.scopus.com/inward/record.url?scp=0032652968&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0032652968&partnerID=8YFLogxK

M3 - Conference contribution

SP - 392

EP - 393

BT - Proceedings of the International Conference on Autonomous Agents

ER -

Lawrence S, Giles CL, Bollacker KD. Autonomous citation matching. In Proceedings of the International Conference on Autonomous Agents. 1999. p. 392-393