The impact of user corrections on a crawl-based digital library: A CiteSeerX perspective

Jian Wu, Kyle Williams, Madian Khabsa, C. Lee Giles

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2 Scopus citations

Abstract

CiteSeerX is a crawl-based digital library search engine providing free access to more than 4 million academic papers. Since metadata in the digital library is obtained through automatic extraction, it is inevitable that errors will occur. CiteSeerX offers a feature allowing registered users to correct paper metadata including titles, authors, abstracts, publication years, venues, etc. We claim that user corrections, as a form of crowd-collaboration, provide a useful and efficient way to improve metadata quality and the impact of the digital library. As evidence to support this claim, we investigate user corrections from the last 5 years and analyze: the nature of the corrections; the quality of the corrections; and the impact of the corrections on downloads.

Original languageEnglish (US)
Title of host publicationCollaborateCom 2014 - Proceedings of the 10th IEEE International Conference on Collaborative Computing
Subtitle of host publicationNetworking, Applications and Worksharing
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages171-176
Number of pages6
ISBN (Electronic)9781631900433
DOIs
StatePublished - Jan 19 2015
Event10th IEEE/EAI International Conference on Collaborative Computing, CollaborateCom 2014 - Miami, United States
Duration: Oct 22 2014Oct 25 2014

Publication series

NameCollaborateCom 2014 - Proceedings of the 10th IEEE International Conference on Collaborative Computing: Networking, Applications and Worksharing

Other

Other10th IEEE/EAI International Conference on Collaborative Computing, CollaborateCom 2014
CountryUnited States
CityMiami
Period10/22/1410/25/14

All Science Journal Classification (ASJC) codes

  • Computer Networks and Communications
  • Computer Science Applications
  • Software

Fingerprint Dive into the research topics of 'The impact of user corrections on a crawl-based digital library: A CiteSeerX perspective'. Together they form a unique fingerprint.

Cite this