A Web Service for Author Name Disambiguation in Scholarly Databases

Kunho Kim, Athar Sefid, Bruce A. Weinberg, C. Lee Giles

Research output: Chapter in Book/Report/Conference proceedingConference contribution

16 Scopus citations

Abstract

Author Name Disambiguation (AND) is the task of clustering unique author names from publication records in scholarly or related databases. Although AND has been extensively studied and has served as an important preprocessing step for several tasks (e.g. calculating bibliometrics and scientometrics for authors), there are few publicly available tools for disambiguation in large-scale scholarly databases. Furthermore, most of the disambiguated data is embedded within the search engines of the scholarly databases, and existing application programming interfaces (APIs) have limited features and are often unavailable for users for various reasons. This makes it difficult for researchers and developers to use the data for various applications (e.g. author search) or research. Here, we design a novel, web-based, RESTful API for searching disambiguated authors, using the PubMed database as a sample application. We offer two type of queries, attribute-based queries and record-based queries which serve different purposes. Attribute-based queries retrieve authors with the attributes available in the database. We study different search engines to find the most appropriate one for processing attribute-based queries. Record-based queries retrieve authors that are most likely to have written a query publication provided by a user. To accelerate record-based queries, we develop a novel algorithm that has a fast record-to-cluster match. We show that our algorithm can accelerate the query by a factor of 4.01 compared to a baseline naive approach.

Original languageEnglish (US)
Title of host publicationProceedings - 2018 IEEE International Conference on Web Services, ICWS 2018 - Part of the 2018 IEEE World Congress on Services
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages265-273
Number of pages9
ISBN (Print)9781538672471
DOIs
StatePublished - Sep 5 2018
Event25th IEEE International Conference on Web Services, ICWS 2018 - San Francisco, United States
Duration: Jul 2 2018Jul 7 2018

Publication series

NameProceedings - 2018 IEEE International Conference on Web Services, ICWS 2018 - Part of the 2018 IEEE World Congress on Services

Other

Other25th IEEE International Conference on Web Services, ICWS 2018
Country/TerritoryUnited States
CitySan Francisco
Period7/2/187/7/18

All Science Journal Classification (ASJC) codes

  • Computer Networks and Communications
  • Computer Science Applications
  • Information Systems
  • Information Systems and Management

Fingerprint

Dive into the research topics of 'A Web Service for Author Name Disambiguation in Scholarly Databases'. Together they form a unique fingerprint.

Cite this