Accelerating Substructure Similarity Search for Formula Retrieval

Wei Zhong, Shaurya Rohatgi, Jian Wu, C. Lee Giles, Richard Zanibbi

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Formula retrieval systems using substructure matching are effective, but suffer from slow retrieval times caused by the complexity of structure matching. We present a specialized inverted index and rank-safe dynamic pruning algorithm for faster substructure retrieval. Formulas are indexed from their Operator Tree (OPT) representations. Our model is evaluated using the NTCIR-12 Wikipedia Formula Browsing Task and a new formula corpus produced from Math StackExchange posts. Our approach preserves the effectiveness of structure matching while allowing queries to be executed in real-time.

Original languageEnglish (US)
Title of host publicationAdvances in Information Retrieval - 42nd European Conference on IR Research, ECIR 2020, Proceedings
EditorsJoemon M. Jose, Emine Yilmaz, João Magalhães, Flávio Martins, Pablo Castells, Nicola Ferro, Mário J. Silva
PublisherSpringer
Pages714-727
Number of pages14
ISBN (Print)9783030454388
DOIs
StatePublished - 2020
Event42nd European Conference on IR Research, ECIR 2020 - Lisbon, Portugal
Duration: Apr 14 2020Apr 17 2020

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume12035 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference42nd European Conference on IR Research, ECIR 2020
CountryPortugal
CityLisbon
Period4/14/204/17/20

All Science Journal Classification (ASJC) codes

  • Theoretical Computer Science
  • Computer Science(all)

Fingerprint Dive into the research topics of 'Accelerating Substructure Similarity Search for Formula Retrieval'. Together they form a unique fingerprint.

  • Cite this

    Zhong, W., Rohatgi, S., Wu, J., Giles, C. L., & Zanibbi, R. (2020). Accelerating Substructure Similarity Search for Formula Retrieval. In J. M. Jose, E. Yilmaz, J. Magalhães, F. Martins, P. Castells, N. Ferro, & M. J. Silva (Eds.), Advances in Information Retrieval - 42nd European Conference on IR Research, ECIR 2020, Proceedings (pp. 714-727). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 12035 LNCS). Springer. https://doi.org/10.1007/978-3-030-45439-5_47