Ranked list fusion and re-ranking with pre-trained transformers for ARQMath lab

Shaurya Rohatgi, Jian Wu, C. Lee Giles

Research output: Contribution to journalConference articlepeer-review

Abstract

This paper elaborates on our submission to the ARQMath track at CLEF 2021. For our submission this year we use a collection of methods to retrieve and re-rank the answers in Math Stack Exchange in addition to our two-stage model which was comparable to the best model last year in terms of NDCG'. We also provide a detailed analysis of what the transformers are learning and why is it hard to train a math language model using transformers. This year's submission to Task-1 includes summarizing long question-answer pairs to augment and index documents, using byte-pair encoding to tokenize formula and then re-rank them, and finally important keywords extraction from posts. Using an ensemble of these methods our approach shows a 20% improvement than our ARQMath'2020 Task-1 submission.

Original languageEnglish (US)
Pages (from-to)125-132
Number of pages8
JournalCEUR Workshop Proceedings
Volume2936
StatePublished - 2021
Event2021 Working Notes of CLEF - Conference and Labs of the Evaluation Forum, CLEF-WN 2021 - Virtual, Bucharest, Romania
Duration: Sep 21 2021Sep 24 2021

All Science Journal Classification (ASJC) codes

  • Computer Science(all)

Fingerprint

Dive into the research topics of 'Ranked list fusion and re-ranking with pre-trained transformers for ARQMath lab'. Together they form a unique fingerprint.

Cite this