TY - GEN

T1 - Õ(n+poly(k))-time Algorithm for Bounded Tree Edit Distance

AU - Das, Debarati

AU - Gilbert, Jacob

AU - Hajiaghayi, Mohammad Taghi

AU - Kociumaka, Tomasz

AU - Saha, Barna

AU - Saleh, Hamed

N1 - Funding Information:
A full version of this work is available at https://arxiv.org/abs/2209.07524. Statements marked with ♠ are proved in the full version only. Barna Saha and Tomasz Kociumaka were partly supported by NSF CCF grants 1652303 and 1909046, and a HDR TRIPODS Phase II grant 2217058. MohammadTaghi Hajiaghayi and Jacob Gilbert were partly supported by NSF CCF grants 2114269 and 2218678.
Publisher Copyright:
© 2022 IEEE.

PY - 2022

Y1 - 2022

N2 - Computing the edit distance of two strings is one of the most basic problems in computer science and combinatorial optimization. Tree edit distance is a natural generalization of edit distance in which the task is to compute a measure of dissimilarity between two (unweighted) rooted trees with node labels. Perhaps the most notable recent application of tree edit distance is in NoSQL big databases, such as MongoDB, where each row of the database is a JSON document represented as a labeled rooted tree and finding dissimilarity between two rows is a basic operation. Until recently, the fastest algorithm for tree edit distance ran in cubic time (Demaine, Mozes, Rossman, Weimann; TALG'10); however, Mao (FOCS'21) broke the cubic barrier for the tree edit distance problem using fast matrix multiplication.Given a parameter k as an upper bound on the distance, an O(n+k2)-time algorithm for edit distance has been known since the 1980s due to works of Myers (Algorithmica'86) and Landau and Vishkin (JCSS'88). The existence of an O(n+poly(k))-time algorithm for tree edit distance has been posed as open question, e.g., by Akmal and Jin (ICALP'21), who give a stateof-the-art O(nk2)-time algorithm. In this paper, we answer this question positively.

AB - Computing the edit distance of two strings is one of the most basic problems in computer science and combinatorial optimization. Tree edit distance is a natural generalization of edit distance in which the task is to compute a measure of dissimilarity between two (unweighted) rooted trees with node labels. Perhaps the most notable recent application of tree edit distance is in NoSQL big databases, such as MongoDB, where each row of the database is a JSON document represented as a labeled rooted tree and finding dissimilarity between two rows is a basic operation. Until recently, the fastest algorithm for tree edit distance ran in cubic time (Demaine, Mozes, Rossman, Weimann; TALG'10); however, Mao (FOCS'21) broke the cubic barrier for the tree edit distance problem using fast matrix multiplication.Given a parameter k as an upper bound on the distance, an O(n+k2)-time algorithm for edit distance has been known since the 1980s due to works of Myers (Algorithmica'86) and Landau and Vishkin (JCSS'88). The existence of an O(n+poly(k))-time algorithm for tree edit distance has been posed as open question, e.g., by Akmal and Jin (ICALP'21), who give a stateof-the-art O(nk2)-time algorithm. In this paper, we answer this question positively.

UR - http://www.scopus.com/inward/record.url?scp=85146356649&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85146356649&partnerID=8YFLogxK

U2 - 10.1109/FOCS54457.2022.00071

DO - 10.1109/FOCS54457.2022.00071

M3 - Conference contribution

AN - SCOPUS:85146356649

T3 - Proceedings - Annual IEEE Symposium on Foundations of Computer Science, FOCS

SP - 686

EP - 697

BT - Proceedings - 2022 IEEE 63rd Annual Symposium on Foundations of Computer Science, FOCS 2022

PB - IEEE Computer Society

T2 - 63rd IEEE Annual Symposium on Foundations of Computer Science, FOCS 2022

Y2 - 31 October 2022 through 3 November 2022

ER -