TY - GEN
T1 - Learning Clause Representation from Dependency-Anchor Graph for Connective Prediction
AU - Gao, Yanjun
AU - Huang, Ting Hao
AU - Passonneau, Rebecca J.
N1 - Publisher Copyright:
© 2021 Association for Computational Linguistics.
PY - 2021
Y1 - 2021
N2 - Semantic representation that supports the choice of an appropriate connective between pairs of clauses inherently addresses discourse coherence, which is important for tasks such as narrative understanding, argumentation, and discourse parsing. We propose a novel clause embedding method that applies graph learning to a data structure we refer to as a dependencyanchor graph. The dependency anchor graph incorporates two kinds of syntactic information, constituency structure and dependency relations, to highlight the subject and verb phrase relation. This enhances coherencerelated aspects of representation. We design a neural model to learn a semantic representation for clauses from graph convolution over latent representations of the subject and verb phrase. We evaluate our method on two new datasets: a subset of a large corpus where the source texts are published novels, and a new dataset collected from students' essays. The results demonstrate a significant improvement over tree-based models, confirming the importance of emphasizing the subject and verb phrase. The performance gap between the two datasets illustrates the challenges of analyzing student's written text, plus a potential evaluation task for coherence modeling and an application for suggesting revisions to students.
AB - Semantic representation that supports the choice of an appropriate connective between pairs of clauses inherently addresses discourse coherence, which is important for tasks such as narrative understanding, argumentation, and discourse parsing. We propose a novel clause embedding method that applies graph learning to a data structure we refer to as a dependencyanchor graph. The dependency anchor graph incorporates two kinds of syntactic information, constituency structure and dependency relations, to highlight the subject and verb phrase relation. This enhances coherencerelated aspects of representation. We design a neural model to learn a semantic representation for clauses from graph convolution over latent representations of the subject and verb phrase. We evaluate our method on two new datasets: a subset of a large corpus where the source texts are published novels, and a new dataset collected from students' essays. The results demonstrate a significant improvement over tree-based models, confirming the importance of emphasizing the subject and verb phrase. The performance gap between the two datasets illustrates the challenges of analyzing student's written text, plus a potential evaluation task for coherence modeling and an application for suggesting revisions to students.
UR - http://www.scopus.com/inward/record.url?scp=85109736617&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85109736617&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85109736617
T3 - TextGraphs 2021 - Graph-Based Methods for Natural Language Processing, Proceedings of the 15th Workshop - in conjunction with the 2021 Annual Conference of the North American Chapter of the Association for Computational Linguistics, NAACL 2021
SP - 54
EP - 66
BT - TextGraphs 2021 - Graph-Based Methods for Natural Language Processing, Proceedings of the 15th Workshop - in conjunction with the 2021 Annual Conference of the North American Chapter of the Association for Computational Linguistics, NAACL 2021
A2 - Panchenko, Alexander
A2 - Malliaros, Fragkiskos D.
A2 - Logacheva, Varvara
A2 - Jana, Abhik
A2 - Ustalov, Dmitry
A2 - Jansen, Peter
PB - Association for Computational Linguistics (ACL)
T2 - 15th Workshop on Graph-Based Methods for Natural Language Processing, TextGraphs 2021
Y2 - 11 June 2021
ER -