Bi-LSTM-CRF sequence labeling for keyphrase extraction from scholarly documents

Rabah A. Al-Zaidy, Cornelia Caragea, Clyde Lee Giles

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2 Citations (Scopus)

Abstract

In this paper, we address the keyphrase extraction problem as sequence labeling and propose a model that jointly exploits the complementary strengths of Conditional Random Fields that capture label dependencies through a transition parameter matrix consisting of the transition probabilities from one label to the neighboring label, and Bidirectional Long Short Term Memory networks that capture hidden semantics in text through the long distance dependencies. Our results on three datasets of scholarly documents show that the proposed model substantially outperforms strong baselines and previous approaches for keyphrase extraction.

Original languageEnglish (US)
Title of host publicationThe Web Conference 2019 - Proceedings of the World Wide Web Conference, WWW 2019
PublisherAssociation for Computing Machinery, Inc
Pages2551-2557
Number of pages7
ISBN (Electronic)9781450366748
DOIs
StatePublished - May 13 2019
Event2019 World Wide Web Conference, WWW 2019 - San Francisco, United States
Duration: May 13 2019May 17 2019

Publication series

NameThe Web Conference 2019 - Proceedings of the World Wide Web Conference, WWW 2019

Conference

Conference2019 World Wide Web Conference, WWW 2019
CountryUnited States
CitySan Francisco
Period5/13/195/17/19

Fingerprint

Labeling
Labels
Semantics
Long short-term memory

All Science Journal Classification (ASJC) codes

  • Computer Networks and Communications
  • Software

Cite this

Al-Zaidy, R. A., Caragea, C., & Giles, C. L. (2019). Bi-LSTM-CRF sequence labeling for keyphrase extraction from scholarly documents. In The Web Conference 2019 - Proceedings of the World Wide Web Conference, WWW 2019 (pp. 2551-2557). (The Web Conference 2019 - Proceedings of the World Wide Web Conference, WWW 2019). Association for Computing Machinery, Inc. https://doi.org/10.1145/3308558.3313642
Al-Zaidy, Rabah A. ; Caragea, Cornelia ; Giles, Clyde Lee. / Bi-LSTM-CRF sequence labeling for keyphrase extraction from scholarly documents. The Web Conference 2019 - Proceedings of the World Wide Web Conference, WWW 2019. Association for Computing Machinery, Inc, 2019. pp. 2551-2557 (The Web Conference 2019 - Proceedings of the World Wide Web Conference, WWW 2019).
@inproceedings{dc6ead84dbf04bfe896c323e1d819885,
title = "Bi-LSTM-CRF sequence labeling for keyphrase extraction from scholarly documents",
abstract = "In this paper, we address the keyphrase extraction problem as sequence labeling and propose a model that jointly exploits the complementary strengths of Conditional Random Fields that capture label dependencies through a transition parameter matrix consisting of the transition probabilities from one label to the neighboring label, and Bidirectional Long Short Term Memory networks that capture hidden semantics in text through the long distance dependencies. Our results on three datasets of scholarly documents show that the proposed model substantially outperforms strong baselines and previous approaches for keyphrase extraction.",
author = "Al-Zaidy, {Rabah A.} and Cornelia Caragea and Giles, {Clyde Lee}",
year = "2019",
month = "5",
day = "13",
doi = "10.1145/3308558.3313642",
language = "English (US)",
series = "The Web Conference 2019 - Proceedings of the World Wide Web Conference, WWW 2019",
publisher = "Association for Computing Machinery, Inc",
pages = "2551--2557",
booktitle = "The Web Conference 2019 - Proceedings of the World Wide Web Conference, WWW 2019",

}

Al-Zaidy, RA, Caragea, C & Giles, CL 2019, Bi-LSTM-CRF sequence labeling for keyphrase extraction from scholarly documents. in The Web Conference 2019 - Proceedings of the World Wide Web Conference, WWW 2019. The Web Conference 2019 - Proceedings of the World Wide Web Conference, WWW 2019, Association for Computing Machinery, Inc, pp. 2551-2557, 2019 World Wide Web Conference, WWW 2019, San Francisco, United States, 5/13/19. https://doi.org/10.1145/3308558.3313642

Bi-LSTM-CRF sequence labeling for keyphrase extraction from scholarly documents. / Al-Zaidy, Rabah A.; Caragea, Cornelia; Giles, Clyde Lee.

The Web Conference 2019 - Proceedings of the World Wide Web Conference, WWW 2019. Association for Computing Machinery, Inc, 2019. p. 2551-2557 (The Web Conference 2019 - Proceedings of the World Wide Web Conference, WWW 2019).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

TY - GEN

T1 - Bi-LSTM-CRF sequence labeling for keyphrase extraction from scholarly documents

AU - Al-Zaidy, Rabah A.

AU - Caragea, Cornelia

AU - Giles, Clyde Lee

PY - 2019/5/13

Y1 - 2019/5/13

N2 - In this paper, we address the keyphrase extraction problem as sequence labeling and propose a model that jointly exploits the complementary strengths of Conditional Random Fields that capture label dependencies through a transition parameter matrix consisting of the transition probabilities from one label to the neighboring label, and Bidirectional Long Short Term Memory networks that capture hidden semantics in text through the long distance dependencies. Our results on three datasets of scholarly documents show that the proposed model substantially outperforms strong baselines and previous approaches for keyphrase extraction.

AB - In this paper, we address the keyphrase extraction problem as sequence labeling and propose a model that jointly exploits the complementary strengths of Conditional Random Fields that capture label dependencies through a transition parameter matrix consisting of the transition probabilities from one label to the neighboring label, and Bidirectional Long Short Term Memory networks that capture hidden semantics in text through the long distance dependencies. Our results on three datasets of scholarly documents show that the proposed model substantially outperforms strong baselines and previous approaches for keyphrase extraction.

UR - http://www.scopus.com/inward/record.url?scp=85066915412&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85066915412&partnerID=8YFLogxK

U2 - 10.1145/3308558.3313642

DO - 10.1145/3308558.3313642

M3 - Conference contribution

AN - SCOPUS:85066915412

T3 - The Web Conference 2019 - Proceedings of the World Wide Web Conference, WWW 2019

SP - 2551

EP - 2557

BT - The Web Conference 2019 - Proceedings of the World Wide Web Conference, WWW 2019

PB - Association for Computing Machinery, Inc

ER -

Al-Zaidy RA, Caragea C, Giles CL. Bi-LSTM-CRF sequence labeling for keyphrase extraction from scholarly documents. In The Web Conference 2019 - Proceedings of the World Wide Web Conference, WWW 2019. Association for Computing Machinery, Inc. 2019. p. 2551-2557. (The Web Conference 2019 - Proceedings of the World Wide Web Conference, WWW 2019). https://doi.org/10.1145/3308558.3313642