Does syntax help discourse segmentation? Not so much
Research output: Chapter in Book/Report/Conference proceeding › Article in proceedings › Research › peer-review
Standard
Does syntax help discourse segmentation? Not so much. / Braud, Chloé Elodie; Lacroix, Ophélie; Søgaard, Anders.
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 2017. p. 2432–2442.Research output: Chapter in Book/Report/Conference proceeding › Article in proceedings › Research › peer-review
Harvard
APA
Vancouver
Author
Bibtex
}
RIS
TY - GEN
T1 - Does syntax help discourse segmentation? Not so much
AU - Braud, Chloé Elodie
AU - Lacroix, Ophélie
AU - Søgaard, Anders
PY - 2017
Y1 - 2017
N2 - Discourse segmentation is the first step inbuilding discourse parsers. Most work ondiscourse segmentation does not scale toreal-world discourse parsing across languages,for two reasons: (i) models relyon constituent trees, and (ii) experimentshave relied on gold standard identificationof sentence and token boundaries. Wetherefore investigate to what extent constituentscan be replaced with universal dependencies,or left out completely, as wellas how state-of-the-art segmenters fare inthe absence of sentence boundaries. Ourresults show that dependency informationis less useful than expected, but we providea fully scalable, robust model thatonly relies on part-of-speech information,and show that it performs well across languagesin the absence of any gold-standardannotation.
AB - Discourse segmentation is the first step inbuilding discourse parsers. Most work ondiscourse segmentation does not scale toreal-world discourse parsing across languages,for two reasons: (i) models relyon constituent trees, and (ii) experimentshave relied on gold standard identificationof sentence and token boundaries. Wetherefore investigate to what extent constituentscan be replaced with universal dependencies,or left out completely, as wellas how state-of-the-art segmenters fare inthe absence of sentence boundaries. Ourresults show that dependency informationis less useful than expected, but we providea fully scalable, robust model thatonly relies on part-of-speech information,and show that it performs well across languagesin the absence of any gold-standardannotation.
M3 - Article in proceedings
SN - 978-1-945626-97-5
SP - 2432
EP - 2442
BT - Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing
PB - Association for Computational Linguistics
T2 - 2017 Conference on Empirical Methods in Natural Language Processing
Y2 - 9 September 2017 through 11 September 2017
ER -
ID: 195014956