TY - GEN
T1 - MORSE
T2 - 55th Annual Meeting of the Association for Computational Linguistics, ACL 2017
AU - Sakakini, Tarek
AU - Bhat, Suma
AU - Viswanath, Pramod
N1 - Publisher Copyright:
© 2017 Association for Computational Linguistics.
PY - 2017
Y1 - 2017
N2 - In this paper we present a novel framework for morpheme segmentation which uses the morpho-syntactic regularities preserved by word representations, in addition to orthographic features, to segment words into morphemes. This framework is the first to consider vocabulary-wide syntactico-semantic information for this task. We also analyze the deficiencies of available benchmarking datasets and introduce our own dataset that was created on the basis of compositionality. We validate our algorithm across different datasets and languages and present new state-of-the-art results.
AB - In this paper we present a novel framework for morpheme segmentation which uses the morpho-syntactic regularities preserved by word representations, in addition to orthographic features, to segment words into morphemes. This framework is the first to consider vocabulary-wide syntactico-semantic information for this task. We also analyze the deficiencies of available benchmarking datasets and introduce our own dataset that was created on the basis of compositionality. We validate our algorithm across different datasets and languages and present new state-of-the-art results.
UR - http://www.scopus.com/inward/record.url?scp=85040944708&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85040944708&partnerID=8YFLogxK
U2 - 10.18653/v1/P17-1051
DO - 10.18653/v1/P17-1051
M3 - Conference contribution
AN - SCOPUS:85040944708
T3 - ACL 2017 - 55th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference (Long Papers)
SP - 552
EP - 561
BT - ACL 2017 - 55th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference (Long Papers)
PB - Association for Computational Linguistics (ACL)
Y2 - 30 July 2017 through 4 August 2017
ER -