Revision as of 01:43, 18 March 2021 edit AnomieBOT (talk \| contribs) Bots 6,864,502 edits m Dating maintenance tags: {{Citation needed}} ← Previous edit		Revision as of 15:08, 13 August 2021 edit undo Asjkl123 (talk \| contribs) 3 edits m reference about multitask learning for NLI Next edit →
Line 21: [[Natural language processing]] methods are used to extract and identify language usage patterns common to speakers of an L1-group. This is done using language learner data, usually from a [[learner corpus]]. Next, [[machine learning]] is applied to train classifiers, like [[support vector machine]]s, for predicting the L1 of unseen texts.<ref>Tetreault et al, [http://anthology.aclweb.org/C/C12/C12-1158.pdf "Native Tongues, Lost and Found: Resources and Empirical Evaluations in Native Language Identification"], In Proc. International Conf. on Computational Linguistics (COLING), 2012</ref> A range of ensemble based systems have also been applied to the task and shown to improve performance over single classifier systems.<ref>Malmasi, Shervin, Sze-Meng Jojo Wong, and Mark Dras. [http://anthology.aclweb.org/W/W13/W13-1716.pdf "NLI Shared Task 2013: MQ submission"]. Proceedings of the Eighth Workshop on Innovative Use of NLP for Building Educational Applications. 2013.</ref><ref>Habic, Vuk, Semenov, Alexander, and Pasiliao, Eduardo. [https://www.sciencedirect.com/science/article/abs/pii/S0950705120305694 "Multitask deep learning for native language identification"] in Knowledge-Based Systems, 2020</ref> Various linguistic feature types have been applied for this task. These include syntactic features such as constituent parses, grammatical dependencies and part-of-speech tags.

Native-language identification: Difference between revisions