Deep Linguistic Processing is a natural language processing framework which draws on theoretical and descriptive linguistics. It models language predominantly by way of theoretical syntactic/semantic theory (e.g. CCG, HPSG, LFG, TAG, the Prague School). The Deep Linguistic Processing approaches differ from shallower methods in that they yield richer, more expressive, structural representation which capture long-distance dependencies or the underlying predicate-arguement structure directly.[1]
Deep vs Shallow Linguistic Processing
Traditionally, deep linguistic processing has been concerned with computational grammar development (for use in both parsing and generation). These grammar were manually developed, maintained and were computationally expensive to run. In recent years, machine learning approaches (also known as shallow linguistic processing) have fundamentally altered the field of natural language processing. The rapid creation of robust and wide-coverage machine learning NLP tools requires substantially lesser amount of manual labor. Thus deep linguistic processing methods have received less attention. However it is the belief of some computational linguists that in order for computers to understand natural language or inference, detailed syntactic and semantic representation is necessary.
Moreover, shallow methods does lack human language 'understanding'; While humans can easily understand a sentence and it's meaning, shallow linguistic processing might lack human language 'understanding'. For example[2]:
a) Things would be different if Microsoft was located in Georgia.
b) The National Institue for Psychology in Israel was established in May 1971 as the Israel Center for Psychobiology by Prof. Joel.
In sentence A, a shallow information extraction system might infer wronlgy that Microsoft's headquarters was located in Georgia. While as humans, we understand from the sentence that Mircosoft office was never in Georgia.
References
- ^ Timothy Baldwin, Mark Dras, Julia Hockenmaier, Tracy Holloway King, and Gertjan van Noord. 2007. The impact of deep linguistic processing on parsing technology. In Proc. of the 10th International Workshop on Parsing Technologies (IWPT-2007), pages 36–8, Prague, Czech Republic.
- ^ U. Schafer. 2007. ¨ Integrating Deep and Shallow Natural Language Processing Components – Representations and Hybrid Architectures. Ph.D. thesis, Faculty of Mathematics and Computer Science, Saarland University, Saarbrucken, Germany.