Content deleted Content added
I am Thomas (talk | contribs) Description of the graphical user interface |
I am Thomas (talk | contribs) paging |
||
Line 1:
{{Infobox_Software2
|logo = [[Image:Gate.gif|83px|GATE logo]]
|name = GATE
|screenshot =[[Image:Gate-add-new.png|250px|]]
|caption = General Architecture for Text Engineering.
|developer = [[Sheffield NLP research group]]
|operating_system = [[Cross-platform]]
|language = [[English language|English]] only
|genre = [[Text mining]]
|license = [[GNU Lesser General Public License|LGPL]]
|website = [http://gate.ac.uk/ http://gate.ac.uk/]
}}
'''General Architecture for Text Engineering''' or '''GATE''' is a [[Java (programming language)|Java]] software toolkit originally developed at the [[University of Sheffield]] since 1995 and now used worldwide by a wide community of scientists, companies, teachers and students for all sorts of [[Natural language processing | natural language processing]] tasks, including [[Information extraction | information extraction]] in many languages.
Line 5 ⟶ 18:
GATE community and research is involved in several European research projects including [[Transitioning Applications to Ontologies|TAO]] and [[SEKT]].
== Features ==
GATE includes an information extraction system called ANNIE (A Nearly-New Information Extraction System) which is a set of modules comprising a [[Lexical analysis|tokenizer]], a [[Gazetteer|gazetteer]], a [[Sentence boundary disambiguation|sentence splitter]], a [[Part-of-speech tagging|part of speech tagger]], a [[Named entity recognition|named entities]] transducer and a [[Coreference|coreference]] tagger. Languages currently handled in GATE include English, Spanish, Chinese, Arabic, French, German, Hindi, Cebuano, Romanian, Russian. There is a large set of plugins for [[machine learning]] with [[Weka (machine learning)|Weka]], RASP, MAXENT, SVM Light, for managing [[Ontologies]] like [[WordNet]], for querying [[search engines]] like [[Google]] or [[Yahoo]], for part of speech tagging with [[Brill tagger|Brill]] or TreeTager, and many more.▼
▲GATE includes an information extraction system called ANNIE (A Nearly-New Information Extraction System) which is a set of modules comprising a [[Lexical analysis|tokenizer]], a [[Gazetteer|gazetteer]], a [[Sentence boundary disambiguation|sentence splitter]], a [[Part-of-speech tagging|part of speech tagger]], a [[Named entity recognition|named entities]] transducer and a [[Coreference|coreference]] tagger
Languages currently handled in GATE include English, Spanish, Chinese, Arabic, French, German, Hindi, Cebuano, Romanian, Russian.
There is a large set of plugins for [[machine learning]] with [[Weka (machine learning)|Weka]], RASP, MAXENT, SVM Light, for managing [[Ontologies]] like [[WordNet]], for querying [[search engines]] like [[Google]] or [[Yahoo]], for part of speech tagging with [[Brill tagger|Brill]] or TreeTager, and many more.
GATE can handle input in various formats, such as [[Text file|TXT]], [[HTML]], [[XML]], [[DOC (computing)|Doc]], [[PDF]] documents, and [[Serialization|Java Serial]], [[PostgreSQL]], [[Lucene]], [[Oracle database|Oracle]] Databases with help of RDBMS storage over [[JDBC]].
|