Revision as of 15:21, 17 July 2007 edit I am Thomas (talk \| contribs) 7 edits Description of the graphical user interface ← Previous edit		Revision as of 15:47, 17 July 2007 edit undo I am Thomas (talk \| contribs) 7 edits paging Next edit →
Line 1: {{Infobox_Software2 \|logo = [[Image:Gate.gif\|83px\|GATE logo]] \|name = GATE \|screenshot =[[Image:Gate-add-new.png\|250px\|]] \|caption = General Architecture for Text Engineering. \|developer = [[Sheffield NLP research group]] \|operating_system = [[Cross-platform]] \|language = [[English language\|English]] only \|genre = [[Text mining]] \|license = [[GNU Lesser General Public License\|LGPL]] \|website = [http://gate.ac.uk/ http://gate.ac.uk/] }} '''General Architecture for Text Engineering''' or '''GATE''' is a [[Java (programming language)\|Java]] software toolkit originally developed at the [[University of Sheffield]] since 1995 and now used worldwide by a wide community of scientists, companies, teachers and students for all sorts of [[Natural language processing \| natural language processing]] tasks, including [[Information extraction \| information extraction]] in many languages. Line 5 ⟶ 18: GATE community and research is involved in several European research projects including [[Transitioning Applications to Ontologies\|TAO]] and [[SEKT]]. == Features == GATE includes an information extraction system called ANNIE (A Nearly-New Information Extraction System) which is a set of modules comprising a [[Lexical analysis\|tokenizer]], a [[Gazetteer\|gazetteer]], a [[Sentence boundary disambiguation\|sentence splitter]], a [[Part-of-speech tagging\|part of speech tagger]], a [[Named entity recognition\|named entities]] transducer and a [[Coreference\|coreference]] tagger. Languages currently handled in GATE include English, Spanish, Chinese, Arabic, French, German, Hindi, Cebuano, Romanian, Russian. There is a large set of plugins for [[machine learning]] with [[Weka (machine learning)\|Weka]], RASP, MAXENT, SVM Light, for managing [[Ontologies]] like [[WordNet]], for querying [[search engines]] like [[Google]] or [[Yahoo]], for part of speech tagging with [[Brill tagger\|Brill]] or TreeTager, and many more.▼ ▲GATE includes an information extraction system called ANNIE (A Nearly-New Information Extraction System) which is a set of modules comprising a [[Lexical analysis\|tokenizer]], a [[Gazetteer\|gazetteer]], a [[Sentence boundary disambiguation\|sentence splitter]], a [[Part-of-speech tagging\|part of speech tagger]], a [[Named entity recognition\|named entities]] transducer and a [[Coreference\|coreference]] tagger. Languages currently handled in GATE include English, Spanish, Chinese, Arabic, French, German, Hindi, Cebuano, Romanian, Russian. There is a large set of plugins for [[machine learning]] with [[Weka (machine learning)\|Weka]], RASP, MAXENT, SVM Light, for managing [[Ontologies]] like [[WordNet]], for querying [[search engines]] like [[Google]] or [[Yahoo]], for part of speech tagging with [[Brill tagger\|Brill]] or TreeTager, and many more. Languages currently handled in GATE include English, Spanish, Chinese, Arabic, French, German, Hindi, Cebuano, Romanian, Russian. There is a large set of plugins for [[machine learning]] with [[Weka (machine learning)\|Weka]], RASP, MAXENT, SVM Light, for managing [[Ontologies]] like [[WordNet]], for querying [[search engines]] like [[Google]] or [[Yahoo]], for part of speech tagging with [[Brill tagger\|Brill]] or TreeTager, and many more. GATE can handle input in various formats, such as [[Text file\|TXT]], [[HTML]], [[XML]], [[DOC (computing)\|Doc]], [[PDF]] documents, and [[Serialization\|Java Serial]], [[PostgreSQL]], [[Lucene]], [[Oracle database\|Oracle]] Databases with help of RDBMS storage over [[JDBC]].

General Architecture for Text Engineering: Difference between revisions