Revision as of 01:53, 14 June 2020 edit OAbot (talk \| contribs) Bots 643,717 edits m Open access bot: doi added to citation with #oabot. ← Previous edit		Revision as of 11:50, 20 July 2020 edit undo Frap (talk \| contribs) Extended confirmed users, File movers, Pending changes reviewers, Rollbackers 35,595 edits MOS:HEAD Next edit →
Line 6: [[Semi-supervised learning]] approaches using a small number of labeled examples with many unlabeled examples are usually unreliable as they produce an internally consistent, but incorrect set of extractions. CPL solves this problem by simultaneously learning classifiers for many different categories and relations in the presence of an [[ontology]] defining constraints that couple the training of these classifiers. It was introduced by Andrew Carlson, Justin Betteridge, Estevam R. Hruschka Jr. and Tom M. Mitchell in 2009.<ref name=cbl2009>{{cite journal\|last=Carlson\|first=Andrew\|author2=Justin Betteridge \|author3=Estevam R. Hruschka Jr. \|author4= Tom M. Mitchell \|year=2009\|title=Coupling semi-supervised learning of categories and relations\|journal=Proceedings of the NAACL HLT 2009 Workshop on Semi-Supervised Learning for Natural Language Processing \|publisher=Association for Computational Linguistics\|___location=Colorado, USA\|pages=1–9\|url=http://dl.acm.org/citation.cfm?id=1621829.1621830}}</ref><ref name=cpl2010>{{cite journal\|last=Carlson\|first=Andrew\|author2=Justin Betteridge \|author3=Richard C. Wang \|author4=Estevam R. Hruschka Jr. \|author5= Tom M. Mitchell \|year=2010\|title=Coupled semi-supervised learning for information extraction\|journal=Proceedings of the Third ACM International Conference on Web Search and Data Mining \|publisher=ACM\|___location=NY, USA\|pages=101–110\|url=http://dl.acm.org/citation.cfm?doid=1718487.1718501\|doi=10.1145/1718487.1718501\|isbn=9781605588896\|doi-access=free}}</ref> == CPL ~~Overview~~overview== CPL is an approach to [[semi-supervised learning]] that yields more accurate results by coupling the training of many information extractors. Basic idea behind CPL is that semi-supervised training of a single type of extractor such as ‘coach’ is much more difficult than simultaneously training many extractors that cover a variety of inter-related entity and relation types. Using prior knowledge about the relationships between these different entities and relations CPL makes unlabeled data as a useful constraint during training. For e.g., ‘coach(x)’ implies ‘person(x)’ and ‘not sport(x)’. == CPL ~~Description~~description == === Coupling of ~~Predicates~~predicates === CPL primarily relies on the notion of coupling the [[learning]] of multiple functions so as to constrain the semi-supervised learning problem. CPL constrains the learned function in two ways. # Sharing among same-arity predicates according to logical relations Line 21: This is a type checking information used to couple the learning of relations and categories. For example, the arguments of the ‘ceoOf’ relation are declared to be of the categories ‘person’ and ‘company’. CPL does not promote a pair of noun phrases as an instance of a relation unless the two noun phrases are classified as belonging to the correct argument types. === Algorithm ~~Description~~description === Following is a quick summary of the CPL algorithm.<ref name=cpl2010 /> Line 45: * Relation Patterns ==== Candidate ~~Filtering~~filtering ==== Candidate instances and patterns are filtered to maintain high precision, and to avoid extremely specific patterns. An instance is only considered for assessment if it co-occurs with at least two promoted patterns in the text corpus, and if its co-occurrence count with all promoted patterns is at least three times greater than its co-occurrence count with negative patterns. ==== Candidate ~~Ranking~~ranking ==== CPL ranks candidate instances using the number of promoted patterns that they co-occur with so that candidates that occur with more patterns are ranked higher. Patterns are ranked using an estimate of the precision of each pattern. ==== Candidate ~~Promotion~~promotion ==== CPL ranks the candidates according to their assessment scores and promotes at most 100 instances and 5 patterns for each predicate. Instances and patterns are only promoted if they co-occur with at least two promoted patterns or instances, respectively.

Coupled pattern learner: Difference between revisions