Coupled pattern learner: Difference between revisions

Content deleted Content added
No edit summary
Typo and General fixing, replaced: th evidence → the evidence, typos fixed: are are → are, added orphan tag using AWB
Line 1:
{{Orphan|date=March 2012}}
 
Coupled Pattern Learner (CPL) is a [[machine learning]] algorithm which couples the [[semi-supervised learning]] of categories and relations to forestall the problem of semantic drift associated with boot-strap learning methods.
 
== Coupled Pattern Learner ==
[[Semi-supervised learning]] approaches using a small number of labeled examples with many unlabeled examples are usually unreliable as they produce an internally consistent, but incorrect set of extractions. CPL solves this problem by simultaneously learning classifiers for many different categories and relations in the presence of an [[ontology]] defining constraints that couple the training of these classifiers. It was introduced by Andrew Carlson, Justin Betteridge, Estevam R. Hruschka Jr. and Tom M. Mitchell in 2009. <ref name=cbl2009>{{cite journal|last=Carlson|first=Andrew|coauthors=Justin Betteridge; Estevam R. Hruschka Jr.; Tom M. Mitchell|dateyear=2009|title=Coupling semi-supervised learning of categories and relations|journal=Proceedings of the NAACL HLT 2009 Workshop on Semi-Supervised Learning for Natural Language Processing |publisher=Association for Computational Linguistics|___location=Colorado, USA|pages=1–9|url=http://dl.acm.org/citation.cfm?id=1621829.1621830}} </ref> <ref name=cpl2010>{{cite journal|last=Carlson|first=Andrew|coauthors=Justin Betteridge;Richard C. Wang; Estevam R. Hruschka Jr.; Tom M. Mitchell|dateyear=2010|title=Coupled semi-supervised learning for information extraction|journal=Proceedings of the third ACM international conference on Web search and data mining |publisher=ACM|___location=NY, USA|pages=101–110|url=http://dl.acm.org/citation.cfm?doid=1718487.1718501}}</ref>
 
== CPL Overview==
Line 7 ⟶ 10:
 
== CPL Description ==
 
=== Coupling of Predicates ===
CPL primarily relies on the notion of coupling the [[learning]] of multiple functions so as to constrain the semi-supervised learning problem. CPL constrains the learned function in two ways.
Line 14 ⟶ 16:
 
=== Sharing among same-arity predicates ===
Each predicate P in the ontology has a list of other same-arity predicates with which P is mutually exclusive. If A is [[mutually exclusive]] with predicate B, A’s positive instances and patterns become negative instances and negative patterns for B. For example, if ‘city’, having an instance ‘Boston’ and a pattern ‘mayor of arg1’, is mutually exclusive with ‘scientist’, then ‘Boston’ and ‘mayor of arg1’ will become a negative instance and a negative pattern respectively for ‘scientist.’ Further, Some categories are declared to be a subset of another category. For e.g., ‘athlete’ is a subset of ‘person’.
 
=== Relation argument type-checking ===
Line 50 ⟶ 52:
 
==== Candidate Promotion ====
CPL ranks the candidates according to their assessment scores and promotes at most 100 instances and 5 patterns for each predicate. Instances and patterns are are only promoted if they co-occur with at least two promoted patterns or instances, respectively.
 
== Meta-Bootstrap Learner ==
Meta-Bootstrap Learner (MBL) was also proposed by the authors of CPL in.<ref name=cpl2010 />. Meta-Bootstrap learner couples the training of multiple extraction techniques with a multi-view constraint, which requires the extractors to agree. It makes addition of coupling constraints on top of existing extraction algorithms, while treating them as black boxes, feasible. MBL assumes that the errors made by different extraction techniques are independent. Following is a quick summary of MBL.
 
'''Input''': An ontology O, a set of extractors ε
Line 67 ⟶ 69:
'''end'''
 
Subordinate algorithms used with MBL do not promote any instance on their own, they report ththe evidence about each candidate to MBL and MBL is responsible for promoting instances.
 
== Applications ==
In their paper <ref name=cbl2009 /> authors have presented results showing the potential of CPL to contribute new facts to existing repository of semantic knowledge, Freebase <ref>{{cite journal|dateyear=2009|title=Freebase data dumps|publisher=Metaweb Technologies |url=http://download.freebase.com/datadumps/}}</ref>
 
== See also ==
Line 79 ⟶ 81:
{{reflist}}
 
* {{cite journal|last=Liu|first=Qiuhua |coauthors=Xuejun Liao;Lawrence Carin|dateyear=2008|title=Semi-supervised multitask learning|journal=NIPS}}
 
* {{cite journal|last=Shinyama|first=Yusuke|coauthors=Satoshi Sekine|dateyear=2006|title=Preemptive information extraction using unrestricted relation discovery|journal=HLT-NAACL}}
 
* {{cite journal|last=Chang|first=Ming-Wei|coauthors=Lev-Arie Ratinov;Dan Roth|dateyear=2007|title=Guiding semi-supervision with constraint driven learning|journal=ACL}}
 
* {{cite journal|last=Banko|first=Michele|coauthors=Michael J. Cafarella;Stephen Soderland; Matt Broadhead; Oren Etzioni|dateyear=2007|title=Open information extraction from the web|journal=IJCAI}}
 
* {{cite journal|last=Blum|first=Avrim|coauthors=Tom Mitchell|dateyear=1998|title=Combining labeled and unlabeled data with co-training|journal=COLT}}
 
* {{cite journal|last=Riloff|first=Ellen|coauthors=Rosie Jones|dateyear=1999|title=Learning dictionaries for information extraction by multi-level bootstrapping|journal=AAAI}}
 
* {{cite journal|last=Rosenfeld|first=Benjamin|coauthors=Ronen Feldman|dateyear=2007|title=Using corpus statistics on entities to improve semi-supervised relation extraction from the web|journal=ACL}}
 
* {{cite journal|last=Wang|first=Richard C.|coauthors=William W. Cohen|dateyear=2008|title=Iterative set expansion of named entities using the web|journal=ICDM}}
 
[[Category:Machine learning]]