Revision as of 01:52, 10 December 2023 edit Citation bot (talk \| contribs) Bots 5,861,400 edits Add: arxiv, volume, date, series, doi-access, doi, s2cid, authors 1-1. Removed proxy/dead URL that duplicated identifier. Removed parameters. Some additions/deletions were parameter name changes. Upgrade ISBN10 to 13. \| Use this bot. Report bugs. \| Suggested by Headbomb \| Category:CS1 maint: unflagged free DOI \| #UCB_Category 2/69 ← Previous edit		Revision as of 17:05, 10 December 2023 edit undo Felix QW (talk \| contribs) Extended confirmed users 8,596 edits Rework the systems part to include a description of bottom-up systems. Tag: Visual edit Next edit →
Line 33: & \not\models & \textit{false} \end{array}</math>]]Completeness requires any generated hypothesis ''{{mvar\|h}}'' to explain all positive examples <math display="inline">E^+</math>, and consistency forbids generation of any hypothesis ''{{mvar\|h}}'' that is inconsistent with the negative examples <math display="inline">E^{-}</math>, both given the background knowledge ''{{mvar\|B}}''.<math display="block">\begin{array}{llll} \text{Completeness:} & B \cup H & \models & E^+ \\ \text{Consistency: } & B \cup H \cup E^- & \not\models & \textit{false} \end{array}</math>Completeness requires any generated hypothesis ''{{mvar\|h}}'' to explain all positive examples <math display="inline">E^+</math>, and consistency forbids generation of any hypothesis ''{{mvar\|h}}'' that is inconsistent with the negative examples <math display="inline">E^{-}</math>, both given the background knowledge ''{{mvar\|B}}''. In Muggleton's setting of concept learning,<ref name="setting2">{{cite journal \|last1=Muggleton \|first1=Stephen \|year=1999 \|title=Inductive Logic Programming: Issues, Results and the Challenge of Learning Language in Logic \|journal=Artificial Intelligence \|volume=114 \|issue=1–2 \|pages=283–296 \|doi=10.1016/s0004-3702(99)00067-3 \|doi-access=}}; here: Sect.2.1</ref> "completeness" is referred to as "sufficiency", and "consistency" as "strong consistency". Two further conditions are added: "''Necessity''", which postulates that ''{{mvar\|B}}'' does not entail <math display="inline">E^+</math>, does not impose a restriction on ''{{mvar\|h}}'', but forbids any generation of a hypothesis as long as the positive facts are explainable without it. . "Weak consistency", which states that no contradiction can be derived from <math display="inline">B\land H</math>, forbids generation of any hypothesis ''{{mvar\|h}}'' that contradicts the background knowledge ''{{mvar\|B}}''. Weak consistency is implied by strong consistency; if no negative examples are given, both requirements coincide. Weak consistency is particularly important in the case of noisy data, where completeness and strong consistency cannot be guaranteed.<ref name="setting2" /> Line 84 ⟶ 95: The common definition of the grandmother relation, viz. <math>\textit{gra}(x,z) \leftarrow \textit{fem}(x) \land \textit{par}(x,y) \land \textit{par}(y,z)</math>, cannot be learned using the above approach, since the variable {{mvar\|y}} occurs in the clause body only; the corresponding literals would have been deleted in the 4th step of the approach. To overcome this flaw, that step has to be modified such that it can be parametrized with different ''literal post-selection heuristics''. Historically, the GOLEM implementation is based on the rlgg approach. == Approaches to ILP == ~~== Inductive Logic Programming system ==~~ An inductive logic programming system is a program that takes as an input logic theories <math>B, E^+, E^-</math> and outputs a correct hypothesis {{mvar\|H}} with respect to theories <math>B, E^+, E^-</math>. Search-based ILP systems consist of two parts: hypothesis search and hypothesis selection. First a hypothesis is searched with an inductive logic programming procedure, then a subset of the found hypotheses (in most systems one hypothesis) is chosen by a selection algorithm. A selection algorithm scores each of the found hypotheses and returns the ones with the highest score. An example of score function include minimal compression length where a hypothesis with a lowest [[Kolmogorov complexity]] has the highest score and is returned. An ILP system is complete if and only if for any input logic theories <math>B, E^+, E^-</math> any correct hypothesis {{mvar\|H}} with respect to these input theories can be found with its hypothesis search procedure. Inductive logic programming systems can be roughly divided into two classes, search-based and meta-interpretative systems. Search-based systems exploit that the space of possible clauses forms a [[complete lattice]] under the [[Theta-subsumption\|subsumption]] relation, where one clause <math display="inline">C_1</math> subsumes another clause <math display="inline">C_2</math> if there is a [[Substitution (logic)\|substitution]] <math display="inline">\theta</math> such that <math display="inline">C_1\theta</math>, the result of applying <math display="inline">\theta</math> to <math display="inline">C_1</math>, is a subset of <math display="inline">C_2</math>. This lattice can be traversed either bottom-up or top-down. === Bottom-up search === Bottom-up methods to search the subsumption lattice have been investigated since Plotkin's first work on formalising induction in clausal logic in 1970.<ref name=":02">{{Cite book \|last1=Nienhuys-Cheng \|first1=Shan-hwei \|title=Foundations of inductive logic programming \|last2=Wolf \|first2=Ronald de \|date=1997 \|publisher=Spinger \|isbn=978-3-540-62927-6 \|series=Lecture notes in computer science Lecture notes in artificial intelligence \|___location=Berlin Heidelberg \|pages=174–177}}</ref><ref>{{cite thesis \|first=G.D. \|last=Plotkin \|title=Automatic Methods of Inductive Inference \|date=1970 \|type=PhD \|publisher=University of Edinburgh \|url=https://www.era.lib.ed.ac.uk/bitstream/handle/1842/6656/Plotkin1972.pdf \|hdl=1842/6656}}</ref> Techniques used include least general generalisation, based on [[Anti-unification (computer science)\|anti-unification]], and inverse resolution, based on inverting the [[Resolution (logic)\|resolution]] inference rule. ==== Least general generalisation ==== A least general generalisation algorithm takes as input two clauses <math display="inline">C_1</math> and <math display="inline">C_2</math> and outputs the least general generalisation of <math display="inline">C_1</math> and <math display="inline">C_2</math>, that is, a clause <math display="inline">C</math> that subsumes <math display="inline">C_1</math> and <math display="inline">C_2</math>, and that is subsumed by every other clause that subsumes <math display="inline">C_1</math> and <math display="inline">C_2</math>. The least general generalisation can be computed by first computing all ''selections'' from <math display="inline">C</math> and <math display="inline">D</math>, which are pairs of literals <math>(L,M) \in (C_1, C_2)</math> sharing the same predicate symbol and negated/unnegated status. Then, the least general generalisation is obtained as the disjunction of the least general generalisations of the individual selections, which can be obtained by [[first-order syntactical anti-unification]].<ref>{{Cite book \|last1=Nienhuys-Cheng \|first1=Shan-hwei \|title=Foundations of inductive logic programming \|last2=Wolf \|first2=Ronald de \|date=1997 \|publisher=Spinger \|isbn=978-3-540-62927-6 \|series=Lecture notes in computer science Lecture notes in artificial intelligence \|___location=Berlin Heidelberg \|page=255}}</ref> To account for background knowledge, inductive logic programming systems employ ''relative least general generalisations'', which are defined in terms of subsumption relative to a background theory. In general, such relative least general generalisations are not guaranteed to exist; however, if the background theory ''{{mvar\|B}}'' is a finite set of [[Ground expression\|ground]] [[Literal (mathematical logic)\|literals]], then the negation of ''{{mvar\|B}}'' is itself a clause. In this case, a relative least general generalisation can be computed by disjoining the negation of ''{{mvar\|B}}'' with both <math display="inline">C_1</math> and <math display="inline">C_2</math> and then computing their least general generalisation as before.<ref>{{Cite book \|last1=Nienhuys-Cheng \|first1=Shan-hwei \|title=Foundations of inductive logic programming \|last2=Wolf \|first2=Ronald de \|date=1997 \|publisher=Spinger \|isbn=978-3-540-62927-6 \|series=Lecture notes in computer science Lecture notes in artificial intelligence \|___location=Berlin Heidelberg \|page=286}}</ref> Relative least general generalisations are the foundation of the bottom-up system [[Golem (ILP)\|Golem]].<ref name=":12">{{Cite book \|last1=Nienhuys-Cheng \|first1=Shan-hwei \|title=Foundations of inductive logic programming \|last2=Wolf \|first2=Ronald de \|date=1997 \|publisher=Spinger \|isbn=978-3-540-62927-6 \|series=Lecture notes in computer science Lecture notes in artificial intelligence \|___location=Berlin Heidelberg \|pages=354–358}}</ref><ref>{{Cite journal \|last1=Muggleton \|first1=Stephen H. \|last2=Feng \|first2=Cao \|date=1990 \|editor-last=Arikawa \|editor-first=Setsuo \|editor2-last=Goto \|editor2-first=Shigeki \|editor3-last=Ohsuga \|editor3-first=Setsuo \|editor4-last=Yokomori \|editor4-first=Takashi \|title=Efficient Induction of Logic Programs \|url=https://dblp.org/rec/conf/alt/MuggletonF90.bib \|journal=Algorithmic Learning Theory, First International Workshop, ALT '90, Tokyo, Japan, October 8-10, 1990, Proceedings \|publisher=Springer/Ohmsha \|pages=368–381}}</ref> === ~~Hypothesis~~Top-down search === The ILP systems Progol,<ref name=":2" /> Hail <ref>{{cite book \|last1=Ray \|first1=O. \|~~last2=Broda \|first2=K. \|last3=Russo \|first3=A.M. \|chapter=Hybrid abductive inductive learning \|chapter-~~url=~~https://link.springer.com/chapter/10.1007/978-3-540-39917-9_21~~ \|title=Proceedings of the 13th international conference on inductive logic programming \|~~publisher~~last2=~~Springer~~Broda \|~~series~~first2=~~LNCS~~K. \|~~volume~~last3=~~2835~~Russo \|first3=A.M. \|date=2003 \|publisher=Springer \|isbn=978-3-540-39917-9 \|series=LNCS \|volume=2835 \|pages=311–328 \|chapter=Hybrid abductive inductive learning \|citeseerx=10.1.1.212.6602 \|doi=10.1007/978-3-540-39917-9_21 \|chapter-url= ~~\|citeseerx=10~~https://link.1springer.1com/chapter/10.~~212.6602~~1007/978-3-540-39917-9_21}}</ref> and Imparo <ref>{{cite book \|last1=Kimber \|first1=T. ~~\|last2=Broda \|first2=K. \|last3=Russo \|first3=A. \|chapter=Induction on failure: learning connected Horn theories \|chapter-url=https://link.springer.com/chapter/10.1007/978-3-642-04238-6_16~~ \|title=Proceedings of the 10th international conference on logic programing and nonmonotonic reasoning \|last2=Broda \|~~publisher~~first2=~~Springer~~K. \|~~series~~last3=~~LNCS~~Russo \|~~volume~~first3=~~575~~A. \|date=2009 \|publisher=Springer \|isbn=978-3-642-04238-6 \|series=LNCS \|volume=575 \|pages=169–181 \|chapter=Induction on failure: learning connected Horn theories \|doi=10.1007/978-3-642-04238-6_16 \|chapter-url=https://link.springer.com/chapter/10.1007/978-3-642-04238-6_16}}</ref> find a hypothesis {{mvar\|H}} using the principle of the '''inverse entailment'''<ref name=":2" /> for theories {{mvar\|B}}, {{mvar\|E}}, {{mvar\|H}}: <math>B \land H \models E \iff B \land \neg E \models \neg H</math>. First they construct an intermediate theory {{mvar\|F}} called a bridge theory satisfying the conditions <math>B \land \neg E \models F</math> and <math>F \models \neg H</math>. Then as <math>H \models \neg F</math>, they generalize the negation of the bridge theory {{mvar\|F}} with anti-entailment.<ref>{{cite journal \|last1=Yamamoto \|first1=Yoshitaka \|last2=Inoue \|first2=Katsumi \|last3=Iwanuma \|first3=Koji \|year=2012 \|title=Inverse subsumption for complete explanatory induction \|url=https://link.springer.com/content/pdf/10.1007/s10994-011-5250-y.pdf \| doi=10.1007/s10994-011-5250-y \| title=Inverse subsumption for complete explanatory induction \| year=2012 \| last1=Yamamoto \| first1=Yoshitaka \| last2=Inoue \| first2=Katsumi \| last3=Iwanuma \| first3=Koji \| journal=Machine Learning \| volume=86 \| pages=115–139 \|doi=10.1007/s10994-011-5250-y \|s2cid=11347607 }}</ref> However, the operation of anti-entailment is computationally more expensive since it is highly nondeterministic. Therefore, an alternative hypothesis search can be conducted using the operation of the inverse subsumption (anti-subsumption) instead which is less non-deterministic than anti-entailment. Questions of completeness of a hypothesis search procedure of specific ILP system arise. For example, Progol's hypothesis search procedure based on the inverse entailment inference rule is not complete by '''Yamamoto's example'''.<ref>{{cite book ~~\|first=Akihiro~~ \|last=Yamamoto \|~~chapter~~first=~~Which hypotheses can be found with inverse entailment?~~Akihiro \|~~chapter-~~url=~~https://link.springer.com/chapter/10.1007/3540635149_58~~ \|title=International Conference on Inductive Logic Programming \|date=1997 \|publisher=Springer \|isbn=978-3-540-69587-5 \|series=Lecture Notes in Computer Science \|~~publisher~~volume=~~Springer~~ 1297 \|___location= \|~~date~~pages=~~1997~~296–308 \|~~volume~~chapter=~~1297~~Which ~~\|isbn=978-3-540-69587-5~~hypotheses can be found with inverse entailment? \|~~pages~~citeseerx=~~296–308~~10.1.1.54.2975 \|doi=10.1007/3540635149_58 \|chapter-url= ~~\|citeseerx=10~~https://link.1springer.1com/chapter/10.~~54.2975~~1007/3540635149_58}}</ref> On the other hand, Imparo is complete by both anti-entailment procedure <ref name="kimber2009induction">{{cite thesis \|first=Timothy \|last=Kimber \|title=Learning definite and normal logic programs by induction on failure \|date=2012 \|type=PhD \|publisher=Imperial College London \|url=https://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.560694 \|id=ethos 560694 \|access-date=2022-10-21 \|archive-date=2022-10-21 \|archive-url=https://web.archive.org/web/20221021035457/https://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.560694 \|url-status=dead }}</ref> and its extended inverse subsumption <ref>{{cite arXiv \|eprint=1407.3836 \|class=cs.AI \|first=David \|last=Toth \|title=Imparo is complete by inverse subsumption \|date=2014 ~~\|class=cs.AI \|eprint=1407.3836~~}}</ref> procedure. == List of implementations == ~~=== Implementations ===~~ * [http://www.cs.bris.ac.uk/Research/MachineLearning/1BC/ 1BC and 1BC2: first-order naive Bayesian classifiers:] * [http://dtai.cs.kuleuven.be/ACE/ ACE (A Combined Engine)]

Inductive logic programming: Difference between revisions