Probabilistic logic programming: Difference between revisions

Browse history interactively

Content deleted Content added

VisualWikitext

Revision as of 12:27, 25 October 2023 edit Felix QW (talk \| contribs) Extended confirmed users 8,596 edits Start PLP draft Tag: 2017 wikitext editor		Latest revision as of 04:40, 25 August 2025 edit undo OAbot (talk \| contribs) Bots 643,717 edits m Open access bot: url-access=subscription updated in citation with #oabot.
(28 intermediate revisions by 8 users not shown)
Line 1: {{Short description\|Programming ~~paradigms~~paradigm}} '''Probabilistic logic programming''' is a [[programming paradigm]] that combines [[logic programming]] with probabilities. ~~Its origins lie in the distribution~~ Most approaches to probabilistic logic programming are based on the ''distribution semantics,'' which splits a program into a set of probabilistic facts and a logic program. It defines a probability distribution on interpretations of the [[Herbrand structure\|Herbrand universe]] of the program. == Languages == Most approaches to probabilistic logic programming are based on the ''distribution semantics,''<ref name=":3">{{Citation \|last1=Riguzzi \|first1=Fabrizio \|title=A survey of probabilistic logic programming \|date=2018-09-01 \|work=Declarative Logic Programming: Theory, Systems, and Applications \|pages=185–228 \|url=http://dx.doi.org/10.1145/3191315.3191319 \|access-date=2023-10-25 \|publisher=ACM \|last2=Swift \|first2=Theresa\|doi=10.1145/3191315.3191319 \|isbn=978-1-970001-99-0 \|s2cid=70180651 \|url-access=subscription }}</ref> which underlies many languages such as Probabilistic Horn Abduction, PRISM, Independent Choice Logic , probabilistic [[Datalog]], Logic Programs with Annotated Disjunctions, [[ProbLog]], P-log, and CP-logic. While the number of languages is large, many share a common approach so that there are transformations with [[Time complexity\|linear complexity]] that can translate one language into another.<ref name=":0">{{Cite journal \|last1=Riguzzi \|first1=Fabrizio \|last2=Bellodi \|first2=Elena \|last3=Zese \|first3=Riccardo \|date=2014 \|title=A History of Probabilistic Inductive Logic Programming \|journal=Frontiers in Robotics and AI \|volume=1 \|doi=10.3389/frobt.2014.00006 \|doi-access=free \|issn=2296-9144}}</ref> == Semantics == Under the distribution semantics, a probabilistic logic program is interpreted as a set of independent probabilistic facts ([[Ground expression\|ground]] [[Atomic formula\|atomic formulas]] annotated with a probability) and a [[logic program]] which can use the probabilistic facts in the bodies of its clauses. The probability of any assignment of truth values to the groundings of the formulas associated with probabilistic facts is given by the product of their probabilities; this is equivalent to assuming the choices of probabilistic facts to be [[independent random variables]].<ref name=":3" /><ref>{{Cite journal \|last1=De Raedt \|first1=Luc \|last2=Kimmig \|first2=Angelika \|date=2015-07-01 \|title=Probabilistic (logic) programming concepts \|url=https://doi.org/10.1007/s10994-015-5494-z \|journal=Machine Learning \|language=en \|volume=100 \|issue=1 \|pages=5–47 \|doi=10.1007/s10994-015-5494-z \|issn=1573-0565}}</ref> === Stratified programs === If for any choice of truth values for the probabilistic facts, the resulting logic program is [[Stratified logic program\|stratified]], it has a unique minimal [[Herbrand model]] which can be seen as the unique interpretation associated with that choice of truth values.<ref name=":3" /> Important subclasses of stratified programs are positive programs, which do not use negation, but may be recursive, and acyclic programs, which may use negation but have no recursive dependencies.<ref name=":3" /> === Answer set programs === The [[stable model semantics]] underlying [[answer set programming]] gives meaning to unstratified programs by allocating potentially more than one answer set to every truth value assignment of the probabilistic facts. This raises the question of how to distribute the probability mass across the answer sets.<ref name=":1">{{Citation \|last=Riguzzi \|first=Fabrizio \|title=Probabilistic Answer Set Programming \|date=2023-05-22 \|work=Foundations of Probabilistic Logic Programming \|pages=165–173 \|url=http://dx.doi.org/10.1201/9781003427421-6 \|access-date=2024-02-03 \|place=New York \|publisher=River Publishers \|doi=10.1201/9781003427421-6 \|isbn=978-1-003-42742-1\|url-access=subscription }}</ref><ref name=":2">{{Cite journal \|last1=Cozman \|first1=Fabio Gagliardi \|last2=Mauá \|first2=Denis Deratani \|date=2020 \|title=The joy of Probabilistic Answer Set Programming: Semantics, complexity, expressivity, inference \|url=https://linkinghub.elsevier.com/retrieve/pii/S0888613X20302012 \|journal=International Journal of Approximate Reasoning \|language=en \|volume=125 \|pages=218–239 \|doi=10.1016/j.ijar.2020.07.004\|s2cid=222233309 \|doi-access=free \|url-access=subscription }}</ref> The probabilistic logic programming language P-Log resolves this by dividing the probability mass equally between the answer sets, following the [[principle of indifference]].<ref name=":1" /><ref>{{Cite journal \|last1=Baral \|first1=Chitta \|last2=Gelfond \|first2=Michael \|last3=Rushton \|first3=Nelson \|date=2009 \|title=Probabilistic reasoning with answer sets \|url=https://www.cambridge.org/core/product/identifier/S1471068408003645/type/journal_article \|journal=Theory and Practice of Logic Programming \|language=en \|volume=9 \|issue=1 \|pages=57–144 \|doi=10.1017/S1471068408003645 \|issn=1471-0684\|url-access=subscription }}</ref> Alternatively, probabilistic answer set programming under the credal semantics allocates a ''[[credal set]]'' to every query. Its lower probability bound is defined by only considering those truth value assignments of the probabilistic facts for which the query is true in every answer set of the resulting program (cautious reasoning); its upper probability bound is defined by considering those assignments for which the query is true in some answer set (brave reasoning).<ref name=":1" /><ref name=":2" /> == Inference == Under the distribution semantics, a probabilistic logic program defines a probability distribution over [[Interpretation (logic)\|interpretations]] of its predicates on its [[Herbrand Universe\|Herbrand universe]]. The probability of a [[Ground expression\|ground]] query is then obtained from the [[Joint probability distribution\|joint distribution]] of the query and the worlds: it is the sum of the probability of the worlds where the query is true.<ref name=":0" /><ref>{{Cite journal \|last=Poole \|first=David \|date=1993 \|title=Probabilistic Horn abduction and Bayesian networks \|url=http://dx.doi.org/10.1016/0004-3702(93)90061-f \|journal=Artificial Intelligence \|volume=64 \|issue=1 \|pages=81–129 \|doi=10.1016/0004-3702(93)90061-f \|issn=0004-3702\|url-access=subscription }}</ref><ref>{{Citation \|last=Sato \|first=Taisuke \|title=A Statistical Learning Method for Logic Programs with Distribution Semantics \|date=1995 \|work=Proceedings of the 12th International Conference on Logic Programming \|pages=715–730 \|url=http://dx.doi.org/10.7551/mitpress/4298.003.0069 \|access-date=2023-10-25 \|publisher=The MIT Press\|doi=10.7551/mitpress/4298.003.0069 \|isbn=978-0-262-29143-9 \|url-access=subscription }}</ref> The problem of computing the probability of queries is called ''(marginal) inference''. Solving it by computing all the worlds and then identifying those that entail the query is impractical as the number of possible worlds is exponential in the number of ground probabilistic facts.<ref name=":0" /> In fact, already for acyclic programs and [[Atomic formula\|atomic]] queries, computing the conditional probability of a query given a conjunction of atoms as evidence is [[♯P\|#P]]-complete.<ref>{{Cite book \|last=Riguzzi \|first=Fabrizio \|title=Foundations of probabilistic logic programming: Languages, semantics, inference and learning \|publisher=[[River Publishers]] \|year=2023 \|isbn=978-87-7022-719-3 \|edition=2nd \|___location=Gistrup, Denmark \|pages=180}}</ref> === Exact inference === Usually, exact inference is performed by resorting to [[knowledge compilation]]: according to this, a [[Propositional calculus\|propositional]] theory and a query are [[Compiler\|compiled]] into a “target language”, which is then used to answer queries in [[polynomial time]]. The compilation becomes the main computational bottleneck, but considerable effort has been devoted to the development of efficient compilers. The compilation methods differ in the compactness of the target language and the class of queries and transformations that they support in polynomial time.<ref name=":0" /> === Approximate inference === Since the cost of inference may be very high, approximate algorithms have been developed. They either compute subsets of possibly incomplete explanations or use random sampling. In the first approach, a subset of the explanations provides a lower bound and the set of partially expanded explanations provides an upper bound. In the second approach, the truth of the query is repeatedly checked in an ordinary [[logic program]] sampled from the probabilistic program. The probability of the query is then given by the fraction of the successes.<ref name=":0" /><ref>{{Cite journal \|last1=Kimmig \|first1=Angelika \|last2=Demoen \|first2=Bart \|last3=Raedt \|first3=Luc De \|last4=Costa \|first4=Vítor Santos \|last5=Rocha \|first5=Ricardo \|date=2011 \|title=On the implementation of the probabilistic logic programming language ProbLog \|url=https://www.cambridge.org/core/journals/theory-and-practice-of-logic-programming/article/abs/on-the-implementation-of-the-probabilistic-logic-programming-language-problog/21037609B99F5DC8033DDF56D07BF839 \|journal=Theory and Practice of Logic Programming \|language=en \|volume=11 \|issue=2–3 \|pages=235–262 \|doi=10.1017/S1471068410000566 \|s2cid=2022299 \|issn=1475-3081\|arxiv=1006.4442 }}</ref> == Learning == {{main\|Probabilistic inductive logic programming}} [[Probabilistic inductive logic programming]] aims to learn probabilistic logic programs from data. This includes parameter learning, which estimates the probability annotations of a program while the clauses themselves are given by the user, and structure learning, in which the clauses themselves are induced by the probabilistic inductive logic programming system.<ref name=":0" /> Common approaches to parameter learning are based on [[Expectation–maximization algorithm\|expectation–maximization]] or [[gradient descent]], while structure learning can be performed by searching the space of possible clauses under a variety of heuristics.<ref name=":0" /> == See also == * [[Inductive logic programming]] * [[Probabilistic database]] * [[Probabilistic programming]] * [[ProbLog]] * [[Statistical relational learning]] == References == {{reflist}} ''As of 3 February 2024, this article is derived in whole or in part from'' {{Cite journal \|last1=Riguzzi \|first1=Fabrizio \|last2=Bellodi \|first2=Elena \|last3=Zese \|first3=Riccardo \|date=2014 \|title=A History of Probabilistic Inductive Logic Programming \|journal=Frontiers in Robotics and AI \|volume=1\|doi=10.3389/frobt.2014.00006 \|doi-access=free }} ''The copyright holder has licensed the content in a manner that permits reuse under [[Wikipedia:Text of Creative Commons Attribution-ShareAlike 3.0 Unported License\|CC BY-SA 3.0]] and [[Wikipedia:Text of the GNU Free Documentation License\|GFDL]]. All relevant terms must be followed.'' {{Programming paradigms navbox}} [[Category:Programming paradigms]] [[Category:Logic ~~Programming~~programming]] [[Category:Probabilistic models]]