Radial basis function network: Difference between revisions

Browse history interactively

← Previous edit

Content deleted Content added

VisualWikitext

Revision as of 09:45, 23 February 2019 edit Rotondus (talk \| contribs) Extended confirmed users 3,664 edits m →References: layout ← Previous edit		Latest revision as of 15:51, 25 August 2025 edit undo CookieMonster135 (talk \| contribs) 48 edits Link suggestions feature: 3 links added. Tags: Visual edit Newcomer task Suggested: add links
(40 intermediate revisions by 30 users not shown)
Line 1: {{short description\|Type of artificial neural network that uses radial basis functions as activation functions}} In the field of [[mathematical modeling]], a '''radial basis function network''' is an [[artificial neural network]] that uses [[radial basis function]]s as [[activation function]]s. The output of the network is a [[linear combination]] of radial basis functions of the inputs and neuron parameters. Radial basis function networks have many uses, including [[function approximation]], [[time series prediction]], [[Statistical classification\|classification]], and system [[Control theory\|control]]. They were first formulated in a 1988 paper by Broomhead and Lowe, both researchers at the [[Royal Signals and Radar Establishment]].<ref>{{cite ~~techreport~~tech report \|last1 = Broomhead \|first1 = D. S. Line 9 ⟶ 10: \|number = 4148 \|url = http://www.dtic.mil/cgi-bin/GetTRDoc?AD=ADA196234 \|archive-url = https://web.archive.org/web/20130409223044/http://www.dtic.mil/cgi-bin/GetTRDoc?AD=ADA196234 \|url-status = dead \|archive-date = April 9, 2013 }}</ref><ref>{{cite journal \|last1 = Broomhead Line 20 ⟶ 24: \|pages = 321–355 \|url = https://sci2s.ugr.es/keel/pdf/algorithm/articulo/1988-Broomhead-CS.pdf \|access-date = 2019-01-29 \|archive-date = 2020-12-01 \|archive-url = https://web.archive.org/web/20201201121028/https://sci2s.ugr.es/keel/pdf/algorithm/articulo/1988-Broomhead-CS.pdf \|url-status = live }}</ref><ref name="schwenker"/> ==Network architecture== [[~~Image~~File:~~Radial funktion~~ Rbf-network.svg\|thumb\|~~250px~~252x252px\|~~right\|Figure 1:~~ Architecture of a radial basis function network. An input vector <math>x</math> is used as input to all radial basis functions, each with different parameters. The output of the network is a linear combination of the outputs from radial basis functions.]]▼ ▲[[Image:Radial funktion network.svg\|thumb\|250px\|right\|Figure 1: Architecture of a radial basis function network. An input vector <math>x</math> is used as input to all radial basis functions, each with different parameters. The output of the network is a linear combination of the outputs from radial basis functions.]] Radial basis function (RBF) networks typically have three layers: an input layer, a hidden layer with a non-linear RBF activation function and a linear output layer. The input can be modeled as a vector of real numbers <math>\mathbf{x} \in \mathbb{R}^n</math>. The output of the network is then a scalar function of the input vector, <math> \varphi : \mathbb{R}^n \to \mathbb{R} </math>, and is given by :<math>\varphi(\mathbf{x}) = \sum_{i=1}^N a_i \rho(\|\|\mathbf{x}-\mathbf{c}_i\|\|)</math> where <math>N</math> is the number of neurons in the hidden layer, <math>\mathbf c_i</math> is the center vector for neuron <math>i</math>, and <math>a_i</math> is the weight of neuron <math>i</math> in the linear output neuron. Functions that depend only on the distance from a center vector are radially symmetric about that vector, hence the name radial basis function. In the basic form, all inputs are connected to each hidden neuron. The [[Norm (mathematics)\|norm]] is typically taken to be the [[Euclidean distance]] (although the [[Mahalanobis distance]] appears to perform better inwith ~~general~~pattern recognition<ref>{{~~citation~~cite ~~needed~~web \|~~reason~~last1=~~better in which sense? and if so, why is tipically taken the Euclidean?~~Beheim\|~~date~~first1=~~September 2014}}) and the radial basis function is commonly taken to be [[Normal distribution\|Gaussian]]~~Larbi \|last2=Zitouni\|first2=Adel \|last3=Belloir\|first3=Fabien \|date=January 2004 \|title=New RBF neural network classifier with optimized hidden neurons number \|url=https://www.researchgate.net/publication/254467552 }}</ref><ref>{{cite conference \|conference=Proceedings of the Second Joint 24th Annual Conference and the Annual Fall Meeting of the Biomedical Engineering Society \|conference-url=https://ieeexplore.ieee.org/xpl/conhome/8844528/proceeding \|___location=Houston, TX, USA \|last1=Ibrikci\|first1=Turgay \|last2=Brandt\|first2=M.E. \|last3=Wang\|first3=Guanyu \|last4=Acikkar\|first4=Mustafa \|date=23–26 October 2002 \|publication-date=6 January 2003 \|volume=3 \|pages=2184–5 \|doi=10.1109/IEMBS.2002.1053230 \|title=Mahalanobis distance with radial basis function network on protein secondary structures \|isbn=0-7803-7612-9 \|issn=1094-687X }}</ref>{{Editorializing\|date=May 2020}}<!-- Was previously marked with a missing-citation tag asking in what sense using Mahalanobis distance is better and why the Euclidean distance is still normally used, but I found sources to support the first part, so it's likely salvageable. -->) and the radial basis function is commonly taken to be [[Normal distribution\|Gaussian]] :<math> \rho \big ( \left \Vert \mathbf{x} - \mathbf{c}_i \right \Vert \big ) = \exp \left[ -\~~beta~~beta_i \left \Vert \mathbf{x} - \mathbf{c}_i \right \Vert ^2 \right] </math>. The Gaussian basis functions are local to the center vector in the sense that Line 40 ⟶ 69: i.e. changing parameters of one neuron has only a small effect for input values that are far away from the center of that neuron. Given certain mild conditions on the shape of the activation function, RBF networks are [[universal approximator]]s on a [[Compact space\|compact]] subset of <math>\mathbb{R}^n</math>.<ref name="Park">{{cite journal\|last=Park\|first=J.\|author2=I. W. Sandberg\|s2cid=34868087\|date=Summer 1991\|title=Universal Approximation Using Radial-Basis-Function Networks\|journal=Neural Computation~~\|date=Summer 1991~~\|volume=3\|issue=2\|pages=246–257\|~~url~~doi=~~http://www.mitpressjournals.org/doi/abs/~~10.1162/neco.1991.3.2.246\|~~accessdate~~pmid=~~26 March 2013\|doi=10.1162/neco.1991.3.2.246~~31167308}}</ref> This means that an RBF network with enough hidden neurons can approximate any [[continuous function]] on a closed, bounded set with arbitrary precision. The parameters <math> a_i </math>, <math> \mathbf{c}_i </math>, and <math> \beta_i </math> are determined in a manner that optimizes the fit between <math> \varphi </math> and the data. [[Image:Unnormalized radial basis functions.svg\|thumb\|250px\|right\|~~Figure 2:~~ Two unnormalized radial basis functions in one input dimension. The basis function centers are located at <math> c_1=0.75 </math> and <math> c_2=3.25 </math>. ]] ===~~Normalized~~Normalization=== {{multiple images \| align = right \| direction = vertical \| width = 250 \| image1 = Normalized radial basis functions.svg \| caption1~~=Figure~~ 3: = Two normalized radial basis functions in one input dimension ([[logistic function\|sigmoids]]). The basis function centers are located at <math> c_1=0.75 </math> and <math> c_2=3.25 </math>. \| image2 = 3 Normalized radial basis functions.svg \| caption2~~=Figure~~ 4: = Three normalized radial basis functions in one input dimension. The additional basis function has center at <math> c_3=2.75 </math>. \| image3 = 4 Normalized radial basis functions.svg \| caption3~~=Figure~~ 5: = Four normalized radial basis functions in one input dimension. The fourth basis function has center at <math> c_4=0 </math>. Note that the first basis function (dark blue) has become localized. }} Line 69 ⟶ 97: :<math> u \big ( \left \Vert \mathbf{x} - \mathbf{c}_i \right \Vert \big ) \ \stackrel{\mathrm{def}}{=}\ \frac { \rho \big ( \left \Vert \mathbf{x} - \mathbf{c}_i \right \Vert \big ) } { \sum_{j=1}^N \rho \big ( \left \Vert \mathbf{x} - \mathbf{c}_j \right \Vert \big ) } </math> is known as a "''normalized radial basis function"''. ====Theoretical motivation for normalization==== Line 95 ⟶ 123: :<math> P\left ( y \mid \mathbf{x} \right ) </math> is the conditional probability of y given <math> \mathbf{x} </math>. The conditional probability is related to the joint probability through [[Bayes' theorem]] :<math> P\left ( y \mid \mathbf{x} \right ) = \frac {P \left ( \mathbf{x} \land y \right )} {P \left ( \mathbf{x} \right )} </math> Line 131 ⟶ 159: :<math> v_{ij}\big ( \mathbf{x} - \mathbf{c}_i \big ) \ \stackrel{\mathrm{def}}{=}\ \begin{cases} \delta_{ij} \rho \big ( \left \Vert \mathbf{x} - \mathbf{c}_i \right \Vert \big ) , & \mbox{if } i \in [1,N] \\ \left ( x_{ij} - c_{ij} \right ) \rho \big ( \left \Vert \mathbf{x} - \mathbf{c}_i \right \Vert \big ) , & \mbox{if }i \in [N+1,2N] \end{cases} </math> in the unnormalized case and in the normalized case.▼ ▲in the unnormalized case and :<math> v_{ij}\big ( \mathbf{x} - \mathbf{c}_i \big ) \ \stackrel{\mathrm{def}}{=}\ \begin{cases} \delta_{ij} u \big ( \left \Vert \mathbf{x} - \mathbf{c}_i \right \Vert \big ) , & \mbox{if } i \in [1,N] \\ \left ( x_{ij} - c_{ij} \right ) u \big ( \left \Vert \mathbf{x} - \mathbf{c}_i \right \Vert \big ) , & \mbox{if }i \in [N+1,2N] \end{cases} </math> ~~in the normalized case.~~ Here <math> \delta_{ij} </math> is a [[Kronecker delta function]] defined as :<math> \delta_{ij} = \begin{cases} 1, & \mbox{if }i = j \\ 0, & \mbox{if }i \ne j \end{cases} </math>. ==Training== RBF networks are typically trained from pairs of input and target values <math>\mathbf{x}(t), y(t)</math>, <math>t = 1, \dots, T</math> by a two-step algorithm. In the first step, the center vectors <math>\mathbf c_i</math> of the RBF functions in the hidden layer are chosen. This step can be performed in several ways; centers can be randomly sampled from some set of examples, or they can be determined using [[k-means clustering]]. Note that this step is [[unsupervised learning\|unsupervised]]. Line 179 ⟶ 200: \|journal = Neural Networks \|volume = 14 \|issue = 4–5 \|pages = 439–458 \|year = 2001 \|citeseerx = 10.1.1.109.312 \|doi=10.1016/s0893-6080(01)00027-2 \|pmid = 11411631 }}</ref> Line 205 ⟶ 228: \end{matrix} \right]</math> It can be shown that the interpolation matrix in the above equation is non-singular, if the points <math>\mathbf x_i</math> are distinct, and thus the weights <math>w</math> can be solved by simple [[linear algebra]]: :<math>\mathbf{w} = \mathbf{G}^{-1} \mathbf{b}</math> where <math>G = (g_{ij})</math>. Line 273 ⟶ 296: ===Logistic map=== The basic properties of radial basis functions can be illustrated with a simple mathematical map, the [[logistic map]], which maps the [[unit interval]] onto itself. It can be used to generate a convenient prototype data stream. The logistic map can be used to explore [[function approximation]], [[time series prediction]], and [[control theory]]. The map originated from the field of [[population dynamics]] and became the prototype for [[chaos theory\|chaotic]] time series. The map, in the fully chaotic regime, is given by :<math> x(t+1)\ \stackrel{\mathrm{def}}{=}\ f\left [ x(t)\right ] = 4 x(t) \left [ 1-x(t) \right ] </math> Line 348 ⟶ 371: :<math> {x}^{ }_{ }(t+1) = 4 x(t) [1-x(t)] +c[x(t),t] </math>. The goal is to choose the control parameter in such a way as to drive the time series to a desired output <math> d(t) </math>. This can be done if we choose the control ~~paramer~~parameter to be :<math> c^{ }_{ }[x(t),t] \ \stackrel{\mathrm{def}}{=}\ -\varphi [x(t)] + d(t+1) </math> Line 368 ⟶ 391: ==See also== * [[Radial basis function kernel]] * [[instance-based learning]] * [[In Situ Adaptive Tabulation]] * [[Predictive analytics]] Line 374 ⟶ 398: * [[Cerebellar model articulation controller]] * [[Instantaneously trained neural networks]] * [[Support vector machine]] ==References== Line 381 ⟶ 406: * J. Moody and C. J. Darken, "Fast learning in networks of locally tuned processing units," Neural Computation, 1, 281-294 (1989). Also see [https://web.archive.org/web/20070302175857/http://www.ki.inf.tu-dresden.de/~fritzke/FuzzyPaper/node5.html Radial basis function networks according to Moody and Darken] * T. Poggio and F. Girosi, "[http://courses.cs.tamu.edu/rgutier/cpsc636_s10/poggio1990rbf2.pdf Networks for approximation and learning]," Proc. IEEE 78(9), 1484-1487 (1990). * ~~[[Roger Jones (physicist and entrepreneur)\|~~Roger D. Jones]], Y. C. Lee, C. W. Barnes, G. W. Flake, K. Lee, P. S. Lewis, and S. Qian, ?[~~http~~https://ieeexplore.ieee.org/~~xpl~~Xplore/~~freeabs_all~~home.jsp~~?arnumber~~;jsessionid=~~137644~~1BAA8854614AFC21D2C29CDB4FC7DBEB Function approximation and time series prediction with neural networks],? Proceedings of the International Joint Conference on Neural Networks, June 17–21, p. I-649 (1990). * {{cite book \| author=Martin D. Buhmann \| title=Radial Basis Functions: Theory and Implementations \| publisher= Cambridge University\| year=2003 \| isbn=0-521-63338-9}} * {{cite book \|author1=Yee, Paul V. \|author2=Haykin, Simon \|~~lastauthoramp~~name-list-style=~~yes~~amp \| title=Regularized Radial Basis Function Networks: Theory and Applications \| publisher= John Wiley\| year=2001 \| isbn=0-471-35349-3}} * {{cite book\|first1=John R. \|last1=Davies, \|first2=Stephen V. \|last2=Coggeshall, [[\|author3-link=Roger Jones (physicist and entrepreneur)\|first3=Roger D. \|last3=Jones~~]], and~~\|first4= Daniel \|last4=Schutzer~~, "~~\|contribution=Intelligent Security Systems~~," in {{cite book~~ \| ~~author~~editor1-last=Freedman, \|editor1-first=Roy S.,\|editor2-last= Flein,\|editor2-first= Robert A.~~, and~~\|editor3-last= Lederman,\|editor3-first= Jess~~, Editors~~ \| title=Artificial Intelligence in the Capital Markets \| ___location= Chicago \| publisher=Irwin\| year=1995 \| isbn=1-55738-811-3}} * {{cite book \| author=Simon Haykin \| title=Neural Networks: A Comprehensive Foundation \| edition=2nd \| ___location=Upper Saddle River, NJ \| publisher=Prentice Hall\| year=1999 \| isbn=0-13-908385-5}} * S. Chen, C. F. N. Cowan, and P. M. Grant, "[https://eprints.soton.ac.uk/251135/1/00080341.pdf Orthogonal Least Squares Learning Algorithm for Radial Basis Function Networks]", IEEE Transactions on Neural Networks, Vol 2, No 2 (Mar) 1991. [[Category:~~Artificial~~Neural ~~neural~~network ~~networks~~architectures]] [[Category:Computational statistics]] [[Category:Classification algorithms]] [[Category:Machine learning algorithms]] [[Category:Regression analysis]] [[Category:1988 in artificial intelligence]]