Content deleted Content added
No edit summary |
No edit summary |
||
Line 32:
We note that in probability theory, a product of probabilities usually assumes that evidence is independent. Expression for L contains a product over n, but it does not assume independence among various signals '''X'''(n). There is a dependence among signals due to concept-models:each model '''M<sub>m</sub>'''('''S<sub>m</sub>''',n) predicts expected signal values in many neurons n.
During the learning process, concept-models are constantly modified. In this review we consider a case when functional forms of models, '''M<sub>m</sub>'''('''S<sub>m</sub>''',n), are all fixed and learning-adaptation involves only model parameters, '''S<sub>m</sub>'''. From time to time a system forms a new concept, while retaining an old one as well; alternatively, old concepts are sometimes merged or eliminated. This requires a modification of the similarity measure L; ); the reason is that more models always result in a better fit between the models and data. This is a well known problem, it is addressed by reducing similarity L using a “skeptic penalty function,” p(N,M) that grows with the number of models M, and this growth is steeper for a smaller amount of data N. For example, an asymptotically unbiased maximum likelihood estimation leads to multiplicative p(N,M) = exp(-N<sub>par</sub>/2), where N<sub>par</sub> is a total number of adaptive parameters in all models (this penalty function is known as [[Akaike
Psychologically, satisfaction of instincts is felt as pleasant emotions. Emotions related to satisfaction of the knowledge instinct (maximization of similarity measure L) are aesthetic emotions, they are “spiritual” in that they are related to working of the mind-brain (whereas bodily emotions are related to bodily instincts).
==Dynamic logic==
The learning process consists in estimating model parameters S and associating signals with concepts by maximizing the similarity L. Note, all possible combinations of signals and models are accounted for in expression for L. This can be seen by expanding a sum and multiplying all the terms; it would result in MN items, a huge number. This is the number of combinations between all signals (N) and all models (M). Here is the source of [[Combinatorial Complexity]] of many algorithms used in the past. For example, a popular multiple hypothesis testing algorithm<ref>Singer, R.A., Sea, R.G. and Housewright, R.B. (1974). Derivation and Evaluation of Improved Tracking Filters for Use in Dense Multitarget Environments, IEEE Transactions on Information Theory, IT-20, pp. 423-432.</ref> attempts to maximize similarity L over model parameters and associations between signals and models, in two steps. First it takes one of the MN items, which is one particular association between signals and models; and maximizes it over model parameters. Second, the largest item is selected (that is the best association for the best set of parameters). Such a program inevitably faces a wall of Combinatorial Complexity, the number of computations on the order of MN.
NMF solves this problem by using [[dynamic logic (neural)|dynamic logic]]<ref>Perlovsky, L.I. (1996). Mathematical Concepts of Intellect. Proc. World Congress on Neural Networks, San Diego, CA; Lawrence Erlbaum Associates, NJ, pp.1013-16</ref>, <ref>Perlovsky, L.I.(1997). Physical Concepts of Intellect. Proc. Russian Academy of Sciences, 354(3), pp. 320-323.</ref>. An important aspect of dynamic logic is ''matching vagueness or fuzziness of similarity measures to the uncertainty of models''. Initially, parameter values are not known, and uncertainty of models is high; so is the fuzziness of the similarity measures. In the process of learning, models become more accurate, and the similarity measure more crisp, the value of the similarity increases. This is the mechanism of dynamic logic. Mathematics of dynamic logicis described in a separate article.
|