- 人工知能学会誌 (ISSN:09128085)
- vol.5, no.5, pp.595-603, 1990-09-01
Learning in connectionist models has two aspects : the first aspect being the reproduction of the mapping from input to output patterns, and the second being the discovery of regularity in these training patterns. The backpropagation learning algorithm stresses the former aspect, as can be seen from its criterion function of the sum of squared output errors. The present paper, on the other hand, lays emphasis on the latter aspect. In the backpropagation learning of feedforward type models it is of nesessity to determine, beforehand, the number of layers and the number of hidden units in each layer. Since this prior determination is, in general, difficult, a trial and error procedure is inevitable, which is quite time consuming. To overcome this difficulty and to generate a small sized network, the present paper proposes a learning algorithm with forgetting of link weights. This forgetting is realized by adding the sum of absolute values of link weights to the criterion in the backpropagation algorithm. This algorithm generates a skeletal structure, in which the numbers of links and units used are kept as small as possible. As by-products of this algorithm it has various advantages : ease of interpretation of hidden units and improved generalization power of the resulting models. This algorithm alone causes the following two difficulties : emergence of distributed representation on hidden layers, which makes the interpretation of hidden units difficult, and a poor criterion value after learning due to the added term in the criterion function. To resolve these difficulties a structural learning algorithm is proposed, which consists of a series of algorithms : the learning algorithm with forgetting, a hidden units clarification algorithm, and a learning algorithm with selective forgetting. This structural learning algorithm is applied to a problem of discovering a logical function from a given set of pairs of input and output logical values. It is well demonstrated that the resulting skeletal network represents logical structure of the given problem. On the contrary the backpropagation algorithm generates a network far from being skeletal, making the interpretation of hidden units quite difficult. This algorithm is applied to another problem of classifying iris data by Fischer.Generalization power of the structural learning algorithm and that of the backpropagation algorithm are compared. The result of the comparison clearly demonstrates that the former has greater generalization power than the latter.