TY - JOUR
T1 - Multilayer Neural Networks for Reduced-Rank Approximation
AU - Diamantaras, Konstantinos I.
AU - Kung, Sun Yuan
N1 - Funding Information:
Manuscript received December 17, 1991; revised September 29, 1992. This work was supported in part by the Air Force Office of Scientific Research under Grant AFOSR-89-0501A. K. I. Diamantaras is with Siemens Corporate Research, Princeton, NJ 08540 USA. S.-Y. Kung is with the Department of Electrical Engineering, Princeton University, Princeton, NJ 08544 USA. IEEE Log Number 9205984.
PY - 1994/9
Y1 - 1994/9
N2 - This paper is developed in two parts. First, we formulate the solution to the general reduced-rank linear approximation problem relaxing the invertibility assumption of the input autocorrelation matrix used by previous authors. Our treatment unifies linear regression, Wiener filtering, full rank approximation, auto-association networks, SVD and Principal Component Analysis (PCA) as special cases. Our analysis also shows that two-layer linear neural networks with reduced number of hidden units, trained with the least-squares error criterion, produce weights that correspond to the Generalized Singular Value Decomposition of the input-teacher cross-correlation matrix and the input data matrix. As a corollary the linear two-layer backpropagation model with reduced hidden layer extracts an arbitrary linear combination of the generalized singular vector components. Second, we investigate artificial neural network models for the solution of the related generalized eigenvalue problem. By introducing and utilizing the extended concept of deflation (originally proposed for the standard eigenvalue problem) we are able to find that a sequential version of linear BP can extract the exact generalized eigenvector components. The advantage of this approach is that it’s easier to update the model structure by adding one more unit or pruning one or more units when our application requires it. An alternative approach for extracting the exact components is to use a set of lateral connections among the hidden units trained in such a way as to enforce orthogonality among the upper and lower-layer weights. We shall call this the Lateral Orthogonalization Network (LON) and we’ll show via theoretical analysis—and verify via simulation—that the network extracts the desired components. The advantage of the LON-based model is that it can be applied in a parallel fashion so that the components are extracted concurrently. Finally, we show the application of our results to the solution of the identification problem of systems whose excitation has non-invertible autocorrelation matrix. Previous identification methods usually rely on the invertibility assumption of the input autocorrelation, therefore they can not be applied to this case.
AB - This paper is developed in two parts. First, we formulate the solution to the general reduced-rank linear approximation problem relaxing the invertibility assumption of the input autocorrelation matrix used by previous authors. Our treatment unifies linear regression, Wiener filtering, full rank approximation, auto-association networks, SVD and Principal Component Analysis (PCA) as special cases. Our analysis also shows that two-layer linear neural networks with reduced number of hidden units, trained with the least-squares error criterion, produce weights that correspond to the Generalized Singular Value Decomposition of the input-teacher cross-correlation matrix and the input data matrix. As a corollary the linear two-layer backpropagation model with reduced hidden layer extracts an arbitrary linear combination of the generalized singular vector components. Second, we investigate artificial neural network models for the solution of the related generalized eigenvalue problem. By introducing and utilizing the extended concept of deflation (originally proposed for the standard eigenvalue problem) we are able to find that a sequential version of linear BP can extract the exact generalized eigenvector components. The advantage of this approach is that it’s easier to update the model structure by adding one more unit or pruning one or more units when our application requires it. An alternative approach for extracting the exact components is to use a set of lateral connections among the hidden units trained in such a way as to enforce orthogonality among the upper and lower-layer weights. We shall call this the Lateral Orthogonalization Network (LON) and we’ll show via theoretical analysis—and verify via simulation—that the network extracts the desired components. The advantage of the LON-based model is that it can be applied in a parallel fashion so that the components are extracted concurrently. Finally, we show the application of our results to the solution of the identification problem of systems whose excitation has non-invertible autocorrelation matrix. Previous identification methods usually rely on the invertibility assumption of the input autocorrelation, therefore they can not be applied to this case.
UR - http://www.scopus.com/inward/record.url?scp=0028497221&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=0028497221&partnerID=8YFLogxK
U2 - 10.1109/72.317721
DO - 10.1109/72.317721
M3 - Article
C2 - 18267843
AN - SCOPUS:0028497221
SN - 1045-9227
VL - 5
SP - 684
EP - 697
JO - IEEE Transactions on Neural Networks
JF - IEEE Transactions on Neural Networks
IS - 5
ER -