Calculating principal components of data with VERY HIGH dimensionality

Suppose we have a set of data X with dimensionality D and total number of data is N. You might encounter insufficient memory problem when D is a large number or simply waste too much time calculating those redundant components when D >> N. Using singular value decomposition can prevent those problems.

From singular value decomposition, we know that a D-by-N matrix X can be decomposed as , where U is composed of the eigenvectors of and of the size D-by-D and V is composed of the eigenvectors of and of size N-by-N. Because is an identity matrix(column vectors in V are orthonormal), the first equation can be written as . From the question, the first m (m<=N) columns of matrix U are actually the principle components of X. We can actually get the first effect m columns of U without actually calculating . Here is how it works.

Compute .

Compute the eigevectors and eigenvalues of .

Because , where is the diagonal matrix of eigenvalues of ,we can conclude, from the last two terms of the euqation, where is the i-th singular value in S, and is the i-th eigenvalues of .

The i-th column of U can be computed from .

Based on the derivations above, the matlab code is as follows

Just found out that the above algorithm can be replaced by a single matlab command.

[U S V] = svd(X,0);

svd(X,0) produce the economy size decomposition. That is only the first N columns of U is returned, given that size of X is D-by-N. Please note that the first parameter of svd is X not X’X.Please leave a comment if this article helps! 🙂

Advertisements

Like this:

LikeLoading...

Related

This entry was posted on Friday, March 14th, 2008 at 4:03 am and is filed under computer vision, matlab. You can follow any responses to this entry through the RSS 2.0 feed.
You can leave a response, or trackback from your own site.

One Response to Calculating principal components of data with VERY HIGH dimensionality

This decomposition of SVD helps me lot..