UCLA Academic Technology Services HomeServicesClassesContactJobs
Help the Stat Consulting Group by giving a gift             
Loading

R Code Fragments
Examples of using singular value decomposition in R

Singular value decomposition (SVD) is a type of matrix factorization.  For more details on SVD, the Wikipedia page is a good starting point.  On this page, we provide four examples of data analysis using SVD in R.

Example 1: SVD to find a generalized inverse of a non-full-rank matrix

For a square matrix A with a non-zero determinant, there exists an inverse matrix B such that AB = I and BA = I.  For a matrix that is not square, generalized inverse matrices have some (but not all) of the properties of an inverse matrix. SVD can be used to find a generalized inverse matrix.  In the example below, we use SVD to find a generalized inverse B to the matrix A such that ABA = A. We compare our generalized inverse with the one generated by the ginv command.


library(MASS)

a <- matrix(c(1,1,1,1,1,1,1,1,1,
              1,1,1,0,0,0,0,0,0,
              0,0,0,1,1,1,0,0,0,
              0,0,0,0,0,0,1,1,1), 9, 4)

a.svd <- svd(a)
a.svd$d

3.464e+00 1.732e+00 1.732e+00 9.688e-17


ds <- diag(1 / a.svd$d[1:3])
u <- a.svd$u
v <- a.svd$v
us <- as.matrix(u[, 1:3])
vs <- as.matrix(v[, 1:3])

(a.ginv <- vs %*% ds %*% t(us))

         [,1]     [,2]     [,3]     [,4]     [,5]     [,6]     [,7]     [,8]     [,9]
[1,]  0.08333  0.08333  0.08333  0.08333  0.08333  0.08333  0.08333  0.08333  0.08333
[2,]  0.25000  0.25000  0.25000 -0.08333 -0.08333 -0.08333 -0.08333 -0.08333 -0.08333
[3,] -0.08333 -0.08333 -0.08333  0.25000  0.25000  0.25000 -0.08333 -0.08333 -0.08333
[4,] -0.08333 -0.08333 -0.08333 -0.08333 -0.08333 -0.08333  0.25000  0.25000  0.25000


# using the function ginv defined in MASS
ginv(a)

         [,1]     [,2]     [,3]     [,4]     [,5]     [,6]     [,7]     [,8]     [,9]
[1,]  0.08333  0.08333  0.08333  0.08333  0.08333  0.08333  0.08333  0.08333  0.08333
[2,]  0.25000  0.25000  0.25000 -0.08333 -0.08333 -0.08333 -0.08333 -0.08333 -0.08333
[3,] -0.08333 -0.08333 -0.08333  0.25000  0.25000  0.25000 -0.08333 -0.08333 -0.08333
[4,] -0.08333 -0.08333 -0.08333 -0.08333 -0.08333 -0.08333  0.25000  0.25000  0.25000

Example 2: Image Processing

The code below requires the rimage package. It reads in a jpeg (pansy.jpg) and plots it in R, first in color (when the image is stored as three matrices--one red, one green, one blue) and then in grayscale (when the image is stored as one matrix). Then, using SVD, we can essentially compress the image. Note that we can recover the image to varying degrees of detail as we recreate the image from different numbers of dimensions from our SVD matrices. You can see how many dimensions are needed before you have an image that cannot be differentiated from the original.


library(rimage)
x <- read.jpeg("pansy.jpg")
dim(x)

[1] 600 465   3

plot(x, useRaster = TRUE)


r <- imagematrix(x, type="grey")
dim(r)

[1] 600 465


plot(r, useRaster= TRUE)



r.svd <- svd(r)
d <- diag(r.svd$d)
dim(d)

[1] 465 465

u <- r.svd$u
v <- r.svd$v
plot(1:length(r.svd$d), r.svd$d)


# first approximation
u1 <- as.matrix(u[-1, 1])
v1 <- as.matrix(v[-1, 1])
d1 <- as.matrix(d[1, 1])
l1 <- u1 %*% d1 %*% t(v1)
l1g <- imagematrix(l1, type = "grey")
plot(l1g, useRaster = TRUE)


# more approximation
depth <- 5
us <- as.matrix(u[, 1:depth])
vs <- as.matrix(v[, 1:depth])
ds <- as.matrix(d[1:depth, 1:depth])
ls <- us %*% ds %*% t(vs)
lsg <- imagematrix(ls, type = "grey")
plot(lsg, useRaster = TRUE)

Example 3: Principal Components Analysis using SVD

This example uses the Stata dataset auto.dta. PCA can be achieved using SVD.  Below, we first use the prcomp command in R and then plot the variances of the principal components (i.e. the square roots of the eigenvalues). These values can also be found through spectral decomposition on the correlation matrix or by SVD on the variable matrix after standardizing each variable. 


llibrary(foreign)
auto <- read.dta("http://statistics.ats.ucla.edu/stat/data/auto.dta")

pca.m1 <- prcomp(~ trunk + weight + length + headroom, data = auto, scale = TRUE)

screeplot(pca.m1)


# spectral decomposition: eigen values and eigen vectors
xvars <- with(auto, cbind(trunk, weight, length, headroom))
corr <- cor(xvars)
a <- eigen(corr)
(std <- sqrt(a$values))

[1] 1.7379 0.8075 0.5264 0.2249
(rotation <- a$vectors)
        [,1]    [,2]    [,3]      [,4]
[1,] -0.5068 -0.2327  0.8249  0.092146
[2,] -0.5221  0.4536 -0.2677  0.670840
[3,] -0.5361  0.3903 -0.1370 -0.735833
[4,] -0.4280 -0.7667 -0.4786 -0.005704

# svd approach
df <- nrow(xvars) - 1
zvars <- scale(xvars)
z.svd <- svd(zvars)
z.svd$d/sqrt(df)

[1] 1.7379 0.8075 0.5264 0.2249
z.svd$v
       [,1]    [,2]    [,3]      [,4]
[1,] 0.5068 -0.2327  0.8249 -0.092146
[2,] 0.5221  0.4536 -0.2677 -0.670840
[3,] 0.5361  0.3903 -0.1370  0.735833
[4,] 0.4280 -0.7667 -0.4786  0.005704

Example 4: Metric multi-dimensional scaling with SVD

This example uses the Stata dataset cerealnut.dta. Multi-dimensional scaling can also be achieved using SVD. The plots generated using cmdscale and the coordinates generated from the SVD steps are mirrored about the x/x1 = 0 axis, but are otherwise identical.


cnut <- read.dta("http://statistics.ats.ucla.edu/stat/data/cerealnut.dta")

# centering the variables
mds.data <- as.matrix(sweep(cnut[, -1], 2, colMeans(cnut[, -1])))
dismat <- dist(mds.data)
mds.m1 <- cmdscale(dismat, k = 8, eig = TRUE)
mds.m1$eig

 [1]  1.584e+05  1.087e+05  1.056e+04  3.827e+02  6.976e+01  1.252e+01  5.756e+00
 [8]  2.224e+00  6.467e-12  5.759e-12  2.660e-12  2.376e-12  2.214e-12  1.020e-12
[15]  7.264e-13  5.392e-13  3.527e-13 -5.217e-14 -3.928e-13 -6.939e-13 -1.011e-12
[22] -1.098e-12 -1.448e-12 -3.814e-12 -1.241e-11


mds.m1 <- cmdscale(dismat, k = 2, eig = TRUE)
x <- mds.m1$points[, 1]
y <- mds.m1$points[, 2]
plot(x, y)
text(x + 20, y, label = cnut$brand)


# eigenvalues
xx <- svd(mds.data %*% t(mds.data))
xx$d

 [1] 1.584e+05 1.087e+05 1.056e+04 3.827e+02 6.976e+01 1.252e+01 5.756e+00 2.224e+00
 [9] 8.195e-12 5.658e-12 4.377e-12 3.204e-12 2.204e-12 2.142e-12 1.816e-12 1.708e-12
[17] 1.292e-12 1.239e-12 1.146e-12 8.363e-13 6.949e-13 4.491e-13 2.551e-13 1.103e-13
[25] 5.319e-14


# coordinates
xxd <- xx$v %*% sqrt(diag(xx$d))
x1 <- xxd[, 1]
y1 <- xxd[, 2]

plot(x1, y1)
text(x1 + 20, y1, label = cnut$brand)



How to cite this page

Report an error on this page or leave a comment

UCLA Researchers are invited to our Statistical Consulting Services
We recommend others to our list of Other Resources for Statistical Computing Help
These pages are Copyrighted (c) by UCLA Academic Technology Services


The content of this web site should not be construed as an endorsement of any particular web site, book, or software product by the University of California