Subsections

2.2.1 Singular Value Decomposition and Eigenvalue Decomposition

2.2.1.1 The Singular Value Decomposition (SVD)

There is a well known algorithm to determine the singular values of a nonsquare matrix $ \mathbf{A}$, the Singular Value Decomposition (SVD).

A SVD of $ \mathbf{A}\in\mathbb{R}^{m\times n}$ is:

$ \mathbf{A}=\mathbf{U\Sigma V}^{\mathrm{T}}$, where $ \mathbf{U}$ and $ \mathbf{V}$ are orthogonal $ m\times m$ resp. $ n\times n$-matrices, i.e. $ \mathbf{UU}^{\mathrm{T}}=\mathbf{I}\in\mathbb{R}^{m\times m},$ $ \mathbf{VV}
^{\mathrm{T}}=\mathbf{I}\in\mathbb{R}^{n\times n}$, which causes $ \left\Vert
\mathbf{U}\right\Vert =\left\Vert \mathbf{V}\right\Vert =1$.

$ \mathbf{\Sigma}$ is a ``diagonal'' $ m\times n$-Matrix; with singular values $ \sigma_{i}$ in the diagonal (see below), and $ \sigma_{1}\geq\sigma_{2}\geq \dots\geq\sigma_{n}\geq0.$

Now: $ \left\Vert \mathbf{r}\right\Vert =\left\Vert \mathbf{Ax-b}\right\Vert =\left\V...
...erset{\mathbf{c}}{\underbrace{\mathbf{U}^{\mathrm{T}}\mathbf{b}}}
\right\Vert .$ The last equality is true, since we have multiplied by $ \mathbf{%%
 U }^{\mathrm{-1}}=\mathbf{U}^{\mathrm{T}},$ and $ \left\Vert \mathbf{U}^{%%
\mathrm{ T} }\right\Vert =\left\Vert \mathbf{U}\right\Vert =1$, since $ \mathbf{U}$ is orthogonal.

So minimizing $ \left\Vert \mathbf{r}\right\Vert $ is equivalent to minimizing $ \left\Vert \mathbf{\Sigma\cdot z-c}\right\Vert$. Written out in components, this means minimizing

$\displaystyle \left\Vert \left( \begin{array}{ccc} \sigma_{1} & & 0  & \ddots...
... \vdots  c_{n}  c_{n+1}  \vdots  c_{m} \end{array} \right) \right\Vert$    with $\displaystyle \mathbf{z:=V}^{\mathrm{T}}\mathbf{x}$ and  $\displaystyle \mathbf{x=Vz}.$    


We now choose $ z_{k}$ so that $ \left\Vert \mathbf{r}\right\Vert $ is minimal, or:

$\displaystyle z_{k}=\frac{c_{k}}{\sigma_{k}}$    for $\displaystyle k=1\ldots n$    

when $ \sigma_{k}>0.$

2.2.1.2 The actual calculation of the SVD

2.2.1.2.1 A desirable improvement to minimize the number of calculations

In our kind of problem we have $ n=21$ and $ m>10000.$ To avoid matrices of size 10000, it would be nice if we could calculate with $ \mathbf{A}^{\mathrm{%%
T}}\mathbf{A}\in\mathbb{R}^{n\times n}, \quad\mathbf{b}^{\mathrm{T} }
\mathbf{A}\in\mathbb{R}^{1\times n},\quad$and $ \mathbf{b}^{\mathrm{T}}
\mathbf{b}\in\mathbb{R}$.

This can be done, especially since $ \mathbf{A}^{\mathrm{T}}\mathbf{A}$, $ %%
\mathbf{b}^{\mathrm{T}}\mathbf{A}$ and $ \mathbf{b}^{\mathrm{T}}\mathbf{b}$ can be calculated in the following easy way, when moving the sliding window across the sequences:

Let e.g. $ RAADT$ be a window of AAs at position $ i,$ (i.e. the midpoint of the window is at position $ i$). Let $ \mathbf{A}_{i}$ be the vector such that $ \mathbf{A}_{i}\cdot \mathbf{x}\triangleq R+A+A+D+T+k_{0}$, i.e. $ \mathbf{A}%%
_{i}=\left( 2,0,1,\ldots ,0\right) .$ In this case $ \mathbf{A}_{i}$ is the $ %%
i^{\mathrm{th}}$ row of the matrix $ \mathbf{A}.$

Let $ b_{i}$ be 1 if the AA at position $ i$ is part of an $ \alpha$-helix, 0 if the AA at position $ i$ is not part of an $ \alpha$-helix.

Starting with a 0-matrix for $ \mathbf{A}^{\mathrm{T}}\mathbf{A}$, a 0-vector for $ \mathbf{b}^{\mathrm{T}}\mathbf{A}$ and $ \mathbf{b}^{\mathrm{T}}\mathbf{b}=0$ we compute for $ i=1$ to $ m:$

$\displaystyle \mathbf{A}^{\mathrm{T}}\mathbf{A}+\mathbf{A}_{i}^{\mathrm{T}}\cdot \mathbf{A} _{i}$ $\displaystyle \longrightarrow \mathbf{A}^{\mathrm{T}}\mathbf{A}$    
$\displaystyle \mathbf{b}^{\mathrm{T}}\mathbf{A+b}_{i}\cdot \mathbf{A}_{i}^{\mathrm{T}}$ $\displaystyle \longrightarrow \mathbf{b}^{\mathrm{T}}\mathbf{A}$    
$\displaystyle \mathbf{b}^{\mathrm{T}}\mathbf{b+b}_{i}^{2}$ $\displaystyle \longrightarrow \mathbf{b}^{ \mathrm{T}}\mathbf{b}$    

These equations hold, since:


$\displaystyle \mathbf{A}^{\mathrm{T}}\mathbf{A}$ $\displaystyle \mathbf{=}$ \begin{displaymath}\left(
\begin{array}{cccc}
a_{11} & a_{21} & \cdots & a_{m1} ...
...
a_{m1} & a_{m2} & a_{m3} & \cdots & a_{mn}
\end{array}\right)\end{displaymath}  
  $\displaystyle =$ \begin{displaymath}\left(
\begin{array}{cccc}
a_{11}^{2}+a_{21}^{2}+\cdots +a_{m...
..._{2n}^{2}+a_{3n}^{2}+\cdots
+a_{mn}^{2}%%
\end{array}%%
\right)\end{displaymath}  

and

$\displaystyle \sum_{i=1..m}\mathbf{A}_{i}^{\mathrm{T}}\cdot \mathbf{A}_{i}$ $\displaystyle =$ \begin{displaymath}\sum_{i=1..m}\left(
\begin{array}{c}
a_{i1} \\
a_{i2} \\
\v...
...cccc}
a_{i1} & a_{i2} & \cdots & a_{in}%%
\end{array}%%
\right)\end{displaymath}  
  $\displaystyle =$ \begin{displaymath}\sum_{i=1..m}\left(
\begin{array}{cccc}
a_{i1}^{2} & a_{i1}a_...
...1} & a_{in}a_{i2} & \cdots & a_{in}^{2}%%
\end{array}%%
\right)\end{displaymath}  
  $\displaystyle =$ \begin{displaymath}\left(
\begin{array}{cccc}
a_{11}^{2}+a_{21}^{2}+...+a_{m1}^{...
..._{1n}^{2}+a_{2n}^{2}+\cdots +a_{mn}^{2}%%
\end{array}%%
\right)\end{displaymath}  

Comparing both results shows the stated identity.

It's much easier to compare

$\displaystyle \mathbf{b}^{\mathrm{T}}\mathbf{A=}\left( \begin{array}{cccc} b_{1...
...3}  \vdots  b_{1}a_{1n}+b_{2}a_{2n}+\cdots +b_{m}a_{mn} \end{array} \right)$    

with

$\displaystyle \sum_{i=1..m}b_{i}\mathbf{A}_{i}^{\mathrm{T}}=\sum_{i=1..m}b_{i}\...
...ray}{c} b_{i}a_{i1}  b_{i}a_{i2}  \vdots  b_{i}a_{in} \end{array} \right)$    

and

$\displaystyle \mathbf{b}^{\mathrm{T}}\mathbf{b=}\left( \begin{array}{cccc} b_{1...
...d{array} \right) =b_{1}^{2}+b_{2}^{2}+\cdots +b_{m}^{2}=\sum_{i=1..m}b_{i}^{2}.$    

In the next subsection we'll show a solution to our problem, which requires $ \mathbf{A}^{\mathrm{T}}\mathbf{A,}$ $ \mathbf{b}^{\mathrm{T}}\mathbf{A}$ and $ \mathbf{b}^{\mathrm{T}}\mathbf{b}$ alone (and does not use $ \mathbf{A}$ or $ \mathbf{b).}$

2.2.1.3 Least squares using Eigenvalues

We can express $ \left\Vert \mathbf{r}\right\Vert ^{2}$ in the following way:

$\displaystyle \left\Vert \mathbf{r}\right\Vert ^{2}$ $\displaystyle =\left\Vert \mathbf{Ax-b}\right\Vert ^{2}$    
  $\displaystyle =(\mathbf{A}\mathbf{x}-\mathbf{b})^{\mathrm{T}}(\mathbf{A}\mathbf{x}- \mathbf{b})$    
  $\displaystyle =(\mathbf{x}^{\mathrm{T}}\mathbf{A}^{\mathrm{T}}-\mathbf{b}^{\mathrm{T}})( \mathbf{A}\mathbf{x}-\mathbf{b})$    
  $\displaystyle =\mathbf{x}^{\mathrm{T}}\mathbf{A}^{\mathrm{T}}\mathbf{A}\mathbf{...
...mathbf{b-b}^{\mathrm{T}} \mathbf{A}\mathbf{x}+\mathbf{b}^{\mathrm{T}}\mathbf{b}$    
  $\displaystyle =\mathbf{x}^{\mathrm{T}}\underbrace{\mathbf{A}^{\mathrm{T}}\mathb...
...{A}^{\mathrm{T}} \mathbf{b}}+\underbrace{\mathbf{b}^{\mathrm{  T}}\mathbf{b}}$, since $\displaystyle \mathbf{x}^{\mathrm{T}}\mathbf{A}^{\mathrm{T}}\mathbf{b=\mathbf{b}^{\mathrm{T}}\mathbf{A}\mathbf{x}\in}\mathbb{R}.$    

Using the eigenvalue decomposition of $ \mathbf{A}^{\mathrm{T}}\mathbf{A}$ (notice, that $ \mathbf{A}^{\mathrm{T}}\mathbf{A}$ is symmetric!) we see:

$ \mathbf{A}^{\mathrm{T}}\mathbf{A=W\Lambda W}^{\mathrm{T}}$, where $ \mathbf{
 W}$ is an orthonormal $ n\times n$-matrix, and $ \mathbf{\Lambda} $ is a diagonal matrix with diagonal-entries (eigenvalues) $ \lambda _{i}$ and $ \lambda _{1}\geq \lambda _{2}\geq \cdots \geq \lambda _{n}$, hence $ \Vert
\mathbf{r}\Vert =\mathbf{x}^{\mathrm{T}}\mathbf{W\Lambda W}^{\mathrm{T}...
...f{A}^{\mathrm{T}}
\mathbf{b}}+\underbrace{\mathbf{b}^{\mathrm{ T}}\mathbf{b}}.$

Applying again the definition $ \mathbf{z}:=\mathbf{W}^{\mathrm{T}}\mathbf{x}$ and hence $ \mathbf{x}=\mathbf{Wz}$, we get:

$ \left\Vert \mathbf{r}\right\Vert ^{2}=\mathbf{z}^{\mathrm{T}}\mathbf{\Lambda}
...
...bf{W}^{\mathrm{T}}\mathbf{ A}^{
\mathrm{T}}\mathbf{b+b}^{\mathrm{T}}\mathbf{b}$

Now let $ \mathbf{W}^{\mathrm{T}}\mathbf{A}^{\mathrm{T}}\mathbf{b}=: \mathbf{d}$, then, since $ dim\left(\mathbf{d}\right) =n$, it holds:

$\displaystyle \left\Vert \mathbf{r}\right\Vert ^{2}$ $\displaystyle =\mathbf{z}^{\mathrm{T}}\mathbf{\Lambda z-2z}^{\mathrm{T}}\mathbf{d+b}^{\mathrm{T}}\mathbf{b}$    
  $\displaystyle =\sum_{i=1}^{n}\left( \lambda_{i}\mathbf{z}_{i}^{2}-2\mathbf{d}_{i}\mathbf{  z}_i +\mathbf{b}_i^2\right)$    

The minimum of this equation is achieved for the minimum of each quadratic term, i.e. the derivative of each term with respect to $ z_{i}:$

$\displaystyle 2\lambda_{i}\mathbf{z}_{i}-2\mathbf{d}_{i}$ $\displaystyle =0$    
$\displaystyle \mathbf{z}_{i}$ $\displaystyle =\frac{\mathbf{d}_{i}}{\lambda_{i}}$    

2.2.1.4 The relation between the SVD of $ \mathbf{A}$ and the eigenvalue-decomposition of $ \mathbf{A}^{\mathrm{T}}\mathbf{A}$

Let $ \mathbf{{A}={U}\Sigma {V}^{\mathrm{T}}}$ be a SVD of $ \mathbf{A}$. Then $ \mathbf{A}^{\mathrm{T}}\mathbf{A=V\Sigma }^{\mathrm{T}}\mathbf{U}^{\mathrm{T}}...
...}^{\mathrm{T}}=\mathbf{V\Sigma }^{\mathrm{T}}\mathbf{%%
\Sigma V}^{\mathrm{T}}.$

Since $ \mathbf{\Sigma }^{\mathrm{T}}\mathbf{\Sigma }$ is a diagonal matrix, with diagonal entries $ \sigma _{i}^{2}$, it's obvious, that $ \mathbf{A}^{%%
\mathrm{T}}\mathbf{A=W\Lambda W}^{\mathrm{T}}=\mathbf{V\Sigma }^{\mathrm{T}}%%
\mathbf{\Sigma V}^{\mathrm{T}}$. So we can conclude, that $ \mathbf{\Sigma }^{%%
\mathrm{T}}\mathbf{\Sigma =\Lambda ,}$ and $ \mathbf{W}=\mathbf{V}$. In other words: $ \mathbf{ \sigma }_{i}=\sqrt{\lambda _{i}}$, and $ (\mathbf{V,\Lambda
})$ is the eigenvector/eigenvalue decomposition of $ \mathbf{A}^{\mathrm{T}}%%
\mathbf{A}$.

(Remark: $ \mathbf{AA}^{\mathrm{T}}=\mathbf{U\Sigma V}^{\mathrm{T}}\mathbf{\
V\Sigma }^{...
...}^{\mathrm{T}}=\mathbf{U\Sigma \Sigma }^{%%
\mathrm{T}}\mathbf{U}^{\mathrm{T}},$ so $ (\mathbf{U,\Lambda })$ is the eigenvector/eigenvalue decomposition of $ \mathbf{AA}^{\mathrm{T}}$. This means, that if we would need $ \mathbf{U}$ we would have to compute $ \mathbf{AA}^{\mathrm{T}}$. Fortunately this is normally not needed.)

SVD: $ \mathbf{{A}={U}\Sigma {V}}^{\mathrm{T}}$         (*)
EV: $ \mathbf{A}^{\mathrm{T}}\mathbf{{A}={V}\Lambda {V}}^{\mathrm{T}}%%
\mathbf{\qquad V}$ is the same for SVD and EV - see above!
Inserting (*) for $ \mathbf{A}$ leads to $ \mathbf{{V}\Sigma }^{\mathrm{T}}%%
\mathbf{U}^{\mathrm{T}}\mathbf{{U}\Sigma {V...
...\mathrm{T}}\mathbf{={V}\Sigma
}^{\mathrm{T}}\mathbf{\Sigma { V}}^{\mathrm{T}}$
  \begin{displaymath}\rightarrow \left(
\begin{array}{ccc}
\lambda _{1} & & 0  ...
...
& \ddots & \\
0 & & \sigma _{n}^{2}%%
\end{array}%%
\right) \end{displaymath}

Compare the interactive exercise ``SVD".

Gaston Gonnet, Institute for Scientific Computing, ETH Zürich, Switzerland
2002-02-24

With assistance from SkillsOnline and Web Pearls