Session 3: LU Factorization & Markov Process

Date: 09/25/2017, Monday

In [1]:
format compact

Read Canvas - Files - lectures- linear_algebra.pdf first before looking at this material.

LU=PA Factorization

Gaussian elimination consists of forward elimination and backward substitution.

The backward substition part is easy – You already have an upper diagnonal matrix \(U\), and just need to solve \(Ux=b\).

The forward elimination part is more interesting. By doing forward elimination by hand, you transformed the original matrix \(A\) to an upper diagnonal matrix \(U\). In fact, during this forward elimination process, you not only produced matrix \(U\), but also constructed two other matrices \(L\) and \(P\) (even if you didn’t even notice it!). They satisfy

\[LU=PA\]
  • L is a lower triangular matrix L with all diagonal elements being 1. It contains all the multipliers used during the forward elimination.
  • \(P\) is a permutation matrix containing only 0 and 1. It accounts for all row-swapings during the forward elimination.

Row operation as matrix multiplication

Basic idea

To understand how \(LU=PA\) works, you should always keep in mind that

row operation = left multiply

Or more verbosely

A row operation on matrix A = left-multiply a matrix L to A (i.e. calculate LA)

This is a crucial concept in linear algebra.

Let’s see an example:

In [2]:
A = [10 -7 0; -3 2 6 ;5 -1 5]
A =
    10    -7     0
    -3     2     6
     5    -1     5

Perform the first step of gaussian elimination, i.e. add 0.3*row1 to row2.

In [3]:
A1 = A; % make a copy
A1(2,:) = A(2,:)+0.3*A(1,:)
A1 =
   10.0000   -7.0000         0
         0   -0.1000    6.0000
    5.0000   -1.0000    5.0000

There’s another way to perform the above row-operation: left-multiply A by an elementary matrix.

In [4]:
L1 = [1,0,0; 0.3,1,0; 0,0,1] % make our elementary matrix
L1 =
    1.0000         0         0
    0.3000    1.0000         0
         0         0    1.0000
In [5]:
L1*A
ans =
   10.0000   -7.0000         0
         0   -0.1000    6.0000
    5.0000   -1.0000    5.0000

L1*A gives the same result as the previous row-operation!

Let’s repeat this idea again:

row operation = left multiply

Find elementary matrix

How to find out the matrix L1? Just perform the row-operation to an identity matrix

In [6]:
L1 = eye(3); % 3x3 identity matrix
L1(2,:) = L1(2,:)+0.3*L1(1,:) % the row-operation you want to "encode" into this matrix
L1 =
    1.0000         0         0
    0.3000    1.0000         0
         0         0    1.0000

Then you can perform L1*A to apply this row-operation on any matrix A.

Same for the permutation operation, as it is also an elementary row operation.

In [7]:
Ap = A; % make a copy
% swap raw 1 and 2
Ap(2,:) = A(1,:);
Ap(1,:) = A(2,:);
Ap
Ap =
    -3     2     6
    10    -7     0
     5    -1     5

You can “encode” this row-swapping operation into an elementary permutation matrix.

In [8]:
I = eye(3); % 3x3 identity matrix
P1 = I;

% swap raw 1 and 2
P1(2,:) = I(1,:);
P1(1,:) = I(2,:);
P1
P1 =
     0     1     0
     1     0     0
     0     0     1

Multiplying A by P1 is equivalent to permuting A directly:

In [9]:
P1*A % same as Ap
ans =
    -3     2     6
    10    -7     0
     5    -1     5

Get L during forward elimination

For simplicity, assume you don’t need permutation steps. Then you just transform an arbrary 3x3 matrix A (non-singular, of course) to an upper-diagnoal matrix U by 3 row operations. Such operations are equivalent to multiplying A by 3 matrices \(L_1,L_2,L_3\)

\[A \rightarrow L_1 A \rightarrow L_2 L_1 A \rightarrow L_3 L_2 L_1 A = U\]

We can rewrite it as

\[A = (L_3 L_2 L_1)^{-1}U\]

Or

\[A = LU , \ \ L= (L_3 L_2 L_1)^{-1}\]

It is easy to get \(L\) as long as you know \(L_1,L_2,L_3\) from the operations you’ve performed.

In [10]:
A % show A's value again
A =
    10    -7     0
    -3     2     6
     5    -1     5
In [11]:
L1 = [1,0,0; 0.3,1,0; 0,0,1] % repeat L1 again
L1 =
    1.0000         0         0
    0.3000    1.0000         0
         0         0    1.0000
In [12]:
L1*A % row operation by left-multiply
ans =
   10.0000   -7.0000         0
         0   -0.1000    6.0000
    5.0000   -1.0000    5.0000
In [13]:
L2 = [1,0,0; 0,1,0; -0.5,0,1] % build the next elimination step
L2 =
    1.0000         0         0
         0    1.0000         0
   -0.5000         0    1.0000
In [14]:
L2*L1*A % apply the next elimination step
ans =
   10.0000   -7.0000         0
         0   -0.1000    6.0000
         0    2.5000    5.0000
In [15]:
L3 = [1,0,0; 0,1,0; 0,25,1] % build the last elimination step
L3 =
     1     0     0
     0     1     0
     0    25     1
In [16]:
U = L3*L2*L1*A % apply the last elimination step
U =
   10.0000   -7.0000         0
         0   -0.1000    6.0000
         0         0  155.0000

Now you’ve transformed \(A\) to an upper-diagonal matrix \(U\). And you also have \(L\):

In [16]:
L = inv(L3*L2*L1)
L =
    1.0000         0         0
   -0.3000    1.0000         0
    0.5000  -25.0000    1.0000

Or

In [17]:
L = inv(L1)*inv(L2)*inv(L3)
L =
    1.0000         0         0
   -0.3000    1.0000         0
    0.5000  -25.0000    1.0000

Calculating \(L\) is just putting the coefficients in \(L_1,L_2,L_3\) together and negating them (except diagonal elements).

Why? Again, because

row operation = left multiply

Here we are just encoding multiple row operations \(L_1^{-1},L_2^{-1},L_3^{-1}\) into a single matrix \(L\). You get this matrix by applying all those operations to an identity matrix.

You can think of \(L_1^{-1}\) as “a row operation that cancels the effect of \(L_1\)“:

In [18]:
inv(L1)
ans =
    1.0000         0         0
   -0.3000    1.0000         0
         0         0    1.0000

Last, let’s verify \(A=LU\)

In [19]:
L*U
ans =
   10.0000   -7.0000         0
   -3.0000    2.0000    6.0000
    5.0000   -1.0000    5.0000

We can say that \(L\) represents all forward elimination steps (assume no permutaion). By knowing \(L\), you can easily get \(U\) by \(U=L^{-1}A\)

Get P during forward elimination

Say you have a permutation step P somewhere, for example

\[L_3 P L_2 L_1 A = U\]

It can be shown that \(P\) can be “pushed rightward”

\[L_3 L_2 L_1 (PA) = U\]

(The proof is too technical. See P159 of Trefethen L N, Bau III D. Numerical linear algebra[M]. Siam, 1997 if you are interested).

Thus

\[LU=PA\]

Markov process

There are two towns, both have 100 people (so 200 in total). Everyday, 50% of people in town 1 will move to town 2, while 30% of people in town 2 will move to town 1. Eventually, how many people will each town have?

The initial condition is

\[\begin{split}\begin{bmatrix} x_1 \\ x_2 \end{bmatrix} = \begin{bmatrix} 100 \\ 100 \end{bmatrix}\end{split}\]

Each day

\[\begin{split}\begin{bmatrix} x_1 \\ x_2 \end{bmatrix} \rightarrow \begin{bmatrix} x_1 - 0.5x_1 + 0.3x_2 \\ x_2 + 0.5x_1 - 0.3x_2 \end{bmatrix} = \begin{bmatrix} 0.5x_1 + 0.3x_2 \\ 0.5x_1 + 0.7x_2 \end{bmatrix} = \begin{bmatrix} 0.5 & 0.3 \\ 0.5 & 0.7 \end{bmatrix} \begin{bmatrix} x_1 \\ x_2 \end{bmatrix}\end{split}\]

This is a Markov process. We can get its Markov matrix (or transition matrix)

\[\begin{split}A = \begin{bmatrix} 0.5 & 0.3 \\ 0.5 & 0.7 \end{bmatrix}\end{split}\]

Each column of A has a sum of 1 because it means probability.

In [20]:
x = [100; 100] % initial condition
A = [0.5 0.3;0.5 0.7] % Markov matrix
x =
   100
   100
A =
    0.5000    0.3000
    0.5000    0.7000

At the second day, the number of people will be:

In [21]:
A*x
ans =
    80
   120

Town2 gets more people. This is expected because town1->town2 has a higher probability than town2->town1.

How about after 10 days?

In [22]:
x10 = A^10*x
x10 =
   75.0000
  125.0000

11 days?

In [23]:
x11 = A*x10
x11 =
   75.0000
  125.0000

There’s no difference between day 10 and day 11, which means the population reaches equilibrium. This is called the power method for finding the state vector for this transition matrix.

This power method is intuitive but not very efficient. For a fancier method for Pset3, see Canvas - Files - scripts_and_code - lin_algebra - pagerank_demo_template.m.