Session 3: LU Factorization & Markov Process¶
Date: 09/25/2017, Monday
In [1]:
format compact
Read Canvas - Files - lectures- linear_algebra.pdf first before looking at this material.
LU=PA Factorization¶
Gaussian elimination consists of forward elimination and backward substitution.
The backward substition part is easy – You already have an upper diagnonal matrix \(U\), and just need to solve \(Ux=b\).
The forward elimination part is more interesting. By doing forward elimination by hand, you transformed the original matrix \(A\) to an upper diagnonal matrix \(U\). In fact, during this forward elimination process, you not only produced matrix \(U\), but also constructed two other matrices \(L\) and \(P\) (even if you didn’t even notice it!). They satisfy
- L is a lower triangular matrix L with all diagonal elements being 1. It contains all the multipliers used during the forward elimination.
- \(P\) is a permutation matrix containing only 0 and 1. It accounts for all row-swapings during the forward elimination.
Row operation as matrix multiplication¶
Basic idea¶
To understand how \(LU=PA\) works, you should always keep in mind that
row operation = left multiply
Or more verbosely
A row operation on matrix A = left-multiply a matrix L to A (i.e. calculate LA)
This is a crucial concept in linear algebra.
Let’s see an example:
In [2]:
A = [10 -7 0; -3 2 6 ;5 -1 5]
A =
10 -7 0
-3 2 6
5 -1 5
Perform the first step of gaussian elimination, i.e. add 0.3*row1 to row2.
In [3]:
A1 = A; % make a copy
A1(2,:) = A(2,:)+0.3*A(1,:)
A1 =
10.0000 -7.0000 0
0 -0.1000 6.0000
5.0000 -1.0000 5.0000
There’s another way to perform the above row-operation: left-multiply A by an elementary matrix.
In [4]:
L1 = [1,0,0; 0.3,1,0; 0,0,1] % make our elementary matrix
L1 =
1.0000 0 0
0.3000 1.0000 0
0 0 1.0000
In [5]:
L1*A
ans =
10.0000 -7.0000 0
0 -0.1000 6.0000
5.0000 -1.0000 5.0000
L1*A gives the same result as the previous row-operation!
Let’s repeat this idea again:
row operation = left multiply
Find elementary matrix¶
How to find out the matrix L1? Just perform the row-operation to an identity matrix
In [6]:
L1 = eye(3); % 3x3 identity matrix
L1(2,:) = L1(2,:)+0.3*L1(1,:) % the row-operation you want to "encode" into this matrix
L1 =
1.0000 0 0
0.3000 1.0000 0
0 0 1.0000
Then you can perform L1*A to apply this row-operation on any matrix A.
Same for the permutation operation, as it is also an elementary row operation.
In [7]:
Ap = A; % make a copy
% swap raw 1 and 2
Ap(2,:) = A(1,:);
Ap(1,:) = A(2,:);
Ap
Ap =
-3 2 6
10 -7 0
5 -1 5
You can “encode” this row-swapping operation into an elementary permutation matrix.
In [8]:
I = eye(3); % 3x3 identity matrix
P1 = I;
% swap raw 1 and 2
P1(2,:) = I(1,:);
P1(1,:) = I(2,:);
P1
P1 =
0 1 0
1 0 0
0 0 1
Multiplying A by P1 is equivalent to permuting A directly:
In [9]:
P1*A % same as Ap
ans =
-3 2 6
10 -7 0
5 -1 5
Get L during forward elimination¶
For simplicity, assume you don’t need permutation steps. Then you just transform an arbrary 3x3 matrix A (non-singular, of course) to an upper-diagnoal matrix U by 3 row operations. Such operations are equivalent to multiplying A by 3 matrices \(L_1,L_2,L_3\)
We can rewrite it as
Or
It is easy to get \(L\) as long as you know \(L_1,L_2,L_3\) from the operations you’ve performed.
In [10]:
A % show A's value again
A =
10 -7 0
-3 2 6
5 -1 5
In [11]:
L1 = [1,0,0; 0.3,1,0; 0,0,1] % repeat L1 again
L1 =
1.0000 0 0
0.3000 1.0000 0
0 0 1.0000
In [12]:
L1*A % row operation by left-multiply
ans =
10.0000 -7.0000 0
0 -0.1000 6.0000
5.0000 -1.0000 5.0000
In [13]:
L2 = [1,0,0; 0,1,0; -0.5,0,1] % build the next elimination step
L2 =
1.0000 0 0
0 1.0000 0
-0.5000 0 1.0000
In [14]:
L2*L1*A % apply the next elimination step
ans =
10.0000 -7.0000 0
0 -0.1000 6.0000
0 2.5000 5.0000
In [15]:
L3 = [1,0,0; 0,1,0; 0,25,1] % build the last elimination step
L3 =
1 0 0
0 1 0
0 25 1
In [16]:
U = L3*L2*L1*A % apply the last elimination step
U =
10.0000 -7.0000 0
0 -0.1000 6.0000
0 0 155.0000
Now you’ve transformed \(A\) to an upper-diagonal matrix \(U\). And you also have \(L\):
In [16]:
L = inv(L3*L2*L1)
L =
1.0000 0 0
-0.3000 1.0000 0
0.5000 -25.0000 1.0000
Or
In [17]:
L = inv(L1)*inv(L2)*inv(L3)
L =
1.0000 0 0
-0.3000 1.0000 0
0.5000 -25.0000 1.0000
Calculating \(L\) is just putting the coefficients in \(L_1,L_2,L_3\) together and negating them (except diagonal elements).
Why? Again, because
row operation = left multiply
Here we are just encoding multiple row operations \(L_1^{-1},L_2^{-1},L_3^{-1}\) into a single matrix \(L\). You get this matrix by applying all those operations to an identity matrix.
You can think of \(L_1^{-1}\) as “a row operation that cancels the effect of \(L_1\)“:
In [18]:
inv(L1)
ans =
1.0000 0 0
-0.3000 1.0000 0
0 0 1.0000
Last, let’s verify \(A=LU\)
In [19]:
L*U
ans =
10.0000 -7.0000 0
-3.0000 2.0000 6.0000
5.0000 -1.0000 5.0000
We can say that \(L\) represents all forward elimination steps (assume no permutaion). By knowing \(L\), you can easily get \(U\) by \(U=L^{-1}A\)
Get P during forward elimination¶
Say you have a permutation step P somewhere, for example
It can be shown that \(P\) can be “pushed rightward”
(The proof is too technical. See P159 of Trefethen L N, Bau III D. Numerical linear algebra[M]. Siam, 1997 if you are interested).
Thus
Markov process¶
There are two towns, both have 100 people (so 200 in total). Everyday, 50% of people in town 1 will move to town 2, while 30% of people in town 2 will move to town 1. Eventually, how many people will each town have?
The initial condition is
Each day
This is a Markov process. We can get its Markov matrix (or transition matrix)
Each column of A has a sum of 1 because it means probability.
In [20]:
x = [100; 100] % initial condition
A = [0.5 0.3;0.5 0.7] % Markov matrix
x =
100
100
A =
0.5000 0.3000
0.5000 0.7000
At the second day, the number of people will be:
In [21]:
A*x
ans =
80
120
Town2 gets more people. This is expected because town1->town2 has a higher probability than town2->town1.
How about after 10 days?
In [22]:
x10 = A^10*x
x10 =
75.0000
125.0000
11 days?
In [23]:
x11 = A*x10
x11 =
75.0000
125.0000
There’s no difference between day 10 and day 11, which means the population reaches equilibrium. This is called the power method for finding the state vector for this transition matrix.
This power method is intuitive but not very efficient. For a fancier method for Pset3, see Canvas - Files - scripts_and_code - lin_algebra - pagerank_demo_template.m.