Session 5: MATLAB backslash & some pitfalls¶
Date: 10/16/2017, Monday
In [1]:
format compact
Different behaviors of backslash¶
You’ve already used the backslash \
a lot to solve linear systems.
But if you look at MATLAB backslash
documentation
you will find it can do much more than solving linear systems. This
powerful feature can sometimes be very confusing if you are not careful.
Standard linear system¶
First see a standard problem: a 4x4, full rank square matrix \(A\) and a 4-element column vector \(b\). We want to solve
In [2]:
b = rand(4,1)
b =
0.8147
0.9058
0.1270
0.9134
In [3]:
A = rand(4,4)
A =
0.6324 0.9575 0.9572 0.4218
0.0975 0.9649 0.4854 0.9157
0.2785 0.1576 0.8003 0.7922
0.5469 0.9706 0.1419 0.9595
Well, not big deal…
In [4]:
x = A\b
x =
-0.0486
0.9204
-0.0630
0.0579
We can verify the result:
In [5]:
A*x - b % almost zero
ans =
1.0e-15 *
0
0
-0.0555
-0.1110
Incorrect shape¶
You should already notice that \(b\) must be a column vector.
In [6]:
x = A\b' % row vector doesn't work
Error using \
Matrix dimensions must agree.
Also, \(A\) and \(b\) must have the same number of rows, because \(A\) and \(b\) come from the following linear system with \(n\) rows (equations) and \(m\) columns (variables):
In [7]:
A = rand(3,4) % A has 3 rows but b has 4 rows
A =
0.6557 0.9340 0.7431 0.1712
0.0357 0.6787 0.3922 0.7060
0.8491 0.7577 0.6555 0.0318
In [8]:
A\b % doesn't work
Error using \
Matrix dimensions must agree.
Looks like \(A\) has to be a square matrix, but see below…
Over-detemined linear system¶
In fact, A doesn’t have to be a square matrix, just like a linear system doesn’t need to have the same number of rows (equations) and columns (variables).
Here we create an over-determined system
An over-determined system means A is tall and narrow (more rows than columns)
In [9]:
A = rand(4,3) % more equations (rows) than variables (columns)
A =
0.2769 0.6948 0.4387
0.0462 0.3171 0.3816
0.0971 0.9502 0.7655
0.8235 0.0344 0.7952
In [10]:
x = A\b % backslash works
x =
0.9798
0.2901
0.2142
MATLAB does return a set of \((x_1,x_2,x_3)\). But you know this can’t be the solution because you can’t fulfill 4 equations by just 3 degrees of freedom.
In [11]:
A*x - b % not zero, so x is not a solution
ans =
-0.2479
-0.6868
0.4078
0.0738
So what’s x
? It is actually a least-square fit, same as the result
from the normal equation
In [12]:
(A'*A)\(A'*b) % solve normal equation
ans =
0.9798
0.2901
0.2142
In MATLAB, simply using A\b
is actually more accurate than
(A'*A)\(A'*b)
for solving least squares, especially for large
matrices. That’s because the condition number of \(A^TA\) could be
very large.
In [13]:
cond(A)
cond(A'*A)
ans =
8.7242
ans =
76.1114
We find \(cond(A^TA) = cond(A)^2\). It is not quite large in this tiny case but could explode for large matrices
So how to avoid the normal equation?
Recall that, for a standard, full rank system \(Ax=b\), the code
A\b
doesn’t compute \(A^{-1}b\) at all, because \(A^{-1}\)
is often ill-conditioned. It uses LU
factorization which
is better-conditioned.
Similarly, for an over-determined \(Ax=b\), the code A\b
doesn’t
compute \(A^TA\) at all. It uses QR
factorization or
similar techniques to make the problem better-conditioned.
Under-determined linear system¶
Now let’s look at an under-determined system
An under-determined system means A is short and wide (more columns than rows)
In [14]:
A = rand(4,5) % less equations (rows) than variables (columns)
A =
0.1869 0.7094 0.6551 0.9597 0.7513
0.4898 0.7547 0.1626 0.3404 0.2551
0.4456 0.2760 0.1190 0.5853 0.5060
0.6463 0.6797 0.4984 0.2238 0.6991
In [15]:
x = A\b % backslash still works
x =
-0.1886
1.4263
0.2995
-0.3730
0
We can see x is the solution to \(Ax=b\):
In [16]:
A*x - b % almost zero
ans =
1.0e-15 *
0.4441
0.3331
0.0278
0.1110
However, we know that an under-determined system has infinite number of solutions. But MATLAB just returns one value. What’s special about this value?
It turns out that the x
we get here has the smallest norm
\(||x||_2\) among all possible solutions.
In [17]:
norm(x) % smaller than any other possible solutions
ans =
1.5162
Another multiple-behavior example¶
Because the backslash operator \
has different behaviors under
different circumstances, you have to be very careful. Sometimes your
matrix shape might be wrong, but MATLAB will still return a result. But
in that case you might be solving a least square problem instead of a
full-rank linear system!
MATLAB has quite a lot of multi-behavior (“poly-algorithm”) functions. Another example is the built-in lu() function for LU factorization.
In [18]:
A = rand(4,4) % to be LU-factorized
A =
0.8909 0.1493 0.8143 0.1966
0.9593 0.2575 0.2435 0.2511
0.5472 0.8407 0.9293 0.6160
0.1386 0.2543 0.3500 0.4733
Just calling lu()
with no return, you get a single matrix with
\(L\) and \(U\) stacked together.
In [19]:
lu(A)
ans =
0.9593 0.2575 0.2435 0.2511
0.5704 0.6938 0.7903 0.4728
0.9287 -0.1295 0.6905 0.0246
0.1445 0.3129 0.0978 0.2867
Using [L,U]
to hold the return, you get two separate matrices.
Notice that L
is not a strict lower-triangular matrix, because it
also incorporates the pivoting matrix \(P\)
In [20]:
[L,U] = lu(A)
L =
0.9287 -0.1295 1.0000 0
1.0000 0 0 0
0.5704 1.0000 0 0
0.1445 0.3129 0.0978 1.0000
U =
0.9593 0.2575 0.2435 0.2511
0 0.6938 0.7903 0.4728
0 0 0.6905 0.0246
0 0 0 0.2867
Using [L,U,P]
to hold the return, you get all three matrices. Now
L
is strictly lower-triangular.
In [21]:
[L,U,P] = lu(A)
L =
1.0000 0 0 0
0.5704 1.0000 0 0
0.9287 -0.1295 1.0000 0
0.1445 0.3129 0.0978 1.0000
U =
0.9593 0.2575 0.2435 0.2511
0 0.6938 0.7903 0.4728
0 0 0.6905 0.0246
0 0 0 0.2867
P =
0 1 0 0
0 0 1 0
1 0 0 0
0 0 0 1
You can ignore the returning P
by ~
, but keep in mind that the
returning L
is different from that in cell [20], although the code
looks very similar.
In [22]:
[L,U,~] = lu(A)
L =
1.0000 0 0 0
0.5704 1.0000 0 0
0.9287 -0.1295 1.0000 0
0.1445 0.3129 0.0978 1.0000
U =
0.9593 0.2575 0.2435 0.2511
0 0.6938 0.7903 0.4728
0 0 0.6905 0.0246
0 0 0 0.2867