# Session 5: MATLAB backslash & some pitfalls¶

Date: 10/16/2017, Monday

```
In [1]:
```

```
format compact
```

## Different behaviors of backslash¶

You’ve already used the backslash `\`

a lot to solve linear systems.
But if you look at MATLAB backslash
documentation
you will find it can do much more than solving linear systems. This
powerful feature can sometimes be very confusing if you are not careful.

### Standard linear system¶

First see a standard problem: a 4x4, full rank square matrix \(A\) and a 4-element column vector \(b\). We want to solve

```
In [2]:
```

```
b = rand(4,1)
```

```
b =
0.8147
0.9058
0.1270
0.9134
```

```
In [3]:
```

```
A = rand(4,4)
```

```
A =
0.6324 0.9575 0.9572 0.4218
0.0975 0.9649 0.4854 0.9157
0.2785 0.1576 0.8003 0.7922
0.5469 0.9706 0.1419 0.9595
```

Well, not big deal…

```
In [4]:
```

```
x = A\b
```

```
x =
-0.0486
0.9204
-0.0630
0.0579
```

We can verify the result:

```
In [5]:
```

```
A*x - b % almost zero
```

```
ans =
1.0e-15 *
0
0
-0.0555
-0.1110
```

### Incorrect shape¶

You should already notice that \(b\) must be a column vector.

```
In [6]:
```

```
x = A\b' % row vector doesn't work
```

```
Error using \
Matrix dimensions must agree.
```

Also, \(A\) and \(b\) must have the same number of rows, because \(A\) and \(b\) come from the following linear system with \(n\) rows (equations) and \(m\) columns (variables):

```
In [7]:
```

```
A = rand(3,4) % A has 3 rows but b has 4 rows
```

```
A =
0.6557 0.9340 0.7431 0.1712
0.0357 0.6787 0.3922 0.7060
0.8491 0.7577 0.6555 0.0318
```

```
In [8]:
```

```
A\b % doesn't work
```

```
Error using \
Matrix dimensions must agree.
```

Looks like \(A\) has to be a square matrix, but see below…

### Over-detemined linear system¶

In fact, A doesn’t have to be a square matrix, just like a linear system doesn’t need to have the same number of rows (equations) and columns (variables).

Here we create an over-determined system

An **over-determined** system means A is **tall** and **narrow** (more
rows than columns)

```
In [9]:
```

```
A = rand(4,3) % more equations (rows) than variables (columns)
```

```
A =
0.2769 0.6948 0.4387
0.0462 0.3171 0.3816
0.0971 0.9502 0.7655
0.8235 0.0344 0.7952
```

```
In [10]:
```

```
x = A\b % backslash works
```

```
x =
0.9798
0.2901
0.2142
```

MATLAB does return a set of \((x_1,x_2,x_3)\). But you know this can’t be the solution because you can’t fulfill 4 equations by just 3 degrees of freedom.

```
In [11]:
```

```
A*x - b % not zero, so x is not a solution
```

```
ans =
-0.2479
-0.6868
0.4078
0.0738
```

So what’s `x`

? It is actually a least-square fit, same as the result
from the normal equation

```
In [12]:
```

```
(A'*A)\(A'*b) % solve normal equation
```

```
ans =
0.9798
0.2901
0.2142
```

In MATLAB, simply using `A\b`

is actually more accurate than
`(A'*A)\(A'*b)`

for solving least squares, especially for large
matrices. That’s because the condition number of \(A^TA\) could be
very large.

```
In [13]:
```

```
cond(A)
cond(A'*A)
```

```
ans =
8.7242
ans =
76.1114
```

We find \(cond(A^TA) = cond(A)^2\). It is not quite large in this tiny case but could explode for large matrices

So how to avoid the normal equation?

Recall that, for a standard, full rank system \(Ax=b\), the code
`A\b`

doesn’t compute \(A^{-1}b\) at all, because \(A^{-1}\)
is often ill-conditioned. It uses LU
factorization which
is better-conditioned.

Similarly, for an over-determined \(Ax=b\), the code `A\b`

doesn’t
compute \(A^TA\) at all. It uses QR
factorization or
similar techniques to make the problem better-conditioned.

### Under-determined linear system¶

Now let’s look at an under-determined system

An **under-determined** system means A is **short** and **wide** (more
columns than rows)

```
In [14]:
```

```
A = rand(4,5) % less equations (rows) than variables (columns)
```

```
A =
0.1869 0.7094 0.6551 0.9597 0.7513
0.4898 0.7547 0.1626 0.3404 0.2551
0.4456 0.2760 0.1190 0.5853 0.5060
0.6463 0.6797 0.4984 0.2238 0.6991
```

```
In [15]:
```

```
x = A\b % backslash still works
```

```
x =
-0.1886
1.4263
0.2995
-0.3730
0
```

We can see x is the solution to \(Ax=b\):

```
In [16]:
```

```
A*x - b % almost zero
```

```
ans =
1.0e-15 *
0.4441
0.3331
0.0278
0.1110
```

However, we know that an under-determined system has infinite number of solutions. But MATLAB just returns one value. What’s special about this value?

It turns out that the `x`

we get here has the smallest norm
\(||x||_2\) among all possible solutions.

```
In [17]:
```

```
norm(x) % smaller than any other possible solutions
```

```
ans =
1.5162
```

## Another multiple-behavior example¶

Because the backslash operator `\`

has different behaviors under
different circumstances, you have to be **very careful**. Sometimes your
matrix shape might be wrong, but MATLAB will still return a result. But
in that case you might be solving a least square problem instead of a
full-rank linear system!

MATLAB has quite a lot of multi-behavior (“poly-algorithm”) functions. Another example is the built-in lu() function for LU factorization.

```
In [18]:
```

```
A = rand(4,4) % to be LU-factorized
```

```
A =
0.8909 0.1493 0.8143 0.1966
0.9593 0.2575 0.2435 0.2511
0.5472 0.8407 0.9293 0.6160
0.1386 0.2543 0.3500 0.4733
```

Just calling `lu()`

with no return, you get a single matrix with
\(L\) and \(U\) stacked together.

```
In [19]:
```

```
lu(A)
```

```
ans =
0.9593 0.2575 0.2435 0.2511
0.5704 0.6938 0.7903 0.4728
0.9287 -0.1295 0.6905 0.0246
0.1445 0.3129 0.0978 0.2867
```

Using `[L,U]`

to hold the return, you get two separate matrices.
Notice that `L`

is not a strict lower-triangular matrix, because it
also incorporates the pivoting matrix \(P\)

```
In [20]:
```

```
[L,U] = lu(A)
```

```
L =
0.9287 -0.1295 1.0000 0
1.0000 0 0 0
0.5704 1.0000 0 0
0.1445 0.3129 0.0978 1.0000
U =
0.9593 0.2575 0.2435 0.2511
0 0.6938 0.7903 0.4728
0 0 0.6905 0.0246
0 0 0 0.2867
```

Using `[L,U,P]`

to hold the return, you get all three matrices. Now
`L`

is strictly lower-triangular.

```
In [21]:
```

```
[L,U,P] = lu(A)
```

```
L =
1.0000 0 0 0
0.5704 1.0000 0 0
0.9287 -0.1295 1.0000 0
0.1445 0.3129 0.0978 1.0000
U =
0.9593 0.2575 0.2435 0.2511
0 0.6938 0.7903 0.4728
0 0 0.6905 0.0246
0 0 0 0.2867
P =
0 1 0 0
0 0 1 0
1 0 0 0
0 0 0 1
```

You can ignore the returning `P`

by `~`

, but keep in mind that the
returning `L`

is different from that in cell [20], although the code
looks very similar.

```
In [22]:
```

```
[L,U,~] = lu(A)
```

```
L =
1.0000 0 0 0
0.5704 1.0000 0 0
0.9287 -0.1295 1.0000 0
0.1445 0.3129 0.0978 1.0000
U =
0.9593 0.2575 0.2435 0.2511
0 0.6938 0.7903 0.4728
0 0 0.6905 0.0246
0 0 0 0.2867
```