叛徒在火星的日子: Least Squares Approximation 最小平方近似

我的技術顧問服務，是提供各種應用於通信系統、控制系統等方面的混合模式積體電路設計(mixed mode IC design)所需的信號處理演算法，與數位積體電路架構設計的技術指導。其中一部分工作，和線性代數理論與演算法相關。

最近，不同的客戶們的幾個的問題，雖然以各種不同的形式展現，但它們其實都是least square approximation的問題。高斯在二百多年前就知道它的解法。因此我把它寫在底下，給大家做為參考。

一個 over-determined system$\mathbf{U} \cdot \mathbf{a}= \mathbf{d}$的least squares (LS) approximation是
$$\hat{\mathbf{a}} = (\mathbf{U}^{T} \cdot \mathbf{U})^{-1}\mathbf{U}^T \cdot \mathbf{d}$$
例：10元硬幣重x克，5元硬幣重y克，以下四個等式是四組測量數據，請問10元硬幣和5元硬幣重量的『合理的』估計值分別是多少？
$$3x+2y=42$$ $$x+3y=24$$ $$4x+4y=60$$ $$5x+y=54$$ 例題中的量測數據，有四個等式，但只有兩個變數，並沒有一組解可以同時滿足四個等式。我們最多只能找出硬幣重量的估計值$x$與$y$，使得四個等式的誤差盡量小一點，無法讓它完全相等。

四個等式用矩陣寫成 $$ \begin{bmatrix} 3 & 2 \\ 1 & 3 \\ 4 & 4 \\ 5 & 1 \end{bmatrix} \begin{bmatrix} x \\ y \end{bmatrix} = \begin{bmatrix} 42 \\ 24 \\ 60 \\ 54 \end{bmatrix} $$ 或是一般的形式 $$\mathbf{U} \cdot \mathbf{a} = \mathbf{d}$$ $m \times n$的矩陣在$m > n$時，多數的情況之下，等式往往是無解的。此時只能找到$\mathbf{U} \cdot \mathbf{a}=\hat{\mathbf{d}} \approx \mathbf{d}$。我們能做的只有讓誤差$\mathbf{e}=\mathbf{d}-\hat{\mathbf{d}}$盡量小一點。

在各種可能的近似估計$\mathbf{U} \cdot \mathbf{a}=\hat{\mathbf{d}} \approx \mathbf{d}$當中，其中一個合理的估計方式，是找到『誤差的平方和』最小的那一組當做最佳近似（least squares approximation）。亦即

$min \ \|\mathbf{e}\|^2$
$s.t. \ \ \mathbf{e}=\mathbf{d}-\hat{\mathbf{d}} = \mathbf{d} -\mathbf{U}\cdot \mathbf{a}$

幾何的觀察：

$\mathbf{d}=\hat{\mathbf{d}}+\mathbf{e}$

要讓誤差向量的平方和$ \|\mathbf{e}\|^2$最小，則：

$\hat{\mathbf{d}}$是$\mathbf{d}$在$span(\mathbf{U})$上的投影，或者說，$\hat{\mathbf{d}}$是$\mathbf{d}$在子空間$span(\mathbf{U})$的分量。

$\mathbf{e}$是$\mathbf{d}$在$span(\mathbf{U})$投影為0的分量，也就是說，$\mathbf{e}$與子空間$span(\mathbf{U})$垂直，或者寫成$\mathbf{U}^T \cdot \mathbf{e} = 0 $，此時的$\|\mathbf{e}\|^2$最小。

任何其他的估計，只要誤差向量$\tilde{\mathbf{e}}$不是垂直於$span(\mathbf{U})$，則其平方和$\|\tilde{\mathbf{e}}\|^2$將等於上述$\|\mathbf{e}\|^2$再加上誤差向量$\tilde{\mathbf{e}}$在$span(\mathbf{U})$的分量的平方和，結果必然大於$\|{\mathbf{e}}\|^2$。

因此，欲求得$\mathbf{U} \cdot \mathbf{a} \approx \mathbf{d}$的 LS approximation，可以同時在$\mathbf{U} \cdot \mathbf{a}$以及$\mathbf{d}$的左方乘上$\mathbf{U}^T$。由於$\mathbf{e}$分量與$span(\mathbf{U})$垂直，左方乘上$span(\mathbf{U}^T)$之後就消掉$\mathbf{e}$的影響了。如此，
$$\mathbf{U}^T (\mathbf{U} \mathbf{a}) = \mathbf{U}^T \mathbf{d} = \mathbf{U}^T (\hat{\mathbf{d}}+\mathbf{e}) = \mathbf{U}^T \hat{\mathbf{d}}+\mathbf{U}^T \mathbf{e} = \mathbf{U}^T \hat{\mathbf{d}}$$ 就有唯一解　
$$\hat{\mathbf{a}} = (\mathbf{U}^{T} \cdot \mathbf{U})^{-1}\mathbf{U}^T \cdot \mathbf{d}$$

例題中

$\begin{bmatrix} 3 \\ 1\\ 4 \\ 5 \end{bmatrix}$ 與 $\begin{bmatrix} 2 \\ 3\\ 4 \\ 1\end{bmatrix}$ 的所有線性組合，構成了二維子平面$span(\mathbf{U})$。然而$\mathbf{d}=\begin{bmatrix} 42 \\ 24 \\ 60 \\ 54 \end{bmatrix}$並不落在這個子平面上，因此無解。套用上述的做法：
$$ \begin{bmatrix} 3 & 1 & 4 & 5 \\ 2 & 3 & 4 &1 \end{bmatrix} \begin{bmatrix} 3 & 2 \\ 1 & 3 \\ 4 & 4 \\ 5 & 1 \end{bmatrix} \begin{bmatrix} \hat{x} \\ \hat{y} \end{bmatrix} = \begin{bmatrix} 3 & 1 & 4 & 5 \\ 2 & 3 & 4 &1 \end{bmatrix} \begin{bmatrix} 42 \\ 24 \\ 60 \\ 54 \end{bmatrix} $$ 也就是
$$ \begin{bmatrix} 51 & 30 \\ 30 & 30 \end{bmatrix} \begin{bmatrix} \hat{x} \\ \hat{y} \end{bmatrix} = \begin{bmatrix} 660 \\ 450 \end{bmatrix}$$ 亦即 $$ \begin{bmatrix} \hat{x} \\ \hat{y} \end{bmatrix} = \begin{bmatrix} 10 \\ 5 \end{bmatrix} $$
是誤差平方和最小的最佳估計。

回過頭來看：$$ \hat{\mathbf{d}} = \begin{bmatrix} 3 & 2 \\ 1 & 3 \\ 4 & 4 \\ 5 & 1 \end{bmatrix} \begin{bmatrix} \hat{x} \\ \hat{y} \end{bmatrix} = \begin{bmatrix} 40 \\ 25 \\ 60 \\ 55 \end{bmatrix} $$ 誤差為 $$ e= \begin{bmatrix} 42 \\ 24 \\ 60 \\ 54 \end{bmatrix} - \begin{bmatrix} 40 \\ 25 \\ 60 \\ 55 \end{bmatrix} = \begin{bmatrix} 2 \\ -1 \\ 0 \\ -1 \end{bmatrix} $$ 原本的$\mathbf{U}^T \mathbf{U} \mathbf{a} = \mathbf{U}^T \hat{\mathbf{d}}$ 也就是 $$\begin{bmatrix} 3 & 1 & 4 & 5 \\ 2 & 3 & 4 &1 \end{bmatrix} \begin{bmatrix} 3 & 2 \\ 1 & 3 \\ 4 & 4 \\ 5 & 1 \end{bmatrix} \begin{bmatrix} \hat{x} \\ \hat{y} \end{bmatrix} = \begin{bmatrix} 3 & 1 & 4 & 5 \\ 2 & 3 & 4 &1 \end{bmatrix} \Bigg( \begin{bmatrix} 42 \\ 25 \\ 60 \\ 55 \end{bmatrix}+ \begin{bmatrix} 2 \\ -1 \\ 0 \\ -1 \end{bmatrix} \Bigg) $$ 其中，後面的誤差項 $ e= \begin{bmatrix} 2 \\ -1 \\ 0 \\ -1 \end{bmatrix} $ 垂直於 $\begin{bmatrix} 3 \\ 1 \\ 4 \\ 5 \end{bmatrix} $ 與 $\begin{bmatrix} 2 \\ 3 \\ 4 \\ 1 \end{bmatrix} $ 可以看到：$\mathbf{U}^T \cdot \mathbf{e} = 0$ ，消去了$\mathbf{d}$當中沒有辦法從$\begin{bmatrix} 3 \\ 1\\ 4 \\ 5 \end{bmatrix}$與$\begin{bmatrix} 2 \\ 3\\ 4 \\ 1 \end{bmatrix}$組合而成的分量$\mathbf{e}=\begin{bmatrix} 2 \\ -1 \\ 0 \\ -1 \end{bmatrix}$。因此，兩個變數四個等式的無解的情況，變為兩個變數兩個等式。

叛徒在火星的日子

2019/07/26

Least Squares Approximation 最小平方近似

1 則留言:

Labels

2019/07/26

Least Squares Approximation 最小平方近似

1 則留言:

Labels

訂閱