天天看点

ICP(Iterative Closest Point)算法推導

ICP(Iterative Closest Point)算法推導

    • 前言
    • 算法流程
    • 損失函數
    • 求解t
      • 化簡
      • 令導數為0
    • 求解R
      • 回代t
      • 將f轉化為x^2^+y^2^-2xy的形式
      • 化簡並引入去除均值的點雲
      • SVD矩陣分解
      • 尋找最優的R

前言

本篇整理自深藍學院三維點雲處理課程的Lecture 9 – Registration。推導過程中大量地用到了矩陣跡(trace)及Frobenius norm的性質,建議與矩陣的跡(matrix trace)的特性以及證明及Frobenius norm的特性以及證明對照參看。

算法流程

  • 尋找(最近鄰)點對,拒絕當中距離過大者
  • 估計旋轉矩陣 R R R和平移向量 t t t,使得 轉換後的原始點雲 與 目標點雲 距離最小化(此即Procrustes Transformation Problem)
  • 檢查 損失函數 f f f是否夠小 或 R , t R,t R,t的變換量是否夠小,若非,則再次執行上述所有步驟

損失函數

兩個已配對的 m m m維空間中的點集, A A A和 B B B中各有 N N N個點:

A = [ a 1 , . . . a N ] ∈ R m × N A = [a_1,...a_N] \in \R^{m\times N} A=[a1​,...aN​]∈Rm×N

B = [ b 1 , . . . b N ] ∈ R m × N B = [b_1,...b_N] \in \R^{m\times N} B=[b1​,...bN​]∈Rm×N

並定義如下矩陣和向量:

1 = [ 1 , . . . , 1 ] T ∈ R N × 1 \bold{1} = [1,...,1]^T \in \R^{N \times 1} 1=[1,...,1]T∈RN×1

R ∈ R m × m m維空間中的旋轉矩陣 \begin{aligned}R \in \R^{m \times m} && \text{m維空間中的旋轉矩陣}\end{aligned} R∈Rm×m​​m維空間中的旋轉矩陣​

t ∈ R m × 1 m維空間中的平移向量 \begin{aligned}t \in \R^{m \times 1} && \text{m維空間中的平移向量}\end{aligned} t∈Rm×1​​m維空間中的平移向量​

損失函數為:

f ( R , t ) = 1 N ∑ i = 1 N ∥ b i − R a i − t ∥ 2 = ∥ B − ( R A + t 1 T ) ∥ F 2 f(R,t) = \frac{1}{N} \sum\limits_{i=1}^N \|b_i - Ra_i-t\|^2 = \|B-(RA+t\bold{1^T})\|_F^2 f(R,t)=N1​i=1∑N​∥bi​−Rai​−t∥2=∥B−(RA+t1T)∥F2​

(最後一項好像少了 1 N \frac{1}{N} N1​?)

這裡的目標是尋找一組 R R R和 t t t使得 f f f最小化。

求解t

化簡

f ( R , t ) = ∥ B − ( R A + t 1 T ) ∥ F 2 = ∥ ( B − R A ) + ( − t 1 T ) ∥ F 2 = ∥ B − R A ∥ F 2 + ∥ − t 1 T ∥ F 2 + 2 ⟨ B − R A , − t 1 T ⟩ F Frobenius decomposition \begin{aligned}f(R,t) &= \|B-(RA+t\bold{1^T})\|_F^2 \\&= \|(B-RA)+(-t\bold{1^T})\|^2_F \\&= \|B-RA\|^2_F + \|-t\bold{1^T}\|^2_F + 2\langle B-RA,-t\bold{1^T}\rangle_F && \text{Frobenius decomposition} \end{aligned} f(R,t)​=∥B−(RA+t1T)∥F2​=∥(B−RA)+(−t1T)∥F2​=∥B−RA∥F2​+∥−t1T∥F2​+2⟨B−RA,−t1T⟩F​​​Frobenius decomposition​

單獨化簡第二項:

∥ − t 1 T ∥ F 2 = ∥ t 1 T ∥ F 2 = ∥ t [ 1 1 . . . 1 ] T T ∥ F 2 = ∥ [ t t . . . t ] ∥ F 2 = tr ( [ t t . . . t ] T [ t t . . . t ] ) Forbenius norm: ∥ A ∥ F = tr ( A A T ) = tr ( A T A ) = tr ( [ t T t t T t . . . t T t t T t t T t . . . t T t . . . t T t t T t . . . t T t ] ) = N t T t \begin{aligned}\|-t\bold{1^T}\|^2_F &= \|t\bold{1^T}\|^2_F \\&= \|t{\begin{bmatrix}1 & 1 & ... & 1 \end{bmatrix}^T}^T\|^2_F \\&= \|\begin{bmatrix}t & t & ... & t \end{bmatrix}\|_F^2 \\&= \text{tr}(\begin{bmatrix}t & t & ... & t \end{bmatrix}^T \begin{bmatrix}t & t & ... & t \end{bmatrix}) && \text{Forbenius norm:}\|A\|_F = \sqrt{\text{tr}(A A^T)} = \sqrt{\text{tr}(A^T A)} \\&= \text{tr}(\begin{bmatrix}t^Tt & t^Tt & ... & t^Tt \\ t^Tt & t^Tt & ... & t^Tt \\ ... \\t^Tt & t^Tt & ... & t^Tt \\ \end{bmatrix})\\&= Nt^Tt\end{aligned} ∥−t1T∥F2​​=∥t1T∥F2​=∥t[1​1​...​1​]TT∥F2​=∥[t​t​...​t​]∥F2​=tr([t​t​...​t​]T[t​t​...​t​])=tr(⎣⎢⎢⎡​tTttTt...tTt​tTttTttTt​.........​tTttTttTt​⎦⎥⎥⎤​)=NtTt​​Forbenius norm:∥A∥F​=tr(AAT)

​=tr(ATA)

​​

代回損失函數:

f ( R , t ) = ∥ B − R A ∥ F 2 + ∥ − t 1 T ∥ F 2 + 2 ⟨ B − R A , − t 1 T ⟩ F = ∥ B − R A ∥ F 2 + N t T t − 2 ⟨ B − R A , t 1 T ⟩ F = ∥ B − R A ∥ F 2 + N t T t − 2 tr ( ( B − R A ) 1 t T ) Frobenius inner product \begin{aligned}f(R,t) &= \|B-RA\|^2_F + \|-t\bold{1^T}\|^2_F + 2\langle B-RA,-t\bold{1^T}\rangle_F \\&= \|B-RA\|^2_F + Nt^Tt - 2\langle B-RA,t\bold{1^T}\rangle_F \\&= \|B-RA\|^2_F + Nt^Tt -2\text{tr}((B-RA)\bold{1}t^T) && \text{Frobenius inner product} \end{aligned} f(R,t)​=∥B−RA∥F2​+∥−t1T∥F2​+2⟨B−RA,−t1T⟩F​=∥B−RA∥F2​+NtTt−2⟨B−RA,t1T⟩F​=∥B−RA∥F2​+NtTt−2tr((B−RA)1tT)​​Frobenius inner product​

令導數為0

將 f f f對 t t t微分:

∂ f ∂ t = ∂ ( ∥ B − R A ∥ F 2 + N t T t − 2 tr ( ( B − R A ) 1 t T ) ) ∂ t = ∂ ∥ B − R A ∥ F 2 ∂ t + ∂ N t T t ∂ t − 2 ∂ tr ( ( B − R A ) 1 t T ) ∂ t = 0 + ∂ N t T t ∂ t − 2 ∂ tr ( ( B − R A ) 1 t T ) ∂ t = 2 N t − 2 ∂ tr ( ( B − R A ) 1 t T ) ∂ t ∂ x T x ∂ x = 2 x = 2 N t − 2 ( B − R A ) 1 ∂ tr ( A X T ) ∂ X = A \begin{aligned}\frac{\partial f}{\partial t} &=\frac{\partial (\|B-RA\|^2_F + Nt^Tt -2\text{tr}((B-RA)\bold{1}t^T))}{\partial t} \\&=\frac{\partial \|B-RA\|^2_F}{\partial t} + \frac{\partial Nt^Tt}{\partial t} -2\frac{\partial \text{tr}((B-RA)\bold{1}t^T)}{\partial t} \\&= 0 + \frac{\partial Nt^Tt}{\partial t} -2\frac{\partial \text{tr}((B-RA)\bold{1}t^T)}{\partial t} \\&= 2Nt -2\frac{\partial \text{tr}((B-RA)\bold{1}t^T)}{\partial t} && \frac{\partial x^Tx}{\partial x} = 2x \\&= 2Nt -2(B-RA)\bold{1} && \frac{\partial \text{tr}(\bold{A}\bold{X}^T)}{\partial \bold{X}} = \bold{A}\end{aligned} ∂t∂f​​=∂t∂(∥B−RA∥F2​+NtTt−2tr((B−RA)1tT))​=∂t∂∥B−RA∥F2​​+∂t∂NtTt​−2∂t∂tr((B−RA)1tT)​=0+∂t∂NtTt​−2∂t∂tr((B−RA)1tT)​=2Nt−2∂t∂tr((B−RA)1tT)​=2Nt−2(B−RA)1​​∂x∂xTx​=2x∂X∂tr(AXT)​=A​

令 ∂ f ∂ t = 2 N t − 2 ( B − R A ) 1 \frac{\partial f}{\partial t} = 2Nt -2(B-RA)\bold{1} ∂t∂f​=2Nt−2(B−RA)1為0。

可以得知取 t = 1 N ( B − R A ) 1 t = \frac{1}{N}(B-RA) \bold{1} t=N1​(B−RA)1時能使得 f f f最小化。

註一: ∂ x T x ∂ x \frac{\partial x^Tx}{\partial x} ∂x∂xTx​可參考x^Tx對x微分公式證明。

註二: ∂ tr ( A X T ) ∂ X = A \frac{\partial \text{tr}(\bold{A}\bold{X}^T)}{\partial \bold{X}} = \bold{A} ∂X∂tr(AXT)​=A可參考兩矩陣乘積的跡對矩陣微分公式證明。

求解R

回代t

之前化簡得到 f ( R , t ) = ∥ B − R A ∥ F 2 + N t T t − 2 tr ( ( B − R A ) 1 t T ) f(R,t)= \|B-RA\|^2_F + Nt^Tt -2\text{tr}((B-RA)\bold{1}t^T) f(R,t)=∥B−RA∥F2​+NtTt−2tr((B−RA)1tT)。

將剛剛計算出來的 t = 1 N ( B − R A ) 1 t = \frac{1}{N}(B-RA) \bold{1} t=N1​(B−RA)1代入第二項:

N t T t = N ( 1 N 1 T ( B − R A ) T ) ( 1 N ( B − R A ) 1 ) = 1 N 1 T ( B − R A ) T ( B − R A ) 1 = 1 N [ 1 . . . 1 ] [ ( b 1 − R a 1 ) T . . . ( b N − R a N ) T ] [ ( b 1 − R a 1 ) . . . ( b N − R a N ) ] [ 1 . . . 1 ] = 1 N ( ( b 1 − R a 1 ) T + . . . + ( b N − R a N ) T ) ( ( b 1 − R a 1 ) + . . . + ( b N − R a N ) ) 矩陣乘法的結合律 = 1 N ∑ i = 1 N ( b i − R a i ) T ∑ i = 1 N ( b i − R a i ) = 1 N N ( μ b − R μ a ) T N ( μ b − R μ a ) = N ( μ b − R μ a ) T ( μ b − R μ a ) = N ∥ μ b − R μ a ∥ 2 \begin{aligned}Nt^Tt &= N(\frac{1}{N}\bold{1^T}(B-RA)^T)(\frac{1}{N}(B-RA)\bold{1}) \\&= \frac{1}{N}\bold{1^T}(B-RA)^T(B-RA)\bold{1} \\ &= \frac{1}{N}\begin{bmatrix}1 & ... & 1\end{bmatrix}\begin{bmatrix}(b_1 - Ra_1)^T \\ ... \\ (b_N - Ra_N)^T\end{bmatrix}\begin{bmatrix}(b_1 - Ra_1) & ... & (b_N - Ra_N)\end{bmatrix}\begin{bmatrix}1 \\ ... \\ 1\end{bmatrix} \\&= \frac{1}{N}((b_1 - Ra_1)^T + ... + (b_N - Ra_N)^T)((b_1 - Ra_1) + ... + (b_N - Ra_N)) && \text{矩陣乘法的結合律} \\&= \frac{1}{N}\sum\limits_{i=1}^N(b_i-Ra_i)^T\sum\limits_{i=1}^N(b_i-Ra_i) \\&= \frac{1}{N}N(\mu_b-R\mu_a)^TN(\mu_b-R\mu_a) \\&= N(\mu_b-R\mu_a)^T(\mu_b-R\mu_a) \\&= N \|\mu_b-R\mu_a\|^2\end{aligned} NtTt​=N(N1​1T(B−RA)T)(N1​(B−RA)1)=N1​1T(B−RA)T(B−RA)1=N1​[1​...​1​]⎣⎡​(b1​−Ra1​)T...(bN​−RaN​)T​⎦⎤​[(b1​−Ra1​)​...​(bN​−RaN​)​]⎣⎡​1...1​⎦⎤​=N1​((b1​−Ra1​)T+...+(bN​−RaN​)T)((b1​−Ra1​)+...+(bN​−RaN​))=N1​i=1∑N​(bi​−Rai​)Ti=1∑N​(bi​−Rai​)=N1​N(μb​−Rμa​)TN(μb​−Rμa​)=N(μb​−Rμa​)T(μb​−Rμa​)=N∥μb​−Rμa​∥2​​矩陣乘法的結合律​

將 t = 1 N ( B − R A ) 1 t = \frac{1}{N}(B-RA) \bold{1} t=N1​(B−RA)1代入第三項:

2 tr ( ( B − R A ) 1 t T ) = 2 tr ( ( B − R A ) 1 1 N 1 T ( B − R A ) T ) = 2 N tr ( [ b 1 − R a 1 . . . b N − R a N ] [ 1 . . . 1 ] [ 1 . . . 1 ] [ ( b 1 − R a 1 ) T . . . ( b N − R a N ) T ] ) = 2 N tr ( ( ( b 1 − R a 1 ) + . . . + ( b N − R a N ) ) ( ( b 1 − R a 1 ) T + . . . + ( b N − R a N ) T ) ) = 2 N tr ( ( ∑ i = 1 N b i − R a i ) ( ∑ i = 1 N b i − R a i ) T ) = 2 N tr ( N ( μ b − R μ a ) N ( μ b − R μ a ) T ) = 2 N tr ( ( μ b − R μ a ) ( μ b − R μ a ) T ) tr ( c A ) = c tr ( A ) = 2 N ∥ μ b − R μ a ∥ 2 tr ( v 1 × v 2 ) = v 1 ⋅ v 2 \begin{aligned}2\text{tr}((B-RA)\bold{1}t^T) &= 2\text{tr}((B-RA)\bold{1}\frac{1}{N}\bold{1^T}(B-RA)^T) \\&= \frac{2}{N}\text{tr}(\begin{bmatrix}b_1-Ra_1 & ... & b_N-Ra_N\end{bmatrix}\begin{bmatrix}1 \\ ... \\1\end{bmatrix}\begin{bmatrix}1 & ... &1\end{bmatrix}\begin{bmatrix}(b_1-Ra_1)^T \\ ... \\ (b_N-Ra_N)^T\end{bmatrix}) \\ &= \frac{2}{N}\text{tr}(((b_1-Ra_1)+...+(b_N-Ra_N))((b_1-Ra_1)^T+...+(b_N-Ra_N)^T)) \\&= \frac{2}{N}\text{tr}((\sum\limits_{i=1}^Nb_i-Ra_i)(\sum\limits_{i=1}^Nb_i-Ra_i)^T) \\&= \frac{2}{N}\text{tr}(N(\mu_b-R\mu_a)N(\mu_b-R\mu_a)^T) \\&= 2N\text{tr}((\mu_b-R\mu_a)(\mu_b-R\mu_a)^T) && \text{tr}(c\bold{A}) = c \text{tr}(\bold{A})\\&= 2N\|\mu_b-R\mu_a\|^2 && \text{tr}(v_1 \times v_2) = v_1 \cdot v_2\end{aligned} 2tr((B−RA)1tT)​=2tr((B−RA)1N1​1T(B−RA)T)=N2​tr([b1​−Ra1​​...​bN​−RaN​​]⎣⎡​1...1​⎦⎤​[1​...​1​]⎣⎡​(b1​−Ra1​)T...(bN​−RaN​)T​⎦⎤​)=N2​tr(((b1​−Ra1​)+...+(bN​−RaN​))((b1​−Ra1​)T+...+(bN​−RaN​)T))=N2​tr((i=1∑N​bi​−Rai​)(i=1∑N​bi​−Rai​)T)=N2​tr(N(μb​−Rμa​)N(μb​−Rμa​)T)=2Ntr((μb​−Rμa​)(μb​−Rμa​)T)=2N∥μb​−Rμa​∥2​​tr(cA)=ctr(A)tr(v1​×v2​)=v1​⋅v2​​

將f轉化為x2+y2-2xy的形式

為了接下來推導方便,將第一項轉化成如下形式:

∥ B − R A ∥ F 2 = ∑ i = 1 m ∑ j = 1 N ( B − R A ) i j 2 = ∑ j = 1 N ∑ i = 1 m ( B − R A ) i j 2 = ∑ j = 1 N ∥ b j − R a j ∥ 2 = ∑ i = 1 N ∥ b i − R a i ∥ 2 \begin{aligned}\|B-RA\|^2_F &= \sum\limits_{i=1}^m\sum\limits_{j=1}^N(B-RA)_{ij}^2 \\&= \sum\limits_{j=1}^N\sum\limits_{i=1}^m(B-RA)_{ij}^2 \\&= \sum\limits_{j=1}^N\|b_j-Ra_j\|^2 \\&= \sum\limits_{i=1}^N\|b_i-Ra_i\|^2\end{aligned} ∥B−RA∥F2​​=i=1∑m​j=1∑N​(B−RA)ij2​=j=1∑N​i=1∑m​(B−RA)ij2​=j=1∑N​∥bj​−Raj​∥2=i=1∑N​∥bi​−Rai​∥2​

第二項由 N ∥ μ b − R μ a ∥ 2 N \|\mu_b-R\mu_a\|^2 N∥μb​−Rμa​∥2化為 ∑ i = 1 N ∥ μ b − R μ a ∥ 2 \sum\limits_{i=1}^N\|\mu_b-R\mu_a\|^2 i=1∑N​∥μb​−Rμa​∥2。

第三項由 2 N ∥ μ b − R μ a ∥ 2 2N\|\mu_b-R\mu_a\|^2 2N∥μb​−Rμa​∥2轉化為:

2 N ∥ μ b − R μ a ∥ 2 = 2 N ( μ b − R μ a ) T ( μ b − R μ a ) = 2 ∑ i = 1 N ( b i − R a i ) T ( μ b − R μ a ) \begin{aligned}2N\|\mu_b-R\mu_a\|^2 &= 2N(\mu_b-R\mu_a)^T(\mu_b-R\mu_a) \\&= 2\sum\limits_{i=1}^N(b_i-Ra_i)^T(\mu_b-R\mu_a)\end{aligned} 2N∥μb​−Rμa​∥2​=2N(μb​−Rμa​)T(μb​−Rμa​)=2i=1∑N​(bi​−Rai​)T(μb​−Rμa​)​

至此, f ( R , t ) = ∑ i = 1 N ∥ b i − R a i ∥ 2 + ∑ i = 1 N ∥ μ b − R μ a ∥ 2 − 2 ∑ i = 1 N ( b i − R a i ) T ( μ b − R μ a ) f(R,t) = \sum\limits_{i=1}^N\|b_i-Ra_i\|^2 + \sum\limits_{i=1}^N\|\mu_b-R\mu_a\|^2 - 2\sum\limits_{i=1}^N(b_i-Ra_i)^T(\mu_b-R\mu_a) f(R,t)=i=1∑N​∥bi​−Rai​∥2+i=1∑N​∥μb​−Rμa​∥2−2i=1∑N​(bi​−Rai​)T(μb​−Rμa​)

化簡並引入去除均值的點雲

f ( R , t ) = ∑ i = 1 N ∥ b i − R a i ∥ 2 + ∑ i = 1 N ∥ μ b − R μ a ∥ 2 − 2 ∑ i = 1 N ( b i − R a i ) T ( μ b − R μ a ) = ∑ i = 1 N ( ∥ b i − R a i ∥ 2 + ∥ μ b − R μ a ∥ 2 − 2 ( b i − R a i ) T ( μ b − R μ a ) ) = ∑ i = 1 N ( ∥ b i − R a i ∥ 2 + ∥ μ b − R μ a ∥ 2 − 2 tr ( ( b i − R a i ) T ( μ b − R μ a ) ) ) ( b i − R a i ) T ( μ b − R μ a ) 是純量 = ∑ i = 1 N ( ∥ b i − R a i ∥ 2 + ∥ μ b − R μ a ∥ 2 − 2 ⟨ b i − R a i , μ b − R μ a ⟩ F ) Frobenius inner product = ∑ i = 1 N ∥ ( b i − R a i ) − ( μ b − R μ a ) ∥ 2 反向套用Frobenius decomposition = ∑ i = 1 N ∥ ( b i − μ b ) − R ( a i − μ a ) ∥ 2 = ∑ i = 1 N ∥ b i ′ − R a i ′ ∥ 2 b i ′ ≜ b i − μ b , a i ′ ≜ a i − μ a = ∥ B ′ − R A ′ ∥ F 2 A ′ ≜ A ( I N − 1 N 1 1 T ) , B ′ ≜ B ( I N − 1 N 1 1 T ) \begin{aligned}f(R,t) &= \sum\limits_{i=1}^N\|b_i-Ra_i\|^2 + \sum\limits_{i=1}^N\|\mu_b-R\mu_a\|^2 - 2\sum\limits_{i=1}^N(b_i-Ra_i)^T(\mu_b-R\mu_a) \\&= \sum\limits_{i=1}^N(\|b_i-Ra_i\|^2 + \|\mu_b-R\mu_a\|^2 -2(b_i-Ra_i)^T(\mu_b-R\mu_a)) \\&= \sum\limits_{i=1}^N(\|b_i-Ra_i\|^2 + \|\mu_b-R\mu_a\|^2 -2\text{tr}((b_i-Ra_i)^T(\mu_b-R\mu_a))) && (b_i-Ra_i)^T(\mu_b-R\mu_a)\text{是純量}\\&= \sum\limits_{i=1}^N(\|b_i-Ra_i\|^2 + \|\mu_b-R\mu_a\|^2 -2\langle b_i-Ra_i,\mu_b-R\mu_a\rangle_F ) && \text{Frobenius inner product} \\&= \sum\limits_{i=1}^N\|(b_i-Ra_i)-(\mu_b-R\mu_a)\|^2 && \text{反向套用Frobenius decomposition} \\&= \sum\limits_{i=1}^N \|(b_i-\mu_b)-R(a_i-\mu_a)\|^2\\&= \sum\limits_{i=1}^N \|b_i'-Ra_i'\|^2 && b_i' \triangleq b_i-\mu_b, a_i' \triangleq a_i-\mu_a\\&= \|B'-RA'\|^2_F && A' \triangleq A(I_N - \frac{1}{N}\bold{1}\bold{1}^T), B' \triangleq B(I_N - \frac{1}{N}\bold{1}\bold{1}^T)\end{aligned} f(R,t)​=i=1∑N​∥bi​−Rai​∥2+i=1∑N​∥μb​−Rμa​∥2−2i=1∑N​(bi​−Rai​)T(μb​−Rμa​)=i=1∑N​(∥bi​−Rai​∥2+∥μb​−Rμa​∥2−2(bi​−Rai​)T(μb​−Rμa​))=i=1∑N​(∥bi​−Rai​∥2+∥μb​−Rμa​∥2−2tr((bi​−Rai​)T(μb​−Rμa​)))=i=1∑N​(∥bi​−Rai​∥2+∥μb​−Rμa​∥2−2⟨bi​−Rai​,μb​−Rμa​⟩F​)=i=1∑N​∥(bi​−Rai​)−(μb​−Rμa​)∥2=i=1∑N​∥(bi​−μb​)−R(ai​−μa​)∥2=i=1∑N​∥bi′​−Rai′​∥2=∥B′−RA′∥F2​​​(bi​−Rai​)T(μb​−Rμa​)是純量Frobenius inner product反向套用Frobenius decompositionbi′​≜bi​−μb​,ai′​≜ai​−μa​A′≜A(IN​−N1​11T),B′≜B(IN​−N1​11T)​

關於 A ′ A' A′的推導:

A ′ = A ( I n − 1 N 1 1 T ) = A − 1 N A 1 1 T = [ A 11 A 12 . . . A 1 N . . . A m 1 A m 2 . . . A m N ] − 1 N [ A 11 A 12 . . . A 1 N . . . A m 1 A m 2 . . . A m N ] [ 1 1 . . . 1 1 1 . . . 1 . . . 1 1 . . . 1 ] = [ A 11 A 12 . . . A 1 N . . . A m 1 A m 2 . . . A m N ] − 1 N [ ∑ j = 1 N A 1 j ∑ j = 1 N A 1 j . . . ∑ j = 1 N A 1 j . . . ∑ j = 1 N A m j ∑ j = 1 N A m j . . . ∑ j = 1 N A m j ] = [ A 11 A 12 . . . A 1 N . . . A m 1 A m 2 . . . A m N ] − [ μ 1 μ 1 . . . μ 1 . . . μ m μ m . . . μ m ] μ i ≜ A第i個row的平均值 = [ A 11 − μ 1 A 12 − μ 1 . . . A 1 N − μ 1 . . . A m 1 − μ m A m 2 − μ m . . . A m N − μ m ] \begin{aligned}A' &= A(I_n - \frac{1}{N}\bold{1}\bold{1}^T) \\&= A - \frac{1}{N}A\bold{1}\bold{1}^T \\&= \begin{bmatrix}A_{11} & A_{12} & ... & A_{1N} \\ ... \\ A_{m1} & A_{m2} & ... & A_{mN}\end{bmatrix} - \frac{1}{N}\begin{bmatrix}A_{11} & A_{12} & ... & A_{1N} \\ ... \\ A_{m1} & A_{m2} & ... & A_{mN}\end{bmatrix}\begin{bmatrix}1 & 1 & ... & 1 \\ 1 & 1 & ... & 1 \\ ... \\1 & 1 & ... & 1\end{bmatrix} \\&= \begin{bmatrix}A_{11} & A_{12} & ... & A_{1N} \\ ... \\ A_{m1} & A_{m2} & ... & A_{mN}\end{bmatrix} - \frac{1}{N} \begin{bmatrix}\sum\limits_{j=1}^{N}A_{1j} & \sum\limits_{j=1}^{N}A_{1j} & ... & \sum\limits_{j=1}^{N}A_{1j} \\ ... \\ \sum\limits_{j=1}^{N}A_{mj} & \sum\limits_{j=1}^{N}A_{mj} & ... & \sum\limits_{j=1}^{N}A_{mj}\end{bmatrix} \\&= \begin{bmatrix}A_{11} & A_{12} & ... & A_{1N} \\ ... \\ A_{m1} & A_{m2} & ... & A_{mN}\end{bmatrix} - \begin{bmatrix}\mu_1 & \mu_1 & ... & \mu_1 \\ ... \\ \mu_m & \mu_m & ... & \mu_m\end{bmatrix} && \mu_i \triangleq \text{A第i個row的平均值} \\&= \begin{bmatrix}A_{11}-\mu_1 & A_{12}-\mu_1 & ... & A_{1N}-\mu_1 \\ ... \\ A_{m1}-\mu_m & A_{m2}-\mu_m & ... & A_{mN}-\mu_m\end{bmatrix} \end{aligned} A′​=A(In​−N1​11T)=A−N1​A11T=⎣⎡​A11​...Am1​​A12​Am2​​......​A1N​AmN​​⎦⎤​−N1​⎣⎡​A11​...Am1​​A12​Am2​​......​A1N​AmN​​⎦⎤​⎣⎢⎢⎡​11...1​111​.........​111​⎦⎥⎥⎤​=⎣⎡​A11​...Am1​​A12​Am2​​......​A1N​AmN​​⎦⎤​−N1​⎣⎢⎢⎢⎢⎡​j=1∑N​A1j​...j=1∑N​Amj​​j=1∑N​A1j​j=1∑N​Amj​​......​j=1∑N​A1j​j=1∑N​Amj​​⎦⎥⎥⎥⎥⎤​=⎣⎡​A11​...Am1​​A12​Am2​​......​A1N​AmN​​⎦⎤​−⎣⎡​μ1​...μm​​μ1​μm​​......​μ1​μm​​⎦⎤​=⎣⎡​A11​−μ1​...Am1​−μm​​A12​−μ1​Am2​−μm​​......​A1N​−μ1​AmN​−μm​​⎦⎤​​​μi​≜A第i個row的平均值​

也就是在各個維度(1~m)去除均值的 A A A。

SVD矩陣分解

拆開並做SVD矩陣分解:

f ( R , t ) = ∥ B ′ − R A ′ ∥ F 2 = ∥ B ′ ∥ F 2 + ∥ R A ′ ∥ F 2 − 2 ⟨ B ′ , R A ′ ⟩ F Frobenius decomposition = ∥ B ′ ∥ F 2 + ∥ R A ′ ∥ F 2 − 2 tr ( R A ′ B ′ T ) ⟨ A , B ⟩ F = tr ( A B T ) = tr ( A T B ) = tr ( B A T ) = tr ( B T A ) = ∥ B ′ ∥ F 2 + ∥ A ′ ∥ F 2 − 2 tr ( R A ′ B ′ T ) 旋轉矩陣R不會改變矩陣的norm = ∥ B ′ ∥ F 2 + ∥ A ′ ∥ F 2 − 2 tr ( R V Σ U T ) SVD矩陣分解: B ′ A ′ T = U Σ V T , A ′ B ′ T = V Σ U T = ∥ B ′ ∥ F 2 + ∥ A ′ ∥ F 2 − 2 tr ( U T R V Σ ) tr ( A B C D ) = tr ( B C D A ) = tr ( C D A B ) = tr ( D A B C ) = ∥ B ′ ∥ F 2 + ∥ A ′ ∥ F 2 − 2 tr ( T Σ ) 在此令 T ≜ U T R V = ∥ B ′ ∥ F 2 + ∥ A ′ ∥ F 2 − 2 ∑ i = 1 m T i i σ i Σ 是 m × m 對角矩陣, Σ = [ σ 1 0 . . . 0 0 σ 2 . . . 0 0 0 . . . σ m ] \begin{aligned}f(R,t) &= \|B'-RA'\|^2_F \\&= \|B'\|^2_F+\|RA'\|^2_F - 2\langle B',RA'\rangle_F && \text{Frobenius decomposition}\\&=\|B'\|^2_F+\|RA'\|^2_F - 2\text{tr}(RA'B'^T) && \langle A,B\rangle_F = \text{tr}(AB^T) = \text{tr}(A^TB) = \text{tr}(BA^T) = \text{tr}(B^TA) \\&=\|B'\|^2_F+\|A'\|^2_F - 2\text{tr}(RA'B'^T) && \text{旋轉矩陣R不會改變矩陣的norm} \\&= \|B'\|^2_F+\|A'\|^2_F- 2\text{tr}(RV\Sigma U^T) && \text{SVD矩陣分解:}B'A'^T = U\Sigma V^T, A'B'^T = V\Sigma U^T \\&= \|B'\|^2_F+\|A'\|^2_F- 2\text{tr}(U^T RV\Sigma ) && \text{tr}(\bold{ABCD}) = \text{tr}(\bold{BCDA}) = \text{tr}(\bold{CDAB}) = \text{tr}(\bold{DABC}) \\ &= \|B'\|^2_F+\|A'\|^2_F - 2\text{tr}(T\Sigma) && \text{在此令}T \triangleq U^TRV \\&= \|B'\|^2_F+\|A'\|^2_F-2\sum\limits_{i=1}^mT_{ii}\sigma_i && \Sigma \text{是}m \times m \text{對角矩陣,}\Sigma = \begin{bmatrix}\sigma_1 & 0 & ... & 0 \\ 0 & \sigma_2 & ... & 0 \\ 0 & 0 & ... & \sigma_m\end{bmatrix}\end{aligned} f(R,t)​=∥B′−RA′∥F2​=∥B′∥F2​+∥RA′∥F2​−2⟨B′,RA′⟩F​=∥B′∥F2​+∥RA′∥F2​−2tr(RA′B′T)=∥B′∥F2​+∥A′∥F2​−2tr(RA′B′T)=∥B′∥F2​+∥A′∥F2​−2tr(RVΣUT)=∥B′∥F2​+∥A′∥F2​−2tr(UTRVΣ)=∥B′∥F2​+∥A′∥F2​−2tr(TΣ)=∥B′∥F2​+∥A′∥F2​−2i=1∑m​Tii​σi​​​Frobenius decomposition⟨A,B⟩F​=tr(ABT)=tr(ATB)=tr(BAT)=tr(BTA)旋轉矩陣R不會改變矩陣的normSVD矩陣分解:B′A′T=UΣVT,A′B′T=VΣUTtr(ABCD)=tr(BCDA)=tr(CDAB)=tr(DABC)在此令T≜UTRVΣ是m×m對角矩陣,Σ=⎣⎡​σ1​00​0σ2​0​.........​00σm​​⎦⎤​​

尋找最優的R

欲使 f ( R , t ) = ∥ B ′ ∥ F 2 + ∥ A ′ ∥ F 2 − 2 ∑ i = 1 m T i i σ i f(R,t) = \|B'\|^2_F+\|A'\|^2_F-2\sum\limits_{i=1}^mT_{ii}\sigma_i f(R,t)=∥B′∥F2​+∥A′∥F2​−2i=1∑m​Tii​σi​最小化,等同於求解 max ⁡ R , t ( ∑ i = 1 m T i i σ i ) \max\limits_{R,t}(\sum\limits^m_{i=1}T_{ii}\sigma_i) R,tmax​(i=1∑m​Tii​σi​)。

因為 U U U, R R R, V V V皆為正交矩陣,故 T ≜ U T R V T \triangleq U^TRV T≜UTRV亦為正交矩陣,所以有 T T T = I m TT^T = I_m TTT=Im​且 ∣ T i j ∣ < = 1 |T_{ij}| <= 1 ∣Tij​∣<=1(參考None element of orthogonal matrix can’t have unit modulus larger then 1)。

在 T = I m T=I_m T=Im​時 T i i = 1 T_{ii}=1 Tii​=1,所以 ∑ i − 1 m T i i σ i \sum\limits^m_{i-1}T_{ii}\sigma_i i−1∑m​Tii​σi​有最大值,此時 T ≜ U T R V = I m T \triangleq U^TRV = I_m T≜UTRV=Im​,這等價於取 R = U V T R = UV^T R=UVT。

所以 R = U V T R = UV^T R=UVT, t = 1 N ( B − R A ) 1 t = \frac{1}{N}(B-RA) \bold{1} t=N1​(B−RA)1即為Procrustes Transformation Problem的最優解。

继续阅读