天天看點

矩陣求導法則總結

1. 矩陣Y對标量x求導

相當于每個元素求導數後轉置一下,注意M×N矩陣求導後變成N×M了

Y = [ y i j ] − − > d Y d x = d y i j d x Y = [y_{ij}] --> \frac{dY}{dx} = \frac{dy_{ij}}{dx} Y=[yij​]−−>dxdY​=dxdyij​​

2. 标量y對列向量X求導:

注意與上面不同,這次括号内是求偏導,不轉置,對N×1向量求導後還是N×1向量

y = f ( x 1 , x 2 , . . , x n ) − − > d y / d X = ( D y / D x 1 , D y / D x 2 , . . , D y / D x n ) ′ y = f(x_1,x_2,..,x_n) --> dy/dX = (Dy/Dx_1,Dy/Dx_2,..,Dy/Dx_n)' y=f(x1​,x2​,..,xn​)−−>dy/dX=(Dy/Dx1​,Dy/Dx2​,..,Dy/Dxn​)′

3. 行向量Y’對列向量X求導:

注意1×M向量對N×1向量求導後是N×M矩陣。

将Y的每一列對X求偏導,将各列構成一個矩陣。

重要結論:

d X ′ / d X = I d ( A X ) ′ / d X = A ′ dX'/dX = I \\ d(AX)'/dX = A' dX′/dX=Id(AX)′/dX=A′

4. 列向量Y對行向量 X ′ X' X′求導:

轉化為行向量Y’對列向量X的導數,然後轉置。

注意M×1向量對1×N向量求導結果為M×N矩陣。

d Y / d X ′ = ( d Y ′ / d X ) ′ dY/dX' = (dY'/dX)' dY/dX′=(dY′/dX)′

5. 向量積對列向量X求導運算法則:

注意與标量求導有點不同。

d ( U V ′ ) / d X = ( d U / d X ) V ′ + U ( d V ′ / d X ) d ( U ′ V ) / d X = ( d U ′ / d X ) V + ( d V ′ / d X ) U ′ d(UV')/dX = (dU/dX)V' + U(dV'/dX) \\ d(U'V)/dX = (dU'/dX)V + (dV'/dX)U' d(UV′)/dX=(dU/dX)V′+U(dV′/dX)d(U′V)/dX=(dU′/dX)V+(dV′/dX)U′

重要結論:

d ( X ′ A ) / d X = ( d X ′ / d X ) A + ( d A / d X ) X ′ = I A + 0 X ′ = A d ( A X ) / d X ′ = ( d ( X ′ A ′ ) / d X ) ′ = ( A ′ ) ′ = A d ( X ′ A X ) / d X = ( d X ′ / d X ) A X + ( d ( A X ) ′ / d X ) X = A X + A ′ X d(X'A)/dX = (dX'/dX)A + (dA/dX)X' = IA + 0X' = A \\ d(AX)/dX' = (d(X'A')/dX)' = (A')' = A \\ d(X'AX)/dX = (dX'/dX)AX + (d(AX)'/dX)X = AX + A'X d(X′A)/dX=(dX′/dX)A+(dA/dX)X′=IA+0X′=Ad(AX)/dX′=(d(X′A′)/dX)′=(A′)′=Ad(X′AX)/dX=(dX′/dX)AX+(d(AX)′/dX)X=AX+A′X

6. 矩陣Y對列向量X求導:

将Y對X的每一個分量求偏導,構成一個超向量。

注意該向量的每一個元素都是一個矩陣。

7. 矩陣積對列向量求導法則:

d ( u V ) / d X = ( d u / d X ) V + u ( d V / d X ) d ( U V ) / d X = ( d U / d X ) V + U ( d V / d X ) d(uV)/dX = (du/dX)V + u(dV/dX) \\ d(UV)/dX = (dU/dX)V + U(dV/dX) d(uV)/dX=(du/dX)V+u(dV/dX)d(UV)/dX=(dU/dX)V+U(dV/dX)

重要結論:

d ( X ′ A ) / d X = ( d X ′ / d X ) A + X ′ ( d A / d X ) = I A + X ′ 0 = A d(X'A)/dX = (dX'/dX)A + X'(dA/dX) = IA + X'0 = A d(X′A)/dX=(dX′/dX)A+X′(dA/dX)=IA+X′0=A

8. 标量y對矩陣X的導數:

類似标量y對列向量X的導數,

把y對每個X的元素求偏導,不用轉置。

d y / d X = [ D y / D x ( i j ) ] dy/dX = [ Dy/Dx(ij) ] dy/dX=[Dy/Dx(ij)]

重要結論:

(1) y = U ′ X V = ∑ ∑ u ( i ) x ( i j ) v ( j ) y = U'XV = \sum\sum u(i)x(ij)v(j) y=U′XV=∑∑u(i)x(ij)v(j),

則, d y / d X = [ u ( i ) v ( j ) ] = U V ′ dy/dX = [u(i)v(j)] = UV' dy/dX=[u(i)v(j)]=UV′

(2) y = U ′ X ′ X U y = U'X'XU y=U′X′XU

則, d y / d X = 2 X U U ′ ) dy/dX = 2XUU') dy/dX=2XUU′)

(3) y = ( X U − V ) ′ ( X U − V ) y = (XU-V)'(XU-V) y=(XU−V)′(XU−V)

d y / d X = d ( U ′ X ′ X U − 2 V ′ X U + V ′ V ) / d X = 2 X U U ′ − 2 V U ′ + 0 = 2 ( X U − V ) U ′ dy/dX = d(U'X'XU - 2V'XU + V'V)/dX = 2XUU' - 2VU' + 0 = 2(XU-V)U' dy/dX=d(U′X′XU−2V′XU+V′V)/dX=2XUU′−2VU′+0=2(XU−V)U′

9. 矩陣Y對矩陣X的導數:

将Y的每個元素對X求導,然後排在一起形成超級矩陣。

10.乘積的導數

d ( f ∗ g ) / d x = ( d f ′ / d x ) g + ( d g / d x ) f ′ d(f*g)/dx=(df'/dx)g+(dg/dx)f' d(f∗g)/dx=(df′/dx)g+(dg/dx)f′

結論

d ( x ′ A x ) = ( d ( x ′ ′ ) / d x ) A x + ( d ( A x ) / d x ) ( x ′ ′ ) = A x + A ′ x d(x'Ax)=(d(x'')/dx)Ax+(d(Ax)/dx)(x'')=Ax+A'x d(x′Ax)=(d(x′′)/dx)Ax+(d(Ax)/dx)(x′′)=Ax+A′x

(注意:''是表示兩次轉置)

11. 假設A為m*n的矩陣,x為n維列向量,則

(1) d ( A x ) d x = A ′ \frac{d(Ax)}{dx} = A' dxd(Ax)​=A′

(2) d ( x ′ A ) d x = A \frac{d(x'A)}{dx} = A dxd(x′A)​=A

(3) d ( x ′ A x ) d x = ( A ′ + A ) x \frac{d(x'Ax)}{dx} = (A'+A)x dxd(x′Ax)​=(A′+A)x

繼續閱讀