天天看点

矩阵求导法则总结

1. 矩阵Y对标量x求导

相当于每个元素求导数后转置一下,注意M×N矩阵求导后变成N×M了

Y = [ y i j ] − − > d Y d x = d y i j d x Y = [y_{ij}] --> \frac{dY}{dx} = \frac{dy_{ij}}{dx} Y=[yij​]−−>dxdY​=dxdyij​​

2. 标量y对列向量X求导:

注意与上面不同,这次括号内是求偏导,不转置,对N×1向量求导后还是N×1向量

y = f ( x 1 , x 2 , . . , x n ) − − > d y / d X = ( D y / D x 1 , D y / D x 2 , . . , D y / D x n ) ′ y = f(x_1,x_2,..,x_n) --> dy/dX = (Dy/Dx_1,Dy/Dx_2,..,Dy/Dx_n)' y=f(x1​,x2​,..,xn​)−−>dy/dX=(Dy/Dx1​,Dy/Dx2​,..,Dy/Dxn​)′

3. 行向量Y’对列向量X求导:

注意1×M向量对N×1向量求导后是N×M矩阵。

将Y的每一列对X求偏导,将各列构成一个矩阵。

重要结论:

d X ′ / d X = I d ( A X ) ′ / d X = A ′ dX'/dX = I \\ d(AX)'/dX = A' dX′/dX=Id(AX)′/dX=A′

4. 列向量Y对行向量 X ′ X' X′求导:

转化为行向量Y’对列向量X的导数,然后转置。

注意M×1向量对1×N向量求导结果为M×N矩阵。

d Y / d X ′ = ( d Y ′ / d X ) ′ dY/dX' = (dY'/dX)' dY/dX′=(dY′/dX)′

5. 向量积对列向量X求导运算法则:

注意与标量求导有点不同。

d ( U V ′ ) / d X = ( d U / d X ) V ′ + U ( d V ′ / d X ) d ( U ′ V ) / d X = ( d U ′ / d X ) V + ( d V ′ / d X ) U ′ d(UV')/dX = (dU/dX)V' + U(dV'/dX) \\ d(U'V)/dX = (dU'/dX)V + (dV'/dX)U' d(UV′)/dX=(dU/dX)V′+U(dV′/dX)d(U′V)/dX=(dU′/dX)V+(dV′/dX)U′

重要结论:

d ( X ′ A ) / d X = ( d X ′ / d X ) A + ( d A / d X ) X ′ = I A + 0 X ′ = A d ( A X ) / d X ′ = ( d ( X ′ A ′ ) / d X ) ′ = ( A ′ ) ′ = A d ( X ′ A X ) / d X = ( d X ′ / d X ) A X + ( d ( A X ) ′ / d X ) X = A X + A ′ X d(X'A)/dX = (dX'/dX)A + (dA/dX)X' = IA + 0X' = A \\ d(AX)/dX' = (d(X'A')/dX)' = (A')' = A \\ d(X'AX)/dX = (dX'/dX)AX + (d(AX)'/dX)X = AX + A'X d(X′A)/dX=(dX′/dX)A+(dA/dX)X′=IA+0X′=Ad(AX)/dX′=(d(X′A′)/dX)′=(A′)′=Ad(X′AX)/dX=(dX′/dX)AX+(d(AX)′/dX)X=AX+A′X

6. 矩阵Y对列向量X求导:

将Y对X的每一个分量求偏导,构成一个超向量。

注意该向量的每一个元素都是一个矩阵。

7. 矩阵积对列向量求导法则:

d ( u V ) / d X = ( d u / d X ) V + u ( d V / d X ) d ( U V ) / d X = ( d U / d X ) V + U ( d V / d X ) d(uV)/dX = (du/dX)V + u(dV/dX) \\ d(UV)/dX = (dU/dX)V + U(dV/dX) d(uV)/dX=(du/dX)V+u(dV/dX)d(UV)/dX=(dU/dX)V+U(dV/dX)

重要结论:

d ( X ′ A ) / d X = ( d X ′ / d X ) A + X ′ ( d A / d X ) = I A + X ′ 0 = A d(X'A)/dX = (dX'/dX)A + X'(dA/dX) = IA + X'0 = A d(X′A)/dX=(dX′/dX)A+X′(dA/dX)=IA+X′0=A

8. 标量y对矩阵X的导数:

类似标量y对列向量X的导数,

把y对每个X的元素求偏导,不用转置。

d y / d X = [ D y / D x ( i j ) ] dy/dX = [ Dy/Dx(ij) ] dy/dX=[Dy/Dx(ij)]

重要结论:

(1) y = U ′ X V = ∑ ∑ u ( i ) x ( i j ) v ( j ) y = U'XV = \sum\sum u(i)x(ij)v(j) y=U′XV=∑∑u(i)x(ij)v(j),

则, d y / d X = [ u ( i ) v ( j ) ] = U V ′ dy/dX = [u(i)v(j)] = UV' dy/dX=[u(i)v(j)]=UV′

(2) y = U ′ X ′ X U y = U'X'XU y=U′X′XU

则, d y / d X = 2 X U U ′ ) dy/dX = 2XUU') dy/dX=2XUU′)

(3) y = ( X U − V ) ′ ( X U − V ) y = (XU-V)'(XU-V) y=(XU−V)′(XU−V)

d y / d X = d ( U ′ X ′ X U − 2 V ′ X U + V ′ V ) / d X = 2 X U U ′ − 2 V U ′ + 0 = 2 ( X U − V ) U ′ dy/dX = d(U'X'XU - 2V'XU + V'V)/dX = 2XUU' - 2VU' + 0 = 2(XU-V)U' dy/dX=d(U′X′XU−2V′XU+V′V)/dX=2XUU′−2VU′+0=2(XU−V)U′

9. 矩阵Y对矩阵X的导数:

将Y的每个元素对X求导,然后排在一起形成超级矩阵。

10.乘积的导数

d ( f ∗ g ) / d x = ( d f ′ / d x ) g + ( d g / d x ) f ′ d(f*g)/dx=(df'/dx)g+(dg/dx)f' d(f∗g)/dx=(df′/dx)g+(dg/dx)f′

结论

d ( x ′ A x ) = ( d ( x ′ ′ ) / d x ) A x + ( d ( A x ) / d x ) ( x ′ ′ ) = A x + A ′ x d(x'Ax)=(d(x'')/dx)Ax+(d(Ax)/dx)(x'')=Ax+A'x d(x′Ax)=(d(x′′)/dx)Ax+(d(Ax)/dx)(x′′)=Ax+A′x

(注意:''是表示两次转置)

11. 假设A为m*n的矩阵,x为n维列向量,则

(1) d ( A x ) d x = A ′ \frac{d(Ax)}{dx} = A' dxd(Ax)​=A′

(2) d ( x ′ A ) d x = A \frac{d(x'A)}{dx} = A dxd(x′A)​=A

(3) d ( x ′ A x ) d x = ( A ′ + A ) x \frac{d(x'Ax)}{dx} = (A'+A)x dxd(x′Ax)​=(A′+A)x

继续阅读