高斯牛頓疊代法

轉載：http://www.tqcto.com/article/code/302336.html#

本文将詳解最小二乘法的非線性拟合，高斯牛頓疊代法。

1.原理

高斯—牛頓疊代法的基本思想是使用泰勒級數展開式去近似地代替非線性回歸模型，然後通過多次疊代，多次修正回歸系數，使回歸系數不斷逼近非線性回歸模型的最佳回歸系數，最後使原模型的殘差平方和達到最小。

①已知m個點：

高斯牛頓疊代法

②函數原型：

高斯牛頓疊代法

其中：（m>=n）

高斯牛頓疊代法

③目的是找到最優解β，使得殘差平方和最小：

高斯牛頓疊代法

殘差：

高斯牛頓疊代法

④要求最小值，即S的對β偏導數等于0：

高斯牛頓疊代法

⑤在非線性系統中，

高斯牛頓疊代法

是變量和參數的函數，沒有close解。是以，我們給定一個初始值，用疊代法逼近解：

高斯牛頓疊代法

其中k是疊代次數，

高斯牛頓疊代法

是疊代矢量。

⑥而每次疊代函數是線性的，在

高斯牛頓疊代法

處用泰勒級數展開：

高斯牛頓疊代法

其中：J是已知的矩陣，為了友善疊代，令

高斯牛頓疊代法

。

⑦此時殘差表示為：

高斯牛頓疊代法

⑧帶入公式④有：

高斯牛頓疊代法

化解得：

高斯牛頓疊代法

⑨寫成矩陣形式：

高斯牛頓疊代法

⑩是以最終疊代公式為：

高斯牛頓疊代法

其中，Jf是函數f=（x，β）對β的雅可比矩陣。

2.代碼

用java代碼實作，解維基百科的例子：

https://en.wikipedia.org/wiki/Gauss%E2%80%93Newton_algorithm

①已知資料：

高斯牛頓疊代法

②函數模型：

高斯牛頓疊代法

③殘差公式：

高斯牛頓疊代法

④對β求偏導數：

高斯牛頓疊代法

⑤代碼如下：

public class GaussNewton {
    double[] xData = new double[]{, , , , , , };
    double[] yData = new double[]{, , , , , , };

    double[][] bMatrix = new double[][];//系數β矩陣
    int m = xData.length;
    int n = bMatrix.length;
    int iterations = ;//疊代次數

    //疊代公式求解，即1中公式⑩
    private void magic(){
        //β1,β2疊代初值
        bMatrix[][] = ;
        bMatrix[][] = ;

        //求J矩陣
        for(int k = ; k < iterations; k++) { 
            double[][] J = new double[m][n];
            double[][] JT = new double[n][m];
            for(int i = ; i < m; i++){
                for(int j = ; j < n; j++) {
                    J[i][j] = secondDerivative(xData[i], bMatrix[][], bMatrix[][], j);
                }
            }

            JT = MatrixMath.invert(J);//求轉置矩陣JT
            double[][] invertedPart = MatrixMath.mult(JT, J);//矩陣JT與J相乘

            //矩陣invertedPart行列式的值：|JT*J|
            double result = MatrixMath.mathDeterminantCalculation(invertedPart);

            //求矩陣invertedPart的逆矩陣:(JT*J)^-1
            double[][] reversedPart = MatrixMath.getInverseMatrix(invertedPart, result);

            //求方差r(β)矩陣: ri = yi - f(xi, b)
            double[][] residuals = new double[m][];
            for(int i = ; i < m; i++) {
                residuals[i][] = yData[i] - (bMatrix[][] * xData[i]) / (bMatrix[][] + xData[i]);
            }

            //求矩陣積reversedPart*JT*residuals: (JT*J)^-1*JT*r
            double[][] product = MatrixMath.mult(MatrixMath.mult(reversedPart, JT), residuals);

            //疊代公式, 即公式⑩
            double[][] newB = MatrixMath.plus(bMatrix, product);
            bMatrix = newB;        
        }        
        //顯示系數值
        System.out.println("b1: " + bMatrix[][] + "\nb2: " + bMatrix[][]);        
    }

    //2中公式④
    private static double secondDerivative(double x, double b1, double b2, int index){
        switch(index) {
            case : return x / (b2 + x);//對系數bi求導
            case : return - * (b1 * x) / Math.pow((b2+x), );//對系數b2求導
        }
        return ;
    }

    public static void main(String[] args) {
        GaussNewton app = new GaussNewton();
        app.magic();        
    }
}

運作，輸出得到：

b1: 0.3618366954234483

b2: 0.5562654497238557

高斯牛頓疊代法

⑥其中用到的矩陣運算代碼如下：

public class MatrixMath {

    /**
     * 矩陣基本運算：加、減、乘、除、轉置
     */
        public final static int OPERATION_ADD = ;  
        public final static int OPERATION_SUB = ;  
        public final static int OPERATION_MUL = ;        

        /**
         * 矩陣加法
         * @param a 加數
         * @param b 被加數
         * @return 和
         */
        public static double[][] plus(double[][] a, double[][] b){
            if(legalOperation(a, b, OPERATION_ADD)) { 
                for(int i=; i<a.length; i++) {         
                    for(int j=; j<a[].length; j++) {  
                        a[i][j] = a[i][j] + b[i][j];  
                    }
                }   
            }
            return a;
        }

        /**
         * 矩陣減法
         * @param a 減數
         * @param b 被減數
         * @return 差
         */
        public static double[][] substract(double[][] a, double[][] b){
            if(legalOperation(a, b, OPERATION_SUB)) { 
                for(int i=; i<a.length; i++) {  
                    for(int j=; j<a[].length; j++) {  
                        a[i][j] = a[i][j] - b[i][j];  
                    }
                }       
            }
            return a;
        }       

        /**
         * 判斷矩陣行列是否符合運算
         * @param a 進行運算的矩陣
         * @param b 進行運算的矩陣
         * @param type 運算類型
         * @return 符合/不符合運算
         */
        private static boolean legalOperation(double[][] a, double[][] b, int type) {  
            boolean legal = true;  
            if(type == OPERATION_ADD || type == OPERATION_SUB)  
            {  
                if(a.length != b.length || a[].length != b[].length) {  
                    legal = false;  
                }  
            }   
            else if(type == OPERATION_MUL)  
            {  
                if(a[].length != b.length) {  
                    legal = false;  
                }  
            }  
            return legal;  
        } 

        /**
         * 矩陣乘法
         * @param a 乘數
         * @param b 被乘數
         * @return 積
         */
        public static double[][] mult(double[][] a, double[][] b){
            if(legalOperation(a, b, OPERATION_MUL)) {
                double[][] result = new double[a.length][b[].length];
                for(int i=; i< a.length; i++) {  
                    for(int j=; j< b[].length; j++) {  
                        result[i][j] = calculateSingleResult(a, b, i, j);  
                    }
                }
                return result;
            }
            else
            {
                return null;
            }       
        }

        /**
         * 矩陣乘法
         * @param a 乘數
         * @param b 被乘數
         * @return 積
         */
        public static double[][] mult(double[][] a, int b) {  
            for(int i=; i<a.length; i++) {  
                for(int j=; j<a[].length; j++) {  
                    a[i][j] = a[i][j] * b;  
                }  
            }  
            return a;  
        }

        /**
         * 乘數矩陣的行元素與被乘數矩陣列元素積的和
         * @param a 乘數矩陣
         * @param b 被乘數矩陣
         * @param row 行
         * @param column 列
         * @return 值
         */
        private static double calculateSingleResult(double[][] a, double[][] b, int row, int column) {  
            double result = ;  
            for(int k = ; k< a[].length; k++) {  
                result += a[row][k] * b[k][column];  
            }  
            return result;  
        }  

        /**
         * 矩陣的轉置
         * @param a 要轉置的矩陣
         * @return 轉置後的矩陣
         */
        public static double[][] invert(double[][] a){
            double[][] result = new double[a[].length][a.length];
            for(int i=;i<a.length;i++){
                for(int j=;j<a[].length;j++){  
                      result[j][i]=a[i][j];  
                  }  
              }  
            return result;
        }   

    /** 
     * 求可逆矩陣（使用代數餘子式的形式） 
     */   
        /** 
         * 求傳入的矩陣的逆矩陣 
         * @param value 需要轉換的矩陣 
         * @return 逆矩陣 
         */  
        public static double[][] getInverseMatrix(double[][] value,double result){  
            double[][] transferMatrix = new double[value.length][value[].length];  
            //計算代數餘子式，并指派給|A|  
            for (int i = ; i < value.length; i++) {  
                for (int j = ; j < value[i].length; j++) {  
                    transferMatrix[j][i] =  mathDeterminantCalculation(getNewMatrix(i, j, value));  
                    if ((i+j)%!=) {  
                        transferMatrix[j][i] = -transferMatrix[j][i];  
                    }  
                    transferMatrix[j][i] /= result;   
                }  
            }  
            return transferMatrix;  
        }  

        /*** 
         * 求行列式的值
         * @param value 需要算的行列式 
         * @return 計算的結果 
         */  
        public static double mathDeterminantCalculation(double[][] value){  
            if (value.length == ) {  
                //當行列式為1階的時候就直接傳回本身  
                return value[][];  
            }

            if (value.length == ) {  
                //如果行列式為二階的時候直接進行計算  
                return value[][]*value[][]-value[][]*value[][];  
            }  

            //當行列式的階數大于2時  
            double result = ;  
            for (int i = ; i < value.length; i++) {         
                //檢查數組對角線位置的數值是否是0，如果是零則對該數組進行調換，查找到一行不為0的進行調換  
                if (value[i][i] == ) {  
                    value = changeDeterminantNoZero(value,i,i);  
                    result*=-;  
                }  

                for (int j = ; j <i; j++) {  
                    //讓開始處理的行的首位為0處理為三角形式  
                    //如果要處理的列為0則和自己調換一下位置，這樣就省去了計算  
                    if (value[i][j] == ) {  
                        continue;  
                    }  
                    //如果要是要處理的行是0則和上面的一行進行調換  
                    if (value[j][j]==) {  
                        double[] temp = value[i];  
                        value[i] = value[i-];  
                        value[i-] = temp;  
                        result*=-;  
                        continue;  
                    }  
                    double  ratio = -(value[i][j]/value[j][j]);  
                    value[i] = addValue(value[i],value[j],ratio);  
                }  
            }           
            return mathValue(value,result);
        }  

        /** 
         * 計算行列式的結果 
         * @param value 
         * @return 
         */  
        private static double mathValue(double[][] value,double result){  
            for (int i = ; i < value.length; i++) {  
                //如果對角線上有一個值為0則全部為0，直接傳回結果  
                if (value[i][i]==) {  
                    return ;  
                }  
                result *= value[i][i];  
            }  
            return result;  
        }  

        /*** 
         * 将i行之前的每一行乘以一個系數，使得從i行的第i列之前的數字置換為0 
         * @param currentRow 目前要處理的行 
         * @param frontRow i行之前的周遊的行 
         * @param ratio 要乘以的系數 
         * @return 将i行i列之前數字置換為0後的新的行 
         */  
        private static double[] addValue(double[] currentRow,double[] frontRow, double ratio){  
            for (int i = ; i < currentRow.length; i++) {  
                currentRow[i] += frontRow[i]*ratio;  
            }  
            return currentRow;  
        }  

        /** 
         * 指定列的位置是否為0，查找第一個不為0的位置的行進行位置調換，如果沒有則傳回原來的值 
         * @param determinant 需要處理的行列式 
         * @param line 要調換的行 
         * @param row 要判斷的列 
         */  
        private static double[][] changeDeterminantNoZero(double[][] determinant,int line,int column){  
            for (int i = line; i < determinant.length; i++) {  
                //進行行調換  
                if (determinant[i][column] != ) {  
                    double[] temp = determinant[line];  
                    determinant[line] = determinant[i];  
                    determinant[i] = temp;  
                    return determinant;  
                }  
            }  
            return determinant;  
        }         

        /** 
         * 轉換為代數餘子式 
         * @param row 行 
         * @param line 列 
         * @param matrix 要轉換的矩陣 
         * @return 轉換的代數餘子式 
         */  
        private static double[][] getNewMatrix(int row,int line,double[][] matrix){  
            double[][] newMatrix = new double[matrix.length-][matrix[].length-];  
            int rowNum = ,lineNum = ;  
            for (int i = ; i < matrix.length; i++) {  
                if (i == row){  
                    continue;  
                }  
                for (int j = ; j < matrix[i].length; j++) {  
                    if (j == line) {  
                        continue;  
                    }  
                    newMatrix[rowNum][lineNum++%(matrix[].length-)] = matrix[i][j];  
                }  
                rowNum++;  
            }  
            return newMatrix;  
        }  

        public static void main(String[] args) {  
            //double[][] test = {{0,0,0,1,2},{0,0,0,2,3},{1,1,0,0,0},{0,1,1,0,0},{0,0,1,0,0}};  
            double[][] test = {
                    {, -},  
                    {-, }
                };
            double result;  
            try {  
                double[][] temp = new double[test.length][test[].length];  
                for (int i = ; i < test.length; i++) {  
                    for (int j = ; j < test[i].length; j++) {  
                        temp[i][j] = test[i][j];  
                    }  
                }  
                //先計算矩陣的行列式的值是否等于0，如果不等于0則該矩陣是可逆的  
                result = mathDeterminantCalculation(temp);  
                if (result == ) {  
                    System.out.println("矩陣不可逆");  
                }else {  
                    System.out.println("矩陣可逆");  
                    //求出逆矩陣  
                    double[][] result11 = getInverseMatrix(test,result);  
                    //列印逆矩陣  
                    for (int i = ; i < result11.length; i++) {  
                        for (int j = ; j < result11[i].length; j++) {  
                            System.out.print(result11[i][j]+"   ");                       
                        }  
                        System.out.println();  
                    }  
                }  
            } catch (Exception e) {  
                e.printStackTrace();  
                System.out.println("不是正确的行列式！！");  
            }  
        }   
}

高斯牛頓疊代法

繼續閱讀

簡單文檔分類——樸素貝葉斯算法樸素貝葉斯算法簡單文檔分類執行個體步驟總結樸素貝葉斯分類調用(sklearn)

【分類算法】什麼是分類算法定義分類與聚類分類過程方法

分類算法的評價名額

K-近鄰算法以及圖像分類應用

weka之NB算法

使用weka的select attribute

weka中分類器算法

在weka中內建自己的算法

【多變量線性回歸】學習記錄序思路實作終

申請評分模型拒絕推斷（RI）方法申請評分模型拒絕推斷（RI）方法

【人工智能行業大師訪談1】吳恩達采訪 Geoffery Hinton

【趨高機器視覺】機器視覺技術原了解析及解決方案

吳恩達 coursera ML 第七課總結+作業答案前言目錄正文模型表示作業答案

XGBoost Plotting API以及GBDT組合特征實踐 XGBoost Plotting API以及GBDT組合特征實踐

解碼器用于語義分割：資料依賴的解碼可以實作靈活的特征聚合

2021-2025年中國運動療法（KT）帶行業市場供需與戰略研究報告