非線性回歸：邏輯回歸（ Logistic Regression ）筆記

2023-04-07 19:26:52

Logistic Regression (邏輯回歸)

1. 基本模型

測試資料為X（x0，x1，x2···xn）

要學習的參數為： Θ（θ0，θ1，θ2，···θn）

非線性回歸：邏輯回歸（ Logistic Regression ）筆記

向量表示：

非線性回歸：邏輯回歸（ Logistic Regression ）筆記

處理二值資料，引入Sigmoid函數時曲線平滑化：

非線性回歸：邏輯回歸（ Logistic Regression ）筆記

得到邏輯回歸的預測函數：

非線性回歸：邏輯回歸（ Logistic Regression ）筆記

也可以用機率表示：

正例(y=1)，即在給定的x和Θ的情況下，發生的機率為：

非線性回歸：邏輯回歸（ Logistic Regression ）筆記

反例(y=0)，即在給定的x和Θ的情況下，發生的機率為：

非線性回歸：邏輯回歸（ Logistic Regression ）筆記

2 .cost函數

線上性回歸中，預測值和真實值的差的平方，使其最小化。

非線性回歸：邏輯回歸（ Logistic Regression ）筆記

在邏輯回歸中，方程的合并過程，去對數有助于簡化和易于判斷單調性：

非線性回歸：邏輯回歸（ Logistic Regression ）筆記

找到一組Θ值使以上方程最小化，利用梯度下降法。

梯度下降法：

非線性回歸：邏輯回歸（ Logistic Regression ）筆記

按一定的學習率和更新法則，不斷循環求導

為找到最小值，對求偏導化簡得到，邏輯回歸的更新法則變為：

非線性回歸：邏輯回歸（ Logistic Regression ）筆記

同時對所有的θ進行更新，重複更新直到收斂

3. python舉例：

import numpy as np
import random


def gradientDescent(x, y, theta, alpha, m, numIterations):   # 梯度下降算法,x:執行個體, y:列, theta:θ, alpha:學習率,
                                                             # m:執行個體個數,numIterations:更新法則的次數
    xTrans = x.transpose()          # 轉置矩陣
    for i in range(0, numIterations):   # numIterations=1000的話，循環從0到999
        hypothesis = np.dot(x, theta)   # hypothesis 内積 x和theta點乘
        loss = hypothesis - y           # hypothesis表示預測出來的y值

        cost = np.sum(loss ** 2) / (2 * m)

        print("Iteration %d | Cost: %f" % (i, cost))

        grandient = np.dot(xTrans, loss) / m

        theta = theta - alpha * grandient
    return theta


def getData(numPoints, bias, variance):    # 建立資料，參數為執行個體個數，偏好，方差
    x = np.zeros(shape=(numPoints, 2))     # numpoints行，2列
    y = np.zeros(shape=numPoints)           #label 标簽

    for i in range(0, numPoints):      # 循環指派，0到numpoints-1

        x[i][0] = 1
        x[i][1] = i

        y[i] = (i + bias) + random.uniform(0, 1) * variance      # uniform是從0到1之間随機取
    return x, y


x, y = getData(100, 25, 10)     # 100行即100個執行個體

# print("x:\n", x)
# print("y:\n", y)

m, n = np.shape(x)
# y_col = np.shape(y)

# print("x shape:", str(m), str(n))
# print("y shape:", str(y_col))

numIterations = 100000
alpha = 0.0005
theta = np.ones(n)      # 初始化為1
theta = gradientDescent(x, y, theta, alpha, m, numIterations)   # m=100個執行個體
print(theta)

θ更新，重複更新直到收斂，得到[29.68959795 1.01793798]

當有新的輸入，帶入可得到預測結果。

非線性回歸：邏輯回歸（ Logistic Regression ）筆記

繼續閱讀

來自python的【條件控制/語句循環/break/continue/else/pass】一、條件控制二、語句循環

無法解析的外部符号 wmain，該符号在函數 "void cdecl mainCRTStartupHelper(struct HINSTANCE *,unsigned short con......

TestLink導出用例轉換工具(XML2Excel)

YAML簡介和PyYAML安全操作YAML支援的類型YAML的優點：yaml的基本文法python操作

Small tricks

libsvm for python 安裝

學習軟體測試基礎測試第七天

Zeppelin 配置通路 REST APIApache Zeppelin Configuration REST API

【Torch】最簡潔logging使用指南

27. Remove Element(清單)題目代碼

Cloud Studio初體驗

使用 ctypes 進行 Python 和 C 的混合程式設計

【python】【資料處理】畫多元資料分布圖

【python】netconf協定對接管理裝置

「Python 網絡自動化」NETCONF —— Python 使用 NETCONF 管理配置 H3C 網絡裝置

在python中建立excel并寫入