天天看點

算法梳理進階任務三:測評函數(單變量)

【測評函數(單變量)】

  1. 波士頓房産資料(隻使用房間數量這個特征)
  2. 資料切分(train_test_split)
  3. MSE、MAE、RMSE、r2_score

import pandas as pd

import numpy as np

from sklearn import datasets

from sklearn.model_selection import train_test_split

from sklearn.preprocessing import StandardScaler

from sklearn.linear_model import LinearRegression

boston = datasets.load_boston()

x = boston.data #獲得資料集的特征屬性列

y = boston.target #獲得資料集的label列

df = pd.DataFrame(data = np.c_[x,y],columns=np.append(boston.feature_names,[‘MEDV’])) #np.c_是按列連接配接兩個矩陣,就是把兩矩陣左右相加,要求列數相等

df = df[[‘RM’,‘MEDV’]] #選擇房間數屬性列和房價屬性列

x_train,x_test,y_train,y_test = train_test_split(x,y,test_size=0.3) #劃分資料集

scaler = StandardScaler() #作用:去均值和方差歸一化。可儲存訓練集中的均值、方差參數,然後直接用于轉換測試集資料。

x_train = scaler.fit_transform(x_train)

x_test = scaler.fit_transform(x_test)

from sklearn import metrics

from sklearn.metrics import r2_score

from sklearn.metrics import mean_squared_error

from sklearn.metrics import mean_absolute_error #平方絕對誤差

linreg = LinearRegression()

model = linreg.fit(x_train,y_train)

print(“MSE均方誤差:”,mean_squared_error(y_train,model.predict(x_train)))

print(“RMSE均方根誤差:”,mean_squared_error(y_train,model.predict(x_train)) ** 0.5)

print(“MAE平均絕對誤差:”,mean_absolute_error(y_train,model.predict(x_train)))

print(“r2_score決定系數:”,r2_score(y_train,model.predict(x_train)))

結果:

MSE均方誤差: 22.343758172268284

RMSE均方根誤差: 4.72691846473665

MAE平均絕對誤差: 3.3216661028196164

r2_score決定系數: 0.7459490346981561

繼續閱讀