【測評函數(單變量)】
- 波士頓房産資料(隻使用房間數量這個特征)
- 資料切分(train_test_split)
- MSE、MAE、RMSE、r2_score
import pandas as pd
import numpy as np
from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import LinearRegression
boston = datasets.load_boston()
x = boston.data #獲得資料集的特征屬性列
y = boston.target #獲得資料集的label列
df = pd.DataFrame(data = np.c_[x,y],columns=np.append(boston.feature_names,[‘MEDV’])) #np.c_是按列連接配接兩個矩陣,就是把兩矩陣左右相加,要求列數相等
df = df[[‘RM’,‘MEDV’]] #選擇房間數屬性列和房價屬性列
x_train,x_test,y_train,y_test = train_test_split(x,y,test_size=0.3) #劃分資料集
scaler = StandardScaler() #作用:去均值和方差歸一化。可儲存訓練集中的均值、方差參數,然後直接用于轉換測試集資料。
x_train = scaler.fit_transform(x_train)
x_test = scaler.fit_transform(x_test)
from sklearn import metrics
from sklearn.metrics import r2_score
from sklearn.metrics import mean_squared_error
from sklearn.metrics import mean_absolute_error #平方絕對誤差
linreg = LinearRegression()
model = linreg.fit(x_train,y_train)
print(“MSE均方誤差:”,mean_squared_error(y_train,model.predict(x_train)))
print(“RMSE均方根誤差:”,mean_squared_error(y_train,model.predict(x_train)) ** 0.5)
print(“MAE平均絕對誤差:”,mean_absolute_error(y_train,model.predict(x_train)))
print(“r2_score決定系數:”,r2_score(y_train,model.predict(x_train)))
結果:
MSE均方誤差: 22.343758172268284
RMSE均方根誤差: 4.72691846473665
MAE平均絕對誤差: 3.3216661028196164
r2_score決定系數: 0.7459490346981561