sklearn&Tensorflow機器學習01 --- 概覽，回歸模型(幸福感與國家gdp的關系）

2021-11-24 23:50:00

學習一個東西之前要認清學的是什麼

啥是機器學習？

機器學習就算用資料的語言，通過計算來進行回歸和預測

包括監督學習，非監督學習，強化學習，深度學習

監督學習：就是用含有标簽的資料進行在各種數學模型中進行運算，得到具有比較好正确率的參數，可以在未知的資料中預測标簽

那麼先用一個小代碼來了解一下

用回歸模型來看幸福感和城市富裕程度的關系

import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
from sklearn import linear_model

#首先處理幸福的資料
#加載資料
oecd_bli = pd.read_csv("oecd_bli_2015.csv",thousands = ',')
oecd_bli = oecd_bli[oecd_bli['Inequality']=='Total']
oecd_bli = oecd_bli.pivot(index = 'Country', columns = 'Indicator',values = 'Value')

#接着處理gdp的資料
gdp_per_capita = pd.read_csv('gdp_per_capita.csv',thousands = ',', 
                             delimiter = '\t', encoding ='latin1',na_values = 'n/a')
gdp_per_capita.rename(columns = {'2015':'GDP per captial'},inplace = True)
gdp_per_capita.set_index('Country', inplace = True)
gdp_per_capita.head(2)

#将兩張表融合在一起

full_country_stats = pd.merge(left = oecd_bli, right = gdp_per_capita, 
                              left_index = True, right_index = True)
full_country_stats.sort_values(by = 'GDP per captial', inplace = True)

#劃分資料
remove_indices = [0,1,6,8,33,34,35]
keep_indices = list(set(range(36)) - set(remove_indices))
sample_data = full_country_stats[["GDP per captial",'Life satisfaction']].iloc[keep_indices]
missing_data = full_country_stats[["GDP per captial","Life satisfaction"]].iloc[remove_indices]

#畫圖
sample_data.plot(kind = 'scatter',x= 'GDP per captial',y = 'Life satisfaction', figsize = (5,3))
plt.axis([0,60000,0,10])
position_text = {
        "Hungary":(5000,1),
        "Korea":(18000,1.7),
        "France":(29000,2.4),
        "Australia":(40000,3.0),
        "United States":(52000,3.8)     
        }
for country, pos_text in position_text.items():
    pos_data_x, pos_data_y = sample_data.loc[country]
    if country == "United States" : country = 'U.S.' 
    else: country
    plt.annotate(country, xy = (pos_data_x, pos_data_y), xytext = pos_text,
                 arrowprops = dict(facecolor = 'black', width = 0.5, shrink = 0.1, headwidth = 5))
    plt.plot(pos_data_x,pos_data_y,'ro')

sklearn&Tensorflow機器學習01 --- 概覽，回歸模型(幸福感與國家gdp的關系）

#選擇線性模型
country_stats = sample_data
x = np.c_[country_stats['GDP per captial']]
y = np.c_[country_stats['Life satisfaction']]

# Visualize the data
country_stats.plot(kind='scatter', x="GDP per captial", y='Life satisfaction')
plt.show()

#選擇線性模型
lin_reg_model = linear_model.LinearRegression()
lin_reg_model.fit(x, y)

#Make a prediction for Cyprus
X_new = [[22587]]
print(lin_reg_model.predict(X_new))

sklearn&Tensorflow機器學習01 --- 概覽，回歸模型(幸福感與國家gdp的關系）

繼續閱讀

如果你想要學習深度學習，但是不知道從何入手，那麼《每天五分鐘深度學習》專欄一定是你不容錯過的學習資源。這個專欄包含了神經

tensorflow學習——keras進階API——序列模型Sequential

SVD原理和案例(奇異值分解)

連續兩年入圍全球Gartner ABI魔力象限，Quick BI在商業智能領域究竟有何魔力？1、互動式的分析和可視化2、建構資料故事3、釘釘內建4、增強分析Quick BI

技術解密｜阿裡雲多媒體 AI 團隊是憑借什麼拿下 CVPR2021 5冠1亞的？頂級挑戰賽戰績顯赫四大挑戰的關鍵技術探索基于視訊了解技術打造多媒體 AI 雲産品

算法專家解讀 | 開放搜尋教育搜題能力和實踐

Keras使用分批疊代（fit_generate）的方式訓練資料

圖像分割UNet系列------UNet3+（UNet3plus）詳解

圖像分割UNet系列------UNet詳解

特征：什麼是特征和特征選擇？

Pytorch(二) Tensor Tensor的建立Tensor是什麼Tensor的建立

2023了，學習深度學習架構哪個比較好？

VGGNet------超經典神經網絡結構與PyTorch實作

tensorflow學習——（imdb資料集）文本分類first_2.py

Matlab深度學習-手寫體數字識别Matlab深度學習前言一、MNIST手寫體數字資料二、用到的深度學習架構-LeNet5三、代碼最後

K-近鄰算法以及圖像分類應用

sklearn&amp;Tensorflow機器學習01 --- 概覽，回歸模型(幸福感與國家gdp的關系）

繼續閱讀

sklearn&Tensorflow機器學習01 --- 概覽，回歸模型(幸福感與國家gdp的關系）