天天看点

keras评估模型 keras评估模型

keras评估模型

目录

keras评估模型

自动评估

手动评估

手动分离数据集

利用交叉验证

自动评估

#!/usr/bin/env python
# -*- coding:utf-8 -*- 
# Author: Jia ShiLin

'''
通过fit()函数分割参数,设置数据集百分比,
'''

from keras.models import Sequential
from keras.layers import Dense
import numpy as np

np.random.seed(100)

# data
dataset = np.loadtxt('pima-indians-diabetes.csv', delimiter=',')  # 总共有9维数据,最后一个维度为0-1标签

x = dataset[:, 0:8]
y = dataset[:, 8:]

# model
model = Sequential()
model.add(Dense(12, input_dim=8, activation='relu'))
model.add(Dense(12, activation='relu'))
model.add(Dense(12, activation='relu'))
model.add(Dense(2, activation='sigmoid'))

# compile
model.compile(
    loss='binary_corssentropy', optimizer='adam',
    metrics=['accuracy'],
)

# 训练模型并评估
model.fit(x=x,y=y,epochs=150, batch_size=20,validation_split=0.2)

           

手动评估

手动分离数据集

#!/usr/bin/env python
# -*- coding:utf-8 -*- 
# Author: Jia ShiLin
'''
手动分离 数据集
使用sciket机器学习库中train_test_split()函数将数据分割成训练集合评估数据集
from sklearn.model_selection import train_test_split
'''

from keras.models import Sequential
from keras.layers import Dense
from sklearn.model_selection import train_test_split

import numpy as np

seed = 7  # 可能多次用到该参数
np.random.seed(seed)

# data
dataset = np.loadtxt('diabetes.csv', delimiter=',')

# 分割数据
x = dataset[:, 0:8]
y = dataset[:, 8:]

# 分割数据集
x_train, x_validation, y_train, y_validation = train_test_split(x, y, test_size=0.2, random_state=seed)

# model
model = Sequential()
model.add(Dense(32, input_dim=8, activation='relu'))
model.add(Dense(10, activation='relu'))
model.add(Dense(10, activation='relu'))
model.add(Dense(10, activation='relu'))
model.add(Dense(2, activation='sigmoid'))

# compile
model.compile(
    loss='binary_crossentropy',optimizer='adam',
    metrics=['accuracy'],
    )

#train
model.fit(
    x_train,y_train,
    validation_data=(x_validation,y_validation),#此处不是validation_split
    epochs=150,
    batch_size=50
)
           

利用交叉验证

为什么交叉验证https://blog.csdn.net/qq_35290785/article/details/91056784

#!/usr/bin/env python
# -*- coding:utf-8 -*- 
# Author: Jia ShiLin

'''
将数据集分为k个子集,选取其中一个为评估数据集,重复上述,
采用k个模型评估结果的平均值作为模型的最终结果,
通常不用与评估深度学习模型,因为计算的开销比较大,如通常用10个子集,10个模型大大增加了模型的评估时间开销

使用scikit_learn机器学习库中的StratifiedKFold类将数据分为10个子集,并用这10个子集创建和评估10个模型,
收集这10个模型的评估得分,通过设置verbose=0来关闭模型的fit()和evaluate()函数的详细输出,在每个模型构建完成后,进行评估并输出评估结果,
在所有模型评估完成后,输出偶像得分的均值和标准差,来对模型精度鲁棒性的估计
'''

from keras.models import Sequential
from keras.layers import Dense
import numpy as np
from sklearn.model_selection import StratifiedKFold

seed = 8
np.random.seed(seed)

# data
dataset = np.loadtxt('pima-indians-diabeters.csv', delimiter=',')
x_data = dataset[:, 0:8]
y_data = dataset[:, 8:]

kfold = StratifiedKFold(n_splits=10, random_state=seed, shuffle=True)
cvscores = []

for train, validation in kfold.split(x_data, y_data):
    # model
    model = Sequential()
    model.add(Dense(18, input_dim=8, activation='relu'))
    model.add(Dense(18, activation='relu'))
    model.add(Dense(2, activation='sigmoid'))

    # compile
    model.compile(
        loss='binary_crossentropy',
        metrics=['accuracy'],
        optimizer='adam',
    )
    # train
    model.fit(
        x_data[train], y_data[train],
        epochs=150,
        batch_size=20,
        verbose=0,
    )

    # 评估
    scores = model.evaluate(x_data[validation], y_data[validation], verbose=0)

    # show
    print('%s:%.2f%%' % (model.metrics_names[1], scores[1] * 100))
    cvscores.append(scores[1] * 100)

print('%.2f%% (+/- %.2f%%)' % (np.mean(cvscores), np.std(cvscores)))
           

继续阅读