Sklearn ValueError: This solver needs samples of at least 2 classes in the data, but the data

2023-05-22 08:30:47

sklearn報錯： ValueError: This solver needs samples of at least 2 classes in the data, but the data contains only one class: 0.0

部落客是在使用sklearn.learning_curve()這個函數時出現了這個問題，使用的estimator是Logistic regression，在網上一查，有很多人都報了同樣的錯，雖然使用案例不同，但是幾乎都是因為使用了Logistic regression而報錯。接下來會介紹有效的解決辦法。

先來看看我之前錯誤的示範吧：

train_sizes, train_scores, test_scores = learning_curve(estimator, X, y, cv=cv, n_jobs=n_jobs, train_sizes=train_sizes, verbose=verbose) #請注意X,y

請注意上面使用的是X,y。報錯想說我們使用的這兩個變量有問題。what？我們心想，X不就是特征，y是标簽嗎，這都會錯？！

于是，在stackoverflow上找到了有效解決該問題的方法：

from sklearn.utils import shuffle

X_shuffle, y_shuffle = shuffle(X, y)

再将轉換後的變量替換原來的變量重新訓練，就可以了成功達到預期效果了！

這是因為在未shuffle（洗牌）之前，如果做了CV導緻可能出現資料集中隻有一個class。而shuffle過後，打亂了資料，減小了上述情況的可能性（也就是說如果資料集極度不均衡，即使shuffle過後仍然可能跳出上述的bug）

Sklearn ValueError: This solver needs samples of at least 2 classes in the data, but the data

繼續閱讀

XGBoost Plotting API以及GBDT組合特征實踐 XGBoost Plotting API以及GBDT組合特征實踐

解碼器用于語義分割：資料依賴的解碼可以實作靈活的特征聚合

YAML簡介和PyYAML安全操作YAML支援的類型YAML的優點：yaml的基本文法python操作

2021-2025年中國運動療法（KT）帶行業市場供需與戰略研究報告

Small tricks

libsvm for python 安裝

學習軟體測試基礎測試第七天

Zeppelin 配置通路 REST APIApache Zeppelin Configuration REST API

【Torch】最簡潔logging使用指南

27. Remove Element(清單)題目代碼

Cloud Studio初體驗

使用 ctypes 進行 Python 和 C 的混合程式設計

【python】【資料處理】畫多元資料分布圖

【python】netconf協定對接管理裝置

「Python 網絡自動化」NETCONF —— Python 使用 NETCONF 管理配置 H3C 網絡裝置

在python中建立excel并寫入