天天看點

【算法競賽】如何從Kaggle下載下傳資料 - Jupyter Notebook

01、安裝Kaggle API

将下面語句在Jupyter notebook或者Terminal中運作;

在Terminal中運作時需要根據平台進行調整;(depending on your platform you may need to modify this slightly to either add

source activate fastai

or similar, or prefix

pip

with a path. Have a look at how

conda install

is called for your platform in the appropriate Returning to work section of https://course.fast.ai/. (Depending on your environment, you may also need to append “–user” to the command.)

! {sys.executable} -m pip install kaggle --upgrade
           

02、下載下傳Kaggle的授權

登入你的Kaggle賬戶,點選

My Account

,下拉找到

Create New API Token

,點選下載下傳kaggle.json檔案

03、上傳Kaggle.json檔案

點選upload上傳Kaggle.json檔案至目前Jupyter notebook所在的地方,并運作下面兩行指令;(如果是Windows,則運作後兩行)

! mkdir -p ~/.kaggle/
! mv kaggle.json ~/.kaggle/

# For Windows, uncomment these two commands
# ! mkdir %userprofile%\.kaggle
# ! move kaggle.json %userprofile%\.kaggle
           

04、接受競賽規則

在Kaggle中,進入你要下載下傳資料的比賽,點選接受競賽規則;(否則下載下傳不成功)

【算法競賽】如何從Kaggle下載下傳資料 - Jupyter Notebook

05、建立儲存資料的路徑 并 下載下傳

path = Config.data_path()/'planet'
path.mkdir(parents=True, exist_ok=True)
path
           

本指令以planet competition為例

! kaggle competitions download -c planet-understanding-the-amazon-from-space -f train-jpg.tar.7z -p {path}  
! kaggle competitions download -c planet-understanding-the-amazon-from-space -f train_v2.csv -p {path}  
! unzip -q -n {path}/train_v2.csv.zip -d {path}
           

指令結合具體比賽進行修改;其中 “kaggle competitions download -c planet-understanding-the-amazon-from-space“可以在對應競賽頁面data的API中找到,具體檔案也有清單;

【算法競賽】如何從Kaggle下載下傳資料 - Jupyter Notebook

06. 解壓檔案

! 7za -bd -y -so x {path}/train-jpg.tar.7z | tar xf - -C {path.as_posix()}
           

如果沒有安裝對應的解壓程式,記得先安裝,此資料是7zip壓縮的

! install --yes --prefix {sys.prefix} -c haasad eidl7zip
           

**

繼續閱讀