天天看點

Scrapy學習筆記-Scrapyd Deploy

部署您的項目需要對其進行優化,并通過addversion.json endpoint 上傳。您可以手動執行此操作,但最簡單的方法是使用scrapyd客戶機提供的scrapyd部署工具,它将為您完成所有操作。

API

daemonstatus.json

檢查服務的加載狀态load status,支援的Request方法GET,比如​

​curl http://localhost:6800/daemonstatus.json​

​​,輸出​

​{ "status": "ok", "running": "0", "pending": "0", "finished": "0", "node_name": "node-name" }​

addversion.json

向項目中添加一個version,如果不存在就建立項目,參數:project (string, required)-項目名;version (string, required)-項目版本;egg (file, required)-包含項目代碼的Python egg。

比如​​

​$ curl http://localhost:6800/addversion.json -F project=myproject -F version=r23 -F [email protected]

​​,輸出​

​{"status": "ok", "spiders": 3}​

schedule.json

排程一個spider運作(作為一個job),傳回job id,支援的Request方法POST,參數:project (string, required)-項目名;spider (string, required)-spider名;setting (string, optional)-運作spider時使用的Scrapy設定;jobid (string, optional)-用于标記job,覆寫預設産生的UUID;_version (string, optional)-使用的項目版本;任何其他參數都作為spider參數傳遞,比如​

​$ curl http://localhost:6800/schedule.json -d project=myproject -d spider=somespider​

​​,輸出​

​{"status": "ok", "jobid": "6487ec79947edab326d6db28a2d86511e8247444"}​

​​,傳遞其他參數示例​

​$ curl http://localhost:6800/schedule.json -d project=myproject -d spider=somespider -d setting=DOWNLOAD_DELAY=2 -d arg1=val1​

​。使用scrapyd排程的spider應該允許任意數量的關鍵字參數,因為scrapyd向正在排程的spider發送内部生成的spider參數。

cancel.json

取消spider run。如果作業處于挂起狀态pending,它将被删除。如果作業正在運作,它将被終止。支援的Request方法POST,參數:project (string, required)-項目名;job (string, required)-job id。比如​

​$ curl http://localhost:6800/cancel.json -d project=myproject -d job=6487ec79947edab326d6db28a2d86511e8247444​

​​,輸出​

​{"status": "ok", "prevstate": "running"}​

listprojects.json

擷取Scrapy伺服器上的項目清單,支援Request方法GET,沒有參數,比如​

​$ curl http://localhost:6800/listprojects.json​

​​,輸出​

​{"status": "ok", "projects": ["myproject", "otherproject"]}​

listversions.json

擷取項目的版本清單,傳回版本清單是排序好的,最後一個是最近使用的,支援Request方法GET,參數:project (string, required)-項目名,比如​

​$ curl http://localhost:6800/listversions.json?project=myproject​

​​,輸出​

​{"status": "ok", "versions": ["r99", "r156"]}​

listspiders.json

擷取一些項目中最近版本的spiders清單,支援Request方法GET,參數project (string, required)-項目名;_version (string, optional)-項目的版本,比如​

​$ curl http://localhost:6800/listspiders.json?project=myproject​

​​,輸出​

​{"status": "ok", "spiders": ["spider1", "spider2", "spider3"]}​

listjobs.json

擷取項目的pending,running和finished job,支援Request方法GET,參數project (string, required)-項目名。比如​

​$ curl http://localhost:6800/listjobs.json?project=myproject​

​​,輸出​

​{"status": "ok", "pending": [{"id": "78391cc0fcaf11e1b0090800272a6d06", "spider": "spider1"}], "running": [{"id": "422e608f9f28cef127b3d5ef93fe9399", "spider": "spider2", "start_time": "2012-09-12 10:14:03.594664"}], "finished": [{"id": "2f16646cfcaf11e1b0090800272a6d06", "spider": "spider3", "start_time": "2012-09-12 10:14:03.594664", "end_time": "2012-09-12 10:24:03.594664"}]}​

​ 所有作業資料都儲存在記憶體中,并在Scrapyd服務重新啟動時重置

delversion.json

delproject.json