Stats Collection
scrapyæä¾äºä¸ä¸ªæ¹ä¾¿ç设æ½æ¥æ¶éé®å¼è¡¨åä¸çç»è®¡ä¿¡æ¯ï¼è¿éçå¼é常使计æ°å¨ãè¿ä¸ªè®¾æ½è¢«ç§°ä¸ºStats Collector, 并ä¸å¯ä»¥éè¿ Crawler APIçstatså±æ§è®¿é®ï¼ä¸æ¹çCommon Stats Collector usesç»äºè¯´æçä¾åã
ç¶èï¼Stats Collectoræ»æ¯å¯ç¨çï¼æä»¥ä½ å¯ä»¥å¯¼å ¥ä»å¹¶ä½¿ç¨å®çAPIï¼å¢å æ设置æ°çç»è®¡é®ï¼ï¼ä¸ç®¡ç»è®¡èµææ¶éæ¯å¦å¯ç¨ãå¦æä¸å¯ç¨ï¼APIä»ä¼å·¥ä½ä½ä¸ä¼æ¶éä»»ä½ä¸è¥¿ã为äºç®åStats Collectorçç¨æ³ï¼ä½ åºè¯¥ä½¿ç¨ä¸è¶ è¿ä¸è¡ç代ç æ¥æ¶éç¬è«çç»è®¡èµæï¼scrapyæ©å±æä»»ä½ä½ 使ç¨Stats Collectoræ¥æ¶éç代ç
Stats Collectorçå ¶ä»ç¹æ§ä¹é常é«æï¼å½å¯ç¨æ¶ï¼å¹¶ä¸ç¦ç¨ä¹æ¯å¾é«æçããï¼å°½ç®¡å¾ä¸èµ·ç¼ï¼
Stats Collectorå¨ç¬è«æå¼æ¶èªå¨ä¸ºæ¯ä¸ªç¬è«å¼å¯ä¸ä¸ªç»è®¡è¡¨ï¼å¹¶ä¸å¨ç¬è«å ³éæ¶ä¹å ³éã
Common Stats Collector uses
éè¿statså±æ§è®¿é®Stats Collectorï¼è¿æ¯è®¿é®statsçæ©å±ä¾åã
class ExtensionThatAccessStats:
def __init__(self, stats):
self.stats = stats
@classmethod
def from_crawler(cls, crawler):
return cls(crawler.stats)
Set stat value:
stats.set_value('hostname', socket.gethostname())
Increment stat value:
stats.inc_value('custom_count')
Set stat value only if greater than previous:
stats.max_value('max_items_scraped', value)
Set stat value only if lower than previous:
stats.min_value('min_free_memory_percent', value)
Get stat value:
>>> stats.get_value('custom_count')
1
Get all stats:
>>> stats.get_stats()
{'custom_count': 1, 'start_time': datetime.datetime(2009, 7, 14, 21, 47, 28, 977139)}
Available Stats Collectors
å¤çåºæ¬ç
StatsCollector
ï¼è¿äºæ¯ä»¥åºæ¬Stats Collectoræ©å±åºæ¥çStats Collectorãä½ å¯ä»¥éè¿STATS_CLASSæ¥éæ©è¿äºãé»è®¤çStats Collector使ç¨
MemoryStatsCollector
MemoryStatsCollector
- class
[source]scrapy.statscollectors.``MemoryStatsCollector
ä¸ä¸ªç®åçStats Collectoræ¥ä¿çå åä¸æåççæåçç»è®¡ä¿¡æ¯ï¼å¯¹æ¯ä¸ä¸ªç¬è«ï¼ï¼å¨ä»ä»¬è¢«å ³éåãç»è®¡ä¿¡æ¯å¯ç¨éè¿spider_statså±æ§è®¿é®ï¼å¾å°ä¸ä¸ªåå ¸å ³é®åæ¯ç¬è«çååãé»è®¤çStats Collector
spider_stats
ä¸ä¸ªåå ¸(以ç¬è«å为é®)å å«äºæ¯ä¸ªç¬è«æåæåçç»è®¡ä¿¡æ¯ã
DummyStatsCollector
- class
[source]scrapy.statscollectors.``DummyStatsCollector
tats Collectoré¤äºé常é«æå°±å¥ä¹ä¸å¹²äºï¼å 为ä»ä¹é½ä¸åï¼ãè¿ä¸ªStats Collectorå¯ç¨éè¿STAT_CLASS设置ï¼ç¦ç¨Stats Collectoræ¯ä¸ºäºæé«æ§è½ï¼ä½æ¯è·å ¶ä»scrapyçå·¥ä½è´è½½ä¾å¦è§£æç½ç«ç¸æ¯ï¼ç»è®¡ä¿¡æ¯æ¶éé常æ¯å¾®ä¸è¶³éçã
Sending e-mail
尽管pythonå¯ä»¥éè¿
smtplib
åºæ¹ä¾¿çåéé®ä»¶ï¼ä½scrapyä¹æä¾äºèªèº«ç设æ½æ¥åéé®ä»¶ï¼ä½¿ç¨æ¹ä¾¿å¹¶ä¸å®ä½¿ç¨ Twisted non-blocking IOå®ç°ï¼å¯ä»¥é¿å å¹²æ°æåå¨çéé»å¡IOï¼ä¹æä¾ä¸ä¸ªç®åçAPIæ¥åéé件并ä¸å¯ä»¥éè¿å¾å°ç settingsæ¥ç®åé ç½®ã
Quick example
两ç§æ¹å¼å®ä¾åé®ä»¶åéå¨ï¼ä½ å¯ä»¥ä½¿ç¨æ åç__init_ æ¹æ³æ¥å®ä¾å®ã
from scrapy.mail import MailSender
mailer = MailSender()
æè éè¿ä¼ éä¸ä¸ªscrapy 设置对象æ¥å®ä¾åï¼éç¨äºsettingsã
mailer = MailSender.from_settings(settings)
è¿æ¯ææ ·ç¨æ¥åéé®ä»¶ï¼æ²¡æé件çï¼
mailer.send(to=["[email protected]"], subject="Some subject", body="Some body", cc=["[email protected]"])
MailSender class reference
å¨scrapyä¸ä½¿ç¨MailSender æ¥åéé®ä»¶æ´å¥½ï¼å®ä½¿ç¨Twisted non-blocking IO, å°±åæ¡æ¶çå ¶ä½é¨åä¸æ ·ã
class scrapy.mail.MailSender(smtphost=None,mailfrom=None,smtpuser=None,smtpport=None,smtppass=None)
Parameters:
- smtphost (str or bytes) â ç¨æ¥åéé®ä»¶çSMTP主æºï¼å¦æ忽ç¥ï¼ä½¿ç¨
设置ãMAIL_HOST
- mailfrom (str) â åéé®ä»¶çå°åï¼ä»¥From:headerçæ ¼å¼ï¼ï¼å¦æ忽ç¥the
setting will be used.MAIL_FROM
- smtpuser â SMTP使ç¨è
ï¼å¦æ忽ç¥ä½¿ç¨
设置ãå¦æ没ç»ï¼å°±æ²¡æSMTP认è¯æ¹å¼æ¥æ§è¡ãMAIL_USER
- smtppass (str or bytes) â ä¼ éç»è®¤è¯çSMTP
- smtpport (int) â the SMTP port to connect to# ç¨æ¥è¿æ¥SMTPçæ¥å£
- smtptls (boolean) â 强å¶ä½¿ç¨SMTP STARTTLSã
- smtpssl (boolean) â 强å¶ä½¿ç¨å®å ¨çSSLè¿æ¥
classmethod:
from_settings(settings)
使ç¨scrapy设置对象æ¥å®ä¾åï¼ ä½¿ç¨ these Scrapy settings.
parameters: settings(
scrapy.settings.Settings
object),é®ä»¶çæ¶ä»¶äºº
send(to,subject,body,cc=None,attachs=(),mimetype=âtext/plainâ,charset=None)
ç»æ¶ä»¶äººåéé®ä»¶
åæ°ï¼
- to (str or list of str) â æ¶ä»¶äºº
- subject (str) â é®ä»¶ç主é¢
- cc (str or list of str) â ç»CC çé®ä»¶ï¼ï¼ï¼
- body (str) â é®ä»¶æ£æ
- attachs (iterable) âä¸ä¸ªå¯è¿ä»£çå ç»ï¼attach_name,mimetype,file_object), attch_nameæ¯å符串ï¼åºç°å¨é®ä»¶éå ä¸çååãmimetypeæ¯é件çåªä½ç±»åï¼file_objectæ¯å¯è¯»é¢æ件对象å å«äºé件ã
- mimetype (str) â é®ä»¶çåªä½ç±»å
- charset (str) â é®ä»¶å 容使ç¨çå符ç¼ç
Mail settings
è¿äºæ¹æ³å®ä¹å¨MailSenderé»è®¤ç__init__ æ¹æ³å¼ä¸ï¼å¯ä»¥ç¨æ¥å¨ä½ ç项ç®ä¸é ç½®é®ä»¶éç¥ï¼èä¸ç¨åä»»ä½ä»£ç ï¼è¿äºæ©å±å代ç é½å¯ä»¥ä½¿ç¨MailSender)
MAIL_FROM
Default:
'[email protected]'
åéè é®ç®±æ¥åéé®ä»¶çï¼ä½¿ç¨ï¼å件人ï¼æ 头ï¼
MAIL_HOST
Default:
'localhost'
ç¨æ¥åéé®ä»¶çSMTP主æº
MAIL_PORT
Default:
25
ç¨äºåéé®ä»¶çæ¥å£ã
MAIL_USER
Default:
None
ç¨æ¥SMTP认è¯çç¨æ·ï¼å¦æç¦ç¨ï¼å°±ä¸æ§è¡è®¤è¯ã
MAIL_PASS
Default:
None
ç¨æ¥è®¤è¯çå¯ç ï¼äºMATL_USERä¸èµ·çã
MAIL_TLS
Default:
False
强迫使ç¨STARTTLS ï¼STARTTLSæ¯ä¸ç§å°ç°æçä¸å»å®å ¨çæ¥å京æ´æ°æ使ç¨SSL/TLSå®å ¨è¿æ¥çä¸ç§æ¹å¼ã
MAIL_SSL
Default:
False
强å¶è¿æ¥ï¼ä½¿ç¨SSLå å¯è¿æ¥ã