我有這個腳本來并行處理一些網址:
import multiprocessing
import time
list_of_urls = []
for i in range(1,1000):
list_of_urls.append('http://example.com/page=' + str(i))
def process_url(url):
page_processed = url.split('=')[1]
print 'Processing page %s'% page_processed
time.sleep(5)
pool = multiprocessing.Pool(processes=4)
pool.map(process_url, list_of_urls)
該清單是有序的,但是當我運作它時,腳本不會按順序從清單中選擇URL:
Processing page 1
Processing page 64
Processing page 127
Processing page 190
Processing page 65
Processing page 2
Processing page 128
Processing page 191
相反,我希望它首先處理頁面1,2,3,4,然後繼續按照清單中的順序.有沒有選擇這樣做?
解決方法:
如果你沒有傳遞參數chunksize map将使用這個算法計算塊:
chunksize, extra = divmod(len(iterable), len(self._pool) * 4)
if extra:
chunksize += 1
它将你的iterable切入task_batches并在sperate程序中運作它.這就是為什麼它不合适.解決方案是将chunk equil聲明為1.
import multiprocessing
import time
list_test = range(10)
def proces(task):
print "task:", task
time.sleep(1)
pool = multiprocessing.Pool(processes=3)
pool.map(proces, list_test, chunksize=1)
task: 0
task: 1
task: 2
task: 3
task: 4
task: 5
task: 6
task: 7
task: 8
task: 9
标簽:python,multiprocessing,python-multiprocessing
來源: https://codeday.me/bug/20190627/1309220.html