天天看點

Python 對基數排序、計數排序、桶排序的比較

前面對比了各種比較排序算法在面對百萬級資料時所表現的性能,結果是連差強人意都說不上。是以又比較了内排序中的一些

非比較排序算法,來比較下它們面對百萬級資料時的性能

#coding=utf-8
import random,time
'''Author:Zhanglongqi
   Date:2018-3-24
   email:[email protected]
   Environment:Python 3,Win8.1
   Descrip:比較桶排序、基數排序、計數排序處理百萬級資料量時的時間性能
'''
def counting_sort(A,B,k):
    C=[]
    for i in range(k):
        C.append(0)
    for j in range(len(A)):
        C[A[j]]=C[A[j]]+1
    for i in range(1,k):
        C[i]=C[i]+C[i-1]
    for i in range(len(A)-1,-1,-1):
        B[C[A[i]]]=A[i]
        C[A[i]]=C[A[i]]-1
    B.pop(0)
    return B
def radix_sort(array):
    bucket, digit = [[]], 0
    while len(bucket[0]) != len(array):
        bucket = [[], [], [], [], [], [], [], [], [], []]
        for i in range(len(array)):
            num = (array[i] // 10 ** digit) % 10
            bucket[num].append(array[i])
        array.clear()
        for i in range(len(bucket)):
            array += bucket[i]
        digit += 1
    return array
class bucketSort(object):
    def _max(self,oldlist):
        _max=oldlist[0]
        for i in oldlist:
            if i>_max:
                _max=i
        return _max
    def _min(self,oldlist):
        _min=oldlist[0]
        for i in oldlist:
            if i<_min:
                _min=i
        return _min
    def sort(self,oldlist):
        _max=self._max(oldlist)
        _min=self._min(oldlist)
        s=[0 for i in range(_min,_max+1)]
        for i in oldlist:
            s[i-_min]+=1
        current=_min
        n=0
        for i in s:
            while i>0:
                oldlist[n]=current
                i-=1
                n+=1
            current+=1
    def __call__(self,oldlist):
        self.sort(oldlist)
        return oldlist
if __name__=='__main__':
    nums=[]
    Lis=[]
    C=[]
    D=[]
    MAX=10000000
    N=1000000
    for _ in range(N):
        n=random.randint(0,MAX)
        nums.append(n)
    C=nums.copy()
    D=nums.copy()
    for _ in range(N+1):
        Lis.append(0)
    start1=time.clock()
    C=radix_sort(C)
    end1=time.clock()
    print('radix time is',end1-start1)

    start2=time.clock()
    Lis=counting_sort(nums,Lis,MAX)
    end2=time.clock()
    print('counting time is',end2-start2)
    
    start=time.clock()
    bucketSort()(D)
    end=time.clock()
    print('bucket time is',end-start)

    if Lis==C and C==D:
        print('True')
    else:
        print('False')
           

Demon:

Python 對基數排序、計數排序、桶排序的比較

可以發現桶排序最快,而且桶排序在單跑百萬資料時隻要不到2秒:

Python 對基數排序、計數排序、桶排序的比較

但是,實際使用時還得根據各種排序算法的時空開銷和穩定性特點來選擇。

Python 對基數排序、計數排序、桶排序的比較