利用堆排序(heapify)和快排(partition)找到top-k個數 java實作

給出一個整數數組m，找到n個最大的數，其中m>=n.

方法一：借助堆排序思想

優點：借助一個n的大小的數組空間，不需要一次性将m全部加載到記憶體

缺點：每個進入堆的資料都需要經過調整

package com.jimmy.lib;

import java.util.Arrays;

public class MyClass {

    public static void main(String[] args) {
        int m[] = {3, 55, 2, 7, 8, 5, 43, 98, 1, 66, 35, 46, 234, 65};
        int n[] = new int[5];
        System.arraycopy(m, 0, n, 0, n.length);

        // 1. 建立堆
        // 由于是要找到最大的數,我們則需要建立一個最小堆
        for (int i = n.length / 2 - 1; i >= 0; i--) {
            heapify(n, i);
        }

        for (int i = n.length; i < m.length; i++) {
        	// 每次隻要比較最小堆頂部的數值
        	if (n[0] < v) {
            	// 如果比頂部的數值大，則替換頂部數值，然後重新調整堆
            	n[0] = v;
            	heapify(n, 0);
            	System.out.println("adjust " + v + " -> " + Arrays.toString(n));
        	}
        }

        System.out.println(Arrays.toString(n));
    }

	/**
 	* 堆調整
 	*/
    private static void heapify(int n[], int i) {
        // 2. 記住左子樹和右子樹的計算方式
        int left = 2 * i + 1;
        int right = left + 1;
        int root = n[i];

        // 3. 判斷左子樹是否存在，判斷右子樹是否存在，比較兩者大小，和根的大小比較
        // 4. 如果左邊和右邊都是正常的，那就直接break循環即可
        while (left < n.length) {
            if (right < n.length && n[right] < n[left] && n[right] < root) {
                n[i] = n[right];
                i = right;
            } else if (n[left] < root) {
                n[i] = n[left];
                i = left;
            } else {
                break;
            }
            n[i] = root;
            left = 2 * i + 1;
            right = left + 1;
        }

    }
}

輸出結果

adjust 5 -> [3, 7, 5, 55, 8]
adjust 43 -> [5, 7, 43, 55, 8]
adjust 98 -> [7, 8, 43, 55, 98]
adjust 66 -> [8, 55, 43, 66, 98]
adjust 35 -> [35, 55, 43, 66, 98]
adjust 46 -> [43, 55, 46, 66, 98]
adjust 234 -> [46, 55, 234, 66, 98]
adjust 65 -> [55, 65, 234, 66, 98]

[55, 65, 234, 66, 98]

主要思路

先将m的數值填充到n
把n建立成最小堆
周遊m剩下的數值，放入最小堆n中
每次進行調整即可

這個不是完整的堆排序，隻是借助了堆排序的思想和方法。

方法二：借助快排思想

優點：比堆排序的速度更快，隻需要對部分的資料進行互動即可

缺點：需要将所有的資料都一次性加載到記憶體，有時也要對所有的資料進行排序

package com.jimmy.lib;

import java.util.Arrays;

public class TopKQuick {

    public static void main(String[] args) {
        int m[] = {3, 55, 2, 7, 8, 5, 43, 98, 1, 66, 35, 46, 234, 65};
        int n = 3;

        quickSortForTopK(m, n, 0, m.length - 1);

        System.out.println(Arrays.toString(m));
    }

    private static void quickSortForTopK(int[] m, int n, int low, int high) {
        int pivot;
        if (low < high) {
            pivot = partition(m, low, high);
            if (n == pivot || n - 1 == pivot) {
                System.out.println(pivot);
                return;
            }
            quickSortForTopK(m, n, low, pivot - 1);
            quickSortForTopK(m, n, pivot + 1, high);
        }
    }
    
	/**
	 * 從大到小的partition
	 */
    private static int partition(int[] m, int low, int high) {
        int pivotKey = m[low];
        while (low < high) {
            while (low < high && pivotKey >= m[high])
                high--;
            m[low] = m[high];
            
            while (low < high && pivotKey <= m[low])
                low++;
            m[high] = m[low];
        }
        m[low] = pivotKey;
        return low;
    }
}

輸出結果

3
[66, 98, 234, 65, 8, 5, 43, 7, 46, 55, 35, 3, 2, 1]

主要思路

完整的從大到小的快排
增加判斷 pivot 和 n 是否相等的目的是為了減少排序次數

因為當n或者n-1為pivot的時候，那麼就意味着前面的n或n-1個數已經是最大的了，接下來不需要在進行排序了

利用堆排序(heapify)和快排(partition)找到top-k個數 java實作

方法一：借助堆排序思想

方法二：借助快排思想

繼續閱讀

資料結構與算法（27）——排序（二）

27 Best Free Eclipse Plug-ins for Java Developer to be ProductiveCode Quality PluginsText Editor PluginsDependency ManagementVersion Control Integration PluginsFramework Development Continuous Integration Related PluginsOther Utility Plugins

Java String.format方法的簡單使用

neo4j之cypher使用文檔

Dijkstra--簡易版（最短路徑）

GitHub連夜封殺！這份阿裡 10W 字内部 Java 字面試手冊到底有多強？

spark/scala關于【資源檔案】加載方法概述外部檔案加載方案測試資源檔案打包入jar包中小結

mybatis_入門程式Mybatis入門

AOP程式設計_Android優雅權限架構(1)概念基礎，2021金三銀四前言正文大綱正文

Effective Java 8:通用程式設計

OOM三種類型

工廠模式-三種類型

【遞歸】高效率求2的n次幂

win10本地scala和spark安裝安裝scala安裝spark

scala (3) Function 和 Method

hdu7108哈希