前幾天有寫到整合并發結果的文章,于是聯想到了fork/join。因為在我看來整合并發結果其實就是fork/join中的join步驟。是以今天我就把自己對fork/join一些淺顯的了解記錄下來。
1. fork/join是什麼
oracle的官方給出的定義是:fork/join架構是一個實作了executorservice接口的多線程處理器。它可以把一個大的任務劃分為若幹個小的任務并發執行,充分利用可用的資源,進而提高應用的執行效率。
fork/join實作了executorservice,是以它的任務也需要放線上程池中執行。它的不同在于它使用了工作竊取算法,空閑的線程可以從滿負荷的線程中竊取任務來幫忙執行。(我個人了解的工作竊取大意就是:由于線程池中的每個線程都有一個隊列,而且線程間互不影響。那麼線程每次都從自己的任務隊列的頭部擷取一個任務出來執行。如果某個時候一個線程的任務隊列空了,而其餘的線程任務隊列中還有任務,那麼這個線程就會從其他線程的任務隊列中取一個任務出來幫忙執行。就像偷取了其他人的工作一樣)
fork/join架構的核心是繼承了abstractexecutorservice的forkjoinpool類,它保證了工作竊取算法和forkjointask的正常工作。
下面是引用oracle官方定義的原文:
the fork/join framework is an implementation of the executorservice
interface that helps you take advantage of multiple processors. it is
designed for work that can be broken into smaller pieces recursively.
the goal is to use all the available processing power to enhance the
performance of your application.
as with any executorservice implementation, the fork/join framework
distributes tasks to worker threads in a thread pool. the fork/join
framework is distinct because it uses a work-stealing algorithm. worker
threads that run out of things to do can steal tasks from other
threads that are still busy.
the center of the fork/join framework is the forkjoinpool class, an
extension of the abstractexecutorservice class. forkjoinpool implements
the core work-stealing algorithm and can execute forkjointask
processes.
2. fork/join的基本用法
(1)fork/join基類
上文已經提到,fork/join就是要講一個大的任務分割成若幹小的任務,是以第一步當然是要做任務的分割,大緻方式如下:
if (這個任務足夠小){
執行要做的任務
} else {
将任務分割成兩小部分
執行兩小部分并等待執行結果
}
要實作frokjointask我們需要一個繼承了recursivetask或recursiveaction的基類,并根據自身業務情況将上面的代碼放入基類的coupute方法中。recursivetask和recursiveaction都繼承了frokjointask,它倆的差別就是recursivetask有傳回值而recursiveaction沒有。下面是我做的一個選出字元串清單中還有"a"的元素的demo:
@override
protected list<string> compute() {
// 當end與start之間的差小于門檻值時,開始進行實際篩選
if (end - this.start < threshold) {
list<string> temp = list.sublist(this.start, end);
return temp.parallelstream().filter(s -> s.contains("a")).collect(collectors.tolist());
} else {
// 如果當end與start之間的差大于門檻值時
// 将大任務分解成兩個小任務。
int middle = (this.start + end) / 2;
forkjointest left = new forkjointest(list, this.start, middle, threshold);
forkjointest right = new forkjointest(list, middle, end, threshold);
// 并行執行兩個“小任務”
left.fork();
right.fork();
// 把兩個“小任務”的結果合并起來
list<string> join = left.join();
join.addall(right.join());
return join;
}
}
(2)執行類
做好了基類就可以開始調用了,調用時首先我們需要fork/join線程池forkjoinpool,然後向線程池中送出一個forkjointask并得到結果。forkjoinpool的submit方法的入參是一個forkjointask,傳回值也是一個forkjointask,它提供一個get方法可以擷取到執行結果。
代碼如下:
forkjoinpool pool = new forkjoinpool();
// 送出可分解的forkjointask任務
forkjointask<list<string>> future = pool.submit(forkjoinservice);
system.out.println(future.get());
// 關閉線程池
pool.shutdown();
就這樣我們就完成了一個簡單的fork/join的開發。
提示:java8中java.util.arrays的parallelsort()方法和java.util.streams包中封裝的方法也都用到了fork/join。(細心的讀者可能注意到我在fork/join中也有用到stream,是以其實這個fork/join是多餘的,因為stream已經實作了fork/join,不過這隻是一個demo展示,沒有任何實際用處也就無所謂了)
引用官方原文:
one such implementation, introduced in java se 8, is used by the
java.util.arrays class for its parallelsort() methods. these methods are
similar to sort(), but leverage concurrency via the fork/join
framework. parallel sorting of large arrays is faster than sequential
sorting when run on multiprocessor systems.
another implementation of the fork/join framework is used by methods
in the java.util.streams package, which is part of project lambda
scheduled for the java se 8 release.
附完整代碼以便以後參考:
1. 定義抽象類(用于拓展,此例中沒有實際作用,可以不定義此類):
import java.util.concurrent.recursivetask;
/**
* description: forkjoin接口
* designer: jack
* date: 2017/8/3
* version: 1.0.0
*/
public abstract class forkjoinservice<t> extends recursivetask<t>{
@override
protected abstract t compute();
2. 定義基類
import java.util.list;
import java.util.stream.collectors;
* description: forkjoin基類
public class forkjointest extends forkjoinservice<list<string>> {
private static forkjointest forkjointest;
private int threshold; //門檻值
private list<string> list; //待拆分list
private forkjointest(list<string> list, int threshold) {
this.list = list;
this.threshold = threshold;
}
protected list<string> compute() {
// 當end與start之間的差小于門檻值時,開始進行實際篩選
if (list.size() < threshold) {
return list.parallelstream().filter(s -> s.contains("a")).collect(collectors.tolist());
} else {
// 如果當end與start之間的差大于門檻值時,将大任務分解成兩個小任務。
int middle = list.size() / 2;
list<string> leftlist = list.sublist(0, middle);
list<string> rightlist = list.sublist(middle, list.size());
forkjointest left = new forkjointest(leftlist, threshold);
forkjointest right = new forkjointest(rightlist, threshold);
// 并行執行兩個“小任務”
left.fork();
right.fork();
// 把兩個“小任務”的結果合并起來
list<string> join = left.join();
join.addall(right.join());
return join;
}
/**
* 擷取forkjointest執行個體
* @param list 待處理list
* @param threshold 門檻值
* @return forkjointest執行個體
*/
public static forkjoinservice<list<string>> getinstance(list<string> list, int threshold) {
if (forkjointest == null) {
synchronized (forkjointest.class) {
if (forkjointest == null) {
forkjointest = new forkjointest(list, threshold);
}
}
return forkjointest;
3. 執行類
import java.util.arraylist;
import java.util.arrays;
import java.util.concurrent.executionexception;
import java.util.concurrent.forkjoinpool;
import java.util.concurrent.forkjointask;
* description: fork/join執行類
public class test {
public static void main(string args[]) throws executionexception, interruptedexception {
string[] strings = {"a", "ah", "b", "ba", "ab", "ac", "sd", "fd", "ar", "te", "se", "te",
"sdr", "gdf", "df", "fg", "gh", "oa", "ah", "qwe", "re", "ty", "ui"};
list<string> stringlist = new arraylist<>(arrays.aslist(strings));
forkjoinpool pool = new forkjoinpool();
forkjoinservice<list<string>> forkjoinservice = forkjointest.getinstance(stringlist, 20);
pool.shutdown();
作者:珂jack
來源:51cto