谈谈fork/join实现原理-技术圈

走过路过不要错过

点击蓝字关注我们

害，又是一个炒冷饭的时间。fork/join是在jdk1.7中出现的一个并发工作包，其特点是可以将一个大的任务拆分成多个子任务进行并行处理，最后将子任务结果合并成最后的计算结果，并进行输出。从而达到多线程分发任务，达到高效处理的目的。

1. 关于fork/join的一点想法

以上说法，也许大家没什么感觉。但换个说法可能会更让人体会深切。总体上，相当于一个map阶段数据拆分，一个reduce阶段数据收集。即一个mapreduce过程，是不是有大数据的思想在了。只不过这fork/join的拆分难度可见性更大（自己手动拆，mapreduce由shuffle组件自动拆），另外fork/join是在一个机器上运行，而大数据的框架，则是在分布式系统中运行的。

从这个点说来，好像研究fork/join就显得有些意义了。

只是，按照fork/join的语义解释，是将任务拆分，然后处理，然后再合并结果。如果没有了合并结果这一步，那么，它就等同于线程池了，这也就是有人说它与线程池有啥差别的疑惑所在了。再说有需要收集结果的这一语义，其实我们也是可以通过线程池去执行任务，然后再用get()得到结果，然后在外部做合并，也是一样咯。

2. fork/join的几个核心类

fork/join被称作执行框架，自然不会是一个单一组件问题了。

首先，它会有一个 ForkJoinPool, 相当于线程池, 所有的任务都要通过它来进行提交，然后由其进行统一调度。

然后，每个任务都会有许多相同的代码，只有业务实现是不一样的，所以它会有一个基类：RecursiveTask . 实现上还有一个无返回结果的类：RecursiveAction, 只是没有返回结果时，往往又可能可以使用普通线程池执行替代了。（没有绝对）

ForkJoinWorkerThreadFactory, 是fork/join框架的线程工厂类，原本含义与普通的线程工厂类一致，只是它的入参不再是一个个 Runnable 任务，而是 ForkJoinPool, 因为它们所处的上下文是不一样的。

ForkJoinWorkerThread, 执行fork/join的具体线程，它可能在执行过程中，再去主动添加task。而它自身拥有一个队列，它的主要任务就是获取队列任务，然后执行。但当其自身的队列完成时，它可以通过work-steal算法窃取其他线程的队列任务。这也是fork/join的核心所在。

sun.misc.Unsafe, 之所以要提到这个jdk类，是因为在fork/join框架中，对于队列的管理，不是通过普通的list或数组来实现，而是通过 U.putOrderedObject(a, j, task); 来存放，虽然效果与数组是一样的，但它会更简单地实现线程安全的操作。只是，其中有许多的位操作，值得学习的同时，也显得有些麻烦了。

3. fork/join使用样例

我们通过对一个数组的排序过程，使用fork/join来实现看看如何使用这框架。尤其对于大数组的排序，显得还是有用的。这种大数组的排序，一般都会使用快速排序或者归并排序来处理。此处使用fork/join框架来处理，也是暗合了归并排序的道理了。

import java.util.Arrays;import java.util.Random;import java.util.concurrent.ExecutionException;import java.util.concurrent.ForkJoinPool;import java.util.concurrent.ForkJoinTask;import java.util.concurrent.RecursiveTask;
/** * Fork/join框架测试 */public class TestForkJoinFramework {
    public static void main(String[] args) {        long beginTime = System.currentTimeMillis();        ForkJoinPool pool = new ForkJoinPool();        int mockArrLen = 1000_0000;        int[] arr = new int[mockArrLen];        Random r = new Random();        for (int index = 1; index <= mockArrLen; index++) {            arr[index - 1] = r.nextInt(1000_0000);        }        FJOrderTask task = new FJOrderTask(arr);        ForkJoinTask<int[]> taskResult = pool.submit(task);        try {            // 等待结果完成            taskResult.get();        } catch (InterruptedException | ExecutionException e) {            e.printStackTrace();        }        long endTime = System.currentTimeMillis();        System.out.println("耗时=" + (endTime - beginTime));    }
    /**     * 单个排序的子任务     */    private static class FJOrderTask extends RecursiveTask<int[]> {
        /**         * 当前排序的数组值         */        private final int[] source;
        public FJOrderTask(int[] source) {            this.source = source;        }
        /**         * 真正的业务计算逻辑         *         * @see java.util.concurrent.RecursiveTask#compute()         */        @Override        protected int[] compute() {            int sourceLen = source.length;            // 如果条件成立，说明任务中要进行排序的集合还不够小            System.out.println(Thread.currentThread());            if (sourceLen > 2) {                int midIndex = sourceLen / 2;                // 拆分成两个子任务, 0 -> mid - 1, mid -> len                FJOrderTask task1 = new FJOrderTask(                        Arrays.copyOf(source, midIndex));                task1.fork();                FJOrderTask task2 = new FJOrderTask(                        Arrays.copyOfRange(source, midIndex, sourceLen));                task2.fork();                // 将两个有序的数组，合并成一个有序的数组                int[] result1 = task1.join();                int[] result2 = task2.join();                return insertMerge(result1, result2);            }            // 否则说明集合中只有一个或者两个元素，可以进行这两个元素的比较排序了            else {                // 如果条件成立，说明数组中只有一个元素，或者是数组中的元素都已经排列好位置了                if (sourceLen == 1                        || source[0] <= source[1]) {                    return source;                } else {                    int[] orderedArr = new int[sourceLen];                    orderedArr[0] = source[1];                    orderedArr[1] = source[0];                    return orderedArr;                }            }        }
        /**         * 使用插入排序，将两个有序数组合并起来         *         * @param arr1 有序数组1         * @param arr2 有序数组2         * @return 合并后的有序数组         */        private int[] insertMerge(int[] arr1, int[] arr2) {            int[] result = new int[arr1.length + arr2.length];            int arr1Len = arr1.length;            int arr2Len = arr2.length;            int destLen = result.length;            // 简单插入排序            for (int i = 0, array1Index = 0, array2Index = 0; i < destLen; i++) {                int value1 = array1Index >= arr1Len                        ? Integer.MAX_VALUE : arr1[array1Index];                int value2 = array2Index >= arr2Len                        ? Integer.MAX_VALUE : arr2[array2Index];                if (value1 < value2) {                    array1Index++;                    result[i] = value1;                }                else {                    array2Index++;                    result[i] = value2;                }            }            return result;        }
    }}

思路很简单，就是将数组一直拆分，直到最后一个或者两个时，从最下面来开始排序，然后依次往上回溯，使用插入排序归并结果集，最终返回排好序的值。如果除去任务拆分的过程，则时间复杂度还是非常好的 O(nlog(n)), 只是这任务拆分的过程，需要大量的空间复杂度，也不见得是什么好事。且不管它。

4. fork/join框架的实现原理

我们以上面的demo为出发点，观察fork/join的工作过程，不知道100%，也八九不离十了。上面主要有几个动作，一ForkJoinPool实例化，submit一个Task, get()等待最终结果完成。这三个看得见的动作好办，只是其核心也许还在背后。

4.1. ForkJoinPool构造器

每个要调用框架的应用，必先初始化一个pool实例，这是自然。如上使用无参构造器，实际上是使用了框架的各种默认值而已, 这种默认值往往是能够满足大部分的场景的，从而体现其易用性。

// java.util.concurrent.ForkJoinPool#ForkJoinPool()    /**     * Creates a {@code ForkJoinPool} with parallelism equal to {@link     * java.lang.Runtime#availableProcessors}, using the {@linkplain     * #defaultForkJoinWorkerThreadFactory default thread factory},     * no UncaughtExceptionHandler, and non-async LIFO processing mode.     *     * @throws SecurityException if a security manager exists and     *         the caller is not permitted to modify threads     *         because it does not hold {@link     *         java.lang.RuntimePermission}{@code ("modifyThread")}     */    public ForkJoinPool() {        // 并行度默认是cpu的核数        this(Math.min(MAX_CAP, Runtime.getRuntime().availableProcessors()),             defaultForkJoinWorkerThreadFactory, null, false);    }    /**     * Creates a {@code ForkJoinPool} with the given parameters.     *     * @param parallelism the parallelism level. For default value,     * use {@link java.lang.Runtime#availableProcessors}.     * @param factory the factory for creating new threads. For default value,     * use {@link #defaultForkJoinWorkerThreadFactory}.     * @param handler the handler for internal worker threads that     * terminate due to unrecoverable errors encountered while executing     * tasks. For default value, use {@code null}.     * @param asyncMode if true,     * establishes local first-in-first-out scheduling mode for forked     * tasks that are never joined. This mode may be more appropriate     * than default locally stack-based mode in applications in which     * worker threads only process event-style asynchronous tasks.     * For default value, use {@code false}.     * @throws IllegalArgumentException if parallelism less than or     *         equal to zero, or greater than implementation limit     * @throws NullPointerException if the factory is null     * @throws SecurityException if a security manager exists and     *         the caller is not permitted to modify threads     *         because it does not hold {@link     *         java.lang.RuntimePermission}{@code ("modifyThread")}     */    public ForkJoinPool(int parallelism,                        ForkJoinWorkerThreadFactory factory,                        UncaughtExceptionHandler handler,                        boolean asyncMode) {        this(checkParallelism(parallelism),             checkFactory(factory),             handler,             // FIFO_QUEUE = 1 << 16, LIFO_QUEUE = 0             asyncMode ? FIFO_QUEUE : LIFO_QUEUE,             "ForkJoinPool-" + nextPoolId() + "-worker-");        checkPermission();    }    /**     * Creates a {@code ForkJoinPool} with the given parameters, without     * any security checks or parameter validation.  Invoked directly by     * makeCommonPool.     */    private ForkJoinPool(int parallelism,                         ForkJoinWorkerThreadFactory factory,                         UncaughtExceptionHandler handler,                         int mode,                         String workerNamePrefix) {        this.workerNamePrefix = workerNamePrefix;        this.factory = factory;        this.ueh = handler;        this.config = (parallelism & SMASK) | mode;        long np = (long)(-parallelism); // offset ctl counts        this.ctl = ((np << AC_SHIFT) & AC_MASK) | ((np << TC_SHIFT) & TC_MASK);    }

构造器自然没啥好说的，就是设置几个并行度，初始化线程工厂，标识等等。为下文做准备。

4.2. 任务submit过程

上面的例子中，submit只有一次调用，而实际应用中则不一定。但即使如此，一次submit, 其实背后也是有许多的动作的。因为这一个task里，又会生出许多task来。

// java.util.concurrent.ForkJoinPool#submit    /**     * Submits a ForkJoinTask for execution.     *     * @param task the task to submit     * @param <T> the type of the task's result     * @return the task     * @throws NullPointerException if the task is null     * @throws RejectedExecutionException if the task cannot be     *         scheduled for execution     */    public <T> ForkJoinTask<T> submit(ForkJoinTask<T> task) {        if (task == null)            throw new NullPointerException();        // submit主要是向pool中加入任务队列        externalPush(task);        return task;    }    /**     * Tries to add the given task to a submission queue at     * submitter's current queue. Only the (vastly) most common path     * is directly handled in this method, while screening for need     * for externalSubmit.     *     * @param task the task. Caller must ensure non-null.     */    final void externalPush(ForkJoinTask<?> task) {        WorkQueue[] ws; WorkQueue q; int m;        int r = ThreadLocalRandom.getProbe();        int rs = runState;        // 如果线程不是第一次进入，且获得锁，则直接放队列即可        // 否则走普通加入队列逻辑        if ((ws = workQueues) != null && (m = (ws.length - 1)) >= 0 &&            (q = ws[m & r & SQMASK]) != null && r != 0 && rs > 0 &&            U.compareAndSwapInt(q, QLOCK, 0, 1)) {            ForkJoinTask<?>[] a; int am, n, s;            if ((a = q.array) != null &&                (am = a.length - 1) > (n = (s = q.top) - q.base)) {                int j = ((am & s) << ASHIFT) + ABASE;                // 通过 putOrderedObject 添加任务到队列中                U.putOrderedObject(a, j, task);                U.putOrderedInt(q, QTOP, s + 1);                U.putIntVolatile(q, QLOCK, 0);                if (n <= 1)                    signalWork(ws, q);                return;            }            U.compareAndSwapInt(q, QLOCK, 1, 0);        }        // 初始化时的submit或者通用 submit        externalSubmit(task);    }
    /**     * Full version of externalPush, handling uncommon cases, as well     * as performing secondary initialization upon the first     * submission of the first task to the pool.  It also detects     * first submission by an external thread and creates a new shared     * queue if the one at index if empty or contended.     *     * @param task the task. Caller must ensure non-null.     */    private void externalSubmit(ForkJoinTask<?> task) {        int r;                                    // initialize caller's probe        if ((r = ThreadLocalRandom.getProbe()) == 0) {            ThreadLocalRandom.localInit();            r = ThreadLocalRandom.getProbe();        }        for (;;) {            WorkQueue[] ws; WorkQueue q; int rs, m, k;            boolean move = false;            // 停止运行            if ((rs = runState) < 0) {                tryTerminate(false, false);     // help terminate                throw new RejectedExecutionException();            }            // 未被初始化，先执行初始化            else if ((rs & STARTED) == 0 ||     // initialize                     ((ws = workQueues) == null || (m = ws.length - 1) < 0)) {                int ns = 0;                // 上锁初始化                rs = lockRunState();                try {                    if ((rs & STARTED) == 0) {                        U.compareAndSwapObject(this, STEALCOUNTER, null,                                               new AtomicLong());                        // create workQueues array with size a power of two                        int p = config & SMASK; // ensure at least 2 slots                        int n = (p > 1) ? p - 1 : 1;                        n |= n >>> 1; n |= n >>> 2;  n |= n >>> 4;                        n |= n >>> 8; n |= n >>> 16; n = (n + 1) << 1;                        // 队列数量初始化                        workQueues = new WorkQueue[n];                        ns = STARTED;                    }                } finally {                    unlockRunState(rs, (rs & ~RSLOCK) | ns);                }            }            // 当前线程已添加过队列            else if ((q = ws[k = r & m & SQMASK]) != null) {                // 上锁添加到队列中                if (q.qlock == 0 && U.compareAndSwapInt(q, QLOCK, 0, 1)) {                    ForkJoinTask<?>[] a = q.array;                    // 取出栈顶指针，向其中放入任务                    int s = q.top;                    boolean submitted = false; // initial submission or resizing                    try {                      // locked version of push                        if ((a != null && a.length > s + 1 - q.base) ||                            (a = q.growArray()) != null) {                            int j = (((a.length - 1) & s) << ASHIFT) + ABASE;                            U.putOrderedObject(a, j, task);                            U.putOrderedInt(q, QTOP, s + 1);                            submitted = true;                        }                    } finally {                        U.compareAndSwapInt(q, QLOCK, 1, 0);                    }                    // 如果队列添加成功，则唤醒一个 worker, 返回                    // 否则进入下一次尝试添加过程                    if (submitted) {                        signalWork(ws, q);                        return;                    }                }                move = true;                   // move on failure            }            else if (((rs = runState) & RSLOCK) == 0) { // create new queue                q = new WorkQueue(this, null);                q.hint = r;                q.config = k | SHARED_QUEUE;                q.scanState = INACTIVE;                rs = lockRunState();           // publish index                if (rs > 0 &&  (ws = workQueues) != null &&                    k < ws.length && ws[k] == null)                    ws[k] = q;                 // else terminated                unlockRunState(rs, rs & ~RSLOCK);            }            else                move = true;                   // move if busy            // 如有必要，为当前线程生成新的标识            if (move)                r = ThreadLocalRandom.advanceProbe(r);        }    }

由上可知，submit主要初始化队列以及向队列中添加任务，并在唤醒worker处理任务。但实际上，worker Thread 我们还没有看到被激活，只是看到有队workQueue的初始化。那么，worker又是在哪进行初始化的呢？只可能是在 signal 的时候了。

4.3. worker的初始化

worker是真正执行任务的线程，前面光看到添加队列，以及唤醒worker了。只是这时还未见worker被初始化，实际上它是在被唤醒的逻辑中进行初始化的。

// java.util.concurrent.ForkJoinPool#signalWork    /**     * Tries to create or activate a worker if too few are active.     *     * @param ws the worker array to use to find signallees     * @param q a WorkQueue --if non-null, don't retry if now empty     */    final void signalWork(WorkQueue[] ws, WorkQueue q) {        long c; int sp, i; WorkQueue v; Thread p;        while ((c = ctl) < 0L) {                       // too few active，一个标识，分两段使用，低位为0代表worker还可以添加            if ((sp = (int)c) == 0) {                  // no idle workers                if ((c & ADD_WORKER) != 0L)            // too few workers                    tryAddWorker(c);                break;            }            if (ws == null)                            // unstarted/terminated                break;            if (ws.length <= (i = sp & SMASK))         // terminated                break;            if ((v = ws[i]) == null)                   // terminating                break;            int vs = (sp + SS_SEQ) & ~INACTIVE;        // next scanState            int d = sp - v.scanState;                  // screen CAS            long nc = (UC_MASK & (c + AC_UNIT)) | (SP_MASK & v.stackPred);            if (d == 0 && U.compareAndSwapLong(this, CTL, c, nc)) {                v.scanState = vs;                      // activate v                if ((p = v.parker) != null)                    U.unpark(p);                break;            }            if (q != null && q.base == q.top)          // no more work                break;        }    }
    /**     * Tries to add one worker, incrementing ctl counts before doing     * so, relying on createWorker to back out on failure.     *     * @param c incoming ctl value, with total count negative and no     * idle workers.  On CAS failure, c is refreshed and retried if     * this holds (otherwise, a new worker is not needed).     */    private void tryAddWorker(long c) {        boolean add = false;        do {            long nc = ((AC_MASK & (c + AC_UNIT)) |                       (TC_MASK & (c + TC_UNIT)));            if (ctl == c) {                int rs, stop;                 // check if terminating                if ((stop = (rs = lockRunState()) & STOP) == 0)                    add = U.compareAndSwapLong(this, CTL, c, nc);                unlockRunState(rs, rs & ~RSLOCK);                if (stop != 0)                    break;                // 添加标识成功，再创建worker                if (add) {                    createWorker();                    break;                }            }        } while (((c = ctl) & ADD_WORKER) != 0L && (int)c == 0);    }
    /**     * Tries to construct and start one worker. Assumes that total     * count has already been incremented as a reservation.  Invokes     * deregisterWorker on any failure.     *     * @return true if successful     */    private boolean createWorker() {        ForkJoinWorkerThreadFactory fac = factory;        Throwable ex = null;        ForkJoinWorkerThread wt = null;        try {            // 调用线程工厂创建新的worker, 并立即启动worker            if (fac != null && (wt = fac.newThread(this)) != null) {                wt.start();                return true;            }        } catch (Throwable rex) {            ex = rex;        }        // 创建失败，处理异常        deregisterWorker(wt, ex);        return false;    }    /**     * Default ForkJoinWorkerThreadFactory implementation; creates a     * new ForkJoinWorkerThread.     */    static final class DefaultForkJoinWorkerThreadFactory        implements ForkJoinWorkerThreadFactory {        public final ForkJoinWorkerThread newThread(ForkJoinPool pool) {            return new ForkJoinWorkerThread(pool);        }    }

果然在signal时，创建worker。值得一提的，为了实现安全地添加worker，它会先更新成功ctl，然后再执行真正的create操作。避免多创建出worker来。

4.4. worker的工作原理

前面看到worker创建过程，传入了pool的实例，即当前上下文都是被worker可见的。所以，它能很好地复用当前的配置信息，而它自身是一个异步线程，在创建之后，立即被启动起来了。那它后续则必然尝试从队列获取任务，进行执行了。具体如何？

1. WorkerThread 构造方法

// java.util.concurrent.ForkJoinWorkerThread#ForkJoinWorkerThread    /**     * Creates a ForkJoinWorkerThread operating in the given pool.     *     * @param pool the pool this thread works in     * @throws NullPointerException if pool is null     */    protected ForkJoinWorkerThread(ForkJoinPool pool) {        // Use a placeholder until a useful name can be set in registerWorker        super("aForkJoinWorkerThread");        this.pool = pool;        // workQueue 临时向 pool 中进行注册所得        this.workQueue = pool.registerWorker(this);    }
    /**     * Callback from ForkJoinWorkerThread constructor to establish and     * record its WorkQueue.     *     * @param wt the worker thread     * @return the worker's queue     */    final WorkQueue registerWorker(ForkJoinWorkerThread wt) {        UncaughtExceptionHandler handler;        wt.setDaemon(true);                           // configure thread        if ((handler = ueh) != null)            wt.setUncaughtExceptionHandler(handler);        WorkQueue w = new WorkQueue(this, wt);        int i = 0;                                    // assign a pool index        int mode = config & MODE_MASK;        int rs = lockRunState();        try {            WorkQueue[] ws; int n;                    // skip if no array            if ((ws = workQueues) != null && (n = ws.length) > 0) {                int s = indexSeed += SEED_INCREMENT;  // unlikely to collide                int m = n - 1;                i = ((s << 1) | 1) & m;               // odd-numbered indices                if (ws[i] != null) {                  // collision                    int probes = 0;                   // step by approx half n                    int step = (n <= 4) ? 2 : ((n >>> 1) & EVENMASK) + 2;                    while (ws[i = (i + step) & m] != null) {                        if (++probes >= n) {                            workQueues = ws = Arrays.copyOf(ws, n <<= 1);                            m = n - 1;                            probes = 0;                        }                    }                }                w.hint = s;                           // use as random seed                w.config = i | mode;                w.scanState = i;                      // publication fence                ws[i] = w;            }        } finally {            unlockRunState(rs, rs & ~RSLOCK);        }        wt.setName(workerNamePrefix.concat(Integer.toString(i >>> 1)));        return w;    }

重点则是在 pool 中注册自身，得到一个 workQueue. 而其具体业务，则是在run方法中实现。

// java.util.concurrent.ForkJoinWorkerThread#run    /**     * This method is required to be public, but should never be     * called explicitly. It performs the main run loop to execute     * {@link ForkJoinTask}s.     */    public void run() {        if (workQueue.array == null) { // only run once            Throwable exception = null;            try {                onStart();                pool.runWorker(workQueue);            } catch (Throwable ex) {                exception = ex;            } finally {                try {                    onTermination(exception);                } catch (Throwable ex) {                    if (exception == null)                        exception = ex;                } finally {                    pool.deregisterWorker(this, exception);                }            }        }    }    // java.util.concurrent.ForkJoinPool#runWorker    /**     * Top-level runloop for workers, called by ForkJoinWorkerThread.run.     */    final void runWorker(WorkQueue w) {        w.growArray();                   // allocate queue        int seed = w.hint;               // initially holds randomization hint        int r = (seed == 0) ? 1 : seed;  // avoid 0 for xorShift        for (ForkJoinTask<?> t;;) {            // 取任务，执行            if ((t = scan(w, r)) != null)                w.runTask(t);            else if (!awaitWork(w, r))                break;            r ^= r << 13; r ^= r >>> 17; r ^= r << 5; // xorshift        }    }
        /**         * Executes the given task and any remaining local tasks.         */        final void runTask(ForkJoinTask<?> task) {            if (task != null) {                scanState &= ~SCANNING; // mark as busy                (currentSteal = task).doExec();                U.putOrderedObject(this, QCURRENTSTEAL, null); // release for GC                execLocalTasks();                ForkJoinWorkerThread thread = owner;                if (++nsteals < 0)      // collect on overflow                    transferStealCount(pool);                scanState |= SCANNING;                if (thread != null)                    thread.afterTopLevelExec();            }        }    // java.util.concurrent.ForkJoinTask#doExec    /**     * Primary execution method for stolen tasks. Unless done, calls     * exec and records status if completed, but doesn't wait for     * completion otherwise.     *     * @return status on exit from this method     */    final int doExec() {        int s; boolean completed;        if ((s = status) >= 0) {            try {                completed = exec();            } catch (Throwable rex) {                return setExceptionalCompletion(rex);            }            if (completed)                s = setCompletion(NORMAL);        }        return s;    }    // java.util.concurrent.RecursiveTask#exec    /**     * Implements execution conventions for RecursiveTask.     */    protected final boolean exec() {        // 即调用具体业务类的 compute 方法        result = compute();        return true;    }

咱们草草看了 worker 如何运行任务。这和线程池没多少差别，大致仍是从队列获取任务，然后执行业务方法compute . 我们暂时略去了如何获取任务，以及如何执行work-steal了。且看下节。

4.5. 任务获取实现

主要是通过scan处理。

// java.util.concurrent.ForkJoinPool#scan    /**     * Scans for and tries to steal a top-level task. Scans start at a     * random location, randomly moving on apparent contention,     * otherwise continuing linearly until reaching two consecutive     * empty passes over all queues with the same checksum (summing     * each base index of each queue, that moves on each steal), at     * which point the worker tries to inactivate and then re-scans,     * attempting to re-activate (itself or some other worker) if     * finding a task; otherwise returning null to await work.  Scans     * otherwise touch as little memory as possible, to reduce     * disruption on other scanning threads.     *     * @param w the worker (via its WorkQueue)     * @param r a random seed     * @return a task, or null if none found     */    private ForkJoinTask<?> scan(WorkQueue w, int r) {        WorkQueue[] ws; int m;        if ((ws = workQueues) != null && (m = ws.length - 1) > 0 && w != null) {            int ss = w.scanState;                     // initially non-negative            for (int origin = r & m, k = origin, oldSum = 0, checkSum = 0;;) {                WorkQueue q; ForkJoinTask<?>[] a; ForkJoinTask<?> t;                int b, n; long c;                // 首次获取时，是从自身队列中获取                if ((q = ws[k]) != null) {                    if ((n = (b = q.base) - q.top) < 0 &&                        (a = q.array) != null) {      // non-empty                        long i = (((a.length - 1) & b) << ASHIFT) + ABASE;                        if ((t = ((ForkJoinTask<?>)                                  U.getObjectVolatile(a, i))) != null &&                            q.base == b) {                            if (ss >= 0) {                                if (U.compareAndSwapObject(a, i, t, null)) {                                    q.base = b + 1;                                    if (n < -1)       // signal others                                        signalWork(ws, q);                                    return t;                                }                            }                            else if (oldSum == 0 &&   // try to activate                                     w.scanState < 0)                                tryRelease(c = ctl, ws[m & (int)c], AC_UNIT);                        }                        if (ss < 0)                   // refresh                            ss = w.scanState;                        r ^= r << 1; r ^= r >>> 3; r ^= r << 10;                        origin = k = r & m;           // move and rescan                        oldSum = checkSum = 0;                        continue;                    }                    checkSum += b;                }                if ((k = (k + 1) & m) == origin) {    // continue until stable                    if ((ss >= 0 || (ss == (ss = w.scanState))) &&                        oldSum == (oldSum = checkSum)) {                        if (ss < 0 || w.qlock < 0)    // already inactive                            break;                        int ns = ss | INACTIVE;       // try to inactivate                        long nc = ((SP_MASK & ns) |                                   (UC_MASK & ((c = ctl) - AC_UNIT)));                        w.stackPred = (int)c;         // hold prev stack top                        U.putInt(w, QSCANSTATE, ns);                        if (U.compareAndSwapLong(this, CTL, c, nc))                            ss = ns;                        else                            w.scanState = ss;         // back out                    }                    checkSum = 0;                }            }        }        return null;    }

要安全高效地实现一个获取队列还是不易啊。

4.6. task.fork 实现

一般地，能用上fork一词的场景，一般是对于当前环境的一个copy. 难道这里的fork也是这样吗？新开一个线程？不然又是如何找到需要处理的队列的呢？

// java.util.concurrent.ForkJoinTask#fork    /**     * Arranges to asynchronously execute this task in the pool the     * current task is running in, if applicable, or using the {@link     * ForkJoinPool#commonPool()} if not {@link #inForkJoinPool}.  While     * it is not necessarily enforced, it is a usage error to fork a     * task more than once unless it has completed and been     * reinitialized.  Subsequent modifications to the state of this     * task or any data it operates on are not necessarily     * consistently observable by any thread other than the one     * executing it unless preceded by a call to {@link #join} or     * related methods, or a call to {@link #isDone} returning {@code     * true}.     *     * @return {@code this}, to simplify usage     */    public final ForkJoinTask<V> fork() {        Thread t;        // ForkJoinWorkerThread 中持有workQueue实例，可直接向其添加任务        if ((t = Thread.currentThread()) instanceof ForkJoinWorkerThread)            ((ForkJoinWorkerThread)t).workQueue.push(this);        else            // 如果是外部线程，则添加到一共享pool中即可，后续将其各空闲线程处理            ForkJoinPool.common.externalPush(this);        return this;    }        // java.util.concurrent.ForkJoinPool.WorkQueue#push        /**         * Pushes a task. Call only by owner in unshared queues.  (The         * shared-queue version is embedded in method externalPush.)         *         * @param task the task. Caller must ensure non-null.         * @throws RejectedExecutionException if array cannot be resized         */        final void push(ForkJoinTask<?> task) {            ForkJoinTask<?>[] a; ForkJoinPool p;            int b = base, s = top, n;            if ((a = array) != null) {    // ignore if queue removed                int m = a.length - 1;     // fenced write for task visibility                U.putOrderedObject(a, ((m & s) << ASHIFT) + ABASE, task);                U.putOrderedInt(this, QTOP, s + 1);                if ((n = s - b) <= 1) {                    if ((p = pool) != null)                        p.signalWork(p.workQueues, this);                }                else if (n >= m)                    growArray();            }        }
/** * A thread managed by a {@link ForkJoinPool}, which executes * {@link ForkJoinTask}s. * This class is subclassable solely for the sake of adding * functionality -- there are no overridable methods dealing with * scheduling or execution.  However, you can override initialization * and termination methods surrounding the main task processing loop. * If you do create such a subclass, you will also need to supply a * custom {@link ForkJoinPool.ForkJoinWorkerThreadFactory} to * {@linkplain ForkJoinPool#ForkJoinPool use it} in a {@code ForkJoinPool}. * * @since 1.7 * @author Doug Lea */public class ForkJoinWorkerThread extends Thread {    /*     * ForkJoinWorkerThreads are managed by ForkJoinPools and perform     * ForkJoinTasks. For explanation, see the internal documentation     * of class ForkJoinPool.     *     * This class just maintains links to its pool and WorkQueue.  The     * pool field is set immediately upon construction, but the     * workQueue field is not set until a call to registerWorker     * completes. This leads to a visibility race, that is tolerated     * by requiring that the workQueue field is only accessed by the     * owning thread.     *     * Support for (non-public) subclass InnocuousForkJoinWorkerThread     * requires that we break quite a lot of encapsulation (via Unsafe)     * both here and in the subclass to access and set Thread fields.     */
    final ForkJoinPool pool;                // the pool this thread works in    final ForkJoinPool.WorkQueue workQueue; // work-stealing mechanics    ...}

可见，fork的过程，即是向当前线程中添加当前任务而已，并没有所谓的上下文copy过程。

4.7. task.join 实现

join的语义是，等待任务完成后返回。与 Thread.join()一致。只是有一个问题，即如果某个线程阻塞等待结果去了，那当前线程自然就相当于无法再被利用了。那后续的任务又何从谈起呢？想来只有递归能够解决这个问题了。但是递归往往又是在单线程中完成的，这岂不无法利用并发特性了？

实际上，之所以被分作fork/join两个步骤，意义就是在这。上一节我们看到，fork的过程是向队列中添加了任务，随后就返回了。这时，如果当前worker比较繁忙（在做任务拆分），则这些任务就会被其他worker窃取过去处理了。而其他任务在处理时，又会遇到自己的递归，从而将一个单线程的递归变为多线程的递归了。

下面我们主要看一个线程的递归过程。join的本义只是等待当前任务完成，但是当前任务完成又要依赖于其子任务完成join, 子任务又要等待其子任务join, 因此形成递归。而join()返回的表象是compute()完成，所以这过程其实是伴随着compute的运算的。

// java.util.concurrent.ForkJoinTask#join    /**     * Returns the result of the computation when it {@link #isDone is     * done}.  This method differs from {@link #get()} in that     * abnormal completion results in {@code RuntimeException} or     * {@code Error}, not {@code ExecutionException}, and that     * interrupts of the calling thread do <em>not</em> cause the     * method to abruptly return by throwing {@code     * InterruptedException}.     *     * @return the computed result     */    public final V join() {        int s;        if ((s = doJoin() & DONE_MASK) != NORMAL)            reportException(s);        // 任务完成后，主动获取结果        return getRawResult();    }    /**     * Throws exception, if any, associated with the given status.     */    private void reportException(int s) {        if (s == CANCELLED)            throw new CancellationException();        if (s == EXCEPTIONAL)            rethrow(getThrowableException());    }    // java.util.concurrent.RecursiveTask#getRawResult    public final V getRawResult() {        return result;    }

    /**     * Implementation for join, get, quietlyJoin. Directly handles     * only cases of already-completed, external wait, and     * unfork+exec.  Others are relayed to ForkJoinPool.awaitJoin.     *     * @return status upon completion     */    private int doJoin() {        int s; Thread t; ForkJoinWorkerThread wt; ForkJoinPool.WorkQueue w;        return (s = status) < 0 ? s :            ((t = Thread.currentThread()) instanceof ForkJoinWorkerThread) ?            // 取当前任务执行， doExec 执行任务，awaitJoin 等待执行完成            (w = (wt = (ForkJoinWorkerThread)t).workQueue).            tryUnpush(this) && (s = doExec()) < 0 ? s :            wt.pool.awaitJoin(w, this, 0L) :            externalAwaitDone();    }
    // java.util.concurrent.ForkJoinPool#awaitJoin    /**     * Helps and/or blocks until the given task is done or timeout.     *     * @param w caller     * @param task the task     * @param deadline for timed waits, if nonzero     * @return task status on exit     */    final int awaitJoin(WorkQueue w, ForkJoinTask<?> task, long deadline) {        int s = 0;        if (task != null && w != null) {            ForkJoinTask<?> prevJoin = w.currentJoin;            U.putOrderedObject(w, QCURRENTJOIN, task);            CountedCompleter<?> cc = (task instanceof CountedCompleter) ?                (CountedCompleter<?>)task : null;            for (;;) {                if ((s = task.status) < 0)                    break;                if (cc != null)                    helpComplete(w, cc, 0);                // 递归添加任务等待完成                else if (w.base == w.top || w.tryRemoveAndExec(task))                    helpStealer(w, task);                if ((s = task.status) < 0)                    break;                long ms, ns;                if (deadline == 0L)                    ms = 0L;                else if ((ns = deadline - System.nanoTime()) <= 0L)                    break;                else if ((ms = TimeUnit.NANOSECONDS.toMillis(ns)) <= 0L)                    ms = 1L;                if (tryCompensate(w)) {                    task.internalWait(ms);                    U.getAndAddLong(this, CTL, AC_UNIT);                }            }            U.putOrderedObject(w, QCURRENTJOIN, prevJoin);        }        return s;    }        // java.util.concurrent.ForkJoinPool.WorkQueue#tryRemoveAndExec        /**         * If present, removes from queue and executes the given task,         * or any other cancelled task. Used only by awaitJoin.         *         * @return true if queue empty and task not known to be done         */        final boolean tryRemoveAndExec(ForkJoinTask<?> task) {            ForkJoinTask<?>[] a; int m, s, b, n;            if ((a = array) != null && (m = a.length - 1) >= 0 &&                task != null) {                while ((n = (s = top) - (b = base)) > 0) {                    for (ForkJoinTask<?> t;;) {      // traverse from s to b                        long j = ((--s & m) << ASHIFT) + ABASE;                        if ((t = (ForkJoinTask<?>)U.getObject(a, j)) == null)                            return s + 1 == top;     // shorter than expected                        else if (t == task) {                            boolean removed = false;                            if (s + 1 == top) {      // pop                                if (U.compareAndSwapObject(a, j, task, null)) {                                    U.putOrderedInt(this, QTOP, s);                                    removed = true;                                }                            }                            else if (base == b)      // replace with proxy                                removed = U.compareAndSwapObject(                                    a, j, task, new EmptyTask());                            // 执行子任务                            if (removed)                                task.doExec();                            break;                        }                        else if (t.status < 0 && s + 1 == top) {                            if (U.compareAndSwapObject(a, j, t, null))                                U.putOrderedInt(this, QTOP, s);                            break;                  // was cancelled                        }                        if (--n == 0)                            return false;                    }                    if (task.status < 0)                        return false;                }            }            return true;        }

可见，最终fork/join还是使用递归完成join任务等待。差别在于其利用了多线程的优势，同时执行多个任务。这有两个好处，一是减轻了单线程的任务处理压力，二是让递归的深度也分担到了多个点上。避免了栈早早溢出的可能。

只是每个线程被分配的任务数是多少，join需要等待的结果有多少，就不太好说了。比如最上层的线程如果任务被别的线程抢走，则它就只需一直在等结果就行了。而最下面的线程，则需要承担最深的递归深度，以保证程序的最终出口。其实从这个点，我们自己可以做个猜想，如果没有做好控制，让线程之间任意执行任务，是否会造成死锁呢？这恐怕是个问题。

往期精彩推荐

了解更多java后端架构知识以及最新面试宝典

你点的每个好看，我都认真当成了

看完本文记得给作者点赞+在看哦~~~大家的支持，是作者源源不断出文的动力

作者：等你归去来

出处：https://www.cnblogs.com/yougewe/p/14943418.html

谈谈fork/join实现原理

4.1. ForkJoinPool构造器

4.2. 任务submit过程

4.3. worker的初始化

4.4. worker的工作原理

4.5. 任务获取实现

4.6. task.fork 实现

4.7. task.join 实现

腾讯、阿里、滴滴后台面试题汇总总结 — （含答案）

面试：史上最全多线程面试题！

最新阿里内推Java后端面试题

JVM难学？那是因为你没认真看完这篇文章

谈谈fork/join实现原理

4.1. ForkJoinPool构造器

4.2. 任务submit过程

4.3. worker的初始化

4.4. worker的工作原理

4.5. 任务获取实现

4.6. task.fork 实现

4.7. task.join 实现

腾讯、阿里、滴滴后台面试题汇总总结 — （含答案）

面试：史上最全多线程面试题 ！

最新阿里内推Java后端面试题

JVM难学？那是因为你没认真看完这篇文章

面试：史上最全多线程面试题！