不正经

这个屌丝很懒，什么也没留下！

热门标签

java list.sort_Java面试-List中的sort详细解读

作者：不正经 | 2024-02-15 15:05:15

踩

java list.sort

最近看了一些排序相关的文章，因此比较好奇，Java中的排序是如何做的。本片文章介绍的是JDK1.8，List中的sort方法。

先来看看List中的sort是怎么写的：

@SuppressWarnings({"unchecked", "rawtypes"})

default void sort(Comparator super E> c) {

Object[] a = this.toArray();

Arrays.sort(a, (Comparator) c);

ListIterator i = this.listIterator();

for (Object e : a) {

i.next();

i.set((E) e);

}

首先，你需要传入一个比较器作为参数，这个好理解，毕竟你肯定要定一个比较标准。然后就是将list转换成一个数组，再对这个数组进行排序，排序完之后，再利用iterator重新改变list。

接着，我们再来看看Arrays.sort：

public static void sort(T[] a, Comparator super T> c) {

if (c == null) {

sort(a);

} else {

if (LegacyMergeSort.userRequested)

legacyMergeSort(a, c);

else

TimSort.sort(a, 0, a.length, c, null, 0, 0);

}

public static void sort(Object[] a) {

if (LegacyMergeSort.userRequested)

legacyMergeSort(a);

else

ComparableTimSort.sort(a, 0, a.length, null, 0, 0);

}

static final class LegacyMergeSort {

private static final boolean userRequested =

java.security.AccessController.doPrivileged(

new sun.security.action.GetBooleanAction(

"java.util.Arrays.useLegacyMergeSort")).booleanValue();

}

这样可以看出，其实排序的核心就是TimSort，LegacyMergeSort大致意思是表明如果版本很旧的话，就用这个，新版本是不会采用这种排序方式的。

我们再来看看TimSort的实现：

private static final int MIN_MERGE = 32;

static void sort(T[] a, int lo, int hi, Comparator super T> c,

T[] work, int workBase, int workLen) {

assert c != null && a != null && lo >= 0 && lo <= hi && hi <= a.length;

int nRemaining = hi - lo;

if (nRemaining < 2)

return; // Arrays of size 0 and 1 are always sorted

// If array is small, do a "mini-TimSort" with no merges

if (nRemaining < MIN_MERGE) {

// 获得最长的递增序列

int initRunLen = countRunAndMakeAscending(a, lo, hi, c);

binarySort(a, lo, hi, lo + initRunLen, c);

return;

}

/**

* March over the array once, left to right, finding natural runs,

* extending short natural runs to minRun elements, and merging runs

* to maintain stack invariant.

TimSort ts = new TimSort<>(a, c, work, workBase, workLen);

int minRun = minRunLength(nRemaining);

do {

// Identify next run

int runLen = countRunAndMakeAscending(a, lo, hi, c);

// If run is short, extend to min(minRun, nRemaining)

if (runLen < minRun) {

int force = nRemaining <= minRun ? nRemaining : minRun;

binarySort(a, lo, lo + force, lo + runLen, c);

runLen = force;

}

// Push run onto pending-run stack, and maybe merge

ts.pushRun(lo, runLen);

ts.mergeCollapse();

// Advance to find next run

lo += runLen;

nRemaining -= runLen;

} while (nRemaining != 0);

// Merge all remaining runs to complete sort

assert lo == hi;

ts.mergeForceCollapse();

assert ts.stackSize == 1;

}

如果小于2个，代表不再不需要排序；如果小于32个，则采用优化的二分排序。怎么优化的呢？首先获得最长的递增序列：

private static int countRunAndMakeAscending(T[] a, int lo, int hi,

Comparator super T> c) {

assert lo < hi;

int runHi = lo + 1;

if (runHi == hi)

return 1;

// Find end of run, and reverse range if descending

if (c.compare(a[runHi++], a[lo]) < 0) { // Descending

// 一开始是递减序列，就找出最长递减序列的最后一个下标

while (runHi < hi && c.compare(a[runHi], a[runHi - 1]) < 0)

runHi++;

// 逆转前面的递减序列

reverseRange(a, lo, runHi);

} else { // Ascending

while (runHi < hi && c.compare(a[runHi], a[runHi - 1]) >= 0)

runHi++;

}

return runHi - lo;

}

接着进行二分排序：

private static void binarySort(T[] a, int lo, int hi, int start,

Comparator super T> c) {

assert lo <= start && start <= hi;

if (start == lo)

start++;

for ( ; start < hi; start++) {

T pivot = a[start];

// Set left (and right) to the index where a[start] (pivot) belongs

int left = lo;

int right = start;

assert left <= right;

* Invariants:

* pivot >= all in [lo, left).

* pivot < all in [right, start).

// start位置是递增序列后的第一个数的位置

// 从前面的递增序列中找出start位置的数应该处于的位置

while (left < right) {

// >>> 无符号右移

int mid = (left + right) >>> 1;

if (c.compare(pivot, a[mid]) < 0)

right = mid;

else

left = mid + 1;

}

assert left == right;

* The invariants still hold: pivot >= all in [lo, left) and

* pivot < all in [left, start), so pivot belongs at left. Note

* that if there are elements equal to pivot, left points to the

* first slot after them -- that's why this sort is stable.

* Slide elements over to make room for pivot.

int n = start - left; // The number of elements to move

// Switch is just an optimization for arraycopy in default case

// 比pivot大的数往后移动一位

switch (n) {

case 2: a[left + 2] = a[left + 1];

case 1: a[left + 1] = a[left];

break;

default: System.arraycopy(a, left, a, left + 1, n);

}

a[left] = pivot;

}

好了，待排序数量小于32个的讲完了，现在来说说大于等于32个情况。首先，获得一个叫minRun的东西，这是个啥含义呢：

int minRun = minRunLength(nRemaining);

private static int minRunLength(int n) {

assert n >= 0;

int r = 0; // Becomes 1 if any 1 bits are shifted off

while (n >= MIN_MERGE) {

// 这里我没搞懂的是为什么不直接将(n & 1)赋值给r，而要做一次逻辑或。

r |= (n & 1);

n >>= 1;

}

return n + r;

}

各种位运算符，MIN_MERGE默认为32，如果n小于此值，那么返回n本身。否则会将n不断地右移，直到小于MIN_MERGE，同时记录一个r值，r代表最后一次移位n时，n最低位是0还是1。

其实看注释比较容易理解：

Returns the minimum acceptable run length for an array of the specified length. Natural runs shorter than this will be extended with binarySort.

Roughly speaking, the computation is: If n < MIN_MERGE, return n (it's too small to bother with fancy stuff).

Else if n is an exact power of 2, return MIN_MERGE/2.

Else return an int k, MIN_MERGE/2 <= k <= MIN_MERGE, such that n/k is close to, but strictly less than, an exact power of 2. For the rationale, see listsort.txt.

返回结果其实就是用于接下来的合并排序中。

接下来就是一个while循环

do {

// Identify next run

// 获得一个最长递增序列

int runLen = countRunAndMakeAscending(a, lo, hi, c);

// If run is short, extend to min(minRun, nRemaining)

// 如果最长递增序列

if (runLen < minRun) {

int force = nRemaining <= minRun ? nRemaining : minRun;

binarySort(a, lo, lo + force, lo + runLen, c);

runLen = force;

}

// Push run onto pending-run stack, and maybe merge

// lo——runLen为将要被归并的范围

ts.pushRun(lo, runLen);

// 归并

ts.mergeCollapse();

// Advance to find next run

lo += runLen;

nRemaining -= runLen;

} while (nRemaining != 0);

这样，假设你的每次归并排序的两个序列为r1和r2，r1肯定是有序的，r2也已经被排成递增序列了，因此这样的归并排序就比较特殊了。

为什么要用归并排序呢，因为归并排序的时间复杂度永远为O(nlogn)，空间复杂度为O(n)，以空间换取时间。

好了，以上就是针对Java中的排序做的一次总结，但具体的归并代码还没有分析，其实我自己也没有完全研究透，为什么minRun的取值是这样的，这也和TimSort中的stackLen有关，有兴趣的小伙伴可以在下方留言，我们可以一起探讨。

有兴趣的话可以关注我的公众号，说不定会有意外的惊喜。

声明：本文内容由网友自发贡献，不代表【wpsshop博客】立场，版权归原作者所有，本站不承担相应法律责任。如您发现有侵权的内容，请联系我们。转载请注明出处：https://www.wpsshop.cn/w/不正经/article/detail/85214