sort : Java Glossary

go to home page S words local find full screen, hide local find menu Google search web for more information on this topic jump to foot of page translate this page with Babelfish 2008-07-23 by Roedy Green ©1996-2009 Canadian Mind Products
index page for letter ⇒ punctuation 0-9 A B C D E F G H I J K L M N O P Q R S T U V W X Y Z (all)
sort
Putting items in ascending or descending order by key fields. The English language meaning of sort is simply to categorise, e.g. to put all your bills into folders by category, e.g. all the electric bills together, all the phone bills together. An internal sort is done strictly in RAM. An external sort uses hard disk temporary files. A sort is stable if it naturally preserves existing order when you have duplicate keys. If you want existing order preserved and you are using an unstable sort, you must add an extra sequence number key to your records and use that as a secondary tie-breaking key.
Internal Sort Bad Sorts
Gotchas Ordered Collections
Comparators Descending/Inverse/Reverse Order with reverseOrder
Comparables Ascending/Descending Indicators
Sorting Speed Generic Sorts
External Sort Complete Example
Merge Sort Learning More
Tag Sort Links

Internal Sort

Here is James Goslings’s QuickSort.

In the JDK 1.2 there are built-in sorts such as Arrays.sort and Collections.sort See java.util.Comparator and java.util.Comparable for different sorting orders. For String sorting in various international orders see java.text.Collator, java.text.RuleBasedCollator, and java.text.CollationKey.

In the simplest case, you can sort an array like this:

import java.util.Arrays;
//...
// works for:
// int[] myArray;
// Object[] myArray;
// Dog[] myArray;
// byte[] myArray;
// etc.
Arrays.sort( myArray );
or an ArrayList (or any other Collection) like this:
import java.util.ArrayList;
import java.util.Collections;
...
Collections.sort( myArrayList );

Note that there is no such method as ArrayList.sort.

Gotchas

Comparators

For more complex sorts you need a class that implements Comparator. Often you can piggyback that code on some other class. The following example does the same as the code above, but could be extended to sort by several fields.
To sort in descending order, simply write a Comparator with a compare that returns the negative of the value it does now i.e.
return -( (String )a ).compareTo( (String)b );

Comparables

If you write a class, you can define its natural sort order, by having that class implement the Comparable interface. Then you can sort arrays or ArrayLists of your Objects without needing to specify a Comparator. Instead of a compare method, an Comparable interface has a compareTo method.

Sorting Speed

There is considerable overhead to set up a sort, of even a couple of elements. So don’t sort every time you add or change an element. Wait till all your changes are done, then do the sort. You can bypass the sort altogether if the size < 2.

In the following table, speed is measured relative to RadixSort on a 100,000 item set of 5-character random keys of the form A9999 where SunSort is arbitrarily assigned 100. Tests done under Java 1.4.1 java.exe on a 233 MHz Pentium 128 MB RAM. Bigger is better.

Sort speed
(bigger is better)
stable? notes
SunSort (Java’s Built-In Sort) 100 yes This is a most excellent sort. It uses a modified QuickSort for primitives and a MergeSort (sometimes called a tournament sort for objects). The merge sort makes a clone of the array (but not the objects) before it starts, so be aware of the memory usage. There is no reason to use any but this sort except for very large sorts with keys of 4 characters or less where RadixSort might beat it out. Of course you can’t use this in the older Java versions before it was available. Invoke with Arrays.sort and Collections.sort. Beware when using the versions with startIndex and endIndex. The endIndex points one past the last element.
RadixSort 86 yes Good for very large sorts, takes same time per item no matter how many items. Ordinary sorts take more time per item as the number of items goes up. More complicated Comparator routine to write. Likes short keys. If you double the length of the key the sort time doubles.
HeapSort 48 no Works best when data are almost already sorted.
ShellSort 34 no Small and quick-loading. Good for small sorts under 2000 items.
QuickSort 14 no Likes random data to sort. Order in the data can cause it to take a very long time to sort. Not good on very small sorts. Varies wildly in time to sort, sometimes a speed demon, sometimes a pig. It just depends on the luck of the draw. I’m told the version I have could be made much better with a few tweaks.

External Sort

If you have too much data to sort to fit in RAM at once, you will need to use an external sort such as OptTech neé Opt-Tech Data Processing. It works by sorting RAM-fulls of records, then doing an N-way merge on the sorted batches, to create fewer larger sorted batches. Eventually it ends up with one large sorted batch.

Abundance uses Op-Tech sort. It rolls itself out to disk, calls in Opt-Tech sort, giving it all of RAM, then rolls itself back in.

Merge Sort

A merge sort is used primarily in large sorts where the data are to big to fit in RAM. You sort bundles and write them to disk, using any of the usual internal sorting techniques, but not a merge sort. Then you merge N bundles, reading them sequentially and interleaving them into one bundle N times as long. You repeat until you have one big bundle. To get maximum speed, to eliminate head motion, your scratch intermediate disk files should be on separate physical drives. You tweak N to use RAM/virtual RAM most efficiently. Merge sorts are not particularly good for internal sorts.

You might experiment with a merge sort to use on a dual core CPU. You split the data in two. Sort each on a separate thread then merge the results in a single pass.

Tag Sort

One problem you often run into is having two arrays you want to keep in sync when you sort one. There are three approaches.
  1. Combine the two arrays into one, with objects having two or more fields. Write a Comparator or implement Comparable for these new combined objects.
  2. Sorts don’t actually sort the records, they just order an array of references. The objects themselves don’t move. If the things you are sorting are not collected nicely together into objects you can use a tag sort. Tag records are short records consisting of the sorting key fields plus a reference back to the original record. This reference may be an array index or a pointer to the original object. The tag records need not contain any data at all, just a way to find it when the comparator is called.
  3. Use the cern.colt.GenericSorting class.

Bad Sorts

There are a few simple sorts you should almost never use since the take time o(n²). In order words they turn into utter pigs when the number of elements to sort gets large.

One is a selection sort:

  1. Select the largest of the N keys and move its record to the output area.
  2. Select the largest of the remaining N-1 keys and move its record to the output area. etc.
Another is the bubble sort, where you pairwise swap elements in a pass through all the elements to percolate the biggest element to the top. Then repeat for the list not containing the biggest elements so far. etc.

A variant of the bubble sort, that is a tiny bit faster is the bi-directional sort that percolates the smallest element to the bottom and the biggest element to the top in alternating direction passes, gradually working toward the middle. Sun has built an

to let you watch bubble sort and bidirectional BubbleSort race against QuickSort. It becomes obvious why BubbleSort should not be used.

Another is the insertion sort, where you create a new list and gradually add to it, always keeping the new list in sorted order by inserting the new element in the place where it belongs. You find the place by binary search, and slide all the subsequent elements down to make room for it. You sometimes have to use this type of sort when you don’t have all the elements at the start of the sort. They are just given to you one at a time, and you want a sorted list immediately output after each element is added.

Ordered Collections

Instead of sorting, you can maintain a set in order with TreeSet or its brother TreeMap. Whether you go the TreeSet route or the export with toArray and sort route depends on whether you need sorted order only on occasion. If you do, the export method is better. If you need the sorted order maintained up to the second, then the TreeSet method is better. The TreeSet method consumes more memory and more CPU time, and is perhaps a trifle more complex to code. With either technique, it is possible view the same data in several sorted orders by sorting the exported array in different orders or by maintaining multiple TreeSets on the same data objects.

Descending/Inverse/Reverse Order with reverseOrder

Sorting ascending order means sorting with the small elements first then the big. This is usual ordering. Descending order means sorting with the big elements first then the small.

If you have a Comparator or Comparable of some kind, you can convert it into one that sorts into the reverse of the usual order. E.g. if the original sorts alphabetically, the new one will sort in reverse alphabetical order. Here is how you use it:

If you don't have a suitable base Comparator, just write an ordinary Comparator from scratch and reverse the operands to each compare inside it, or return - result instead of result.

Ascending/Descending Arrow Indicators

If you do a GUI, you will want some way of indicating which way your data are sorted ascending or descending. The problem is, no matter what you do you are going to confuse some of you users. Why is this? What does the direction mean to various people?

Perhaps some day the association and icon choices will be a configurable part of look an feel so users will have consistent conventions across all the apps they use. Consistency is the problem.

I have found the majority up application use an upward arrow to indicate that items are displayed in ascending order.

My convention is to use an arrow pointing up and to the right to indicate the data are sorted in ascending order. This helps break the connection with reading direction, and suggests the climbing association.

No matter what you do, you are going do it “backwards” for some people, and irritate them. So you might use indicators that don’t have any arrows or at least no confusing up/down arrows. Let the user figure out what your colours or letters A/D mean for example.

Generic Sorts

If you write a sort for a List, e.g. ArrayList, with proper generics, it will work on collections of any type that supports Comparable or Comparator. To see how to pull it off, have a look at the source for any of my sorts, or Sun’s sort.

However, because of Java’s lack of orthogonality, your List sort won’t work for arrays of such Objects. You need to write very similar code to do that. Even that array version won’t sort an array of primitives such as long, int or byte. You have to write yet another slightly different version of the sort to handle each type of primitive.

Complete Example

Learning More

Sun’s Javadoc on Collections.sort : available:
Sun’s Javadoc on Arrays.sort : available:
Sun’s Javadoc on the Comparable class : available:
Note java.lang.Comparable but java.util.Comparator.
Sun’s Javadoc on the Comparator class : available:
Sun’s Javadoc on Collections.reverseOrder : available:
Sun’s Javadoc on Collections.reverseOrder() : available:
Sun’s Javadoc on Collections.reverseOrder(Comparator) : available:

CMP homejump to top
CMP logo
feedback Please email your feedback for publication, errors, omissions, broken/redirected link reports
and suggestions to improve this page to Roedy Green : feedback email
made with CSS
HTML Checked!
ICRA ratings logo
mindprod.com IP:[65.110.21.43]
Your face IP:[38.103.63.62] The information on this page is for non-military use only.
You are visitor number 140,334. Military use includes use by defence contractors.
You can get a fresh copy of this page from: or possibly from your local J: drive (Java virtual drive/mindprod.com website mirror)
http://mindprod.com/jgloss/sort.html J:\mindprod\jgloss\sort.html