Monkey-timsort: FAST array sorting

Monkey Programming Forums/User Modules/Monkey-timsort: FAST array sorting

Nobuyuki(Posted 2013) [#1]
This module provides an implementation of TimSort, a new and fast sorting algorithm. Under testing, I've found this algorithm to be much faster in the most common use scenarios than all of the other implementations currently available for Monkey, including Diddy's QuickSort(), and most likely the default List<T>.Sort(). It's my hope that this may one day become the default sorting algorithm for Monkey arrays and containers.



The code is adapted mostly from the default Java implementation written by Josh Bloch (formerly of Google), and is licensed under the GPL.

Q: What can I use this for?
A: Exactly what it says in the thread title -- fast sorting. The most common usage scenarios for games would be Y-Order and Z-Order sorting of sprites or tiles. Isometric tile renderers and 3d-vectorball routines can gain a large speed advantage with this sort -- since the data inside the tile/object arrays doesn't change often, the sort can skip the vast majority of the work, leaving more precious CPU cycles available to work with on mobile (and probably increasing your framerate as a result!)

Q: Can I use this in my commercial projects?
A: Generally speaking, yes, you should be able to. The original license was GPLv2, but is part of the Java classpath and is covered by their classpath linking exception. This means that the Monkey code is covered under the linking exception, too -- you may static link the module in commercial projects, and only modifications to the module itself need their source to be released. Monkey-timsort's license was changed to reflect this.

Download: (Fork, Clone, or click the ZIP button)
https://github.com/nobuyukinyuu/monkey-timsort

Be sure to read the wiki for help! Thanks, and enjoy!


ziggy(Posted 2013) [#2]
why GPL? We won't be able to use it on any commercial project. Is it GPL becouse the original algo is also GPL?


Nobuyuki(Posted 2013) [#3]
Yes, that's correct, it's GPL because the original code was GPL. I'm pretty sure there's an exception in the GPL which covers code in projects that simply static-link to this, otherwise it wouldn't be a part of Java's default library for arraysort (as of Java SE 7), or Python's default sort, either. If the code to monkey-timsort itself is modified, the code should probably be released, but I would shy away from the notion that the implementation can't be used in a commercial project...


ziggy(Posted 2013) [#4]
No, there's no exception. The exception was added to LGPL to allow static linking, but AFAIK, GPL forces any software using a GPL licensed code to be also open-source, wich makes the GPL license highly incompatible with most other existing licenses.


skid(Posted 2013) [#5]
It's a pity the port is not based on the original implementation

http://svn.python.org/projects/python/trunk/Objects/listobject.c

which is not infected by GPL.


Nobuyuki(Posted 2013) [#6]
Ziggy: Okay, I did some asking around, and here's what I found. Apparently, the original Java SE classpath was released under GPL v2, with a classpath linking exception, which should allow people to release their commercial projects without releasing source. Since I adapted this code from there, I will update my license to reflect this exception.


Soap(Posted 2013) [#7]
Report from the original implementation? :P


muddy_shoes(Posted 2013) [#8]
The Java code uses the Classpath exception: http://www.gnu.org/software/classpath/license.html and software that uses it doesn't need to be GPL'd as a whole.

If you're going to keep it GPL then you should include the same exception if you want it to be usable.

Edit: Oops, wrote the post and then went and did something else before hitting the button. I see you've got there already.


Nobuyuki(Posted 2013) [#9]
thanks a lot, muddy. Almost lost my crap seeing ziggy's post after 3 days of work! Glad to know that this can be used in commercial projects. I really never ran into licensing problems before (never having released a library covered under GPL), so I didn't know how much it's treated as radioactive/poison by commercial entities!


Samah(Posted 2013) [#10]
You gonna put a reference to Diddy in that Arrays class as per the MIT license?

Edit: I've just added the MIT header to all of Diddy. Please copy it into your version.


muddy_shoes(Posted 2013) [#11]
I've had a quick look at this and other than the "already sorted" case it looks like the advantage you're seeing is mostly down to diddy's use of object arrays. If I change the quicksort implementation to use generics and the Int array directly then it is generally faster than TimSort.

There's a worthwhile speed-up to be had by cutting some of the flab from the Array copy operations but it doesn't quite close the gap.


Samah(Posted 2013) [#12]
I'm already working on primitives for Arrays.Sort/QuickSort, but it's messy trying to mix primitives and objects while avoiding boxing.

Edit: Optimally I'd like to do something native with the array copying like Java's System.arrayCopy() method which actually does a raw memory transfer via JNI.


Nobuyuki(Posted 2013) [#13]
muddy_shoes: I changed the code sometime before you wrote up about the boxing/unboxing so as to try and measure only the time spent in Quicksort() itself, and not boxing/unboxing the array in order to try and make the comparison more fair.

That being said, the purely random numbers that are generated for these tests represent a worst-case scenario for most of the data that's fed to a sort in our games. Things like sprite order sorts are going to be mostly-ordered already from frame to frame, and in those usage scenarios, TimSort has been shown (in tests done by people a lot smarter than me!) to be quite a bit more optimized for that purpose.

Samah: Sorry about that! Completely slipped my mind... I'll get the new headers up for Arrays.monkey right away. I probably could've foregone the usage of Arrays<T> for array copying, but I figured that Diddy probably did it the "fastest way" possible without going native, and its use of generics fit nicely into the existing style of the original (and matched Java's arraycopy arguments order, to boot).

What would really be nice is if these native libs you mention would eventually extend Monkey's built-in Array class! That's something I think many people would definitely appreciate having built-in and supported on all platforms. There's definitely room for speed gains on the raw array copying, but I'm afraid it's a bit above my pay grade. But anyway, thanks again to the Diddy devs for providing functions to manipulate arrays.


Samah(Posted 2013) [#14]
Sorry about that!

Yeah I figured it'd just slipped your mind. ;)

I might take a look at the native stuff at some point.

Edit: Just made a change to Arrays.Copy to remove the need for a temp array (I'm an idiot for not thinking of it in the first place).


muddy_shoes(Posted 2013) [#15]
@Nobuyuki

I changed the code sometime before you wrote up about the boxing/unboxing so as to try and measure only the time spent in Quicksort() itself, and not boxing/unboxing the array in order to try and make the comparison more fair.


That's not really the issue. The problem is that the Object array based quicksort incurs a large overhead from all the casting that goes on in the comparator.

As for TimSort's comparative performance for any given real-world dataset, of course it may well be a better choice. You'd have to test to see if that is the case. I'm just pointing out that your headline numbers are a bit misleading as it isn't faster for the actual random data test you're using if the quicksort is implemented using the same generic style as your TimSort code.


@Samah: You might also want to wrap the various sanity checks at the top of the function in an #If CONFIG = "debug".


Nobuyuki(Posted 2013) [#16]
>I'm just pointing out that your headline numbers are a bit misleading as it isn't faster for the actual random data test you're using if the quicksort is implemented using the same generic style as your TimSort code.


Point taken. I wouldn't mind a Quicksort() that operated on the same generic style, myself. Voodoo casting scares me.. oAo;


Samah(Posted 2013) [#17]
Just refactored QuickSort into a generic class.
[monkeycode]SortUtil<Int>.QuickSort(Int[])[/monkeycode]