jueves, 30 de junio de 2011

Skip Lists compared with Treaps and Red-Black Trees

I implemented the skip list, which is a linked-list based data structure that has an average performance comparable to BSTs (O(log n) for insertion/search/removal). What I find amazing about this data structure is that it is REALLY REALLY easy to implement (insertion took... 20 lines?) for such good performance. Furthermore, it is a data structure that can be made concurrency-friendly without major complications as the underlying structure is just a linked-list.

References:

- http://en.wikipedia.org/wiki/Skip_list
- Lecture from MIT
- Article on MSDN
- C# implementation by Leslie Sanford

I also wanted to compare the performance of the Skip List against the Treap and the Red-Black Tree. For that, I used my own implementation of the first two and I used a TreeMap (in Java Collections) for the third.

The Skip List was initialized with log(n) levels, where n is the amount of keys we'll be inserting into the structures. The random number of levels for each inserted key is calculated by generating a new level with a probability of 0.50 each time (equivalent to tossing a fair coin each time).

The test was done by executing the attached application several times and more or less memorizing the execution times (yeah, not very scientific... but I just wanted to have a quick look) for inputs of <100000 and >300000 pre-generated random numbers. The results, in descending order (fastest to slowest) were:

Insertion:
1. Treap
2. Skip List /RB tree (depending on input size: RB-tree faster with smaller input, skip list faster with larger input)

Search:
1. Treap
2. RB-tree
3. Skip List

Removal:
1. Treap
2. Skip List
3. RB-tree

The Treap was consistently faster (actually MUCH faster, I have to say) than the other two for all three operations, which is also the result obtained by Heger (pages 67-75). The skip list's performance was 'pretty good' but didn't seem too stable (could also be that my implementation was flawed in some way), while the RB-tree looked very stable. However, the skip list was incredibly easy to implement, which is also a great factor to take into account.

As for the memory footprint, I couldn't really compare them fairly but the general impression after running a profiler was that both the Treap and the RB-Tree used less space, which is quite logical as the Skip List uses quite a bit space for the pointers.

Note: if you want to run the test yourself you should remove the calls to System.out.println, as they affect the runtimes. The Treap package is also within the attached NetBeans project so that the comparisons can be done.

Source code:

Download

martes, 28 de junio de 2011

Treaps

A Treap is a data structure that combines a binary search tree with the heap invariant, which means that any root node of any subtree has a higher priority than its children (with a max-heap invariant, that is).

The idea is that BSTs perform really well when the insertions are done randomly, as it tends to be balanced, so the Treap tries to simulate this randomness by assigning random priorities to the inserted nodes.

I quickly implemented insertion/removal only for the purpose of learning and haven't refactored it, so the code could probably be further optimized/simplified.

Quick notes:

- Insertion: insert key X with a random priority. Insert works as in BSTs. Once the node is inserted, 'heapify' the node with the corresponding rotations.

- Removal: the opposite of the insertion. Find key X to remove, rotate the node down (preserving the heap invariant) until it becomes a leaf or has a null child and then remove it appropriately.

Further reference:
- http://en.wikipedia.org/wiki/Treap
- C implementation by Farooq Mela: https://github.com/fmela/libdict/blob/master/src/tr_tree.c

The Treap might perform better than BSTs, Skip Lists or other balanced trees such as Red-Black Trees, AVLs, AA trees or Splay Trees. Reference:

A Disquisition on The Performance Behavior of Binary Search Tree Data Structures (pages 67-75)

Source code (netbeans project):

Download