Categories
Uncategorized

Useful data structure of the day: BPlus tree

In RakNet, when I sent a message I put it on a queue. When an acknowlegment arrives, I look at the head of the queue and assuming that acknowlegement matches what is there, I pop the queue. Most of the time that works great – O(1) both ways. But there is a problem – with […]

In RakNet, when I sent a message I put it on a queue. When an acknowlegment arrives, I look at the head of the queue and assuming that acknowlegement matches what is there, I pop the queue. Most of the time that works great – O(1) both ways. But there is a problem – with large blocks lost, you get the folloowing bug:

(assuming I missed out on elements 0-1000)
Search through a 1000 element list for element # 1001
Search through a 1000 element list for element # 1002
Search through a 1000 element list for element # 1003
Search through a 1000 element list for … 2000

Assuming the list went from 0 to 1000, this is 1 million iterations because in the worst case I did a linear search through the queue.

This rarely happened because usually acknowlegements came in the order I did sends, so 99.99% of the time there wasn’t a problem and it was O(1). But I needed a data structure that, given input not in the set, would still do a relatively fast find. However, it also had to do fast insertions and deletions (which was why I used a queue to begin with).

I could have used my AVL Balanced binary tree. It’s a binary tree that automatically rebalances after every insertion and deletion. It’s pretty good for unsorted data – finds, insertions, and deletions are all Log2(n). However, I wrote it back in college so it wasn’t as efficient as it could be. The main problem was that there was a lot of node creation and destruction.

I also wrote a B Tree and a BPlus tree in college. A BPlus tree is an n-way tree of order o where each node contains o keys and o+1 children. Each leaf contains an ordered list of key/data pairs. The leftmost child of a node points to the subtree where all keys are less than node key 0, the child o+1 is the value of the lowest key of that subtree, and so on. All leaves are linked in a linked list. In essence, it’s an ordered list divided up among leaves with the branches of the tree pointing into that list.

I thought it would be a one day at most project since that’s about how long it took me in college. However, I did two things differently:
1. I didn’t remember how to do it. Thinking it would be easy, I started coding without full understanding of the data structure.
2. I wanted something very optimal so went to extreme lengths to make it so.

So it took a week 🙁

Anyway it’s done now and it’s a good data structure. If you need an ordered list where insertions, deletions, and searches are all fast, and/or you want to linearly traverse the list, a B+ is the way to go. If you need fast insertions, deletions, and searches but your data doesn’t have to be ordered and you don’t need linear traversals very often, an AVL tree is slightly more efficient.

It’s in RakNet as the file DS_BPlusTree.h.

Leave a Reply

Your email address will not be published. Required fields are marked *