I removed a couple of conditions, inlined and unrolled a function, and replaced the generic memory pool with a stack of pointers.
RakString vs. std::string
Creation: RakString roughly fixed cost. std::string improves over time. Depending on when the test occurs, RakString ranges from 3X faster to roughly equal.
Remove at head (shifting a large list): RakString is consistently 4.25X faster
Deletion: Both are fast enough to not register one millisecond.
5000 iterations
Reference is new / delete and strcpy with a straight char*
Insertion 1 Ref=5 Rak=13, Std=48
RemoveHead Ref=18 Rak=150, Std=674
Insertion 2 Ref=5 Rak=6, Std=13
RemoveTail Ref=0 Rak=0, Std=0
Insertion 1 Ref=6 Rak=7, Std=5
RemoveHead Ref=30 Rak=153, Std=673
Insertion 2 Ref=8 Rak=6, Std=5
RemoveTail Ref=0 Rak=0, Std=0
Insertion 1 Ref=6 Rak=7, Std=8
RemoveHead Ref=62 Rak=151, Std=666
Insertion 2 Ref=7 Rak=7, Std=8
RemoveTail Ref=0 Rak=0, Std=0
Insertion 1 Ref=6 Rak=7, Std=13
RemoveHead Ref=61 Rak=156, Std=669
Insertion 2 Ref=11 Rak=7, Std=12
RemoveTail Ref=0 Rak=0, Std=0
10000 iterations
Insertion 1 Ref=13 Rak=26, Std=141
RemoveHead Ref=58 Rak=1167, Std=3425
Insertion 2 Ref=12 Rak=13, Std=7
RemoveTail Ref=0 Rak=0, Std=0
Insertion 1 Ref=13 Rak=13, Std=30
RemoveHead Ref=108 Rak=1122, Std=3334
Insertion 2 Ref=12 Rak=14, Std=29
RemoveTail Ref=0 Rak=0, Std=0
Insertion 1 Ref=10 Rak=14, Std=22
RemoveHead Ref=878 Rak=1116, Std=3363
Insertion 2 Ref=12 Rak=13, Std=23
RemoveTail Ref=1 Rak=0, Std=0
Insertion 1 Ref=13 Rak=13, Std=14
RemoveHead Ref=886 Rak=1128, Std=3349
Insertion 2 Ref=14 Rak=14, Std=14
RemoveTail Ref=0 Rak=0, Std=0
Summary
Reference is much faster in shifting large lists because it is just copying a single variable internally. Neither std::string nor RakString know that contextually they don’t need to do intermediate copies, so there is more overhead and thus they are slower. However, RakString uses a reference counted pointer, so the shifting is limited to a few variables vs. std::string which actually does a full copy. RakString is about 20% slower than reference, but std::string is over 350% slower.
Reference is much faster the first iteration, because RakString has to create the memory pools. Presumably std::string does this as well. After the first iteration, RakString varies from equally fast to 50% slower. std::string varies from equally fast to 250% slower. However, oddly enough in one test std::string was faster than both – this was probably a fluke due to thread context switching.
While it doesn’t show in these numbers, in some later tests reference got slower the more tests I ran, until it was actually far slower than both RakString and std::string. This was probably due to memory fragmentation.
If you are going to do a lot of shifting around in lists, just use char* that you allocated yourself. Otherwise, use RakString. RakString is faster than std::string, and doesn’t cause linker errors and export warnings when linking as a DLL, a big plus. The constructor and Set function is varidic. It complies with camelCase conventions. And as I use it more I can add inherent functionality that std::string is missing, such as splitpath.
Here’s the code. It’s stand-alone except for DS_List, and SimpleMutex. For the export and assertions you can just replace those with your own versions. SimpleMutex could be removed if this was only used in a single thread at a single time.
2 replies on “Additional RakString optimizations”
Isn’t the STL a implementation independent definition, of what headers have to look like? So you should propably state, which implementation you’re compiling with which compiler. Maybe a comparison would be interesting.
It’s whatever comes with Visual Studio 2008