Sunday, July 27, 2014

A Red-Black Tree - slightly unusual

The Red-Black Tree is not something I can implement without referring a standard textbook - it's implementation is just way too intricate. Even so, when you read about it, everything conceptually clicks, somehow, but you wonder how someone could have got around to actually inventing this beast !

The Red-Black Tree is the data structure of choice when a worst case O(log N) complexity is desired. As such AVL trees have preceded them and they make the same guarantees but it seems the AVL trees have lost out to the Red-Black Trees in recent implementations. The AVL tree seems to be too eager to re-balance the heights of the sub-trees resulting in larger constant factor in typical usages. Red-Black trees just don't care too much about imbalances, by only guaranteeing that any path to leaf contains a maximum of twice the number of nodes than the other paths from the same node.

So, we see that the Red-Black Tree is incorporated in some very commonly used C++ - STL structures (maps).

I have implemented the tree now, it's a standard implementation, but with a few twists on how it is used. See my GitHub source shared here...

The Red-Black tree in my implementation has the following features :

1. It has a C++ interface, but there are no pointers in the interface or internal code. There are only 4 byte unsigned integers serving as offsets into a linear structure, such as a vector or array.

2. Tree operations, even insertions, do not require copying an element or compound objects. As such the objects are maintained in a vector by the client code and the tree only maintains offsets into the client's vector. Goes without saying that the objects to be inserted need not have a default constructor either.

3. Comparison function (actually a class) is supplied by the client code and invoked by the tree. Nothing new here though ...

4. Removing elements from the tree is complemented by a free-list feature. The free-list holds a list of freed elements so that subsequent insertions can re-use memory from removed elements.

I hope to derive these advantages from these characteristics :

1. Interleaved insertions-deletions should behave better, with reduced need to allocate-release memory, maybe to the OS or to a shared memory pool.

2. Not using any pointers presents multiple benefits - the memory consumed by the tree will not change just by changing between a 32-bit and a 64-bit OS ! Since the 4 byte unsigned integral offsets all refer to memory within a single contiguously allocated array, the memory references should be more compact/localized. Cache hits should be better than what could have been if memory for nodes was allocated individually.

3. No copying of objects involved. Default constructor on elements to be inserted is not required. Only 4 byte offsets referring to the container are stored. Unfortunately, such references mean that offsets of existing elements cannot be changed, that might happen if the container's elements were to be deleted.

This is something in work too - I am planning to implement a companion container to this tree, that can support deletions too. Also to come are a few other data structures that are not commonly found in libraries...

Saturday, July 26, 2014

About API design and copy semantics

This is a tale of how a combination of counter-intuitive copy semantics and API design kept me debugging a piece of code, late into the night, for several hours... Late night coding might be fun but it there is never any fun in working in a crisis mode, with time pressure and against a bug.

I suspect that the reasons behind my ordeal, are also the reasons why most programmers tend to vastly under-estimate the time it will take them to finish a programming task.

Sequences or contiguously allocated arrays usually offer some facilities to copy their internal representation, that is efficient, and shallow. Certainly more efficient than simply iterating over the entire sequence and copying each object by value. Typically such sequences have a raw buffer where the data is managed, a 4 or 8 byte 'length' to know the size of that buffer and whether the sequence 'owns' the buffer or not.

Ownership of the raw buffer is the "hot potato", the sequence which ends up with it should be the one releasing the buffer's memory. The consequences are ghastly - If both of them release or if both fail to release.

When a copy of a sequence is required, into another sequence, the pointer to the raw buffer is assigned, the 'length' is copied and the ownership is relinquished by the original sequence depending on what the caller wants.

Consider such feature from a CORBA sequence, and an implementation from 'Orbacus',
typedef OB::FixSeq< ::CORBA::ULong, int > MSeq; /* CORBA's fixed sequence of integers */

MSeq source;
source.length(2); /* Fill some data to test copy */
source[0] = 1;
source[1] = 2;

MSeq target; /* Empty sequence */

bool takeOwnership = true; /* Let's have target sequence acquire ownership, and source sequence relinquish it too ! */

/* Now actual call to copy sequence into target  */
target.replace(source.maximum(), 
               source.length(), 
               source.get_buffer(takeOwnership), 
               takeOwnership);


I noted the following painful aspects of the 'replace()' call, that is supposed to do the efficient copy :

1. The name itself, 'replace' is suggestive of an action to 'change the guts' of the target sequence, but implies nothing about the fact that even the source sequence can be totally 'gutted' through passing certain choice of options transferring ownership.

2. How much of grunt work is expected from the caller ! The function takes 4 arguments including the length, maximum allocation of the source sequence and buffer, all internal details of the source sequence... Why does it not do with just a reference to source sequence and ownership option, and use all such internal detail from the source sequence itself ? After all, 'replace()' is a member function.

3. I find that boolean option of 'takeOwnership' is accepted twice in the same function call - first to tell the source that it should relinquish ownership and second to have the target sequence acquire it. Now, I can't think of a situation where the caller might desire to pass 'takeOwnership' as 'true' in first and 'false' in second or the other way. In fact, using different values for the ownership caused a crash that I debugged. The code was deep into the implementation and the results are totally counter-intuitive.

/* This code crashes  */
/* We are copying between 3 sequences : original -> intermediate -> final */
#include "OB/CORBA.h"
#include 

int main(){

typedef OB::FixSeq< ::CORBA::ULong, int > MSeq;

MSeq original;
original.length(2);
original[0] = 1;
original[1] = 2;


MSeq intermediate; 
intermediate.replace(original.maximum(), 
                     original.length(), 
                     original.get_buffer(false), 
                     true);

std::cout << intermediate.length() << " " << intermediate[0] << " " << intermediate[1] << "\n";


MSeq final;
final.replace(intermediate.maximum(), 
              intermediate.length(), 
              intermediate.get_buffer(false), 
              true);

std::cout << final.length() << "\n"; /* Length is as expected ==> 2 */

std::cout << final[0] << " " << final[1] << "\n"; // Still ... Crashes here...

return 0;

}

/*
To build :
g++ -I  -L  -lOB -lJTC -ldl -lpthread 

*/


What is certainly bad is that the length of the 'final' sequence is 2 but the internal buffer is just not there, any access causes a crash. The resulting sequence is simply not internally consistent.

The 'replace()' function API is designed such that it accepted too many arguments, that should have been deduced from a source reference. And that made it easy for a caller to mis-use the API. In that sense, it fails to have certain characteristics mentioned in an excellent piece from Joshua Bloch here !