Sunday, August 31, 2014

Code optimization patterns - my quick notes

These patterns are based on what I read about here. I am still reproducing (dumping !) it here (with as high fidelity as I can manage), primarily because these patterns are what I do use in assorted situations and also that I want to keep in deep touch with all of them and keep applying them as much as I can. These are just my notes.

Some of these ideas are thrown up as directly conflicting with Object Orientation principles, by breaking encapsulation and trashing inheritance, in favor of composition. It really is a "black art" to know the appropriate situations where I need to apply them.

Here they are :

1/ Data locality -
a/ Converting a list of pointers to a list of offsets(4 byte integers) into contiguously allocated arrays. Iterating over an array of pointers is bad - it makes memory access to random locations - array of pointers "thrash the cache" (too bad, it rhymes !). Allocating array of objects in contiguous memory locations makes it likely that successive objects will be "found" in the cache.
b/ Hot-cold separation - Perhaps the object is large and the array of objects will not fit in the cache anyway. Perhaps for most iterations, only a few specific attributes of each object are processed. Splitting a larger object into smaller sets so that more frequently accessed parts are together. This might make the entire array of objects fit in the cache too.
c/ Reduced branch mis-prediction by
c1/ sorting a list of objects based on some criteria such that iteration only happens for a subset without enclosed 'if' condition
c2/ Creating separate lists of objects based on same criteria so that sorting an entire list is avoided
c3/ Just placing the iteration loop inside the 'if' condition, not other way - simple enough.
c4/ Not mixing different types of objects together in an array if they are to be iterated upon, even if they have same base class types and having separate arrays of each type.

2/ Dirty flag - I can think of it as the poor man's lazy evaluation. Basic dirty flagging is to have an additional boolean flag to tell if the object / cache is stale and requires refreshing. It helps avoid computation until it is required and only if it is required. Another flavor to the pattern is time-stamping or sequence numbering - We have a number or time-stamp that only increments without any danger of repeating and use it to mark the 'version' of the object. Whenever the sequence number is less than some other 'master', we update the cache.

3/ Object pooling - This is basically re-using allocations of some sort - either resources like database connections, threads or simple objects or even plain memory. We just want to avoid creating and destroying objects in memory when we know they really are required in future in some other contexts. There could be low and high watermarks set so that a minimum and a maximum limit on the number of objects in the pool is enforced, for not consuming too much memory and also being ready for multiple requests in burst. If we think of multi-threading scenarios, then each thread dynamically allocating memory from Operating systems is known to be slow, due to locking required for using the common heap. If each thread can use its own pool of memory, then no locking is required, re-use is achieved and thread-specific memory pool should not end up being fragmented.

4/ Spatial partitioning - Organize objects that are "close" to each other so that given an object, the "neighbors" can be efficiently located and processed. This is obviously helpful with real geometric spaces such as nearby particles being checked for collisions. However, the distance as a metric can be extended in other ways - Given a tree like structure, the distance between any two nodes may be expressed in terms of the number of common parents between them. This helps us partition the nodes better, so that queries like finding the number of child nodes for any given node may be made more efficient. This partitioning may be limited to nodes at same levels in the tree too. This pattern does present difficulties when the nodes can actually change distances to other nodes and we have to come up with efficient ways of updating the partitioning depending on the problem context.

No comments: