|
|
An Introduction to C++ Programming - Part 12/13Reference Counting PointerWritten by Björn Fahller |
IntroductionLast month's ``auto_ptr<T>'' class template, documents and implements ownership transfer of dynamically allocated memory. Often, however, we do not want to be bothered with ownership. We want several places of the code to be able to access the memory, but we also want to be sure that the memory is deallocated when no longer needed. The general solution to this is called automatic garbage collection (something you can read several theses on, and also buy a few commercially available libraries for.) The less general solution is reference counting; no owner, but last one out locks the door. The idea is that a counter is attached to every object allocated. When allocated it is set to 0. When the first smart pointer attaches to it, the count is incremented to 1. Every smart pointer attaching to the resource, increments the reference count; and every smart pointer detaching from a resource (the smart pointer destroyed, or assigned another value) the resource's counter is decremented. If the counter reaches zero, no one is referring to it anymore, so the resource must be deallocated. The weakness of this compared to automatic garbage collection is that it does not work with circular data structures (the count never goes below 1.) The problems to solveMany of the problems with a reference counting pointer are the same as for the auto pointer. The list is actually a bit shorter, since there's no need to worry about ownership.
This might also be the place to mention a problem not to solve; that of how to stop reference counting a resource. Adding this functionality is not difficult, but it quickly leads to user code that is extremely hard to maintain. This is exactly what we want to avoid, so it is better not to have the functionality. Here is how it is supposed to work when we are done: counting_ptr<int> P1(new int(value));After creating a counting pointer ``P1'', the reference count for the value pointed to is set to one. counting_ptr<int> P2(P1);When a second counting pointer ``P2'' is created from ``P1'', the object pointed to is not duplicated, but the reference count is incremented. counting_ptr<int> P3(P2);When three counting pointers refer to the same object, the value of the counter is three. P1.manage(new int(other));As one of the pointers referring to the first object created is reinitialized to another object, the old reference count is decremented (there are only two references to it now) and the new one is assigned a reference count of 1. P2=P1;When yet one of the pointers move attention from the old object to the new one, the counter for the old one is yet again decremented, and for the new one it is incremented. P3=P2;Now that the last counting pointer referring to the old object moves its attention away from it, the old objects reference count goes to zero, and the object is deallocated. Now instead the new object has a reference count of 3, since there are three reference counting pointers referring to it. Interface outlineThe interface of the reference counting smart pointer will, for obvious
reasons, share much with the auto pointer. The differences lie in
accessing the raw pointer and giving the pointer a new value. While these
aspects could use the same interface as does the auto pointer, giving the
reference counting pointer an identical interface, this would be very
unfortunate since their semantics differ dramatically. Here is a suggested
interface.
Where to store the reference countBefore we can dive into implementation, we must figure out where to store
the reference count. It is obvious that the counter belongs to the object
referred to, so it cannot reside in the smart pointer object. A solution
that easily springs to mind is to use a struct with a counter, and the
type referred to, like this:
The work around is simple; use a ``T*'' instead. All we need to work this way is to make sure to allocate this representation struct on heap in the constructor and ``manage'' member functions (and of course to deallocate the struct when we're done with it.) There is a performance disadvantage with this, however. Whenever accessing the object referred to, we must follow two pointers, the pointer to the representation and the pointer to the object from the representation. The best solution I have seen is to decouple the representation from
the object and instead allocate an ``unsigned''
and in every counting pointer object keep both a pointer to the counter
and to the object referred to. This gives the following data section of
our counting pointer class template:
AccessibilityThe solution outlined above is so good it almost works. For compilers that do not support member templates, such that the assignment and construction from a counting pointer of another type are impossible, it is all we need. However, if we want the ability to assign a ``counting_ptr<T>'' object from a value of type ``counting_ptr<Y>'' if a ``T*'' can be assigned from a ``Y*'', we must think of something. The problem is that ``counting_ptr<T>'' and ``counting_ptr<Y>'' are two distinct types, so both the member variables are private and thus inaccessible. The value of the raw pointer member can be accessed through the public ``peek'' member function, but we need a solution for accessing the counter without making it publicly available. This kind of problem is exactly what ``friend'' declarations are for, but there is a two-fold problem with that. To begin with, extremely few compilers support template friends; a rather new addition to the language. Second, member templates open up holes in the type system you can only dream of. For the curious, please read Scott Meyer's paper on the topic. One step on the way towards a solution is to see that the management of
the counter is independent of the type T, so we can implement the counter
managing code in a separate class. A reference counting class may look
like:
The default constructor allows us to choose whether we want a reference counter or not. If our reference counting pointer is initialized with the 0 pointer, there is no need to waste time and memory by allocating a reference counter for it. The member functions ``release'' and ``reinit'' return 1 if the old counter is discarded (and hence, we should delete the object we refer to) or 0 if it was just decremented. Base implementationIt is fairly easy to implement, and makes life easier later on, even when
member templates are not available.
As an exercise, prove to yourself that this reference counting base class does not have any memory handling errors (i.e. it always deallocates what it allocates, never deallocates the same area twice, never accesses uninitialized or just deallocated memory, and never dereferences the 0 pointer.) Accessibility againAs nice and convenient the above helper class is, it really does not solve the accessibility problem. It does make the implementation a bit easier, though. The problem remains. class ``counting_ptr<T>'' and ``counting_ptr<Y>'' are different classes and because of this are not allowed to see each others private sections. The easy way out is to use public inheritance, and say that every counting pointer is-a counting base. That is a solution that is simple, sweet and dead wrong. There is no is-a relationship here, but an is-implemented-in-terms-of relationship. Such relationships are implemented through private member data or private inheritance - almost. This is bordering on abuse, but it works fine. Instead of having the member functions of the ``counting_base'' class public, they can be declared protected. As such they do not become part of the public interface, and the is-a relationship is mostly imaginary. Implementation of a reference counting pointerFinally we can get to work and write the reference counting pointer,
and since the helper class does most of the dirty work, the implementation
is not too convoluted.
The access operators are all trivial and need no further explanation:
EfficiencyThere is no question about it; there is a cost involved in using reference counting pointers. Every time an object is allocated, a pointer is also allocated, and vice-versa for deallocation. Depending on how efficient your compiler's memory manager is for small objects this cost may be negligible, or it may be severe. Every time a counting pointer is assigned, constructed, destroyed and copied, counter manipulation is done, and that costs a few CPU cycles. Exercises
Coming upIf I missed something, or you want something clarified further or disagree with me, please drop me a line and I'll address your ideas in future articles. Next month is devoted to Run Time Type Identification (RTTI,) one of the new functionalities in C++. /Björn. |