Appendix A. C++ Templates by Example
This chapter is meant to be a brief introduction to the basics of C++ templates using examples. It also shows some of the advanced template features that ATL uses, but without any ATL code cluttering up things. For a more thorough introduction to templates, see Stan Lippman’s C++ Primer (Addison-Wesley, 2005).
The Need for Templates
Imagine a simple bounds-checked array class:
1#define MAX_ELEMS 8
2
3class Array {
4public:
5 long& operator[](size_t n) {
6 if( n < 0 || n >= MAX_ELEMS ) throw "out of bounds!";
7 return m_rg[n];
8 }
9
10protected:
11 long m_rg[MAX_ELEMS];
12};
This class makes quiet, hard-to-find errors loud and easy to find:
1void main(int argc, char* argv[]) {
2 long rg[8]; // Built in array type
3 rg[8] = 1; // will corrupt the stack, but quietly
4
5 Array array; // Bounds-checked array type
6 array[8] = 1; // will complain loudly
7}
The bounds-checking part of the Array
class really has nothing to do with the data being managed; using a
little bit of C++ trickery, this is even easier to spot:
1typedef long T;
2class Array {
3public:
4 T& operator[](size_t n) {
5 if( n < 0 || n >= MAX_ELEMS ) throw "out of bounds!";
6 return m_rg[n];
7 }
8
9protected:
10 T m_rg[MAX_ELEMS];
11};
Notice that we’ve replaced the use of
long
with a generic type T
. Unfortunately, this
trick doesn’t allow us to reuse the Array
class with
different types of T
. When the compiler sees the
Array
class, it won’t let us change T
and compile
it again with another type T
. To get any reuse out of the
Array
class as it is, we have to do some cut-and-paste
work and create different Array
classes, one for each type
we’re interested in managing:
1class ArrayOfChar {
2public:
3 char& operator[](size_t n);
4
5protected:
6 char m_rg[MAX_ELEMS];
7};
8
9class ArrayOfLong {
10public:
11 long& operator[](size_t n);
12
13protected:
14 long m_rg[MAX_ELEMS];
15};
Besides the tedium involved with this technique,
the developer of the Array
family of classes would have to
build an Array
class for every type that the user of the
class would want to manage. Because some of these types can be
defined after the Array
class is built, this is an especially difficult task. We’d like the
compiler to step in and help us. And so it will, with
templates.
Template Basics
Using template syntax,
we can create an Array
class that is parameterized on any
number of parameters, including the type of data to manage and how
large the internal buffer should be: [1]
1template <typename T, size_t MAX_ELEMS>
2class Array {
3public:
4 T& operator[](size_t n) {
5 if( n < 0 || n >= MAX_ELEMS ) throw "out of bounds!";
6 return m_rg[n];
7 }
8
9protected:
10 T m_rg[MAX_ELEMS];
11};
Notice that the only difference in this code is
the use of the template
statement before the class
declaration. When the compiler sees a template, it knows to store
the class declaration in its internal data structures, but not to
generate any code. The compiler can’t generate the code until it
sees how the client would like to use it:
1struct Point { long x; long y; };
2
3void main(int argc, char* argv[]) {
4 Array<long, 8> a1; // Array of 8 longs
5 Array<char, 256> a2; // Array of 256 chars
6 Array<Point, 16> a3; // Array of 16 Points
7 ...
8}
The compiler uses the client’s template
parameters to generate the code for the class on demand,
effectively creating a new member of the Array
family of
classes with each use. [2] Because the compiler is using the
template parameters to generate the code, only parameters whose
values are known at compile time are allowed. However, that includes built-in types, user-defined
types, constants, and even function pointers. To make the template
even more convenient for the client to use, you’re allowed to
declare default values for template parameters, just as you would
for functions:
The linker makes sure that only one expansion per set of template parameters actually makes in into the final image.
1template <typename T, size_t MAX_ELEMS = 8>
2class Array {...};
3
4void main(int argc, char* argv[]) {
5 Array<long> a1; // Array of 8 longs
6 Array<char, 256> a2; // Array of 256 chars
7 ...
8}
Template Specialization
You might decide that for a specific combination
of template parameters, the generic template expansion isn’t good
enough. For example, if you decide that an Array
of 256
characters should have an equality operator, you might decide to
override the Array
general template using the template
specialization syntax:
1template <> // No template arguments here
2class Array<char, 256> { // Template argument values here
3public:
4 // You are not required to provide the same functionality
5 // as the general template (although it's a good idea)
6 char& operator[](size_t n) {
7 if( n < 0 || n >= 256 ) throw "out of bounds!";
8 return m_sz[n];
9 }
10
11 // You may add functionality not in the general template
12 bool operator==(const Array<char, 256>& rhs) {
13 return strcmp(m_sz, rhs.m_sz);
14 }
15
16protected:
17 char m_sz[256];
18};
The client doesn’t have to do anything new to
use the specialized version of the template. When the compiler sees
Array<char, 256>
, the client automatically gets the
specialized version.
Templates as Base Classes
Template specialization allows the addition of new
functionality and optimized implementations based on specific
template parameters. For example, the specialization I just showed
specializes on all the parameters, both the type and the size of
the array. It would probably be more useful to be able to
specialize on the type of data held, but to expose additional
functions for character strings of any size. This can be
accomplished by using the Array
as a base class:
1template <size_t MAX_LEN>
2class String : public Array<char, MAX_LEN+1> {
3public:
4 // Additional functionality
5 bool operator==(const String<MAX_LEN>& rhs) {
6 return strcmp(m_rg, rhs.m_rg);
7 }
8};
Notice that the String
is still
parameterized on length and that it passes arguments to the base
Array
class. The type is fixed because we’re building a
string, but the number of elements to store is based on the
String
template argument. In effect, this achieves a
partial specialization, although the client code must use the
String
template instead of the Array
template to
make use of it.
A Different Kind of Polymorphism
Templates give us a different kind of reuse mechanism than inheritance. Inheritance means that we’d like to reuse the implementation and signature of a base class and extend it in a derived class. Inheritance is type-centric. The derived class is type compatible with the base class. Type compatibility means that an instance of the derived class can be used where an instance of the base class is required.
On the other hand, templates enable you to reuse
the behavior of a class, but to divorce it from the types involved.
If the type can fit where it’s being used, the compiler doesn’t
care about the type. For example, all that was required of the type
being passed to the Array
template was that it had a
default constructor so that it could be created in arrays. If the
type couldn’t live up to that requirement, the compiler would
complain.
Because of the way the compiler generates code using template arguments, templates give us a different kind of polymorphism. Instead of polymorphism based on type compatibility, we’ve got polymorphism based on signature compatibility. [3] If the type has the appropriate function signatures available, the compiler is perfectly happy. This kind of polymorphism has some interesting properties, of which ATL makes heavy use.
This is often called structural typing these days.
Using Behavior Classes
Imagine modifying the Array
class so
that it does bounds checking only if you so desire:
1template <typename T, size_t MAX_ELEMS = 8,
2 bool bCheckBounds = true>
3class Array {
4public:
5 T& operator[](size_t n) {
6 if( bCheckBounds && (n < 0 || n >= MAX_ELEMS) )
7 throw "out of bounds!";
8 return m_rg[n];
9 }
10
11protected:
12 T m_rg[MAX_ELEMS];
13};
This allows a client to turn bounds checking on or off, based on its own requirements:
1void main(int argc, char* argv[]) {
2#ifdef _DEBUG
3 bool bCheckBounds = true;
4#else
5 bool bCheckBounds = false;
6#endif
7
8 Array<long, 256, bCheckBounds> array;
9 array[256] = 1;
10}
The intent here is that we skip the bounds
checks in release mode because we have hopefully caught the
out-of-bounds errors during development. However, we’re still doing
a check against bCheckBounds
every time through the
operator[]
member function. We could hope for a good
compiler that would optimize away the line because
bCheckBounds
is known at compile time, or we could remove
all doubt and make use of a behavior
class.
Behavior Classes Can Group Functions
A behavior class (also known as a trait class or a trait) is a class that contains some static member functions or some type definitions. Behavior classes are never meant to be created. Instead, by putting the functions and type definitions into a class, we’ve grouped them and can refer to them as a unit.
To solve our bounds-checking problem, imagine two behavior classes, one that checks the bounds and one that does not:
1struct DebugBoundsChecker {
2 static void CheckBounds(size_t n, size_t nMax) {
3 if( n < 0 || n >= nMax ) throw "out of bounds!";
4 }
5};
6
7struct ReleaseBoundsChecker {
8 static void CheckBounds(size_t n, size_t nMax) {}
9};
Notice that both the behavior classes have the
same function with the same signature. The debug version does the
bounds checking, and the release version doesn’t do anything. And
because both implementations are inline functions, the compiler can
optimize easily. Given only the knowledge of the signature of the
bounds-checking function, I can rewrite the Array
class to
use the CheckBounds
function scoped by a type name passed
as a template parameter:
1template <typename T, size_t MAX_ELEMS = 8,
2 typename BoundsChecker = DebugBoundsChecker>
3class Array {
4public:
5 T& operator[](size_t n) {
6 BoundsChecker::CheckBounds(n, MAX_ELEMS);
7 return m_rg[n];
8 }
9
10protected:
11 T m_rg[MAX_ELEMS];
12};
The client can now take advantage of the bounds checkingor not, as decided at compile time:
1void main(int argc, char* argv[]) {
2#ifdef _DEBUG
3 typedef DebugBoundsChecker BoundsChecker;
4#else
5 typedef ReleaseBoundsChecker BoundsChecker;
6#endif
7
8 Array<long, 256, BoundsChecker> array;
9 array[256] = 1;
10}
In this case, I’ve used
signature compatibility to make my decisions at compile time,
resulting in potentially more efficient code. And if I want to add
another kind of bounds-checking behavior class in the future, I
can, if it has a CheckBounds
function that lives up to the
signature requirements of the caller.
Simulating Dynamic Binding
One of the benefits of inheritance is dynamic binding. A virtual function in the base class can be overridden in the derived class to extend or replace base class functionality. One especially powerful expression of this idea is the pure virtual member function. A pure virtual member function is a virtual member function that must be overridden in the deriving class. In fact, any class with a pure virtual member function is thought of as an abstract base class (ABC), and no instance of that class can be created. To use the functionality of an ABC, it must be used as a base class, and the deriving class must implement all the pure virtual member functions. This is useful because it allows the base class to define functionality required by the deriving class:
1template <typename T>
2class Array {
3public:
4 ...
5 virtual int Compare(const Array<T>& rhs) =0;
6
7 bool operator< (const Array<T>& rhs)
8 { return this->Compare(rhs) < 0; }
9
10 bool operator> (const Array<T>& rhs)
11 { return this->Compare(rhs) > 0; }
12
13 bool operator== (const Array<T>& rhs)
14 { return this->Compare(rhs) == 0; }
15
16 T m_rg[1024];
17};
By defining the Compare
function as
pure virtual, the Array
class designer has decreed that
Array
can be used only as a base class and that the
deriving class must provide some way of comparing two arrays of the
same type so that the comparison operators can be implemented. For
example, a String
class must implement the
Compare
function:
1class String : public Array<char> {
2public:
3 // the compiler requires this method:
4 int Compare(const Array<char>& rhs)
5 { return strcmp(m_rg, rhs.m_rg); }
6};
The compiler uses the implementation of the pure virtual member function provided in the derived class to fill in the virtual function table entry for that member function (as it does with all virtual member function pointers). Using function pointers to invoke member functions is how C++ provides dynamic bindingthat is, binding to the specific implementation at runtime based on the type of the variable.
However, we’re paying the price of dynamic
binding for our String
class. The price of dynamic binding
is a virtual function table, shared among all instances, an extra
4-bye virtual function pointer per object, and at least two extra
pointer indirections to invoke the appropriate implementation. This
seems a high price to pay, given perfect knowledge at compile time.
We know how to compare two String
objects. Why not make
the compiler do the work and save us the overhead of dynamic
binding? We can do that by using simulated dynamic binding.
Simulated dynamic binding enables us to provide
the name of the deriving class to the base class. The base class
casts itself to the deriving class to call the required function.
Revising the Array
to use simulated dynamic binding looks
like this:
1template <typename T, typename Deriving>
2class Array {
3public:
4 ...
5 bool operator< (const Array<T, Deriving>& rhs)
6 { return static_cast<Deriving*>(this)->Compare(rhs) < 0; }
7
8 bool operator> (const Array<T, Deriving>& rhs)
9 { return static_cast<Deriving*>(this)->Compare(rhs) > 0; }
10
11 bool operator== (const Array<T, Deriving>& rhs)
12 { return static_cast<Deriving*>(this)->Compare(rhs) == 0; }
13 T m_rg[1024];
14};
Notice that the Array
template takes an
additional parameter: the name of the deriving class. It uses that
class name to perform a static cast on itself. Because the compiler
expands the code of the base class while instantiating the deriving
class, the static cast performs a perfectly safe downcast (normally
a contradiction in terms). The compiler uses the member functions
of the deriving cast to resolve the address of the required
function at compile time, saving us the cost of dynamic binding.
The deriving class must implement the appropriate member function
and pass its own name as a parameter to the base class
template:
1class String : public Array<char, String> {
2public:
3 // the compiler requires this method:
4 int Compare(const Array<char, String>& rhs)
5 { return strcmp(m_rg, rhs.m_rg); }
6};
This technique gives us the look and feel of dynamic binding without using virtual member functions. The base class can require functionality of the deriving class. The base class cannot be instantiated by itself. The major difference is that because no virtual functions are required, we don’t have the overhead of dynamic binding.
The Discovery of Simulated Dynamic Binding
It’s my understanding that the ATL team members discovered simulated dynamic binding accidentally. When they did, they went immediately to the compiler team members, who claimed that the C++ Standard does not mandate or prohibit such behavior, but they promised to keep it working for ATL. Does that mean you should use simulated dynamic binding when you’re writing your most portable code? Probably not. Should you still understand it because ATL is rife with it? Absolutely. Should you sneak it into your own bag of tricks? It’s in mine.
Function Templates
Classes are not the only thing that can be
parameterized; functions can be, too. The canonical example is the
min
function:
1inline long min(long a, long b) { return (a < b ? a : b); }
2inline float min(float a, float b) { return (a < b ? a : b); }
Because the code is the same for both overloaded
min
implementations, there’s no reason not to make it into
a template:
1template <typename T>
2inline T min(T a, T b) { return (a < b ? a : b); }
When the template is instantiated, the compiler generates the code on demand based on the use:
1void main() {
2 // produces: long min(long, long)
3 long a = 1, b = 2, c = min(a, b);
4
5 // produces: float min(float, float)
6 float x = 1.1, y = 2.2, z = min(x, y);
7
8 // produces: char *min(char *, char *)
9 char *r = "hello", *s = "world", *t = min(r, s);
10}
Notice that the compiler can figure out how to
insatiate the min
template implementation without any
fancy angle brackets. However, also notice that sometimes, just as
with class templates, function templates don’t expand the way we’d
like. For example, the min
that takes two char*
compares two pointer values, not the contents of the strings. A
function template can be specialized by merely providing an
implementation that takes the types involved:
1inline char* min(const char* a, const char* b) {
2 return (strcmp(a, b) < 0 ? a : b);
3}
In this case, because a version of the function
that takes two character pointers already exists, the compiler
binds to that one instead of generating another implementation
based on the min
function template.
Member Function Templates
The fun of templates does not stop at global
functions. Oh, no. You can create member function templates as
well. As of Visual C++ 6.0, the definition of IUnknown
has
been augmented with a member function template for
QueryInterface
:
1struct IUnknown {
2...
3 template <class Q>
4 HRESULT STDMETHODCALLTYPE QueryInterface(Q** pp)
5 { return QueryInterface(__uuidof(Q), (void**)pp); }
6}
Before the member function template, all calls
to QueryInterface
had to be sure to match up the interface
type and the interface identifier:
1void Fly(IUnknown* punk) {
2 IBird* pbird = 0;
3 punk->QueryInterface(IID_ICat, (void**)&pbird); // Oops!
4 punk->QueryInterface(IID_IBird, (void**)pbird); // Double oops!
5 punk->QueryInterface(IID_IBird, (void**)&pbird); // OK
6 pbird->Fly();
7}
On the other hand, with the
QueryInterface
member function template, the type of the
interface suffices:
1void Fly(IUnknown* punk) {
2 IBird* pbird = 0;
3 punk->QueryInterface(&pbird); // __uuidof uses to determine IID
4 pbird->Fly();
5}
In effect, member function templates add new member functions of a class, just as function templates add new global functions based on use.
Summary
Templates provide two services. The first is code reuse, but in a different way than inheritance. Inheritance requires a certain type relationship. Templates require only that the compiler be capable of finding the functionality required of the specified type. Inheritance makes type requirements, whereas templates take what types we give them.
The other service that templates provide is efficiency. By mixing in types at compile time, we can use static binding to make decisions at runtime instead of at compile time. This is the key to the promise of smaller, faster code that templates make.