Chapter 4. Objects in ATL

ATL’s fundamental support for COM can be split into two pieces: objects and servers. This chapter covers classes and concentrates on how IUnknown is implemented as related to threading and various COM identity issues, such as standalone versus aggregated objects. The next chapter focuses on how to expose classes from COM servers.

Implementing `IUnknown`

A COM object has one responsibility: to implement the methods of IUnknown. Those methods perform two services, lifetime management and runtime type discovery, as follows:

interface IUnknown {
  // runtime type discovery
  HRESULT QueryInterface([in] REFIID riid,
                         [out, iid_is(riid)] void **ppv);

  // lifetime management
  ULONG AddRef();
  ULONG Release();
}

COM allows every object to implement these methods as it chooses (within certain restrictions, as described in Chapter 5, “COM Servers”). The canonical implementation is as follows:

// Server lifetime management
extern void ServerLock();
extern void ServerUnlock();

class CPenguin : public IBird, public ISnappyDresser {
public:
  CPenguin() : m_cRef(0) { ServerLock(); }
  virtual ~CPenguin()    { ServerUnlock(); }

  // IUnknown methods
  STDMETHODIMP QueryInterface(REFIID riid, void **ppv) {
      if( riid == IID_IBird || riid == IID_IUnknown )
          *ppv = static_cast<IBird*>(this);
      else if( riid == IID_ISnappyDresser )
          *ppv = static_cast<ISnappyDresser*>(this);
      else *ppv = 0;

      if( *ppv ) {
          reinterpret_cast<IUnknown*>(*ppv)->AddRef();
          return S_OK;
      }

      return E_NOINTERFACE;
  }

  ULONG AddRef()
  { return InterlockedIncrement(&m_cRef); }

  ULONG Release() {
      ULONG l = InterlockedDecrement(&m_cRef);
      if( l == 0 ) delete this;
      return l;
  }

  // IBird and ISnappyDresser methods...
private:
    ULONG m_cRef;
};

This implementation of IUnknown is based on several assumptions:

The object is heap-based because it removes itself using the delete operator. Furthermore, the object’s outstanding references completely govern its lifetime. When it has no more references, it deletes itself.
The object is capable of living in a multithread apartment because it manipulates the reference count in a thread-safe manner. Of course, the other methods must be implemented in a thread-safe manner as well for the object to be fully thread safe.
The object is standalone and cannot be aggregated because it does not cache a reference to a controlling outer, nor does it forward the methods of IUnknown to a controlling outer.
The object exposes its interfaces using multiple inheritance.
The existence of the object keeps the server running. The constructor and the destructor are used to lock and unlock the server, respectively.

These common assumptions are not the only possibilities. Common variations include the following:

An object can be global and live for the life of the server. Such objects do not need a reference count because they never delete themselves.
An object might not need to be thread safe because it might be meant to live only in a single-threaded apartment.
An object can choose to allow itself to be aggregated as well as, or instead of, supporting standalone activation.
An object can expose interfaces using other techniques besides multiple inheritance, including nested composition, tear-offs, and aggregation.
You might not want the existence of an object to force the server to keep running. This is common for global objects because their mere existence prohibits the server from unloading.

Changing any of these assumptions results in a different implementation of IUnknown, although the rest of the object’s implementation is unlikely to change much (with the notable exception of thread safety). These implementation details of IUnknown tend to take a very regular form and can be encapsulated into C++ classes. Frankly, we’d really like to use someone else’s tested code and be able to change our minds later without a great deal of effort. We’d also like this boilerplate code to be easily separated from the actual behavior of our objects so that we can focus on our domain-specific implementation. ATL was designed from the ground up to provide just this kind of functionality and flexibility.

The Layers of ATL

ATL’s support for building COM objects is separated into several layers, as shown in Figure 4.1.

Figure 4.1. The layers of ATL

These layers break down into services exposed by ATL for building objects:

CComObjectRootEx uses a CComXxxThreadModel to provide “just thread-safe enough” object lifetime management and object locking.
CComObjectRootBase and CComObjectRootEx provide helper functions used in implementing IUnknown.
Your class, which derives from CComObjectRootEx, also derives from any interfaces it wants to implement, as well as providing the method implementations. You or one of the ATL IXxxImpl classes can provide method implementations.
CComObject et al, provides the actual implementation of the methods of IUnknown in a way consistent with your desires for object and server lifetime management requirements. This final layer actually derives from your class.

Your choice of base classes and the most derived class determines the way the methods of IUnknown are implemented. If your choices change, using different classes at compile time (or runtime) will change how ATL implements IUnknown, independently of the rest of the behavior of your object. The following sections explore each layer of ATL.

Threading Model Support

Just Enough Thread Safety

The thread-safe implementation of AddRef and Release shown previously might be overkill for your COM objects. For example, if instances of a specific class will live in only a single-threaded apartment, there’s no reason to use the thread-safe Win32 functions InterlockedIncrement and InterlockedDecrement. For single-threaded objects, the following implementation of AddRef and Release is more efficient:

class Penquin {
...
  ULONG AddRef()
  { return ++m_cRef; }

  ULONG Release() {
      ULONG l = m_cRef;
      if( l == 0 ) delete this;
      return l;
  }
...
};

Using the thread-safe Win32 functions also works for single-threaded objects, but unnecessary thread safety requires extra overhead. For this reason, ATL provides three classes, CComSingleThreadModel, CComMultiThreadModel, and CComMultiThreadModelNoCS. These classes provide two static member functions, Increment and Decrement, for abstracting away the differences between managing an object’s lifetime count in a multithreaded manner versus a single-threaded one. The two versions of these functions are as follows (notice that both CComMultiThreadModel and CComMultiThreadModelNoCS have identical implementations of these functions):

class CComSingleThreadModel {
  static ULONG WINAPI Increment(LPLONG p) { return ++(*p); }
  static ULONG WINAPI Decrement(LPLONG p) { return (*p); }
  ...
};

class CComMultiThreadModel {
  static ULONG WINAPI Increment(LPLONG p) { return InterlockedIncrement(p); }
  static ULONG WINAPI Decrement(LPLONG p) { return InterlockedDecrement(p); }
  ...
};
class CComMultiThreadModelNoCS {
  static ULONG WINAPI Increment(LPLONG p) { return InterlockedIncrement(p); }
  static ULONG WINAPI Decrement(LPLONG p) { return InterlockedDecrement(p); }
  ...
};

Using these classes, you can parameterize [1] the class to give a “just thread-safe enough”AddRef and Release implementation:

template <typename ThreadModel>
class Penquin {
...
  ULONG AddRef()
  { return ThreadModel::Increment(&m_cRef); }

  ULONG Release() {
      ULONG l = ThreadModel::Decrement(&m_cRef);
      if( l == 0 ) delete this;
      return l;
  }
...
};

Now, based on our requirements for the CPenguin class, we can make it just thread-safe enough by supplying the threading model class as a template parameter:

// Let's make a thread-safe CPenguin
CPenguin* pobj = new CPenguin<CComMultiThreadModel>( );

Instance Data Synchronization

When you create a thread-safe object, protecting the object’s reference count isn’t enough. You also have to protect the member data from multithreaded access. One popular method for protecting data that multiple threads can access is to use a Win32 critical section object, as shown here:

template <typename ThreadModel>
class CPenguin {
public:
  CPenguin() {
    ServerLock();
    InitializeCriticalSection(&m_cs);
  }

  ~CPenguin() { ServerUnlock(); DeleteCriticalSection(&m_cs); }

  // IBird
  STDMETHODIMP get_Wingspan(long* pnWingspan) {
    Lock(); // Lock out other threads during data read
    *pnWingSpan = m_nWingspan;
    Unlock();
    return S_OK;
  }

  STDMETHODIMP put_Wingspan(long nWingspan) {
    Lock(); // Lock out other threads during data write
    m_nWingspan = nWingspan;
    Unlock();
    return S_OK;
  }
  ...
private:
  CRITICALSECTION m_cs;

  void Lock() { EnterCriticalSection(&m_cs); }
  void Unlock() { LeaveCriticalSection(&m_cs); }
};

Notice that before reading or writing any member data, the CPenguin object enters the critical section, locking out access by other threads. This coarse-grained, object-level locking keeps the scheduler from swapping in another thread that could corrupt the data members during a read or a write on the original thread. However, object-level locking doesn’t give you as much concurrency as you might like. If you have only one critical section per object, one thread might be blocked trying to increment the reference count while another is updating an unrelated member variable. A greater degree of concurrency requires more critical sections, allowing one thread to access one data member while a second thread accesses another. Be careful using this kind of finer-grained synchronization – it often leads to deadlock:

class CZax : public IZax {
public:
  ...
  // IZax
  STDMETHODIMP GoNorth() {
      EnterCriticalSection(&m_cs1); // Enter cs1...
      EnterCriticalSection(&m_cs2); // ...then enter cs2
      // Go north...
      LeaveCriticalSection(&m_cs2);
      LeaveCriticalSection(&m_cs1);
  }

  STDMETHODIMP GoSouth() {
      EnterCriticalSection(&m_cs2); // Enter cs2...
      EnterCriticalSection(&m_cs1); // ...then enter cs1
      // Go south...
      LeaveCriticalSection(&m_cs1);
      LeaveCriticalSection(&m_cs2);
  }
  ...
private:
    CRITICAL_SECTION m_cs1;
    CRITICAL_SECTION m_cs2;
};

Imagine that the scheduler let the northbound Zax [2] thread enter the first critical section and then swapped in the southbound Zax thread to enter the second critical section. If this happened, neither Zax could enter the other critical section; therefore, neither Zax thread would be able to proceed. This would leave them deadlocked while the world went on without them. Try to avoid this. [3]

Whether you decide to use object-level locking or finer-grained locking, critical sections are handy. ATL provides four class wrappers that simplify their use: CComCriticalSection, CComAutoCriticalSection, CComSafeDeleteCriticalSection, and CComAutoDeleteCriticalSection.

class CComCriticalSection {
public:
    CComCriticalSection() {
        memset(&m_sec, 0, sizeof(CRITICAL_SECTION));
    }
    ~CComCriticalSection() { }
    HRESULT Lock() {
        EnterCriticalSection(&m_sec);
        return S_OK;
    }

    HRESULT Unlock() {
        LeaveCriticalSection(&m_sec);
        return S_OK;
    }

    HRESULT Init() {
        HRESULT hRes = E_FAIL;
        __try {
            InitializeCriticalSection(&m_sec);
            hRes = S_OK;
        }
        // structured exception may be raised in
        // low memory situations
        __except(STATUS_NO_MEMORY == GetExceptionCode()) {
            hRes = E_OUTOFMEMORY;
        }

        return hRes;
    }

    HRESULT Term() {
        DeleteCriticalSection(&m_sec);
        return S_OK;
    }
    CRITICAL_SECTION m_sec;
};

class CComAutoCriticalSection : public CComCriticalSection {
public:
    CComAutoCriticalSection() {
        HRESULT hr = CComCriticalSection::Init();
        if (FAILED(hr))
            AtlThrow(hr);
    }
    ~CComAutoCriticalSection() {
        CComCriticalSection::Term();
    }
private:
    // Not implemented. CComAutoCriticalSection::Init
    // should never be called
    HRESULT Init();
    // Not implemented. CComAutoCriticalSection::Term
    // should never be called
    HRESULT Term();
};

class CComSafeDeleteCriticalSection
    : public CComCriticalSection {
public:
    CComSafeDeleteCriticalSection(): m_bInitialized(false) { }

    ~CComSafeDeleteCriticalSection() {
        if (!m_bInitialized) { return; }
        m_bInitialized = false;
        CComCriticalSection::Term();
    }

    HRESULT Init() {
        ATLASSERT( !m_bInitialized );
        HRESULT hr = CComCriticalSection::Init();
        if (SUCCEEDED(hr)) {
            m_bInitialized = true;
        }
        return hr;
    }

    HRESULT Term() {
        if (!m_bInitialized) { return S_OK; }
        m_bInitialized = false;
        return CComCriticalSection::Term();
    }

    HRESULT Lock() {
        ATLASSUME(m_bInitialized);
        return CComCriticalSection::Lock();
    }

private:
    bool m_bInitialized;
};

class CComAutoDeleteCriticalSection : public CComSafeDeleteCriticalSection {
private:
    // CComAutoDeleteCriticalSection::Term should never be called
    HRESULT Term() ;
};

Notice that CComCriticalSection does not use its constructor or destructor to initialize and delete the contained critical section. Instead, it contains Init and Term functions for this purpose. CComAutoCriticalSection, on the other hand, is easier to use because it automatically creates the critical section in its constructor and destroys it in the destructor.

CComSafeDeleteCriticalSection does half that job; it doesn’t create the critical section until the Init method is called, but it always deletes the critical section (if it exists) in the destructor. You also have the option of manually calling Term if you want to explicitly delete the critical section ahead of the object’s destruction. CComAutoDeleteCriticalSection, on the other hand, blocks the Term method by simply declaring it but never defining it; calling CComAutoDeleteCriticalSection::Term gives you a linker error. These classes were useful before ATL was consistent about supporting construction for global and static variables, but these classes are largely around for historical reasons at this point; you should prefer CComAutoCriticalSection.

Using a CComAutoCriticalSection in our CPenguin class simplifies the code a bit:

template <typename ThreadModel>
class CPenguin {
public:
  // IBird methods Lock() and Unlock() as before...
...
private:
  CComAutoCriticalSection m_cs;

  void Lock() { m_cs.Lock(); }
  void Unlock() { m_cs.Unlock(); }
};

Note that with both CComAutoCriticalSection and CComCriticalSection, the user must take care to explicitly call Unlock before leaving a section of code that has been protected by a call to Lock. In the presence of code that might throw exceptions (which a great deal of ATL framework code now does), this can be difficult to do because each piece of code that can throw an exception represents a possible exit point from the function. CComCritSecLock addresses this issue by automatically locking and unlocking in its constructor and destructor. CComCritSecLock is parameterized by the lock type so that it can serve as a wrapper for CComCriticalSection or CComAutoCriticalSection.

template< class TLock >
class CComCritSecLock {
public:
      CComCritSecLock( TLock& cs, bool bInitialLock = true );
      ~CComCritSecLock() ;

      HRESULT Lock() ;
      void Unlock() ;

// Implementation
private:
      TLock& m_cs;
      bool m_bLocked;
...
};

template< class TLock >
inline CComCritSecLock< TLock >::CComCritSecLock(
    TLock& cs,bool bInitialLock )
    : m_cs( cs ), m_bLocked( false ) {
      if( bInitialLock ) {
            HRESULT hr;
            hr = Lock();
            if( FAILED( hr ) ) { AtlThrow( hr ); }
      }
}

template< class TLock >
inline CComCritSecLock< TLock >::~CComCritSecLock() {
      if( m_bLocked ) { Unlock(); }
}

template< class TLock >
inline HRESULT CComCritSecLock< TLock >::Lock() {
      HRESULT hr;
      ATLASSERT( !m_bLocked );
      hr = m_cs.Lock();
      if( FAILED( hr ) ) { return( hr ); }
      m_bLocked = true;
      return( S_OK );
}

template< class TLock >
inline void CComCritSecLock< TLock >::Unlock() {
      ATLASSERT( m_bLocked );
      m_cs.Unlock();
      m_bLocked = false;
}

If the bInitialLock parameter to the constructor is true, the contained critical section is locked upon construction. In normal use on the stack, this is exactly what you want, which is why true is the default. However, as usual with constructors, if something goes wrong, you don’t have an easy way to return the failure code. If you need to know whether the lock failed, you can pass false instead and then call Lock explicitly. Lock returns the HRESULT from the lock operation. This class ensures that the contained critical section is unlocked whenever an instance of this class leaves scope because the destructor automatically attempts to call Unlock if it detects that the instance is currently locked.

Notice that our CPenguin class is still parameterized by the threading model. There’s no sense in protecting our member variables in the single-threaded case. Instead, it would be handy to have another critical section class that could be used in place of CComCriticalSection or CComAutoCriticalSection. ATL provides the CComFakeCriticalSection class for this purpose:

class CComFakeCriticalSection {
public:
    HRESULT Lock() { return S_OK; }
    HRESULT Unlock() { return S_OK; }
    HRESULT Init() { return S_OK; }
    HRESULT Term() { return S_OK; }
};

Given CComFakeCriticalSection, we could further parameterize the CPenguin class by adding another template parameter, but this is unnecessary. The ATL threading model classes already contain type definitions that map to a real or fake critical section, based on whether you’re doing single or multithreading:

class CcomSingleThreadModel {
public:
    static ULONG WINAPI Increment(LPLONG p) {return ++(*p);}
    static ULONG WINAPI Decrement(LPLONG p) {return (*p);}
    typedef CComFakeCriticalSection AutoCriticalSection;
    typedef CComFakeCriticalSection AutoDeleteCriticalSection;
    typedef CComFakeCriticalSection CriticalSection;
    typedef CComSingleThreadModel ThreadModelNoCS;
};

class CcomMultiThreadModel {
public:
    static ULONG WINAPI Increment(LPLONG p) {return InterlockedIncrement(p);}
    static ULONG WINAPI Decrement(LPLONG p) {return InterlockedDecrement(p);}
    typedef CComAutoCriticalSection AutoCriticalSection;
    typedef CComAutoDeleteCriticalSection
        AutoDeleteCriticalSection;
    typedef CComCriticalSection CriticalSection;
    typedef CComMultiThreadModelNoCS ThreadModelNoCS;
};

class CcomMultiThreadModelNoCS {
public:
    static ULONG WINAPI Increment(LPLONG p) {return InterlockedIncrement(p);}
    static ULONG WINAPI Decrement(LPLONG p) {return InterlockedDecrement(p);}
    typedef CComFakeCriticalSection AutoCriticalSection;
    typedef CComFakeCriticalSection AutoDeleteCriticalSection;
    typedef CComFakeCriticalSection CriticalSection;
    typedef CComMultiThreadModelNoCS ThreadModelNoCS;
};

These type definitions enable us to make the CPenguin class just thread safe enough for both the object’s reference count and course-grained object synchronization:

template <typename ThreadingModel>
class CPenguin {
public:
    // IBird methods as before...
...
private:
    ThreadingModel::AutoCriticalSection m_cs;

    void Lock() { m_cs.Lock(); }
    void Unlock() { m_cs.Unlock(); }
};

This technique enables you to provide the compiler with operations that are just thread safe enough. If the threading model is CComSingleThreadModel, the calls to Increment and Decrement resolve to operator++ and operator--, and the Lock and Unlock calls resolve to empty inline functions.

If the threading model is CComMultiThreadModel, the calls to Increment and Decrement resolve to calls to InterlockedIncrement and InterlockedDecrement. The Lock and Unlock calls resolve to calls to EnterCriticalSection and LeaveCriticalSection.

Finally, if the model is CComMultiThreadModelNoCS, the calls to Increment and Decrement are thread safe, but the critical section is fake, just as with CComSingleThreadModel. CComMultiThreadModelNoCS is designed for multithreaded objects that eschew object-level locking in favor of a more fine-grained scheme. Table4.1 shows how the code is expanded based on the threading model class you use:

Table 4.1. Expanded Code Based on Threading Model Class

	`CcomSingleThreadModel`	`CComMultiThreadModel`	`CComMultiThreadModelNoCS`
`TM::Increment`	`++`	`Interlocked-Increment`	`Interlocked-Increment`
`TM::Decrement`	`--`	`Interlocked-Decrement`	`Interlocked-Decrement`
`TM::AutoCriticalSection::Lock`	`(Nothing)`	`EnterCritical-Section`	`(Nothing)`
`TM::AutoCriticalSection::Unlock`	`(Nothing)`	`LeaveCritical-Section`	`(Nothing)`

The Server’s Default Threading Model

ATL-based servers have a concept of a “default” threading model for things that you don’t specify directly. To set the server’s default threading model, you define one of the following symbols: _ATL_SINGLE_THREADED, _ATL_APARTMENT_THREADED, or _ATL_FREE_THREADED. If you don’t specify one of these symbols, ATL assumes _ATL_FREE_THREADED. However, the ATL Project Wizard defines _ATL_APARTMENT_THREADED in the generated stdafx.h file. ATL uses these symbols to define two type definitions:

#if defined(_ATL_SINGLE_THREADED)
...
    typedef CComSingleThreadModel CComObjectThreadModel;
    typedef CComSingleThreadModel CComGlobalsThreadModel;

#elif defined(_ATL_APARTMENT_THREADED)
...
    typedef CComSingleThreadModel CComObjectThreadModel;
    typedef CComMultiThreadModel CComGlobalsThreadModel;

#elif defined(_ATL_FREE_THREADED)
...
    typedef CComMultiThreadModel CComObjectThreadModel;
    typedef CComMultiThreadModel CComGlobalsThreadModel;
...
#endif

Internally, ATL uses CComObjectThreadModel to protect instance data and CComGlobalsThreadModel to protect global and static data. Because the usage is difficult to override in some cases, you should make sure that ATL is compiled using the most protective threading model of any of the classes in your server. In practice, this means you should change the wizard-generated _ATL_APARTMENT_THREADED symbol to _ATL_FREE_THREADED if you have even one multithreaded class in your server.

The Core of IUnknown

Standalone Reference Counting

To encapsulate the Lock and Unlock methods as well as the “just thread-safe enough” reference counting, ATL provides the CComObjectRootEx base class, parameterized by the desired threading model [4] :

template <class ThreadModel>
class CComObjectRootEx : public CComObjectRootBase {
public:
    typedef ThreadModel _ThreadModel;
    typedef typename _ThreadModel::AutoCriticalSection _CritSec;
    typedef typename _ThreadModel::AutoDeleteCriticalSection _AutoDelCritSec;
    typedef CComObjectLockT<_ThreadModel> ObjectLock;

    ~CComObjectRootEx() {}

    ULONG InternalAddRef() {
        ATLASSERT(m_dwRef != -1L);
        return _ThreadModel::Increment(&m_dwRef);
    }
    ULONG InternalRelease() {
#ifdef _DEBUG
        LONG nRef = _ThreadModel::Decrement(&m_dwRef);
        if (nRef < -(LONG_MAX / 2)) {
            ATLASSERT(0 &&
            _T("Release called on a pointer that has"
               " already been released"));
        }
        return nRef;
#else
        return _ThreadModel::Decrement(&m_dwRef);
#endif
    }

    HRESULT _AtlInitialConstruct() { return m_critsec.Init(); }
    void Lock() {m_critsec.Lock();}
    void Unlock() {m_critsec.Unlock();}
private:
    _AutoDelCritSec m_critsec;
};

template <>
class CComObjectRootEx<CComSingleThreadModel>
    : public CComObjectRootBase {
public:
    typedef CComSingleThreadModel _ThreadModel;
    typedef _ThreadModel::AutoCriticalSection _CritSec;
    typedef _ThreadModel::AutoDeleteCriticalSection
        _AutoDelCritSec;
    typedef CComObjectLockT<_ThreadModel> ObjectLock;

    ~CComObjectRootEx() {}

    ULONG InternalAddRef() {
        ATLASSERT(m_dwRef != -1L);
        return _ThreadModel::Increment(&m_dwRef);
    }
    ULONG InternalRelease() {
#ifdef _DEBUG
        long nRef = _ThreadModel::Decrement(&m_dwRef);
        if (nRef < -(LONG_MAX / 2)) {
            ATLASSERT(0 && _T("Release called on a pointer "
                      "that has already been released"));
        }
        return nRef;
#else
        return _ThreadModel::Decrement(&m_dwRef);
#endif
    }

    HRESULT _AtlInitialConstruct() { return S_OK; }

    void Lock() {}
    void Unlock() {}
};

ATL classes derive from CComObjectRootEx and forward AddRef and Release calls to the InternalAddRef and InternalRelease methods when the object is created standalone (that is, not aggregated). Note that InternalRelease checks the decremented reference count against the somewhat odd-looking value (LONG_MAX / 2). The destructor of CComObject (or one of its alternatives, discussed a bit later) sets the reference count to this value. The ATL designers could have used a different value here, but basing the value on LONG_MAX makes it unlikely that such a reference count could be reached under normal circumstances. Dividing LONG_MAX by 2 ensures that the resulting value can’t mistakenly be reached by wrapping around from 0. InternalRelease simply checks the reference count against this value to see if you’re trying to call Release on an object that has already been destroyed. If so, an assert is issued in debug builds.

The template specialization for CComSingleThreadModel demonstrates the “just safe enough” multithreading. When used in a single-threaded object, the Lock and Unlock methods do nothing, and no critical section object is created.

With the Lock and Unlock methods so readily available in the base class, you might be tempted to write the following incorrect code:

class CPenguin
    : public CComObjectRootEx<CComMultiThreadModel>, ... {
    STDMETHODIMP get_Wingspan(long* pnWingspan) {
      Lock();
      if( !pnWingspan ) return E_POINTER; // Forgot to Unlock
      *pnWingSpan = m_nWingspan;
      Unlock();
      return S_OK;
    }
    ...
};

To help you avoid this kind of mistake, CComObjectRootEx provides a type definition for a class called ObjectLock, based on CComObjectLockT parameterized by the threading model:

template <class ThreadModel>
class CcomObjectLockT {
public:
    CComObjectLockT(CComObjectRootEx<ThreadModel>* p) {
        if (p)
            p->Lock();
        m_p = p;
    }

    ~CComObjectLockT() {
        if (m_p)
            m_p->Unlock();
    }
    CComObjectRootEx<ThreadModel>* m_p;
};

template <>
class CComObjectLockT<CComSingleThreadModel> {
public:
    CComObjectLockT(CComObjectRootEx<CComSingleThreadModel>*) {}
    ~CComObjectLockT() {}
};

Instances of CComObjectLockT Lock the object passed to the constructor and Unlock it upon destruction. The ObjectLock type definition provides a convenient way to write code that will properly release the lock regardless of the return path:

class CPenguin
    : public CComObjectRootEx<CComMultiThreadModel>, ... {
    STDMETHODIMP get_Wingspan(long* pnWingspan) {
      ObjectLock lock(this);
      if( !pnWingspan ) return E_POINTER; // Unlock happens as
                                          // stack unwinds
      *pnWingSpan = m_nWingspan;
       return S_OK;
    }
    ...
};

Of course, the specialization for CComSingleThreadModel ensures that in the single-threaded object, no locking is done. This is useful when you’ve changed your threading model; you don’t pay a performance penalty for using an ObjectLock if you don’t actually need one.

Table-Driven QueryInterface

In addition to “just thread-safe enough” implementations of AddRef and Release for standalone COM objects, CComObjectRootEx (via its base class, CComObjectRootBase) provides a static, table-driven implementation of QueryInterface called InternalQueryInterface:

static HRESULT WINAPI
CComObjectRootBase::InternalQueryInterface(
    void*                    pThis,
    const _ATL_INTMAP_ENTRY* pEntries,
    REFIID                   iid,
    void**                   ppvObject);

This function’s job is to use the this pointer of the object, provided as the pThis parameter, and the requested interface to fill the ppvObject parameter with a pointer to the appropriate virtual function table pointer (vptr). It does this using the pEntries parameter, a zero-terminated array of _ATL_INTMAP_ENTRY structures:

struct _ATL_INTMAP_ENTRY {
    const IID*           piid;
    DWORD                dw;
    _ATL_CREATORARGFUNC* pFunc;
};

Each interface exposed from a COM object is one entry in the interface map, which is a class static array of _ATL_INTMAP_ENTRY structures. Each entry consists of an interface identifier, a function pointer, and an argument for the function represented as a DWORD. This provides a flexible, extensible mechanism for implementing QueryInterface that supports multiple inheritance, aggregation, tear-offs, nested composition, debugging, chaining, and just about any other wacky COM identity tricks C++ programmers currently use. [5] However, because most interfaces are implemented using multiple inheritance, you don’t often need this much flexibility. For example, consider one possible object layout for instances of the CPenguin class, shown in Figure 4.2.

Figure 4.2. CPenguin object layout, including ``vptr`` s to ``vtbl`` s

class CPenguin : public IBird, public ISnappyDresser {...};

The typical implementation of QueryInterface for a class using multiple inheritance consists of a series of if statements and static_cast operations; the purpose is to adjust the this pointer by some fixed offset to point to the appropriate vptr. Because the offsets are known at compile time, a table matching interface identifiers to offsets would provide an appropriate data structure for adjusting the this pointer at runtime. To support this common case, InternalQueryInterface function treats the _ATL_INTMAP_ENTRY as a simple IID/offset pair if the pFunc member has the special value _ATL_SIMPLEMAPENTRY :

1#define _ATL_SIMPLEMAPENTRY ((ATL::_ATL_CREATORARGFUNC*)1)

To be able to use the InternalQueryInterface function, each implementation populates a static interface map. To facilitate populating this data structure, and to provide some other methods used internally, ATL provides the following macros (as well as others described in Chapter 6, “Interface Maps”):

#define BEGIN_COM_MAP(class) ...
#define COM_INTERFACE_ENTRY(itf) ...
#define END_COM_MAP() ...

For example, our CPenguin class would declare its interface map like this:

class CPenguin :
    public CComObjectRootEx<CComMultiThreadModel>,
    public IBird,
    public ISnappyDresser {
...
public:
BEGIN_COM_MAP(CPenguin)
    COM_INTERFACE_ENTRY(IBird)
    COM_INTERFACE_ENTRY(ISnappyDresser)
END_COM_MAP()
...
};

In an abbreviated form, this would expand to the following:

class CPenguin :
    public CComObjectRootEx<CComMultiThreadModel>,
    public IBird,
    public ISnappyDresser {
...
public:
  IUnknown* GetUnknown() {
      ATLASSERT(_GetEntries()[0].pFunc == _ATL_SIMPLEMAPENTRY);
      return (IUnknown*)((int)this+_GetEntries()->dw); }
  }
  HRESULT _InternalQueryInterface(REFIID iid, void** ppvObject) {
      return InternalQueryInterface(this, _GetEntries(), iid, ppvObject);
  }
  const static _ATL_INTMAP_ENTRY* WINAPI _GetEntries() {
     static const _ATL_INTMAP_ENTRY _entries[] = {
         { &_ATL_IIDOF(IBird),          0, _ATL_SIMPLEMAPENTRY },
         { &_ATL_IIDOF(ISnappyDresser), 4, _ATL_SIMPLEMAPENTRY },
         { 0, 0, 0 }
     };
     return _entries;
  }
...
};

The _ATL_IIDOF macro expands as follows:

#ifndef _ATL_NO_UUIDOF
#define _ATL_IIDOF(x) __uuidof(x)
#else
#define _ATL_IIDOF(x) IID_##x
#endif

This macro lets you choose to use __uuidof operator or the standard naming convention to specify the IID for the interface in question for the entire project.

Figure4.3 shows how this interface map relates to an instance of a CPenguin object in memory.

Figure 4.3. ``CPenguin`` interface map, ``CPenguin`` object, and vtbls

[View full size image]

Something else worth mentioning is the GetUnknown member function that the BEGIN_COM_MAP provides. Although ATL uses this internally, it’s also useful when passing your this pointer to a function that requires an IUnknown*. Because your class derives from potentially more than one interface, each of which derives from IUnknown, the compiler considers passing your own this pointer as an IUnknown* to be ambiguous.

HRESULT FlyInAnAirplane(IUnknown* punkPassenger);

// Penguin.cpp
STDMETHODIMP CPenguin::Fly() {
    return FlyInAnAirplane(this); // ambiguous
}

For these situations, GetUnknown is your friend, e.g.

STDMETHODIMP CPenguin::Fly() {
    return FlyInAnAirplane(this->GetUnknown()); // unambiguous
}

As you’ll see later, GetUnknown is implemented by handing out the first entry in the interface map.

Support for Aggregation: The Controlled Inner

So far, we’ve discussed the implementation of IUnknown for standalone COM objects. However, if our object is to participate in aggregation as a controlled inner,our job is not to think for ourselves, but rather to be subsumed by the thoughts and prayers of another. A controlled inner does this by blindly forwarding all calls on the publicly available implementation of IUnknown to the controlling outer’s implementation. The controlling outer’s implementation is provided as the pUnkOuter argument to the CreateInstance method of IClassFactory. If our ATL-based COM object is used as a controlled inner, it simply forwards all calls to IUnknown methods to the OuterQueryInterface, OuterAddRef, and OuterRelease functions provided in CComObjectRootBase; these, in turn, forward to the controlling outer. The relevant functions of CComObjectRootBase are shown here:

class CComObjectRootBase {
public:
    CComObjectRootBase() { m_dwRef = 0L; }
    ...
    ULONG OuterAddRef() {
        return m_pOuterUnknown->AddRef();
    }
    ULONG OuterRelease() {
        return m_pOuterUnknown->Release();
    }
    HRESULT OuterQueryInterface(REFIID iid, void ** ppvObject) {
        return m_pOuterUnknown->QueryInterface(iid, ppvObject);
    }
    ...
    union {
        long      m_dwRef;
        IUnknown* m_pOuterUnknown;
    };
};

Notice that CComObjectRootBase keeps the object’s reference count and a pointer to a controlling unknown as a union. This implies that an object can either maintain its own reference count or be aggregated, but not both at the same time. This implication is not true. If the object is being aggregated, it must maintain a reference count and a pointer to a controlling unknown. In this case, discussed more later, ATL keeps the m_pUnkOuter in one instance of the CComObjectBase and derives from CComObjectBase again to keep the object’s reference count.

More to Come

Although it’s possible to implement the methods of IUnknown directly in your class using the methods of the base class CComObjectRootEx, most ATL classes don’t. Instead, the actual implementations of the IUnknown methods are left to a class that derives from your class, as in CComObject. We discuss this after we talk about the responsibilities of your class.

Your Class

Because ATL provides the behavior for IUnknown in the CComObjectRootEx class and provides the actual implementation in the CComObject (and friends) classes, the job your class performs is pretty simple: derive from interfaces and implement their methods. Besides making sure that the interface map lists all the interfaces you’re implementing, you can pretty much leave implementing IUnknown to ATL and concentrate on your custom functionality. This is, after all, the whole point of ATL in the first place.

ATL’s Implementation Classes

Many standard interfaces have common implementations. ATL provides implementation classes of many standard interfaces. For example, IPersistImpl, IConnectionPointContainerImpl, and IViewObjectExImpl implement IPersist, IConnectionPointContainer, and IViewObjectEx, respectively. Some of these interfaces are common enough that many objects can implement them; for example, persistence, eventing, and enumeration. Some are more special purpose and related only to a particular framework, as with controls, Internet-enabled components, and Microsoft Management Console extensions. Most of the general-purpose interface implementations are discussed in Chapters 7, “Persistence in ATL”; 8, “Collections and Enumerators”; and 9, “Connection Points.” The interface implementations related to the controls framework are discussed in Chapters 10, “Windowing,” and 11, “ActiveX Controls.” One implementation is general purpose enough to discuss right here: IDispatchImpl.

Scripting Support

For a scripting environment to access functionality from a COM object, the COM object must implement IDispatch:

interface IDispatch : IUnknown {
    HRESULT GetTypeInfoCount([out] UINT * pctinfo);

    HRESULT GetTypeInfo([in] UINT iTInfo,
        [in] LCID lcid,
        [out] ITypeInfo ** ppTInfo);

    HRESULT GetIDsOfNames([in] REFIID riid,
        [in, size_is(cNames)] LPOLESTR * rgszNames,
        [in] UINT cNames,
        [in] LCID lcid,
        [out, size_is(cNames)] DISPID * rgDispId);

    HRESULT Invoke([in] DISPID dispIdMember,
        [in] REFIID riid,
        [in] LCID lcid,
        [in] WORD wFlags,
        [in, out] DISPPARAMS * pDispParams,
        [out] VARIANT * pVarResult,
        [out] EXCEPINFO * pExcepInfo,
        [out] UINT * puArgErr);
}

The most important methods of IDispatch are GetIDsOfNames and Invoke. Imagine the following line of scripting code:

penguin.wingspan = 102

This translates into two calls on IDispatch. The first is GetIDsOfNames, which asks the object if it supports the wingspan property. If the answer is yes, the second call to IDispatch is to Invoke. This call includes an identifier (called a DISPID) that uniquely identifies the name of the property or method the client is interested in (as retrieved from GetIDsOfName s), the type of operation to perform (calling a method, or getting or setting a property), a list of arguments, and a place to put the result (if any). The object’s implementation of Invoke is then required to interpret the request the scripting client made. This typically involves unpacking the list of arguments (which is passed as an array of VARIANT structures), converting them to the appropriate types (if possible), pushing them onto the stack, and calling some other method implemented that deals in real data types, not VARIANT s. In theory, the object’s implementation could take any number of interesting, dynamic steps to parse and interpret the client’s request. In practice, most objects forward the request to a helper, whose job it is to build a stack and call a method on an interface implemented by the object to do the real work. The helper makes use of type information held in a type library typically bundled with the server. COM type libraries hold just enough information to allow an instance of a TypeInfo object – that is, an object that implements ITypeInfo – to perform this service. The TypeInfo object used to implement IDispatch is usually based on a dual interface, defined in IDL like this:

[ object, dual, uuid(44EBF74E-116D-11D2-9828-00600823CFFB) ]
interface IPenguin : IDispatch {
    [propput] HRESULT Wingspan([in] long nWingspan);
    [propget] HRESULT Wingspan([out, retval] long* pnWingspan);
              HRESULT Fly();
}

Using a TypeInfo object as a helper allows an object to implement IDispatch like this (code in bold indicates differences between one implementation and another):

class CPenguin :
    public CComObectRootEx<CComSingleThreadModel>,
    public IBird,
    public ISnappyDresser,
    public IPenguin {
public:
    CPenguin() : m_pTypeInfo(0) {
        IID*      pIID   = &IID_IPenguin;
        GUID*     pLIBID = &LIBID_BIRDSERVERLib;
        WORD      wMajor = 1;
        WORD      wMinor = 0;
        ITypeLib* ptl = 0;
        HRESULT hr = LoadRegTypeLib(*pLIBID, wMajor, wMinor,
            0, &ptl);
        if( SUCCEEDED(hr) ) {
            hr = ptl->GetTypeInfoOfGuid(*pIID, &m_pTypeInfo);
            ptl->Release();
        }
    }

    virtual ~Penguin() {
        if( m_pTypeInfo ) m_pTypeInfo->Release();
    }

BEGIN_COM_MAP(CPenguin)
    COM_INTERFACE_ENTRY(IBird)
    COM_INTERFACE_ENTRY(ISnappyDresser)
    COM_INTERFACE_ENTRY(IDispatch)
    COM_INTERFACE_ENTRY(IPenguin)
END_COM_MAP()

    // IDispatch methods
    STDMETHODIMP GetTypeInfoCount(UINT *pctinfo) {
        return (*pctinfo = 1), S_OK;
    }
    STDMETHODIMP GetTypeInfo(UINT ctinfo, LCID lcid,
    ITypeInfo **ppti) {
        if( ctinfo != 0 ) return (*ppti = 0), DISP_E_BADINDEX;
        return (*ppti = m_pTypeInfo)->AddRef(), S_OK;
    }

    STDMETHODIMP GetIDsOfNames(REFIID riid, OLECHAR **rgszNames,
        UINT cNames, LCID lcid, DISPID *rgdispid) {
        return m_pTypeInfo->GetIDsOfNames(rgszNames, cNames,
            rgdispid);
    }

    STDMETHODIMP Invoke(DISPID dispidMember,
                        REFIID riid,
                        LCID lcid,
                        WORD wFlags,
                        DISPPARAMS *pdispparams,
                        VARIANT *pvarResult,
                        EXCEPINFO *pexcepinfo,
                        UINT *puArgErr) {
        return m_pTypeInfo->Invoke(static_cast<IPenguin*>(this),
                                   dispidMember, wFlags,
                                   pdispparams, pvarResult,
                                   pexcepinfo, puArgErr);
    }
    // IBird, ISnappyDresser and IPenguin methods...
private:
    ITypeInfo* m_pTypeInfo;
};

Because this implementation is so boilerplate (it varies only by the dual interface type, the interface identifier, the type library identifier, and the major and minor version numbers), it can be easily implemented in a template base class. ATL’s parameterized implementation of IDispatch is IDispatchImpl:

template <class T,
          const IID* piid = &__uuidof(T),
          const GUID* plibid = &CAtlModule::m_libid,
          WORD wMajor = 1,
          WORD wMinor = 0,
          class tihclass = CComTypeInfoHolder>
class ATL_NO_VTABLE IDispatchImpl : public T {...};

Given IDispatchImpl, our IPenguin implementation gets quite a bit simpler:

class CPenguin :
    public CComObjectRootEx<CComMultiThreadModel>,
    public IBird,
    public ISnappyDresser,
    public IDispatchImpl<IPenguin, &IID_IPenguin> {
public:
BEGIN_COM_MAP(CPenguin)
    COM_INTERFACE_ENTRY(IBird)
    COM_INTERFACE_ENTRY(ISnappyDresser)
    COM_INTERFACE_ENTRY(IDispatch)
    COM_INTERFACE_ENTRY(IPenguin)
END_COM_MAP()
    // IBird, ISnappyDresser and IPenguin methods...
};

Supporting Multiple Dual Interfaces

I wish it wouldn’t, but this question always comes up: “How do I support multiple dual interfaces in my COM objects?” My answer is always, “Why would you want to?”

The problem is, of the scripting environments I’m familiar with that require an object to implement IDispatch, not one supports QueryInterface. So although it’s possible to use ATL to implement multiple dual interfaces, you have to choose which implementation to hand out as the “default” – that is, the one the client gets when asking for IDispatch. For example, let’s say that instead of having a special IPenguin interface that represents the full functionality of my object to scripting clients, I decided to make all the interfaces dual interfaces.

[ dual, uuid(...) ] interface IBird : IDispatch {...}
[ dual, uuid(...) ] interface ISnappyDresser : IDispatch { ... };

You can implement both of these dual interfaces using ATL’s IDispatchImpl:

class CPenguin :
    public CComObjectRootEx<CComSingleThreadModel>,
    public IDispatchImpl<IBird, &IID_IBird>,
    public IDispatchImpl<ISnappyDresser, &IID_ISnappyDresser> {
public:
BEGIN_COM_MAP(CPenguin)
    COM_INTERFACE_ENTRY(IBird)
    COM_INTERFACE_ENTRY(ISnappyDresser)
    COM_INTERFACE_ENTRY(IDispatch) // ambiguous
END_COM_MAP()
...
};

However, when you fill in the interface map in this way, the compiler gets upset. Remember that the COM_INTERFACE_ENTRY macro essentially boils down to a static_cast to the interface in question. Because two different interfaces derive from IDispatch, the compiler cannot resolve the one to which you’re trying to cast. To resolve this difficulty, ATL provides another macro:

1#define COM_INTERFACE_ENTRY2(itf, branch)

This macro enables you to tell the compiler which branch to follow up the inheritance hierarchy to the IDispatch base. Using this macro allows you to choose the default IDispatch interface:

class CPenguin :
    public CComObjectRootEx<CComSingleThreadModel>,
    public IDispatchImpl<IBird, &IID_IBird>,
    public IDispatchImpl<ISnappyDresser, &IID_ISnappyDresser> {
public:
BEGIN_COM_MAP(CPenguin)
    COM_INTERFACE_ENTRY(IBird)
    COM_INTERFACE_ENTRY(ISnappyDresser)
    COM_INTERFACE_ENTRY2(IDispatch, IBird) // Compiles
                                           // (unfortunately)
END_COM_MAP()
...
};

That brings me to my objection. Just because ATL and the compiler conspire to allow this usage doesn’t mean that it’s a good one. There is no good reason to support multiple dual interfaces on a single implementation. Any client that supports QueryInterface will not need to use GetIDsOfNames or Invoke. These kinds of clients are perfectly happy using a custom interface, as long as it matches their argument type requirements. On the other hand, scripting clients that don’t support QueryInterface can get to methods and properties on only the default dual interface. For example, the following will not work:

// Since IBird is the default, its operations are available
penguin.fly
// Since ISnappyDresser is not the default, its operations
// aren't available
penguin.straightenTie // runtime error

So, here’s my advice: Don’t design your reusable, polymorphic COM interfaces as dual interfaces. Instead, if you’re going to support scripting clients, define a single dual interface that exposes the entire functionality of the class, as I did when defining IPenguin in the first place. As an added benefit, this means that you have to define only one interface that supports scripting clients instead of mandating that all of them do.

Having said that, sometimes you don’t have a choice. For example, when building Visual Studio add-ins, you need to implement two interfaces: _IDTExtensibility2 and IDTCommandTarget. Both of these are defined as dual interfaces, so the environment forces you to deal with this problem. [6] You’ll need to look at the documentation and do some experimentation to figure out which of your IDispatch implementations should be the default.

CComObject Et Al

Consider the following C++ class:

class CPenguin :
  public CComObjectRootEx<CComMultiThreadModel>,
  public IBird,
  public ISnappyDresser {
public:
BEGIN_COM_MAP(CPenguin)
    COM_INTERFACE_ENTRY(IBird)
    COM_INTERFACE_ENTRY(ISnappyDresser)
END_COM_MAP()
    // IBird and ISnappyDresser methods...
    // IUnknown methods not implemented here
};

Because this class doesn’t implement the methods of IUnknown, the following will fail at compile time:

STDMETHODIMP
CPenguinCO::CreateInstance(IUnknown* pUnkOuter,
    REFIID riid, void** ppv) {
    ...
    CPenguin* pobj = new CPenguin; // IUnknown not implemented
    ...
}

Given CComObjectRootBase, you can easily implement the methods of IUnknown:

// Server lifetime management
extern void ServerLock();
extern void ServerUnlock();

class CPenguin :
  public CComObjectRootEx<CComMultiThreadModel>,
  public IBird,
  public ISnappyDresser {
public:
    CPengin() { ServerLock(); }
    ~CPenguin() { ServerUnlock(); }
BEGIN_COM_MAP(CPenguin)
    COM_INTERFACE_ENTRY(IBird)
    COM_INTERFACE_ENTRY(ISnappyDresser)
END_COM_MAP()
    // IBird and ISnappyDresser methods...
    // IUnknown methods for standalone, heap-based objects
    STDMETHODIMP QueryInterface(REFIID riid, void** ppv)
    { return _InternalQueryInterface(riid, ppv); }

    STDMETHODIMP_(ULONG) AddRef()
    { return InternalAddRef(); }

    STDMETHODIMP_(ULONG) Release() {
        ULONG l = InternalRelease();
        if( l == 0 ) delete this;
        return l;
    }
};

Unfortunately, although this implementation does leverage the base class behavior, it has hard-coded assumptions about the lifetime and identity of our objects. For example, instances of this class can’t be created as an aggregate. Just as we’re able to encapsulate decisions about thread safety into the base class, we would like to encapsulate decisions about lifetime and identity. However, unlike thread-safety decisions, which are made on a per-class basis and are, therefore, safe to encode into a base class, lifetime and identity decisions can be made on a per-instance basis. Therefore, we’ll want to encapsulate lifetime and identity behavior into classes meant to derive from our class.

Standalone Activation

To encapsulate the standalone, heap-based object implementation of IUnknown I just showed you, ATL provides CComObject, shown in a slightly abbreviated form here:

template <class Base>
class CComObject : public Base {
public:
    typedef Base _BaseClass;
    CComObject(void* = NULL)
    { _pAtlModule->Lock(); }    // Keeps server loaded

    // Set refcount to -(LONG_MAX/2) to protect destruction and
    // also catch mismatched Release in debug builds
    ~CComObject() {
        m_dwRef = -(LONG_MAX/2);
        FinalRelease();
#ifdef _ATL_DEBUG_INTERFACES
        _AtlDebugInterfacesModule.DeleteNonAddRefThunk(
            _GetRawUnknown());
#endif
        _pAtlModule->Unlock();   // Allows server to unload
    }
    STDMETHOD_(ULONG, AddRef)() {return InternalAddRef();}
    STDMETHOD_(ULONG, Release)() {
        ULONG l = InternalRelease();
        if (l == 0) delete this;
        return l;
    }

    STDMETHOD(QueryInterface)(REFIID iid, void ** ppvObject)
    {return _InternalQueryInterface(iid, ppvObject);}

    template <class Q>
    HRESULT STDMETHODCALLTYPE QueryInterface(Q** pp)
    { return QueryInterface(__uuidof(Q), (void**)pp); }

    static HRESULT WINAPI CreateInstance(CComObject<Base>** pp) ;
};

Notice that CComObject takes a template parameter called Base. This is the base class from which CComObject derives to obtain the functionality of CComObjectRootEx, as well as whatever custom functionality we’d like to include in our objects. Given the implementation of CPenguin that did not include the implementation of the IUnknown methods, the compiler would be happy with CComObject used as follows (although I describe later why new shouldn’t be used directly when creating ATL-based COM objects):

STDMETHODIMP
CPenguinCO::CreateInstance(IUnknown* pUnkOuter, REFIID riid,
    void** ppv) {
    *ppv = 0;
    if( pUnkOuter ) return CLASS_E_NOAGGREGATION;
    // Read on for why not to use new like this!
    CComObject<CPenguin>* pobj = new CComObject<CPenguin>;
    if( pobj ) {
        pobj->AddRef();
        HRESULT hr = pobj->QueryInterface(riid, ppv);
        pobj->Release();
        return hr;
    }
    return E_OUTOFMEMORY;
}

Besides the call to FinalRelease and the static member function CreateInstance (which are both described in the “Creators” section of this chapter), CComObject provides one additional item of note, the QueryInterface member function template [7] :

template <class Q>
HRESULT STDMETHODCALLTYPE QueryInterface(Q** pp)
{ return QueryInterface(__uuidof(Q), (void**)pp); }

This member function template uses the capability of the VC++ compiler to tag a type with a universally unique identifier (UUID). This capability has been available since VC++ 5.0 and takes the form of a declarative specifier (declspec s):

struct __declspec(uuid("00000000-0000-0000-C000-000000000046") IUnknown
{...};

These declspec specifiers are output by the Microsoft IDL compiler and are available for both standard and custom interfaces. You can retrieve the UUID of a type using the __uuidof operator, allowing the following syntax:

void TryToFly(IUnknown* punk) {
    IBird* pbird = 0;
    if( SUCCEEDED(punk->QueryInterface(__uuidof(pbird),
        (void**)&pbird) ) {
        pbird->Fly();
        pbird->Release();
    }
}

Using the QueryInterface member function template provided in CComObject offers a bit more syntactic convenience, given a CComObject-based object reference:

void TryToFly(CComObject<CPenguin>* pPenguin) {
    IBird* pbird = 0;
    if( SUCCEEDED(pPenguin->QueryInterface(&pbird) ) {
      pbird->Fly();
      pbird->Release();
    }
}

Aggregated Activation

Notice that the CPenguin class object implementation shown previously disallowed aggregation by checking for a nonzero pUnkOuter and returning CLASS_E_NOAGGREGATION. If we want to support aggregation as well as – or instead of – standalone activation, we need another class to implement the forwarding behavior of aggregated instances. For this, ATL provides CComAggObject.

CComAggObject performs the chief service of being a controlled inner – that is, providing two implementations of IUnknown. One implementation forwards calls to the controlling outer, subsumed by its lifetime and identity. The other implementation is for private use of the controlling outer for actually maintaining the lifetime of and querying interfaces from the inner. To obtain the two implementations of IUnknown, CComAggObject derives from CComObjectRootEx twice, once directly and once indirectly via a contained instance of your class derived from CComContainedObject, as shown here:

template <class contained>
class CComAggObject :
    public IUnknown,
    public CComObjectRootEx<
        contained::_ThreadModel::ThreadModelNoCS> {
public:
    typedef contained _BaseClass;
    CComAggObject(void* pv) : m_contained(pv)
    { _pAtlModule->Lock(); }

    ~CComAggObject() {
        m_dwRef = -(LONG_MAX/2);
        FinalRelease();
        _pAtlModule->Unlock();
    }

    STDMETHOD(QueryInterface)(REFIID iid, void ** ppvObject) {
        ATLASSERT(ppvObject != NULL);
        if (ppvObject == NULL)
            return E_POINTER;
        *ppvObject = NULL;

        HRESULT hRes = S_OK;
        if (InlineIsEqualUnknown(iid)) {
            *ppvObject = (void*)(IUnknown*)this;
            AddRef();
        }
        else
            hRes = m_contained._InternalQueryInterface(iid,
                ppvObject);
        return hRes;
    }

    STDMETHOD_(ULONG, AddRef)()
    { return InternalAddRef(); }

    STDMETHOD_(ULONG, Release)() {
        ULONG l = InternalRelease();
        if (l == 0) delete this;
        return l;
    }

    template <class Q>
    HRESULT STDMETHODCALLTYPE QueryInterface(Q** pp)
    { return QueryInterface(__uuidof(Q), (void**)pp); }
    static HRESULT WINAPI CreateInstance(LPUNKNOWN pUnkOuter,
        CComAggObject<contained>** pp);

    CComContainedObject<contained> m_contained;
};

You can see that instead of deriving from your class (passed as the template argument), CComAggObject derives directly from CComObjectRootEx. Its implementation of QueryInterface relies on the interface map you’ve built in your class, but its implementation of AddRef and Release access relies on the second instance of CComObjectRootBase it gets by deriving from CComObjectRootEx. This second instance of CComObjectRootBase uses the m_dwRef member of the union.

The first instance of CComObjectRootBase, the one that manages the m_pOuterUnknown member of the union, is the one CComAggObject gets by creating an instance of your class derived from CComContainedObject as the m_contained data member. CComContainedObject implements QueryInterface, AddRef, and Release by delegating to the m_pOuterUnknown passed to the constructor:

template <class Base>
class CComContainedObject : public Base {
public:
    typedef Base _BaseClass;
    CComContainedObject(void* pv) {
        m_pOuterUnknown = (IUnknown*)pv;
    }

    STDMETHOD(QueryInterface)(REFIID iid, void ** ppvObject) {
      return OuterQueryInterface(iid, ppvObject);
    }

    STDMETHOD_(ULONG, AddRef)()
    { return OuterAddRef(); }

    STDMETHOD_(ULONG, Release)()
    { return OuterRelease(); }

    template <class Q>
    HRESULT STDMETHODCALLTYPE QueryInterface(Q** pp)
    { return QueryInterface(__uuidof(Q), (void**)pp); }

    IUnknown* GetControllingUnknown()
    { return m_pOuterUnknown; }
};

Being the Controlled Inner

Using CComAggObject and its two implementations of IUnknown, our CPenguin class object implementation can support either standalone or aggregated activation without touching the CPenguin source:

STDMETHODIMP
CPenguinCO::CreateInstance(IUnknown* pUnkOuter, REFIID riid,
    void** ppv) {
    *ppv = 0;
    if( pUnkOuter ) {
        CComAggObject<CPenguin>* pobj =
                       new CComAggObject<CPenguin>(pUnkOuter);
        ...
    }
    else {
        CComObject<CPenguin>* pobj = new CComObject<CPenguin>;
        ...
    }
}

This usage provides the most efficient runtime decision making. If the object is standalone, it pays the price of one reference count and one implementation of IUnknown. If it is aggregated, it pays the price of one reference count, one pointer to the controlling outer, and two implementations of IUnknown. However, one additional price we’re paying is one extra set of vtbl s. By using both CComAggObject<CPenguin> and CComObject<CPenguin>, we’ve created two classes and, therefore, two sets of vtbl s. If you’ve got a small number of instances or nearly all your instances are aggregated, you might want a single class that can handle both aggregated and standalone activation, thereby eliminating one set of vtbl s. You do this by using CComPolyObject in place of both CComObject and CComAggObject:

STDMETHODIMP
CPenguinCO::CreateInstance(IUnknown* pUnkOuter, REFIID riid,
    void** ppv) {
    *ppv = 0;
    CComPolyObject<CPenguin>* pobj =
        new CComPolyObject<CPenguin>(pUnkOuter);
    ...
}

CComPolyObject is nearly identical to CComAggObject, except that, in its constructor, if the pUnkOuter is zero, it uses its second implementation of IUnknown as the outer for the first to forward to, as shown:

class CComPolyObject :
    public IUnknown,
    public CComObjectRootEx<
        contained::_ThreadModel::ThreadModelNoCS> {
public:
    ...
    CComPolyObject(void* pv) : m_contained(pv ? pv : this) {...}
    ...
};

The use of CComPolyObject saves a set of vtbl s, so the module size is smaller, but the price you pay for standalone objects is getting an extra implementation of IUnknown as well as an extra pointer to that implementation.

Alternative Activation Techniques

Besides standalone operation, CComObject makes certain assumptions about where the object’s memory has been allocated from (the heap) and whether the existence of the object should keep the server loaded (it does). For other needs, ATL provides five more classes meant to be the most derived class in your implementation hierarchy: CComObjectCached, CComObjectNoLock, CComObjectGlobal, CComObjectStack, and CComObjectStackEx.

CComObjectCached

CComObjectCached objects implement reference counting, assuming that you’re going to create an instance and then hold it for the life of the server, handing out references to it as requested. To avoid keeping the server running forever after the cached instance is created, the boundary for keeping the server running is a reference count of one, although the lifetime of the object is still managed on a boundary of zero:

template <class Base>
class CComObjectCached : public Base {
public:
    ...
    STDMETHOD_(ULONG, AddRef)() {
        ULONG l = InternalAddRef();
        if (l == 2)
            _pAtlModule->Lock();
        return l;
    }
    STDMETHOD_(ULONG, Release)() {
        ULONG l = InternalRelease();
        if (l == 0)
            delete this;
        else if (l == 1)
            _pAtlModule->Unlock();
        return l;
    }
    ...
};

Cached objects are useful for in-process class objects:

static CComObjectCached<CPenguinCO>* g_pPenguinCO = 0;

BOOL WINAPI DllMain(HINSTANCE, DWORD dwReason, void*) {
  switch( dwReason ) {
  case DLL_PROCESS_ATTACH:
      g_pPenguinCO = new CComObjectCached<CPenguinCO>();

      // 1st ref. **doesn't** keep server alive
      if( g_pPenguinCO ) g_pPenguinCO->AddRef();
  break;

  case DLL_PROCESS_DETACH:
      if( g_pPenguinCO ) g_pPenguinCO->Release();
  break;
  }
  return TRUE;
}

STDAPI DllGetClassObject(REFCLSID clsid, REFIID riid,
    void** ppv) {
    // Subsequent references do keep server alive
    if( clsid == CLSID_Penguin && g_pPenguinCO )
      return g_pPenguinCO->QueryInterface(riid, ppv);
    return CLASS_E_CLASSNOTAVAILABLE;
}

CComObjectNoLock

Sometimes you don’t want outstanding references on your object to keep the server alive. For example, class objects in an out-of-process server are cached in a table maintained by ole32.dll some number of times (it might not be one). For this reason, COM itself manages how the lifetime of a class object affects the lifetime of its out-of-process server using the LockServer method of the IClassFactory interface. For this use, ATL provides CComObjectNoLock, whose implementation does not affect the lifetime of the server:

template <class Base>
class CComObjectNoLock : public Base {
public:
    ...
    STDMETHOD_(ULONG, AddRef)()
    { return InternalAddRef(); }

    STDMETHOD_(ULONG, Release)() {
        ULONG l = InternalRelease();
        if (l == 0) delete this;
        return l;
    }
  ...
};

No-lock objects are useful for out-of-process class objects:

int WINAPI WinMain(HINSTANCE, HINSTANCE, LPSTR, int) {
    CoInitialize(0);

    CComObjectNoLock<CPenguinCO>* pPenguinCO =
        new CComObjectNoLock<CPenguinCO>();
    if( !pPenguinCO ) return E_OUTOFMEMORY;
        pPenguinCO->AddRef();

    DWORD   dwReg;
    HRESULT hr;

    // Reference(s) cached by ole32.dll won't keep server
    // from shutting down
    hr = CoRegisterClassObject(CLSID_Penguin, pPenguinCO, ...,
        &dwReg);
    if( SUCCEEDED(hr) ) {
        MSG msg; while( GetMessage(&msg, 0, 0, 0) ) DispatchMessage(&msg);
        CoRevokeClassObject(dwReg);
        pPenguinCO->Release();
    }

    CoUninitialize();
    return hr;
}

CComObjectGlobal

Just as it’s handy to have an object whose existence or outstanding references don’t keep the server alive, sometimes it’s handy to have an object whose lifetime matches that of the server. For example, a global or static object is constructed once when the server is loaded and is not destroyed until after WinMain or DllMain has completed. Clearly, the mere existence of a global object cannot keep the server running, or the server could never be shut down. On the other hand, we’d like to be able to keep the server running if there are outstanding references to a global object. For this, we have CComObjectGlobal:

template <class Base>
class CComObjectGlobal : public Base {
public:
  ...
  STDMETHOD_(ULONG, AddRef )() { return _pAtlModule->Lock(); }
  STDMETHOD_(ULONG, Release)() { return _pAtlModule->Unlock(); }
  ...
};

Global objects can be used instead of cached objects for implementing in-process class objects, but they’re useful for any global or static object:

// No references yet, so server not forced to stay alive
static CComObjectGlobal<CPenguinCO> g_penguinCO;

STDAPI DllGetClassObject(REFCLSID clsid, REFIID riid,
    void** ppv) {
    // All references keep the server alive
    if( clsid == CLSID_Penguin )
        return g_penguinCO.QueryInterface(riid, ppv);
    return CLASS_E_CLASSNOTAVAILABLE;
}

CComObjectStack and CComObjectStackEx

Instead of using a global or static object, you might find yourself with the urge to allocate a COM object on the stack. ATL supports this technique with CComObjectStack:

template <class Base>
class CComObjectStack : public Base {
public:
    ...
    STDMETHOD_(ULONG, AddRef)()
    { ATLASSERT(FALSE); return 0; }

    STDMETHOD_(ULONG, Release)()
    { ATLASSERT(FALSE); return 0; }

    STDMETHOD(QueryInterface)(REFIID iid, void** ppvObject)
    { ATLASSERT(FALSE); return E_NOINTERFACE; }
    ...
};

Based on the implementation, it should be clear that you’re no longer doing COM. CComObjectStack shuts up the compiler, but you still cannot use any methods of IUnknown, which means that you cannot pass out an interface reference from an object on the stack. This is good because, as with a reference to anything on the stack, as soon as the stack goes away, the reference points at garbage. The nice thing about ATL’s implementation of CComObjectStack is that it warns you at runtime that you’re doing something bad:

void DoABadThing(IBird** ppbird) {
    CComObjectStack<CPenguin> penguin;
    penguin.Fly();           // Using IBird method is OK
    penguin.StraightenTie(); // Using ISnappyDresser method
                             // also OK

    // This will trigger an assert at runtime
    penguin.QueryInterface(IID_IBird, (void**)ppbird);
}

CComObjectStackEx addresses the limitations of CComObjectStack by providing a more useful implementation of IUnknown:

template <class Base>
class CComObjectStackEx : public Base {
public:
    typedef Base _BaseClass;

    CComObjectStackEx(void* = NULL) {
#ifdef _DEBUG
        m_dwRef = 0;
#endif
        m_hResFinalConstruct = _AtlInitialConstruct();
        if (SUCCEEDED(m_hResFinalConstruct))
            m_hResFinalConstruct = FinalConstruct();
    }

    virtual ~CComObjectStackEx() {
        // This assert indicates mismatched ref counts.
        //
        // The ref count has no control over the
        // lifetime of this object, so you must ensure
        // by some other means that the object remains
        // alive while clients have references to its interfaces.
        ATLASSUME(m_dwRef == 0);
        FinalRelease();
#ifdef _ATL_DEBUG_INTERFACES
        _AtlDebugInterfacesModule.DeleteNonAddRefThunk(
            _GetRawUnknown());
#endif
    }

    STDMETHOD_(ULONG, AddRef)() {
#ifdef _DEBUG
        return InternalAddRef();
#else
        return 0;
#endif
    }

    STDMETHOD_(ULONG, Release)() {
#ifdef _DEBUG
        return InternalRelease();
#else
        return 0;
#endif
    }

    STDMETHOD(QueryInterface)(REFIID iid, void ** ppvObject) {
        return _InternalQueryInterface(iid, ppvObject);
    }

    HRESULT m_hResFinalConstruct;
};

As you can see, CComObjectStackEx permits the use of the IUnknown methods, as long as they are called within the scope of the CComObjectStackEx instance. This allows methods called from within the instance scope to treat the object as if it were a typical heap-based COM object, as in the following:

void PlayWithBird() {
    CComObjectStackEx<CPenguin> penguin;
    IBird* pBird = NULL;
    penguin.QueryInterface(IID_IBird,
        (void**)&pBird);          // OK -> no assert
    DoBirdTrickes(pBird);
}

void DoBirdTricks(IBird* pBird) {
    pBird->Fly();                 // IBird methods OK
    ISnappyDresser* pPenguin = NULL;
    pBird->QueryInterface(IID_ISnappyDresser,
        (void**)&pPenguin);       // OK
    pPenguin->StraightenTie();    // ISnappyDresser methods OK
    pPenguin->Release();          // OK -> no assert
}

One from Column A, Two from Column B…

Table 4.2 shows the various identity and lifetime options ATL provides.

Table 4.2. ATL’s Identity and Lifetime Options

Class	Standalone or Aggregated	Heap or Stack	Existence Keeps Server Alive	Extent Refs Keep Server Alive	Useful IUnkown Methods
`CcomObject`	Standalone	Heap	Yes	Yes	Yes
`CComAggObject`	Aggregated	Heap	Yes	Yes	Yes
`CComPolyObject`	Standalone or aggregated	Heap	Yes	Yes	Yes
`CComObjectCached`	Standalone	Heap	No	Second Reference	Yes
`CComObjectNoLock`	Standalone	Heap	No	No	Yes
`CComObjectGlobal`	Standalone	Data seg.	No	Yes	Yes
`CComObjectStack`	Standalone	Stack	No	No	No
`CComObjectStackEx`	Standalone	Stack	No	No	Yes

ATL Creators

Multiphase Construction

As I’ve mentioned, ATL servers might not necessarily link with the CRT. However, living without the CRT can be a pain. Among other things, if you don’t have the CRT, you also don’t get C++ exceptions. That doesn’t leave you much to do in the following scenario:

// CPenguin constructor
CPenguin::CPenguin() {
  HRESULT hr = CoCreateInstance(CLSID_EarthAtmosphere, 0,
    CLSCTX_ALL, IID_IAir, (void**)&m_pAir);
  if( FAILED(hr) ) {
    // Can't return an error from a ctor
    return hr;
    // Can't throw an error without the CRT
    throw hr;
    // This won't help
    OutputDebugString(__T("Help! Can't bre...\n"));
  }
}

The OutputDebugString isn’t going to notify the client that the object it just created doesn’t have the resources it needs to survive; there’s no way to return the failure result back to the client. This hardly seems fair because the IClassFactory method CreateInstance that’s creating our objects certainly can return an HRESULT. The problem is having a way to hand a failure from the instance to the class object so that it can be returned to the client. By convention, ATL classes provide a public member function called FinalConstruct for objects to participate in multiphase construction:

HRESULT FinalConstruct();

An empty implementation of the FinalConstruct member function is provided in CComObjectRootBase, so all ATL objects have one. Because FinalConstruct returns an HRESULT, now you have a clean way to obtain the result of any nontrivial construction:

HRESULT CPenguin::FinalConstruct() {
    return CoCreateInstance(CLSID_EarthAtmosphere, 0, CLSCTX_ALL,
                            IID_IAir, (void**)&m_pAir);
}
STDMETHODIMP
CPenguinCO::CreateInstance(IUnknown* pUnkOuter, REFIID riid,
    void** ppv) {
    *ppv = 0;
    if( !pUnkOuter ) {
        CComObject<CPenguin>* pobj = new CComObject<CPenguin>;
        if( !pobj ) return E_OUTOFMEMORY;
        HRESULT hr = pobj->FinalConstruct();
        if( SUCCEEDED(hr) ) ...
        return hr;
    }
    ...
}

You do have something else to consider, though. Notice that when CreateInstance calls FinalConstruct, it has not yet increased the reference count of the object. This causes a problem if, during the FinalConstruct implementation, the object handed a reference to itself to another object. If you think this is uncommon, remember the pUnkOuter parameter to the IClassFactory method CreateInstance. However, even without aggregation, it’s possible to run into this problem. Imagine the following somewhat contrived but perfectly legal code:

// CPenguin implementation
HRESULT CPenguin::FinalConstruct() {
    HRESULT hr;
    hr = CoCreateInstance(CLSID_EarthAtmosphere, 0, CLSCTX_ALL,
                          IID_IAir, (void**)&m_pAir);
    if( SUCCEEDED(hr) ) {
        // Pass reference to object with reference count of 0
        hr = m_pAir->CheckSuitability(GetUnknown());
    }
    return hr;
}

// CEarthAtmosphere implementation in separate server
STDMETHODIMP CEarthAtmosphere::CheckSuitability(IUnknown* punk) {
    IBreatheO2* pbo2 = 0;
    HRESULT hr = E_FAIL;

    // CPenguin's lifetime increased to 1 via QI
    hr = punk->QueryInterface(IID_IBreatheO2, (void**)&pbo2);
    if( SUCCEEDED(hr) ) {
        pbo2->Release(); // During this call, lifetime decreases
                         // to 0 and destruction sequence begins...
    }

    return (SUCCEEDED(hr) ? S_OK : E_FAIL);
}

To avoid the problem of premature destruction, you need to artificially increase the object’s reference count before FinalConstruct is called and then decrease its reference count afterward:

STDMETHODIMP
CPenguinCO::CreateInstance(IUnknown* pUnkOuter, REFIID riid,
    void** ppv) {
    *ppv = 0;
    if( !pUnkOuter ) {
        CComObject<CPenguin>* pobj = new CComObject<CPenguin>;
        if( FAILED(hr) ) return E_OUTOFMEMORY;

        // Protect object from pre-mature destruction
        pobj->InternalAddRef();
        hr = pobj->FinalConstruct();
        pobj->InternalRelease();

        if( SUCCEEDED(hr) ) ...
        return hr;
    }
    ...
}

Just Enough Reference Count Safety

Arguably, not all objects need their reference count artificially managed in the way just described. In fact, for multithreaded objects that don’t require this kind of protection, extra calls to InterlockedIncrement and InterlockedDecrement represent unnecessary overhead. Toward that end, CComObjectRootBase provides a pair of functions just for bracketing the call to FinalConstruct in a “just reference count safe enough” way:

STDMETHODIMP
CPenguinCO::CreateInstance(IUnknown* pUnkOuter, REFIID riid,
    void** ppv) {
    *ppv = 0;
    if( !pUnkOuter ) {
        CComObject<CPenguin>* pobj = new CComObject<CPenguin>;
        if( FAILED(hr) ) return E_OUTOFMEMORY;

        // Protect object from pre-mature destruction (maybe)
        pobj->InternalFinalConstructAddRef();
        hr = pobj->FinalConstruct();
        pobj->InternalFinalConstructRelease();

        if( SUCCEEDED(hr) ) ...
        return hr;
    }
    ...
}

By default, InternalFinalConstructAddRef and InternalFinalConstructRelease incur no release build runtime overhead:

class CComObjectRootBase {
public:
  ...
  void InternalFinalConstructAddRef() {}
  void InternalFinalConstructRelease() {
    ATLASSERT(m_dwRef == 0);
  }
  ...
};

To change the implementation of InternalFinalConstructAddRef and InternalFinalConstructRelease to provide reference count safety, ATL provides the following macro:

#define DECLARE_PROTECT_FINAL_CONSTRUCT() \
  void InternalFinalConstructAddRef() { InternalAddRef(); } \
  void InternalFinalConstructRelease() { InternalRelease(); }

The DECLARE_PROTECT_FINAL_CONSTRUCT macro is used on a per-class basis to turn on reference count safety as required. Our CPenguin would use it like this:

class CPenguin : ... {
public:
  HRESULT FinalConstruct();
  DECLARE_PROTECT_FINAL_CONSTRUCT()
  ...
};

In my opinion, DECLARE_PROTECT_FINAL_CONSTRUCT is one ATL optimization too many. Using it requires not only a great deal of knowledge of COM and ATL internals, but also a great deal of knowledge of how to implement the objects you create in FinalConstruct methods. Because you often don’t have that knowledge, the only safe thing to do is to always use DECLARE_PROTECT_FINAL_CONSTRUCT if you’re handing out references to your instances in your FinalConstruct calls. And because that rule is too complicated, most folks will probably forget it. So here’s a simpler one:

Every class that implements the FinalConstruct member function should also have a DECLARE_PROTECT_FINAL_CONSTRUCT macro instantiation.

Luckily, the wizard generates DECLARE_PROTECT_FINAL_CONSTRUCT when it generates a new class, so your FinalConstruct code will be safe by default. If you decide you don’t want it, you can remove it. [8]

Another Reason for Multiphase Construction

Imagine a plain-vanilla C++ class that wants to call a virtual member function during its construction, and another C++ class that overrides that function:

class Base {
public:
    Base() { Init(); }
    virtual void Init() {}
};

class Derived : public Base {
public:
    virtual void Init() {}
};

Because it’s fairly uncommon to call virtual member functions as part of the construction sequence, it’s not widely known that the Init function during the constructor for Base will not be Derived::Init, but Base::Init. This might seem counterintuitive, but the reason it works this way is a good one: It doesn’t make sense to call a virtual member function in a derived class until the derived class has been properly constructed. However, the derived class isn’t properly constructed until after the base class has been constructed. To make sure that only functions of properly constructed classes are called during construction, the C++ compiler lays out two vtbl s, one for Base and one for Derived. The C++ runtime then adjusts the vptr to point to the appropriate vtbl during the construction sequence.

Although this is all part of the official C++ standard, it’s not exactly intuitive, especially because it is so rarely used (or maybe it’s so rarely used because it’s unintuitive). Because it’s rarely used, beginning with Visual C++ 5.0, Microsoft introduced __declspec(novtable) to turn off the adjustment of vptr s during construction. If the base class is an abstract base class, this often results in vtbl s that are generated by the compiler but not used, so the linker can remove them from the final image.

This optimization is used in ATL whenever a class is declared using the ATL_NO_VTABLE macro:

#ifdef _ATL_DISABLE_NO_VTABLE
#define ATL_NO_VTABLE
#else
#define ATL_NO_VTABLE __declspec(novtable)
#endif

Unless the _ATL_DISABLE_NO_VTABLE is defined, a class defined using _ATL_NO_VTABLE has its constructor behavior adjusted with __declspec(novtable):

class ATL_NO_VTABLE CPenguin ... {};

This is a good and true optimization, but classes that use it must not call virtual member functions in their constructors. [9] If virtual member functions need to be called during construction, leave them until the call to FinalConstruct, which is called after the most derived class’s constructor and after the vptr s are adjusted to the correct values.

One last thing should be mentioned about __declspec(novatble). Just as it turns off the adjustment of vptr s during construction, it turns off the adjustment of vptr s during destruction. Therefore, avoid calling virtual functions in the destructor as well; instead, call them in the object’s FinalRelease member function.

FinalRelease

ATL calls the object’s FinalRelease function after the object’s final interface reference is released and before your ATL-based object’s destructor is called:

void FinalRelease();

The FinalRelease member function is useful for calling virtual member functions and releasing interfaces to other objects that also have pointers back to you. Because those other objects might want to query for an interface during its shutdown sequence, it’s just as important to protect the object against double destruction as it was to protect it against premature destruction in FinalConstruct. Even though the FinalRelease member function is called when the object’s reference count has been decreased to zero (which is why the object is being destroyed), the caller of FinalRelease artificially sets the reference count to -(LONG_MAX/2) to avoid double deletion. The caller of FinalRelease is the destructor of the most derived class:

CComObject::~CComObject() {
    m_dwRef = -(LONG_MAX/2);
    FinalRelease();
    _AtlModule->Unlock();
}

Under the Hood

Just as two-phase construction applies to code you need to call to set up your objects, the ATL framework itself often needs to do operations at construction time that might fail. For example, creation of a lock object could fail for some reason. To handle this, ATL and CComObjectRootBase define a couple other entry points:

class CComObjectRootBase {
public:
    ...
    // For library initialization only
    HRESULT _AtlFinalConstruct() {
        return S_OK;
    }
    ...
    void _AtlFinalRelease() {}      // temp
};

These methods exist so that ATL has a place to put framework-initialization functions that aren’t affected by your work in FinalConstruct. In addition to these methods, CComObjectRootEx defines this setup method:

template <class ThreadModel>
class CComObjectRootEx : public CComObjectRootBase {
public:
    ...
    HRESULT _AtlInitialConstruct() {
        return m_critsec.Init();
    }
};

CComAggObject, CComPolyObject, etc. all define their own implementation of _AtlInitialConstruct. At this time, nothing in the framework overrides _AtlFinalConstruct or _AtlFinalRelease. However, _AtlInitialConstruct is used; when you’re creating objects, make sure that it gets called or your objects won’t get initialized properly.

Creators

Because the extra steps to manage the multiphase construction process are easy to forget, ATL encapsulates this algorithm into several C++ classes called Creators. Each performs the appropriate multiphase construction. Each Creator class is actually just a way to wrap a scope around a single static member function called CreateInstance:

static HRESULT WINAPI CreateInstance(void* pv, REFIID riid, LPVOID* ppv);

The name of the Creator class is used in a type definition associated with the class; this is discussed in the next section.

CComCreator

CComCreator is a Creator class that creates either standalone or aggregated instances. It is parameterized by the C++ class being created; for example, CComObject<CPenguin>. CComCreator is declared like this:

template <class T1>
class CComCreator {
public:
    static HRESULT WINAPI CreateInstance(void* pv, REFIID riid,
        LPVOID* ppv) {
        ATLASSERT(ppv != NULL);
        if (ppv == NULL)
            return E_POINTER;
        *ppv = NULL;

        HRESULT hRes = E_OUTOFMEMORY;
        T1* p = NULL;
        ATLTRY(p = new T1(pv))
        if (p != NULL) {
            p->SetVoid(pv);
            p->InternalFinalConstructAddRef();
            hRes = p->_AtlInitialConstruct();
            if (SUCCEEDED(hRes))
                hRes = p->FinalConstruct();
            if (SUCCEEDED(hRes))
                hRes = p->_AtlFinalConstruct();
            p->InternalFinalConstructRelease();
            if (hRes == S_OK)
                hRes = p->QueryInterface(riid, ppv);
            if (hRes != S_OK)
                delete p;
        }
        return hRes;
    }
};

Using CComCreator simplifies our class object implementation quite a bit:

STDMETHODIMP
CPenguinCO::CreateInstance(IUnknown* pUnkOuter, REFIID riid,
    void** ppv) {
    typedef CComCreator<
        CComPolyObject<CPenguin> > PenguinPolyCreator;
    return PenguinPolyCreator::CreateInstance(pUnkOuter,
        riid, ppv);
}

Notice the use of the type definition to define a new Creator type. If we were to create penguins other places in our server, we would have to rebuild the type definition:

STDMETHODIMP CAviary::CreatePenguin(IBird** ppbird) {
    typedef CComCreator< CComObject<CPenguin> > PenguinCreator;
    return PenguinCreator::CreateInstance(0, IID_IBird, (void**)ppbird);
}

Defining a Creator like this outside the class being created has two problems. First, it duplicates the type-definition code. Second, and more important, we’ve taken away the right of the CPenguin class to decide for itself whether it wants to support aggregation; the type definition is making this decision now. To reduce code and let the class designer make the decision about standalone versus aggregate activation, by convention in ATL, you place the type definition inside the class declaration and give it the well-known name _CreatorClass:

class CPenguin : ... {
public:
    ...
    typedef CComCreator<
        CComPolyObject<CPenguin> > _CreatorClass;
};

Using the Creator type definition, creating an instance and obtaining an initial interface actually involves fewer lines of code than operator new and QueryInterface:

STDMETHODIMP CAviary::CreatePenguin(IBird** ppbird) {
    return CPenguin::_CreatorClass::CreateInstance(0,
        IID_IBird,
        (void**)ppbird);
}

Chapter5, “COM Servers,” discusses one other base class that your class will often derive from, CComCoClass.

class CPenguin : ...,
public CComCoClass<CPenguin, &CLSID_Penguin>, ... {...};

CComCoClass provides two static member functions, each called CreateInstance, that make use of the class’s creators:

template <class T, const CLSID* pclsid = &CLSID_NULL>
class CComCoClass {
public:
    ...
    template <class Q>
    static HRESULT CreateInstance(IUnknown* punkOuter, Q** pp) {
        return T::_CreatorClass::CreateInstance(punkOuter,
            __uuidof(Q), (void**) pp);
    }
    template <class Q>
    static HRESULT CreateInstance(Q** pp) {
        return T::_CreatorClass::CreateInstance(NULL,
            __uuidof(Q), (void**) pp);
    }
};

This simplifies the creation code still further:

STDMETHODIMP CAviary::CreatePenguin(IBird** ppbird) {
    return CPenguin::CreateInstance(ppbird);
}

CComCreator2

You might like to support both standalone and aggregate activation using CComObject and CComAggObject instead of CComPolyObject because of the overhead associated with CComPolyObject in the standalone case. The decision can be made with a simple if statement, but then you lose the predefined CreateInstance code in CComCoClass. ATL provides CComCreator2 to make this logic fit within the existing Creator machinery:

template <class T1, class T2> class CComCreator2 {
public:
    static HRESULT WINAPI CreateInstance(void* pv, REFIID riid,
        LPVOID* ppv) {
        ATLASSERT(*ppv == NULL);
        return (pv == NULL) ? T1::CreateInstance(NULL, riid, ppv)
                            : T2::CreateInstance(pv, riid, ppv);
    }
};

Notice that CComCreator2 is parameterized by the types of two other Creators. All CComCreator2 does is check for a NULL pUnkOuter and forward the call to one of two other Creators. So, if you’d like to use CComObject and CComAggObject instead of CComPolyObject, you can do so like this:

class CPenguin : ... {
public:
    ...
    typedef CComCreator2< CComCreator< CComObject<CPenguin> >,
        CComCreator< CComAggObject<CPenguin> > >
        _CreatorClass;
};

Of course, the beauty of this scheme is that all the Creators have the same function, CreateInstance, and are exposed via a type definition of the same name, _CreatorClass. Thus, none of the server code that creates penguins needs to change if the designer of the class changes his mind about how penguins should be created.

CComFailCreator

One of the changes you might want to make to your creation scheme is to support either standalone or aggregate activation only, not both. To make this happen, you need a special Creator to return an error code to use in place of one of the Creators passed as template arguments to CComCreator2. That’s what CComFailCreator is for:

template <HRESULT hr> class CComFailCreator {
public:
    static HRESULT WINAPI CreateInstance(void*, REFIID, LPVOID*)
    { return hr; }
};

If you’d like standalone activation only, you can use CComFailCreator as the aggregation creator template parameter:

class CPenguin : ... {
public:
    ...
    typedef CComCreator2< CComCreator< CComObject<CPenguin> >,
        CComFailCreator<CLASS_E_NOAGGREGATION> >
        _CreatorClass;
};

If you’d like aggregate activation only, you can use CComFailCreator as the standalone creator parameter:

class CPenguin : ... {
public:
    ...
    typedef CComCreator2< CComFailCreator<E_FAIL>,
        CComCreator< CComAggObject<CPenguin> > >
        _CreatorClass;
};

Convenience Macros

As a convenience, ATL provides the following macros in place of manually specifying the _CreatorClass type definition for each class:

#define DECLARE_POLY_AGGREGATABLE(x) public:\
  typedef ATL::CComCreator< \
  ATL::CComPolyObject< x > > _CreatorClass;

#define DECLARE_AGGREGATABLE(x) public: \
  typedef ATL::CComCreator2< \
    ATL::CComCreator< ATL::CComObject< x > >, \
    ATL::CComCreator< ATL::CComAggObject< x > > > \
    _CreatorClass;

#define DECLARE_NOT_AGGREGATABLE(x) public:\
  typedef ATL::CComCreator2< \
    ATL::CComCreator< ATL::CComObject< x > >, \
    ATL::CComFailCreator<CLASS_E_NOAGGREGATION> > \
    _CreatorClass;

#define DECLARE_ONLY_AGGREGATABLE(x) public:\
  typedef ATL::CComCreator2< \
    ATL::CComFailCreator<E_FAIL>, \
    ATL::CComCreator< ATL::CComAggObject< x > > > \
    _CreatorClass;

Using these macros, you can declare that CPenguin can be activated both standalone and aggregated like this:

class CPenguin : ... {
public:
    ...
    DECLARE_AGGREGATABLE(CPenguin)
};

Table4.3 summarizes the classes the Creators use to derive from your class.

Table 4.3. Creator Type-Definition Macros

Macro	Standalone	Aggregation
`DECLARE_AGGREGATABLE`	`CComObject`	`CComAggObject`
`DECLARE_NOT_AGGREGATABLE`	`CComObject`	–
`DECLARE_ONLY_AGGREGATABLE`	–	`CComAggObject`
`DECLARE_POLY_AGGREGATABLE`	`CComPolyObject`	`CComPolyObject`

Private Initialization

Creators are handy because they follow the multiphase construction sequence ATL-based objects use. However, Creators return only an interface pointer, not a pointer to the implementing class (as in IBird* instead of CPenguin*). This can be a problem if the class exposes public member functions or if member data is not available via a COM interface. Your first instinct as a former C programmer might be to simply cast the resultant interface pointer to the type you’d like:

STDMETHODIMP
CAviary::CreatePenguin(BSTR bstrName, long nWingspan,
    IBird** ppbird) {
    HRESULT hr;
    hr = CPenguin::_CreatorClass::CreateInstance(0,
        IID_IBird, (void**)ppbird);
    if( SUCCEEDED(hr) ) {
        // Resist this instinct!
        CPenguin* pPenguin = (CPenguin*)(*ppbird);
        pPenguin->Init(bstrName, nWingspan);
    }
    return hr;
}

Unfortunately, because QueryInterface allows interfaces of a single COM identity to be implemented on multiple C++ objects or even multiple COM objects, in many cases a cast won’t work. Instead, you should use the CreateInstance static member functions of CComObject, CComAggObject, and CComPolyObject:

static HRESULT WINAPI
CComObject::CreateInstance(CComObject<Base>** pp);

static HRESULT WINAPI
CComAggObject::CreateInstance(IUnknown* puo,
    CComAggObject<contained>** pp);

static HRESULT WINAPI
CComPolyObject::CreateInstance(IUnknown* puo,
    CComPolyObject<contained>** pp);

These static member functions do not make Creators out of CComObject, CComAggObject, or CComPolyObject, but they each perform the additional work required to call the object’s FinalConstruct (and _AtlInitialConstruct, and so on) member functions. The reason to use them, however, is that each of them returns a pointer to the most derived class:

STDMETHODIMP
CAviary::CreatePenguin(BSTR bstrName, long nWingspan,
    IBird** ppbird) {
    HRESULT hr;
    CComObject<CPenguin>* pPenguin = 0;
    hr = CComObject<CPenguin>::CreateInstance(&pPenguin);
    if( SUCCEEDED(hr) ) {
        pPenguin->AddRef();
        pPenguin->Init(bstrName, nWingspan);
        hr = pPenguin->QueryInterface(IID_IBird, (void**)ppbird);
        pPenguin->Release();
    }
    return hr;
}

The class you use for creation in this manner depends on the kind of activation you want. For standalone activation, use CComObject::CreateInstance. For aggregated activation, use CComAggObject::CreateInstance. For either standalone or aggregated activation that saves a set of vtbl s at the expense of per-instance overhead, use CComPolyObject::CreateInstance.

Multiphase Construction on the Stack

When creating an instance of an ATL-based COM object, you should always use a Creator (or the static CreateInstance member function of CComObject, et al) instead of the C++ operator new. However, if you’ve got a global or a static object, or an object that’s allocated on the stack, you can’t use a Creator because you’re not calling new. As discussed earlier, ATL provides two classes for creating instances that aren’t on the heap: CComObjectGlobal and CComObjectStack. However, instead of requiring you to call FinalConstruct (and FinalRelease) manually, both of these classes perform the proper initialization and shutdown in their constructors and destructors, as shown here in CComObjectGlobal:

template <class Base>
class CComObjectGlobal : public Base {
public:
    typedef Base _BaseClass;
    CComObjectGlobal(void* = NULL) {
        m_hResFinalConstruct = S_OK;
        __if_exists(FinalConstruct) {
            __if_exists(InternalFinalConstructAddRef) {
                InternalFinalConstructAddRef();
        }
        m_hResFinalConstruct = _AtlInitialConstruct();
        if (SUCCEEDED(m_hResFinalConstruct))
            m_hResFinalConstruct = FinalConstruct();
        __if_exists(InternalFinalConstructRelease) {
                InternalFinalConstructRelease();
            }
        }
    }
    ~CComObjectGlobal() {
        __if_exists(FinalRelease) {
            FinalRelease();
        }
    }
    ...
    HRESULT m_hResFinalConstruct;
};

Because there is no return code from a constructor, if you’re interested in the result from FinalConstruct, you must check the cached result in the public member variable m_hResFinalConstruct.

Note in the previous code the use of the new __if_exists C++ keyword. This keyword allows for conditional compilation based on the presence of a symbol or member function. Derived classes, for instance, can check for the existence of particular members of a base class. Alternatively, the __if_not_exists keyword can be used to conditionally compile code based on the absence of specific symbol. These keywords are analogous to the #ifdef and #ifndef preprocessor directives, except that they operate on symbols that are not removed during the preprocessing stage.

Debugging

ATL provides a number of helpful debugging facilities, including both a normal and a categorized wrapper for producing debug output, a macro for making assertions, and debug output for tracing calls to QueryInterface, AddRef, and Release on an interface-by-interface basis. Of course, during a release build, all these debugging facilities fall away to produce the smallest, fastest binary image possible.

Making Assertions

Potentially the best debugging technique is to use assertions, which enable you to make assumptions in your code and, if those assumptions are invalidated, to be notified immediately. Although ATL doesn’t exactly support assertions, it does provide the ATLASSERT macro. However, it’s actually just another name for the Microsoft CRT macro _ASSERTE:

#ifndef ATLASSERT
#define ATLASSERT(expr) _ASSERTE(expr)
#endif

Flexible Debug Output

OutputDebugString is handy as the Win32 equivalent of printf, but it takes only a single string argument. We want a printf that outputs to debug output instead of standard output. ATL provides the AtlTrace function to do exactly that:

inline void _cdecl AtlTrace(LPCSTR pszFormat, ...)
inline void _cdecl AtlTrace(LPCWSTR pszFormat, ...)

Instead of calling the function directly, use the macro ATLTRACE. The macro calls the underlying function, but also adds file and line number information to the trace output. The macro expands to either a call to AtlTrace or nothing, depending on whether the _DEBUG symbol is defined. Typical usage is as follows:

HRESULT CPenguin::FinalConstruct() {
  ATLTRACE(__TEXT("%d+%d= %d\n"), 2, 2, 2+2);
}

ATLTRACE always generates output to the debug window. If you’d like to be even more selective about what makes it to debug output, ATL provides a second trace function, AtlTrace2, also with its own macro, ATLTRACE2:

void AtlTrace2(DWORD_PTR dwCategory, UINT nLevel,
    LPCSTR pszFormat, ...)
void AtlTrace2(DWORD_PTR dwCategory, UINT nLevel,
    LPCWSTR pszFormat, ...)

In addition to the format string and the variable arguments, AtlTrace2 takes a trace category and a trace level. The trace category is defined as an instance of the CTraceCategory class. ATL includes the following trace categories, already defined:

#ifdef _DEBUG
#define DECLARE_TRACE_CATEGORY( name ) \
    extern ATL::CTraceCategory name;
#else
#define DECLARE_TRACE_CATEGORY( name ) const DWORD_PTR name = 0;
#endif

DECLARE_TRACE_CATEGORY( atlTraceGeneral )
DECLARE_TRACE_CATEGORY( atlTraceCOM )
DECLARE_TRACE_CATEGORY( atlTraceQI )
DECLARE_TRACE_CATEGORY( atlTraceRegistrar )
DECLARE_TRACE_CATEGORY( atlTraceRefcount )
DECLARE_TRACE_CATEGORY( atlTraceWindowing )
DECLARE_TRACE_CATEGORY( atlTraceControls )
DECLARE_TRACE_CATEGORY( atlTraceHosting )
DECLARE_TRACE_CATEGORY( atlTraceDBClient )
DECLARE_TRACE_CATEGORY( atlTraceDBProvider )
DECLARE_TRACE_CATEGORY( atlTraceSnapin )
DECLARE_TRACE_CATEGORY( atlTraceNotImpl )
DECLARE_TRACE_CATEGORY( atlTraceAllocation )
DECLARE_TRACE_CATEGORY( atlTraceException )
DECLARE_TRACE_CATEGORY( atlTraceTime )
DECLARE_TRACE_CATEGORY( atlTraceCache )
DECLARE_TRACE_CATEGORY( atlTraceStencil )
DECLARE_TRACE_CATEGORY( atlTraceString )
DECLARE_TRACE_CATEGORY( atlTraceMap )
DECLARE_TRACE_CATEGORY( atlTraceUtil )
DECLARE_TRACE_CATEGORY( atlTraceSecurity )
DECLARE_TRACE_CATEGORY( atlTraceSync )
DECLARE_TRACE_CATEGORY( atlTraceISAPI )

// atlTraceUser categories are no longer needed.
// Just declare your own trace category using CTraceCategory.
DECLARE_TRACE_CATEGORY( atlTraceUser )
DECLARE_TRACE_CATEGORY( atlTraceUser2 )
DECLARE_TRACE_CATEGORY( atlTraceUser3 )
DECLARE_TRACE_CATEGORY( atlTraceUser4 )

#pragma deprecated( atlTraceUser )
#pragma deprecated( atlTraceUser2 )
#pragma deprecated( atlTraceUser3 )
#pragma deprecated( atlTraceUser4 )

The CTraceCategory class associates the category name with the underlying value so that it appears in the trace listing. The four atlTraceUserX categories exist for backward-compatibility; ATL versions 7 and earlier had no means of defining custom trace categories. For new code, you simply need to create a global instance of CTraceCategory like this:

CTraceCategory PenguinTraces( "CPenguin trace", 1 );
...
STDMETHODIMP CPenguin::Fly() {
    ATLTRACE2(PenguinTraces,   2,
        _T("IBird::Fly\n"));
    ATLTRACE2(PenguinTraces,   42,
        _T("Hmmm... Penguins can't fly...\n"));
    ATLTRACE2(atlTraceNotImpl, 0,
        _T("IBird::Fly not implemented!\n"));
    return E_NOTIMPL;
}

The trace level is a measure of severity, with 0 the most severe. ATL itself uses only levels 0 and 2. The documentation recommends that you stay between 0 and 4, but you can use any level up to 4,294,967,295 (although that might be a little too fine grained to be useful).

Also, because ATL uses atlTraceNotImpl so often, there’s even a special macro for it:

#define ATLTRACENOTIMPL(funcname) \
  ATLTRACE2(atlTraceNotImpl, 2, \
    _T("ATL: %s not implemented.\n"), funcname); \
  return E_NOTIMPL

This macro is used a lot in the implementations of the OLE interfaces:

STDMETHOD(SetMoniker)(DWORD, IMoniker*) {
    ATLTRACENOTIMPL(_T("IOleObjectImpl::SetMoniker"));
}

Tracing Calls to QueryInterface

ATL’s implementation of QueryInterface is especially well instrumented for debugging. If you define the _ATL_DEBUG_QI symbol before compiling, your objects will output their class name, the interface being queried for (by name [10] , if available), and whether the query succeeded or failed. This is extremely useful for reverse engineering clients’ interface requirements. For example, here’s a sample of the _ATL_DEBUG_QI output when hosting a control in IE6:

CComClassFactory - IUnknown
CComClassFactory - IClassFactory
CComClassFactory - IClassFactory
CComClassFactory -  - failed
CPenguin - IUnknown
CPenguin -  - failed
CPenguin - IOleControl
CPenguin - IClientSecurity - failed
CPenguin - IQuickActivate
CPenguin - IOleObject
CPenguin - IViewObjectEx
CPenguin - IPointerInactive - failed
CPenguin - IProvideClassInfo2
CPenguin - IConnectionPointContainer - failed
CPenguin - IPersistPropertyBag2 - failed
CPenguin - IPersistPropertyBag - failed
CPenguin - IPersistStreamInit
CPenguin - IViewObjectEx
CPenguin - IActiveScript - failed
CPenguin -  - failed
CPenguin - IOleControl
CPenguin - IOleCommandTarget - failed
CPenguin - IDispatchEx - failed
CPenguin - IDispatch
CPenguin - IOleControl
CPenguin - IOleObject
CPenguin - IOleObject
CPenguin - IRunnableObject - failed
CPenguin - IOleObject
CPenguin - IOleInPlaceObject
CPenguin - IOleInPlaceObjectWindowless
CPenguin - IOleInPlaceActiveObject
CPenguin - IOleControl
CPenguin - IClientSecurity - failed

Tracing Calls to AddRef and Release

The only calls more heavily instrumented for debugging than QueryInterface are AddRef and Release. ATL provides an elaborate scheme for tracking calls to AddRef and Release on individual interfaces. It is elaborate because each ATL-based C++ class has a single implementation of AddRef and Release, implemented in the most derived class; for example, CComObject. To overcome this limitation, when _ATL_DEBUG_INTERFACES is defined, ATL wraps each new interface [11] handed out via QueryInterface in another C++ object that implements a single interface. Each of these “thunk objects” keeps track of the real interface pointer, as well as the name of the interface and the name of the class that has implemented the interface. The thunk objects also keep track of an interface pointer specific reference count that is managed, along with the object’s reference count, in the thunk object’s implementation of AddRef and Release. As calls to AddRef and Release are made, each thunk object knows exactly which interface is being used and dumps reference count information to debug output. For example, here’s the same interaction between a control and IE6, but using _ATL_DEBUG_INTERFACES instead of _ATL_DEBUG_QI:

QIThunk-1   AddRef:   Object=0x021c2c88   Refcount=1   CComClassFactory-IUnknown
IThunk-2    AddRef:   Object=0x021c2c88   Refcount=1   CComClassFactory-IClassFactory
QIThunk-2   AddRef:   Object=0x021c2c88   Refcount=2   CComClassFactory-IClassFactory
QIThunk-2   Release:  Object=0x021c2c88   Refcount=1   CComClassFactory-IClassFactory
QIThunk-3   AddRef:   Object=0x021c2c88   Refcount=1   CComClassFactory-IClassFactory
QIThunk-2   Release:  Object=0x021c2c88   Refcount=0   CComClassFactory-IClassFactory
QIThunk-4   AddRef:   Object=0x021c2e38   Refcount=1   CPenguin-IUnknown
QIThunk-5   AddRef:   Object=0x021c2e40   Refcount=1   CPenguin-IOleControl
QIThunk-5   Release:  Object=0x021c2e40   Refcount=0   CPenguin-IOleControl
QIThunk-6   AddRef:   Object=0x021c2e60   Refcount=1   CPenguin-IQuickActivate
QIThunk-7   AddRef:   Object=0x021c2e44   Refcount=1   CPenguin-IOleObject
QIThunk-8   AddRef:   Object=0x021c2e4c   Refcount=1   CPenguin-IViewObjectEx
QIThunk-9   AddRef:   Object=0x021c2e68   Refcount=1   CPenguin-IProvideClassInfo2
QIThunk-9   Release:  Object=0x021c2e68   Refcount=0   CPenguin-IProvideClassInfo2
QIThunk-8   Release:  Object=0x021c2e4c   Refcount=0   CPenguin-IViewObjectEx
QIThunk-7   Release:  Object=0x021c2e44   Refcount=0   CPenguin-IOleObject
QIThunk-6   Release:  Object=0x021c2e60   Refcount=0   CPenguin-IQuickActivate
QIThunk-10  AddRef:   Object=0x021c2e3c   Refcount=1   CPenguin-IPersistStreamInit
QIThunk-10  Release:  Object=0x021c2e3c   Refcount=0   CPenguin-IPersistStreamInit
QIThunk-11  AddRef:   Object=0x021c2e4c   Refcount=1   CPenguin-IViewObjectEx
QIThunk-12  AddRef:   Object=0x021c2e40   Refcount=1   CPenguin-IOleControl
QIThunk-12  Release:  Object=0x021c2e40   Refcount=0   CPenguin-IOleControl
QIThunk-13  AddRef:   Object=0x021c2e38   Refcount=1   CPenguin-IDispatch
QIThunk-14  AddRef:   Object=0x021c2e40   Refcount=1   CPenguin-IOleControl
QIThunk-14  Release:  Object=0x021c2e40   Refcount=0   CPenguin-IOleControl
QIThunk-3   Release:  Object=0x021c2c88   Refcount=0   CComClassFactory-IClassFactory
QIThunk-15  AddRef:   Object=0x021c2e44   Refcount=1   CPenguin-IOleObject
QIThunk-16  AddRef:   Object=0x021c2e44   Refcount=1   CPenguin-IOleObject
QIThunk-16  Release:  Object=0x021c2e44   Refcount=0   CPenguin-IOleObject
QIThunk-15  Release:  Object=0x021c2e44   Refcount=0   CPenguin-IOleObject
QIThunk-17  AddRef:   Object=0x021c2e44   Refcount=1   CPenguin-IOleObject
QIThunk-18  AddRef:   Object=0x021c2e50   Refcount=1   CPenguin-IOleInPlaceObject
QIThunk-19  AddRef:   Object=0x021c2e50   Refcount=1   CPenguin-IOleInPlaceObjectWindowless
QIThunk-20  AddRef:   Object=0x021c2e48   Refcount=1   CPenguin-IOleInPlaceActiveObject
QIThunk-20  Release:  Object=0x021c2e48   Refcount=0   CPenguin-IOleInPlaceActiveObject
QIThunk-18  Release:  Object=0x021c2e50   Refcount=0   CPenguin-IOleInPlaceObject
QIThunk-19  AddRef:   Object=0x021c2e50   Refcount=2   CPenguin-IOleInPlaceObjectWindowless
QIThunk-17  Release:  Object=0x021c2e44   Refcount=0   CPenguin-IOleObject
QIThunk-19  Release:  Object=0x021c2e50   Refcount=1   CPenguin-IOleInPlaceObjectWindowless
QIThunk-21  AddRef:   Object=0x021c2e40   Refcount=1   CPenguin-IOleControl
QIThunk-21  Release:  Object=0x021c2e40   Refcount=0   CPenguin-IOleControl
QIThunk-22  AddRef:   Object=0x021c2e44   Refcount=1   CPenguin-IOleObject
QIThunk-23  AddRef:   Object=0x021c2e50   Refcount=1   CPenguin-IOleInPlaceObject
QIThunk-19  Release:  Object=0x021c2e50   Refcount=0   CPenguin-IOleInPlaceObjectWindowless
QIThunk-23  Release:  Object=0x021c2e50   Refcount=0   CPenguin-IOleInPlaceObject
QIThunk-22  Release:  Object=0x021c2e44   Refcount=0   CPenguin-IOleObject
QIThunk-24  AddRef:   Object=0x021c2e44   Refcount=1   CPenguin-IOleObject
QIThunk-25  AddRef:   Object=0x021c2e50   Refcount=1   CPenguin-IOleInPlaceObject
QIThunk-25  Release:  Object=0x021c2e50   Refcount=0   CPenguin-IOleInPlaceObject
QIThunk-24  Release:  Object=0x021c2e44   Refcount=0   CPenguin-IOleObject
QIThunk-13  Release:  Object=0x021c2e38   Refcount=0   CPenguin-IDispatch
QIThunk-11  Release:  Object=0x021c2e4c   Refcount=0   CPenguin-IViewObjectEx
QIThunk-26  AddRef:   Object=0x021c2e44   Refcount=1   CPenguin-IOleObject
QIThunk-26  Release:  Object=0x021c2e44   Refcount=0   CPenguin-IOleObject
QIThunk-4   Release:  Object=0x021c2e38   Refcount=0   CPenguin-IUnknown
QIThunk-1   Release:  Object=0x021c2c88   Refcount=0   CComClassFactory-IUnknown

ATL maintains a list of outstanding thunk objects. This list is used at server shutdown to detect any leaks; that is, any interfaces that the client has not released. When using _ATL_DEBUG_INTERFACES, watch your debug output for the string LEAK, which is an indication that someone has mismanaged an interface reference:

ATL: QIThunk - 4 LEAK: Object = 0x00962920 Refcount = 4
  MaxRefCount = 4 CCalc - ICalc

The most useful part of this notification is the index of QI thunk object. You can use this to track when the leaked interface is acquired by using the CAtlDebugInterfacesModule class. This is the class that manages the thunk objects during debug builds, and a global instance of this class called _AtlDebugInterfacesModule is automatically included in your class when the _ATL_DEBUG_INTERFACES symbol is defined. You can instruct the debugger to break at the appropriate time by setting the m_nIndexBreakAt member of the CAtlDebugInterfacesModule at server start-up time.

extern "C"
BOOL WINAPI DllMain(HINSTANCE hInstance, DWORD dwReason,
    LPVOID lpReserved) {
    hInstance;
    BOOL b = _AtlModule.DllMain(dwReason, lpReserved);
    // Trace down interface leaks
#ifdef _ATL_DEBUG_INTERFACES
    _AtlDebugInterfacesModule.m_nIndexBreakAt = 4;
#endif
    return b;
}

When that interface thunk is allocated, _AtlDebugInterfacesModule calls DebugBreak, handing control over to the debugger and allowing you to examine the call stack and plug the leak.

_ATL_DEBUG_REFCOUNT

Versions of ATL earlier than version 3 used the _ATL_DEBUG_REFCOUNT symbol to track interface reference counts for ATL IXxxImpl classes only. Because _ATL_DEBUG_INTERFACES is much more general, it has replaced _ATL_DEBUG_REFCOUNT, although _ATL_DEBUG_REFCOUNT is still supported for backward compatibility.

#ifdef _ATL_DEBUG_REFCOUNT
#ifndef _ATL_DEBUG_INTERFACES
#define _ATL_DEBUG_INTERFACES
#endif
#endif

Summary

ATL provides a layered approach to implementing IUnknown. The top layer, represented by the CComXxxThreadModel classes, provides helper functions and type definitions for synchronization required of both STAs and MTAs. The second level, CComObjectRootEx, uses the threading model classes to support “just thread-safe enough”AddRef and Release implementations and object-level locking. CComObjectRootEx also provides a table-driven implementation of QueryInterface, using an interface map provided by your class. Your class derives from CComObjectRootEx and any number of interfaces, providing the interface member function implementations. The final level is provided by CComObject and friends, which provide the implementation of QueryInterface, AddRef, and Release based on the lifetime and identity requirements of the object.

To allow each class to define its one lifetime and identity requirements, each class defines its own _CreatorClass, which defines the appropriate Creator. The Creator is responsible for properly creating an instance of your ATL-base class and should be used in place of the C++ operator new.

Finally, to debug your objects, ATL provides a number of debugging facilities, including tracing and interface usage and leak tracking.