Chapter 14. ATL Server Internals

ATL Server provides a robust implementation of an ISAPI extension right out of the box. It manages threading and IIS resources so you don’t have to. You’ve already seen how to use ATL Server in Chapter 13, “Hello, ATL Server”; now let’s take a look under the hood and see how it works.

Implementing ISAPI in ATL Server

The CIsapiExtension class is the heart of ATL’s implementation of the ISAPI interface.

template <class ThreadPoolClass=CThreadPool<CIsapiWorker>,
  class CRequestStatClass=CNoRequestStats,
  class HttpUserErrorTextProvider=CDefaultErrorProvider,
  class WorkerThreadTraits=DefaultThreadTraits,
  class CPageCacheStats=CNoStatClass,
  class CStencilCacheStats=CNoStatClass>
class CIsapiExtension :
  public IServiceProvider,
  public IIsapiExtension,
  public IRequestStats {
protected:
  CIsapiExtension();

  DWORD HttpExtensionProc(LPEXTENSION_CONTROL_BLOCK lpECB) ;
  BOOL GetExtensionVersion(__out HSE_VERSION_INFO* pVer) ;
  BOOL TerminateExtension(DWORD /*dwFlags*/) ;

  // ...
};

As you can see, this class is heavily templated. Three of the template parameters (CRequestStatClass, CPageCacheStats, and CStencilCacheStats) are used for performance tracking and logging. The default template parameters result in no logging or performance counters being used; ATL Server provides other implementation that will gather statistics for you, but because that logging can have a significant performance impact, it’s turned off by default.

The three CIsapiExtension methods contain the actual implementations of the three ISAPI functions. The GetExtensionVersion method is long but fairly straightforward. Because this is the method called when the ISAPI extension is first loaded, the class does most of its initialization here:

BOOL GetExtensionVersion( HSE_VERSION_INFO* pVer) {
  // allocate a Tls slot for storing per thread data
  m_dwTlsIndex = TlsAlloc();

  // create a private heap for request data
  // this heap has to be thread safe to allow for
  // async processing of requests
  m_hRequestHeap = HeapCreate(0, 0, 0);
  if (!m_hRequestHeap) {
    m_hRequestHeap = GetProcessHeap();
    if (!m_hRequestHeap) {
      return SetCriticalIsapiError(IDS_ATLSRV_CRITICAL_HEAPCREATEFAILED);
    }
  }

  // create a private heap (synchronized) for
  // allocations. This reduces fragmentation overhead
  // as opposed to the process heap
  HANDLE hHeap = HeapCreate(0, 0, 0);
  if (!hHeap) {
    hHeap = GetProcessHeap();
    m_heap.Attach(hHeap, false);
  } else {
    m_heap.Attach(hHeap, true);
  }
  hHeap = NULL;

  if (S_OK != m_WorkerThread.Initialize()) {
      return SetCriticalIsapiError(IDS_ATLSRV_CRITICAL_WORKERINITFAILED);
  }

  if (m_critSec.Init() != S_OK) {
      HRESULT hrIgnore=m_WorkerThread.Shutdown();
      return SetCriticalIsapiError(IDS_ATLSRV_CRITICAL_CRITSECINITFAILED);
  }
  if (S_OK != m_ThreadPool.Initialize(
    static_cast<IIsapiExtension*>(this), GetNumPoolThreads(),
    GetPoolStackSize(), GetIOCompletionHandle())) {
    HRESULT hrIgnore=m_WorkerThread.Shutdown();
    m_critSec.Term();
    return SetCriticalIsapiError(
      IDS_ATLSRV_CRITICAL_THREADPOOLFAILED);
  }

  if (FAILED(m_DllCache.Initialize(&m_WorkerThread,
    GetDllCacheTimeout()))) {
    HRESULT hrIgnore=m_WorkerThread.Shutdown();
    m_ThreadPool.Shutdown();
    m_critSec.Term();
    return SetCriticalIsapiError(
      IDS_ATLSRV_CRITICAL_DLLCACHEFAILED);
  }

  if (FAILED(m_PageCache.Initialize(&m_WorkerThread))) {
    HRESULT hrIgnore=m_WorkerThread.Shutdown();
    m_ThreadPool.Shutdown();
    m_DllCache.Uninitialize();
    m_critSec.Term();
    return SetCriticalIsapiError(
      IDS_ATLSRV_CRITICAL_PAGECACHEFAILED);
  }

  if (S_OK != m_StencilCache.Initialize(
    static_cast<IServiceProvider*>(this),
    &m_WorkerThread,
    GetStencilCacheTimeout(),
    GetStencilLifespan())) {
    HRESULT hrIgnore=m_WorkerThread.Shutdown();
    m_ThreadPool.Shutdown();
    m_DllCache.Uninitialize();
    m_PageCache.Uninitialize();
    m_critSec.Term();
    return SetCriticalIsapiError(IDS_ATLSRV_CRITICAL_STENCILCACHEFAILED);
  }

  pVer->dwExtensionVersion = HSE_VERSION;
  Checked::strncpy_s(pVer->lpszExtensionDesc,
    HSE_MAX_EXT_DLL_NAME_LEN, GetExtensionDesc(), _TRUNCATE);
  pVer->lpszExtensionDesc[HSE_MAX_EXT_DLL_NAME_LEN - 1] = '\0';

  return TRUE;
}

This method allocates two Win32 heaps for use during request process, sets up a thread pool, and initializes various caches.

The real action takes place in the HttpExtensionProc method. This is called for every HTTP request that IIS routes to our extension DLL. Before we look at the implementation of this method, we need to look at how to achieve high performance in a server environment.

Performance and Multithreading

Any production web server needs to handle many simultaneous network requests. In the original web extension platform, the Common Gateway Interface (CGI), each request was handled by spawning a new process. This process handled that one request and then exited. This worked acceptably on UNIX for small sites, but process creation overhead soon limited the number of simultaneous requests that could be processed.

This process-creation model was made even worse on Windows, where creating processes is much more expensive. However, there’s a fairly obvious alternative in Win32: use a thread per request instead of a process. Threads are much, much cheaper to start. Unfortunately, the obvious solution is somewhat less obviously wrong in large systems. Threads might be cheap, but they’re not free. As the number of threads increases, the CPU spends more time on thread management and less time actually doing the work of serving your web site.

The solution comes from the stateless nature of HTTP. Because each request is independent, it doesn’t matter which specific thread processes a request. More usefully, when a thread is done processing a request, instead of dying, it can be reused to process another request. This design is called a thread pool.

IIS uses a thread pool internally to handle incoming traffic. Each request is handed off to a thread in the pool. The thread services the request (by either returning static content off the disk or executing the HttpExtensionProc of the appropriate ISAPI extension DLL). In general, this works well, but the thread has to finish its processing quickly. If all the threads in the IIS pool are busy, new requests start getting dropped. Serving static content is a low-overhead process. But when you start executing arbitrary code (to generate dynamic HTML, for example), suddenly the time it takes for the thread to return to the pool is much less predictable, and it could be much longer.

So, we need to return the IIS thread back to the pool as soon as possible. But we also need to actually perform our processing to handle the request. Instead of forcing every developer to micro-optimize every statement of the ISAPI extension to get the thread back to the pool, ATL Server provides its own thread pool. On a request, the HttpExtensionProc (which is running on the IIS thread) places the request into the internal thread pool. The IIS thread then returns, ready to process another request. The code follows:

DWORD HttpExtensionProc(LPEXTENSION_CONTROL_BLOCK lpECB) {
  AtlServerRequest *pRequestInfo = NULL;
  _ATLTRY {
    pRequestInfo = CreateRequest();
    if (pRequestInfo == NULL)
      return HSE_STATUS_ERROR;

    CServerContext *pServerContext = NULL;
    ATLTRY(pServerContext = CreateServerContext(m_hRequestHeap));
    if (pServerContext == NULL) {
      FreeRequest(pRequestInfo);
      return HSE_STATUS_ERROR;
    }

    pServerContext->Initialize(lpECB);
    pServerContext->AddRef();

    pRequestInfo->pServerContext = pServerContext;
    pRequestInfo->dwRequestType = ATLSRV_REQUEST_UNKNOWN;
    pRequestInfo->dwRequestState = ATLSRV_STATE_BEGIN;
    pRequestInfo->pExtension =
      static_cast<IIsapiExtension *>(this);
    pRequestInfo->pDllCache =
      static_cast<IDllCache *>(&m_DllCache);
#ifndef ATL_NO_MMSYS
    pRequestInfo->dwStartTicks = timeGetTime();
#else
    pRequestInfo->dwStartTicks = GetTickCount();
#endif
    pRequestInfo->pECB = lpECB;

    m_reqStats.OnRequestReceived();

    if (m_ThreadPool.QueueRequest(pRequestInfo))
      return HSE_STATUS_PENDING;

    if (pRequestInfo != NULL) {
      FreeRequest(pRequestInfo);
    }
  }
  _ATLCATCHALL() { }
  return HSE_STATUS_ERROR;
}

The CreateRequest method simply allocates a chunk of memory from the request heap to store the information about the request:

struct AtlServerRequest {
  // For future compatibility
  DWORD cbSize;

  // Necessary because it wraps the ECB
  IHttpServerContext *pServerContext;

  // Indicates whether it was called through an .srf file or
  // through a .dll file
  ATLSRV_REQUESTTYPE dwRequestType;
  // Indicates what state of completion the request is in
  ATLSRV_STATE dwRequestState;
  // Necessary because the callback (for async calls) must
  // know where to route the request
  IRequestHandler *pHandler;
  // Necessary in order to release the dll properly
  // (for async calls)
  HINSTANCE hInstDll;
  // Necessary to requeue the request (for async calls)
  IIsapiExtension *pExtension;
  // Necessary to release the dll in async callback
  IDllCache* pDllCache;

  HANDLE hFile;
  HCACHEITEM hEntry;
  IFileCache* pFileCache;

  // necessary to synchronize calls to HandleRequest
  // if HandleRequest could potentially make an
  // async call before returning. only used
  // if indicated with ATLSRV_INIT_USEASYNC_EX
  HANDLE m_hMutex;
  // Tick count when the request was received
  DWORD dwStartTicks;
  EXTENSION_CONTROL_BLOCK *pECB;
  PFnHandleRequest pfnHandleRequest;
  PFnAsyncComplete pfnAsyncComplete;
  // buffer to be flushed asynchronously
  LPCSTR pszBuffer;
  // length of data in pszBuffer
  DWORD dwBufferLen;
  // value that can be used to pass user data between
  // parent and child handlers
  void* pUserData;
};

AtlServerRequest *CreateRequest() {
    // Allocate a fixed block size to avoid fragmentation
    AtlServerRequest *pRequest = (AtlServerRequest *) HeapAlloc(
      m_hRequestHeap, HEAP_ZERO_MEMORY,
      __max(sizeof(AtlServerRequest),
        sizeof(_CComObjectHeapNoLock<CServerContext>)));
    if (!pRequest) return NULL;

    pRequest->cbSize = sizeof(AtlServerRequest);
    return pRequest;
}

As you can see, there’s all the information that IIS supplies about the request (the ECB pointer), plus a whole lot more.

The ATL Server Thread Pool

ATL Server provides a thread pool implementation in the CThreadPool class:

template <class Worker,
  class ThreadTraits=DefaultThreadTraits,
  class WaitTraits=DefaultWaitTraits>
class CThreadPool : public IThreadPoolConfig {
    // ...
};

The template parameters give you control over how threads are created and what they do. The Worker template parameter lets you specify what class will actually do the processing of the request. The ThreadTraits class controls how a thread is created. Depending on the ATL_MIN_CRT setting, DefaultThreadTraits is a typedef to one of two other classes:

class CRTThreadTraits {
public:
  static HANDLE CreateThread(LPSECURITY_ATTRIBUTES lpsa,
      DWORD dwStackSize, LPTHREAD_START_ROUTINE pfnThreadProc,
      void *pvParam, DWORD dwCreationFlags, DWORD *pdwThreadId) {
    // _beginthreadex calls CreateThread
    // which will set the last error value
    // before it returns.
    return (HANDLE) _beginthreadex(lpsa, dwStackSize,
      (unsigned int (__stdcall *)(void *)) pfnThreadProc,
      pvParam, dwCreationFlags, (unsigned int *) pdwThreadId);
  }
};

class Win32ThreadTraits {
public:
  static HANDLE CreateThread(LPSECURITY_ATTRIBUTES lpsa,
      DWORD dwStackSize, LPTHREAD_START_ROUTINE pfnThreadProc,
      void *pvParam, DWORD dwCreationFlags, DWORD *pdwThreadId) {
    return ::CreateThread(lpsa, dwStackSize, pfnThreadProc,
      pvParam, dwCreationFlags, pdwThreadId);
  }
};

#if !defined(_ATL_MIN_CRT) && defined(_MT)
    typedef CRTThreadTraits DefaultThreadTraits;
#else
    typedef Win32ThreadTraits DefaultThreadTraits;
#endif

As part of initialization, the CThreadPool class uses the ThreadTraits class to create the initial set of threads. The threads in the pool all run this thread proc:

DWORD ThreadProc() {
  DWORD dwBytesTransfered;
  ULONG_PTR dwCompletionKey;

  OVERLAPPED* pOverlapped;

  // this block is to ensure theWorker gets destructed before the
  // thread handle is closed {
    // We instantiate an instance of the worker class on the
    // stack for the life time of the thread.
    Worker theWorker;
    if (theWorker.Initialize(m_pvWorkerParam) == FALSE) {
      return 1;
    }

    SetEvent(m_hThreadEvent);
    // Get the request from the IO completion port
    while (GetQueuedCompletionStatus(m_hRequestQueue,
      &dwBytesTransfered, &dwCompletionKey, &pOverlapped,
      INFINITE)) {
      if (pOverlapped == ATLS_POOL_SHUTDOWN) // Shut down {
        LONG bResult = InterlockedExchange(&m_bShutdown, FALSE);
        if (bResult) // Shutdown has not been cancelled
          break;

      // else, shutdown has been cancelled  continue as before
      }
      else {
        // Do work
        Worker::RequestType request =
          (Worker::RequestType) dwCompletionKey;

        // Process the request. Notice the following:
        // (1) It is the worker's responsibility to free any
        // memory associated with the request if the request is
        // complete
        // (2) If the request still requires some more processing
        // the worker should queue the request again for
        // dispatching
        theWorker.Execute(request, m_pvWorkerParam, pOverlapped);
      }
    }

    theWorker.Terminate(m_pvWorkerParam);
  }

  m_dwThreadEventId = GetCurrentThreadId();
  SetEvent(m_hThreadEvent);

  return 0;
}

The overall logic is fairly common in a thread pool. The thread sits waiting on the I/O Completion port for requests to come in. A special value is used to tell the thread to shut down; if it’s not shut down, the request is passed off to the worker object to do the actual work.

The worker class can be anything with a RequestType typedef and the appropriate Execute method.

At this point, ATL Server has already provided a greatly improved ISAPI development experience. The hard work to maintain the performance of the server has been done; all you need to do is write a worker class and implement your logic in the Execute method. This still leaves you with the job of generating the HTML to send to the client. This isn’t too hard in C++, [1] but it is tedious, and building HTML in code means that you have to recompile to change a spelling error. What’s really needed is some way to generate the HTML based on a template. ATL Server does this via Server Response Files.

Server Response Files

ATL Server provides a text-replacement system called Server Response Files (referred to in the ATL Server code and documentation occasionally as Stencil Files). An .srf file is an HTML file with some replacement markers. Consider this example, which is used to display the lyrics to a classic song:

<html>
{{handler Beverage.dll/Default}}
<head>
  <title>The Beverage Song</title>
</head>
<body>
{{if InputValid}}
{{while MoreDrinks}}
<p/>
{{DrinkNumber}} bottles of {{Beverage}} on the wall, <br />
{{DrinkNumber}} bottles of {{Beverage}}.<br />
Take one down, pass it around,<br />
{{NextDrink}} bottles of {{Beverage}} on the wall.<br />
{{endwhile}}
{{else}}
<h1>You must specify a beverage and the number of them
 in the query string.</h1>
{{endif}}
</body>
</html>
<head>

As we discussed in Chapter 13, the items within {{ }} are used for one of three purposes. They can be directives to the stencil processor (the handler directive), they can work as flow control (if and while), or they can be replaced at runtime by the request handler class. Any text outside the markers is simply passed straight to the output.

The actual replacements are handled by a class referred to a request handler. An example request handler for the song follows:

class CBeverageHandler
  : public CRequestHandlerT<CBeverageHandler> {

public:
  BEGIN_REPLACEMENT_METHOD_MAP(CBeverageHandler)
    REPLACEMENT_METHOD_ENTRY("InputValid", OnInputValid)
    REPLACEMENT_METHOD_ENTRY("MoreDrinks", OnMoreDrinks)
    REPLACEMENT_METHOD_ENTRY("DrinkNumber", OnDrinkNumber)
    REPLACEMENT_METHOD_ENTRY("Beverage", OnBeverage)
    REPLACEMENT_METHOD_ENTRY("NextDrink", OnNextDrink)
  END_REPLACEMENT_METHOD_MAP()

  HTTP_CODE ValidateAndExchange() {
    m_numDrinks = 0;
    m_HttpRequest.GetQueryParams().Exchange( "numdrinks",
      &m_numDrinks );
    m_beverage =
      m_HttpRequest.GetQueryParams().Lookup("beverage");

    m_HttpResponse.SetContentType("text/html");

    return HTTP_SUCCESS;
  }

protected:
  HTTP_CODE OnInputValid( ) {
    if( m_numDrinks == 0 || m_beverage.IsEmpty() ) {
      return HTTP_S_FALSE;
    }
    return HTTP_SUCCESS;
  }

  HTTP_CODE OnMoreDrinks( ) {
    if( m_numDrinks > 0 ) {
      return HTTP_SUCCESS;
  }
      return HTTP_S_FALSE;
  }

  HTTP_CODE OnDrinkNumber( ) {
    m_HttpResponse << m_numDrinks;
    return HTTP_SUCCESS;
  }

  HTTP_CODE OnBeverage( ) {
    m_HttpResponse << m_beverage;
    return HTTP_SUCCESS;
  }

  HTTP_CODE OnNextDrink( ) {
    m_numDrinks;
    if( m_numDrinks > 0 ) {
      m_HttpResponse << m_numDrinks;
    } else {
      m_HttpResponse << "No more";
    }
    return HTTP_SUCCESS;
  }

private:
  long m_numDrinks;
  CStringA m_beverage;
};

Request handlers inherit from the CRequestHandlerT base class. A request handler needs to implement the ValidateAndExchange method, which gets called at the start of processing the HTTP request. In processing a form post, this is where you would process the submitted form fields. If this function returns HTTP_FAIL, the request is aborted and IIS sends back an HTTP 500 error to the client.

If, as you would usually prefer, ValidateAndExchange returns HTTP_SUCCESS, the stencil processor starts rendering the SRF file. Each time a replacement occurs, the processor calls back into the response-handler object.

The REPLACEMENT_METHOD_MAP() macros in the response-handler class are used to specify which methods should be called for which replacement. In the previous code, this line says that when the {{Beverage}} replacement is found in the .srf file, the OnBeverage method should be called:

REPLACEMENT_METHOD_ENTRY("Beverage", OnBeverage)

Actually generating the output is fairly simple using the m_HttpResponse member, which is inherited from the CRequestHandlerT base class. This is an instance of the CHttpResponse class, already initialized and ready to use. Figure 14.1 shows the result of this page running.

Figure 14.1. Some tasty beverages to sing about

[View full size image]

Request-Handler Routing

How does the stencil processor know which response handler class to use? In the .srf file itself, you might have noticed this line:

{{handler Beverage.dll/Default}}

The handler directive says which DLL the handler is in (Beverage.dll, in this case) and what the name of the handler is (Default). This might seem strange because the name of our handler class isn’t Default; it’s CBeverageHandler. ATL Server isn’t reading anybody’s mind here. Instead, a global map in the response DLL provides the mapping between the name you use in the handler directive and the actual class. If you look in your request handler project’s .cpp file, you’ll see something like this at global scope:

// Beverage.cpp
...
BEGIN_HANDLER_MAP()
  HANDLER_ENTRY("Default", CBeverageHandler)
END_HANDLER_MAP()

This is one way to get your handler into the map: Simply add a new HANDLER_ENTRY macro to the map every time you add a new request-handler class. However, this global map is difficult to maintain over time. It sure would be nice to have the handler name with the class that handles it.

Much like the COM_OBJECT_ENTRY_AUTO macro for ATL COM classes, there’s a macro that you can put in your .h file instead: DECLARE_REQUEST_HANDLER. You use it like this:

class CBeverageHandler : ... { ... };

DECLARE_REQUEST_HANDLER( "Default", CBeverageHandler,
  ::CBeverageHandler )

This macro uses similar linker tricks to the COM_OBJECT_ENTRY_AUTO macro to stitch together the tables at link time. The default project generated by the ATL Server project template uses HANDLER_ENTRY; for your own request-handler classes, I would recommend using DECLARE_REQUEST_HANDLER instead. Unfortunately, DECLARE_REQUEST_HANDLER is undocumented at this time. The parameters to the macro are, in order, the handler name, the name of the request-handler class without any namespaces, and the name of the request handler including the namespaces.

Now that you’ve seen the various pieces, let’s look at the .srf-processing pipeline. The first stop for the HTTP request is IIS. IIS checks its configuration and finds that, for this virtual directory, it should route the request to our ATL Server ISAPI Extension DLL.

So IIS loads (on the first request) the extension DLL and calls the HttpExtensionProc method. This immediately calls into the global instance of CIsapiExtension.

CIsapiExtension takes the request, builds a CServerContext object, places the request onto its internal thread pool, and releases the IIS thread back to handle another incoming request.

Meanwhile, the extension DLL’s thread-pool threads are hungrily waiting for work to come in. The first one available pulls the request off the internal queue and hands it to the working class (which is, by default, CIsapiWorker).

The actual work is done in the Execute() method:

void CIsapiWorker::Execute(AtlServerRequest *pRequestInfo,
  void *pvParam, OVERLAPPED *pOverlapped) {
  _ATLTRY {
    (static_cast<IIsapiExtension*>(pvParam))->
      DispatchStencilCall(pRequestInfo);
  } _ATLCATCHALL() {
  ATLASSERT(FALSE);
  }
}

A pointer to the CIsapiExtension object is passed in via the pvParam parameter. The worker object then turns around and calls back into the CIsapiExtension via the DispatchStencilCall method. Why go back to the CIsapiExtension instead of doing the work within the worker class? The following chunk of the DispatchStencilCall method reveals the answer:

BOOL DispatchStencilCall(AtlServerRequest *pRequestInfo) {
  ...
      HTTP_CODE hcErr = HTTP_SUCCESS;
      if (pRequestInfo->dwRequestState == ATLSRV_STATE_BEGIN) {
        BOOL bAllowCaching = TRUE;
        if (TransmitFromCache(pRequestInfo, &bAllowCaching)) {
          return TRUE;
        }
      ...
      }
  ...
}

The results of processing the SRF file are stored in a cache and are regenerated only when needed. The cache is stored in the ISAPI extension object so that it is available to all the worker threads.

The DispatchStencilCall method takes care of the details of the various states in which a request can be. The request eventually ends up at a new instance of your request-handler object, and that’s where we go next.

Request Handlers

All request handlers derive from the CRequestHandlerT template:

template < class THandler,
           class ThreadModel=CComSingleThreadModel,
           class TagReplacerType=CHtmlTagReplacer<THandler> >
class CRequestHandlerT :
    public TagReplacerType,
    public CComObjectRootEx<ThreadModel>,
    public IRequestHandlerImpl<THandler> {
public:
    // public CRequestHandlerT members
    CHttpResponse m_HttpResponse;
    CHttpRequest m_HttpRequest;
    ATLSRV_REQUESTTYPE m_dwRequestType;
    AtlServerRequest* m_pRequestInfo;

    CRequestHandlerT() ;
    ~CRequestHandlerT() ;

    void ClearResponse() ;

    // Where user initialization should take place
    HTTP_CODE ValidateAndExchange();

    // Where user Uninitialization should take place
    HTTP_CODE Uninitialize(HTTP_CODE hcError);

    // HandleRequest is called to perform default processing
    // of HTTP requests. Users can override this function in
    // their derived classes if they need to perform specific
    // initialization prior to processing this request or
    // want to change the way the request is processed.
    HTTP_CODE HandleRequest(
        AtlServerRequest *pRequestInfo,
        IServiceProvider* /*pServiceProvider*/);

    HTTP_CODE ServerTransferRequest(LPCSTR szRequest,
        bool bContinueAfterTransfer=false,
        WORD nCodePage = 0, CStencilState *pState = NULL);

    ...
}

The CRequestHandlerT class provides the m_HttpRequest object as a way of accessing the request data, and the m_HttpResponse object that is used to build the response to go back to the client. The previous code block shows some of the more useful methods of this class. Some, such as ServerTransferRequest, are available for you to call from your request handler. Others, such as ValidateAndExchange, exist to be overridden in your derived class.

The actual processing of the stencil file is handled via the TagReplacerType template parameter, which defaults to ChtmlTagReplacer. This class is itself a template:

template <class THandler, class StencilType=CHtmlStencil>
class CHtmlTagReplacer :
    public ITagReplacerImpl<THandler>
{ ... }

There’s also a second layer of templates here. The CHtmlTagReplacer actually exists to manage the stencil cache. For each .srf file, a stencil object is created the first time. The .srf file is then parsed into a series of StencilToken objects, which are stored in an array in the stencil object. Rendering the HTML is done by walking the array and rendering each token. That stencil object is then stored in the cache for later use. This way the parsing is done only once.

By default, the type of stencil object created is CHtmlStencil. This class knows about all the replacement tags that can occur in .srf files. However, it is a template parameter and, as such, can be overridden to add new replacement tags. This is your opportunity to customize the stencil replacement system: Create a new stencil class (which should derive from CStencil) and override the parsing methods to add new tags to the processing.

An Example Request Handler

Let’s see how this comes together. Here’s an example .srf file that’s part of a simple online forum [2] system, to provide a list of forums available:

<html>
{{handler SimpleForums.dll/ForumList}}
<head>
    <title>Forums</title>
</head>
<body>
<h1>ATL Server Simple Forums</h1>
<p>There are {{NumForums}} forums on this system.</p>
{{while MoreForums}}
    <h2><a href="{{LinkToForum}}">{{ForumName}}</a></h2>
    <p>{{ForumDescription}}</p>
    <p><a href="{{LinkToEditForum}}">Edit Forum Settings</a></p>
    <br />
{{NextForum}}
{{endwhile}}
</body>
</html>

This file uses not only the {{handler}} directive, but also textual replacements and the {{while}} loop.

So, we need a forum list handler. The handler class looks like this: [3]

class ForumListHandler :
  public CRequestHandlerT<ForumListHandler> {
public:
  ForumListHandler(void);
public:
  virtual ~ForumListHandler(void);

public:
BEGIN_REPLACEMENT_METHOD_MAP(ForumListHandler)
  REPLACEMENT_METHOD_ENTRY("NumForums", OnNumForums)
  REPLACEMENT_METHOD_ENTRY("MoreForums", OnMoreForums)
  REPLACEMENT_METHOD_ENTRY("NextForum", OnNextForum)
  REPLACEMENT_METHOD_ENTRY("ForumName", OnForumName)
  REPLACEMENT_METHOD_ENTRY("ForumDescription",
    OnForumDescription)
  REPLACEMENT_METHOD_ENTRY("LinkToForum", OnLinkToForum)
  REPLACEMENT_METHOD_ENTRY("LinkToEditForum",
    OnLinkToEditForum)
END_REPLACEMENT_METHOD_MAP()

  HTTP_CODE ValidateAndExchange();

private:

  HTTP_CODE OnNumForums( );
  HTTP_CODE OnMoreForums( );
  HTTP_CODE OnNextForum( );
  HTTP_CODE OnForumName( );
  HTTP_CODE OnLinkToForum( );
  HTTP_CODE OnLinkToEditForum( );
  HTTP_CODE OnForumDescription( );

private:

  ForumList m_forums;
  CComPtr< _Recordset > m_forumsRecordSet;
};

The action starts for this class in the ValidateAndExchange method, which is called at the start of processing after the m_HttpRequest variable has been created.

#define AS_HR(ex) { \
  HRESULT_hr = ex; if(FAILED(_hr)) { return HTTP_FAIL; } }
HTTP_CODE ForumListHandler::ValidateAndExchange() {
    // Set the content-type
    m_HttpResponse.SetContentType("text/html");

    AS_HR( m_forums.Open( ) );
    AS_HR( m_forums.ReadAllForums( &m_forumsRecordSet ) );

    return HTTP_SUCCESS;
}

The return value, HTTP_CODE, is used to signal what HTTP return code to send back to the client. If this function returns HTTP_SUCCESS, the processing continues. On the other hand, if something is wrong, you can return a different value (such as HTTP_FAIL) to abort the processing and send an HTTP failure code back to the browser.

The HTTP_CODE type is actually a typedef for a DWORD, and it packs multiple data items into those 32 bits (much like hrESULT does). The high 16 bits contain the HTTP status code that should be returned. The lower 16 bits specify a code to tell IIS what to do with the rest of the request. Take a look at MSDN for the set of predefined HTTP_CODE macros.

In this example, we use the data layer object in the m_forums variable to go out to our forums database and read the list of forums. Assuming that this worked, [4] we store the list (an ADO recordset) as a member variable.

The replacement functions come in two varieties: textual replacement and flow control. The OnForumName method is an example of the former. When the {{ForumName}} token is found in the SRF file, this code is run:

HTTP_CODE ForumListHandler::OnForumName( ) {
    CComBSTR name;
    AS_HR( m_forums.GetCurrentForumName( m_forumsRecordSet,
        &name ) );
    m_HttpResponse << CW2A( name );
    return HTTP_SUCCESS;
}

Here, the m_HttpResponse member is used like a C++ stream class to output the name of the current forum. The CW2A conversion class is used because our data layer is returning Unicode, but the SRF file defaults to 8-bit characters.

The flow-control tokens use the same replacement map but work very differently. Within the replacement method, the return value is the important thing:

HTTP_CODE ForumListHandler::OnMoreForums( ) {
    VARIANT_BOOL endOfRecordSet;
    AS_HR( m_forumsRecordSet->get_adoEOF( &endOfRecordSet ) );
    if( endOfRecordSet == VARIANT_TRUE ) {
        return HTTP_S_FALSE;
    }
    return HTTP_SUCCESS;
}

Here, we’re checking to see if we have any more records in our recordset. If so, we return HTTP_SUCCESS. If not, we return HTTP_S_FALSE. Much like S_FALSE is the “Succeeded, but false” hrESULT, HTTP_S_FALSE signals the stencil processor that the Boolean expression being evaluated is false, but the processing completed. In this case, the false return value causes the while loop to exit.

Handling Input

Let’s get a little further into our example and look at how to process input. Consider this HTML form used to create or edit a forum in our system:

<html>
{{handler SimpleForums.dll/EditForum}}
  <head>
    <title>Edit Forum</title>
  </head>
  <body>
    <h1>Edit Forum Information</h1>
    {{if ValidForumId}}
    <form action="editforum.srf?forumid={{ForumId}}"
     method="post">
      <table border="0" cellpadding="0">
        <tr>
          <td>
            Forum Name:
          </td>
          <td>
            <input type="text" name="forumName" id="forumName"
            maxlength="63" value="{{ForumName}}" />
          </td>
        </tr>
        <tr>
          <td>
            Forum Description:
          </td>
          <td>
            <textarea cols="50" rows="10" wrap="soft"
             id="forumDescription">
              {{ForumDescription}}
            </textarea>
          </td>
        </tr>
      </table>
      <input type="submit" />
      <a href="forumlist.srf">Return to Forum List</a>
    </form>
    {{else}}
    <p><b>You have given an invalid forum ID.
      Shame on you!</b></p>
    {{endif}}
  </body>
</html>

Here we’re using both the standard ways to do input in HTML: The browser query string contains the forum ID that we’re editing, and the post variables contain the new text and descriptions. ATL Server provides access to both of these via the m_HttpRequest object. This object is of the class CHttpRequest and provides a variety of ways to get access to server, query string, and form variables:

class CHttpRequest : public IHttpRequestLookup {
public:

  // Access to Query String parameters as a collection
  const CHttpRequestParams& GetQueryParams() const;

  // Access to Query String parameters via an iterator
  POSITION GetFirstQueryParam(LPCSTR *ppszName,
      LPCSTR *ppszValue);
  POSITION GetNextQueryParam(POSITION pos,
      LPCSTR *ppszName, LPCSTR *ppszValue);

  // Get the entire raw query string
  LPCSTR GetQueryString();

  // Access to form variables as a collection
  const CHttpRequestParams& GetFormVars() const;

  // Access to form variables via an iterator
  POSITION GetFirstFormVar(LPCSTR *ppszName,
      LPCSTR *ppszValue);
  POSITION GetNextFormVar(POSITION pos,
      LPCSTR *ppszName, LPCSTR *ppszValue);

  // Access to uploaded files
  POSITION GetFirstFile(LPCSTR *ppszName,
      IHttpFile **ppFile);
  POSITION GetNextFile(POSITION pos,
      LPCSTR *ppszName, IHttpFile **ppFile);

  // Get all cookies as a string
  BOOL GetCookies(LPSTR szBuf,LPDWORD pdwSize);
  BOOL GetCookies(CStringA& strBuff);

  // Get a single cookie by name
  const CCookie& Cookies(LPCSTR szName);

  // Access cookies via an iterator
  POSITION GetFirstCookie(LPCSTR *ppszName,
      const CCookie **ppCookie);
  POSITION GetNextCookie(POSITION pos,
      LPCSTR *ppszName, const CCookie **ppCookie);

  // Get the session cookie
  const CCookie& GetSessionCookie();

  // Get the HTTP method used for this request
  LPCSTR GetMethodString();
  HTTP_METHOD GetMethod();

  // Access to various server variables and HTTP Headers
  LPCSTR GetContentType();

  BOOL GetAuthUserName(LPSTR szBuff, DWORD *pdwSize);
  BOOL GetAuthUserName(CStringA &str);

  BOOL GetPhysicalPath(LPSTR szBuff, DWORD *pdwSize);
  BOOL GetPhysicalPath(CStringA &str);

  BOOL GetAuthUserPassword(LPSTR szBuff, DWORD *pdwSize);
  BOOL GetAuthUserPassword(CStringA &str);

  BOOL GetUrl(LPSTR szBuff, DWORD *pdwSize);
  BOOL GetUrl(CStringA &str);

  BOOL GetUserHostName(LPSTR szBuff, DWORD *pdwSize);
  BOOL GetUserHostName(CStringA &str);

  BOOL GetUserHostAddress(LPSTR szBuff, DWORD *pdwSize);
  BOOL GetUserHostAddress(CStringA &str);

  LPCSTR GetScriptPathTranslated();
  LPCSTR GetPathTranslated();
  LPCSTR GetPathInfo();

  BOOL GetAuthenticated();

  BOOL GetAuthenticationType(LPSTR szBuff, DWORD *pdwSize);
  BOOL GetAuthenticationType(CStringA &str);

  BOOL GetUserName(LPSTR szBuff, DWORD *pdwSize);
  BOOL GetUserName(CStringA &str);

  BOOL GetUserAgent(LPSTR szBuff, DWORD *pdwSize);
  BOOL GetUserAgent(CStringA &str);

  BOOL GetUserLanguages(LPSTR szBuff, DWORD *pdwSize);
  BOOL GetUserLanguages(CStringA &str);
  BOOL GetAcceptTypes(LPSTR szBuff,DWORD *pdwSize);
  BOOL GetAcceptTypes(CStringA &str);

  BOOL GetAcceptEncodings(LPSTR szBuff, DWORD *pdwSize);
  BOOL GetAcceptEncodings(CStringA& str);

  BOOL GetUrlReferer(LPSTR szBuff, DWORD *pdwSize);
  BOOL GetUrlReferer(CStringA &str);

  BOOL GetScriptName(LPSTR szBuff, DWORD *pdwSize);
  BOOL GetScriptName(CStringA &str);

  // Raw access to server variables
  BOOL GetServerVariable(LPCSTR szVariable, CStringA &str) const;

}; // class CHttpRequest

For methods that return strings (that is, almost all of them), there are two overloads. The first one is the traditional “pass in a buffer and a DWORD containing the buffer length” style used so often in the Win32 API. The second overload lets you pass in a CStringA reference and stores the resulting string in the CString. The latter overload is much more convenient; the former gives you complete control over memory allocation if you need it for performance.

The query string and form variable access methods give you a variety of ways to get at the contents of these two collections of variables. For query strings, the easiest way to work if you know what query strings you’re expecting is to use the GetQueryParams() method. This returns a reference to a CHttpRequestParams object. This object basically maps name/value pairs and is used to access the contents of the query strings. Usage is quite simple:

const CHttpRequestParams& queryParams =
    m_HttpRequest.GetQueryParams( );
CStringA cstrForumId = queryParams.Lookup( "forumid" );

If the query parameter you’re looking for isn’t present, you get back an empty string.

The CHttpRequestParams object also supports an iterator interface to walk the list of name/value pairs in the collection. Unfortunately, this is an MFC-style iterator rather than a standard C++ iterator. Here’s an example that walks the list of form variables submitted in a post:

HTTP_CODE EditForumHandler::OnFormFields( ) {
  if( m_HttpRequest.GetMethod( ) ==
    CHttpRequest::HTTP_METHOD_POST ) {
    const CHttpRequestParams &formFields =
      m_HttpRequest.GetFormVars( );
    POSITION pos = formFields.GetStartPosition( );
    m_HttpResponse << "Form fields:<br>" << "<ul>";

    const CHttpRequestParams::CPair *pField;
    for( pField = formFields.GetNext( pos );
      pField != 0;
      pField = formFields.GetNext( pos ) ) {
      m_HttpResponse << "<li>" << pField->m_key <<
        ": " << pField->m_value << "</li>";
    }
    m_HttpResponse << "</ul>";
  }
  return HTTP_SUCCESS;
}

To use the iterator interface, you call the GetStartPosition( ) method on the collection to get back a POSITION object. This acts as a pointer into the collection and is initialized to one before the first element in the collection. The GetNext( ) method increments the POSITION to point to the next item in the collection and returns a pointer to the object at the new POSITION. When you get to the end, GetNext( ) returns 0.

Because the CHttpRequestParams class stores name/value pairs, it makes sense that the GetNext() call returns a CPair object; this is a nested type defined within the map class. It has two fields: m_key and m_value, which should be self-explanatory.

It’s up to you to choose which way to access your inputs. The Lookup method is much more convenient when you know in advance what form fields or query string parameters you’re expecting. The iterator versions are useful if you can have a wide variety of inputs and don’t know in advance what you’re going to get (for example, some blog systems enable you to pass a variety of different parameters to bring up a single post, all posts in a month, or posts from a start/end date).

One thing to consider is what to do about parameters you don’t expect and don’t support. The easiest thing to do is simply ignore them. However, if somebody is sending you unexpected junk, it might be somebody trying to hack your system, so you might want to at least loop through the query string and form variables to check if there’s anything in there you don’t expect. Your response to these values is up to you: This could range from ignoring them to logging the invalid parameters or failing the request outright.

Data Exchange and Validation

So we have easy access to our query string and form variables, but that access is less than convenient. We need to check for empty strings when calling the Lookup() method to verify that the variable exists at all. We need to do data type conversions: In our example, the forum ID is an integer, but in the query string it’s stored as a string. And when we’ve got the value, we need to do validation on it: Faulty input validation is the single biggest security flaw in web sites today. [5]

ATL Server includes some common validation and data-conversion functions to make life easier for the web developer. This is implemented via the CValidateObject< > template and the CValidateContext class.

The CValidateObject< > template is designed to be used as a base class; the CHttpRequestParams class derives from CValidateObject< >. It provides numerous overloads of two methods: Exchange and Validate:

template <class TLookupClass, class TValidator = CAtlValidator>
class CValidateObject {
public:
   template <class T>
    DWORD Exchange(
        LPCSTR szParam,
        T* pValue,
        CValidateContext *pContext = NULL) const;

    template<>
    DWORD Exchange(
        LPCSTR szParam,
        CString* pstrValue,
        CValidateContext *pContext) const;

    template<>
    DWORD Exchange(
        LPCSTR szParam,
        LPCSTR* ppszValue,
        CValidateContext *pContext) const;

    template<>
    DWORD Exchange(
        LPCSTR szParam,
        GUID* pValue,
        CValidateContext *pContext) const;

    template<>
    DWORD Exchange(
        LPCSTR szParam,
        bool* pbValue,
        CValidateContext *pContext) const;


    template <class T, class TCompType>
    DWORD Validate(
        LPCSTR Param,
        T *pValue,
        TCompType nMinValue,
        TCompType nMaxValue,
        CValidateContext *pContext = NULL) const;

    template<>
    DWORD Validate(
        LPCSTR Param,
        LPCSTR* ppszValue,
        int nMinChars,
        int nMaxChars,
        CValidateContext *pContext) const;

    template<>
    DWORD Validate(
        LPCSTR Param,
        CString* pstrValue,
        int nMinChars,
        int nMaxChars,
        CValidateContext *pContext) const;

    template<>
    DWORD Validate(
        LPCSTR Param,
        double* pdblValue,
        double dblMinValue,
        double dblMaxValue,
        CValidateContext *pContext) const;
};

The Exchange( ) method takes in the name of a variable. If that variable exists in the collection you’re using, it converts the string to the correct type (based on the type T you use) and stores the result in the requested pointer. The return value tells you whether the parameter was present:

HTTP_CODE ValidateAndExchange( ) {
  ...
  int forumId;
  m_HttpRequest.GetQueryParams().Exchange( "forumid",
    &forumId, NULL );
  ...
}

Thanks to the wonder of template type inference, by passing in the address of a variable of type int, the Exchange method knows that I want the string converted to type int. The Exchange( ) method properly works with these types: ULONGLONG, LONGLONG, double, int, unsigned int, long, unsigned long, short, and unsigned short. In addition, there are specializations for CString and LPCSTR, GUID, and bool.

This is a convenient way to check whether a parameter exists, copy it, and do data conversion all in one fell swoop. But that’s usually not enough. You generally need to do more checking than “Is it an int?” The Validate( ) method and the various overloads give you some more checking. Specifically, when working with a numeric value, Validate lets you check that a parameter is within a particular numeric range. When validating strings, the Validate method can check for minimum and maximum string lengths (very helpful to avoid buffer overflows). For example, here’s some code from the ValidateAndExchange method that checks the results of our form post:

HTTP_CODE ValidateAndExchange( ) {
  ...
  if( m_HttpRequest.GetMethod( ) ==
    CHttpRequest::HTTP_METHOD_POST ) {
    const CHttpRequestParams& formFields =
      m_HttpRequest.GetFormVars( );
    formFields.Validate( "forumName", &m_forumName,
      1, 50, &m_validationContext );
    formFields.Validate( "forumDescription",
      &m_forumDescription, 1, 255, &m_validationContext );
  }
  ...
}

Notice that I’m not actually checking the return values from the Validate method. That’s one way to get the results of the Validate call, but having to do this repeatedly for every field gets tedious (and hard to maintain) quickly:

if( VALIDATION_SUCCEEDED( formFields.Validate(
   "forumName", &m_forumName, 1, 50, &m_validationContext ) )
{ ... }

Instead, we take advantage of another class: CValidateContext. The last parameter for the Exchange() and Validate() methods is an optional pointer to a CValidateContext object. This object acts as a collectionspecifically, a collection of validation errors. If the Exchange() or Validate() call fails, an entry in the CValidateContext object is made. Using the validation context, you can do all your validation checks and not have to worry about the results until the end.

The easiest thing to do is check whether there were any validation failures, via the ParamsOK() method on the CValidateContext object. You can also walk the list of errors, like this:

HTTP_CODE EditForumHandler::OnValidationErrors( ) {
  if( m_validationContext.ParamsOK( ) ) {
    m_HttpResponse << "No validation errors occurred";
  }
  else {
    int numValidationFailures =
      m_validationContext.GetResultCount( );
    m_HttpResponse << "<ol>";
    for( int i = 0; i < numValidationFailures; ++i ) {
      CStringA faultName;
      DWORD faultCode;
      m_validationContext.GetResultAt( i, faultName,
        faultCode );
      m_HttpResponse << "<li>" << faultName << ": " <<
        faultCode << "</li>";
    }
    m_HttpResponse << "</ol>";
  }
  return HTTP_SUCCESS;
}

Here we’re just printing the fault codes as integers. These are the possible fault codes:

VALIDATION_S_OK. The named value was found and could be converted successfully.
VALIDATION_S_EMPTY. The name was present, but the value was empty.
VALIDATION_E_PARAMNOTFOUND. The named value was not found.
VALIDATION_E_INVALIDPARAM. The name was present, but the value could not be converted to the requested data type.
VALIDATION_E_LENGTHMIN. The name was present and could be converted to the requested data type, but the value was too small.
VALIDATION_E_LENGTHMAX. The name was present and could be converted to the requested data type, but the value was too large.
VALIDATION_E_FAIL. An unspecified error occurred.

It would have been nice if these were just custom hrESULT values, but, unfortunately, they’re not. Luckily, there’s also a VALIDATION_SUCCEEDED macro that tells you whether a particular error code is a success.

When validation for a particular variable fails, the Validate (or Exchange) method adds a name/value pair to the validation context. The name is the name of the variable that failed. The value is the fault code. These can be retrieved using the GetresultAt method, as shown earlier. You are also free to add your own error records to the validation context via the AddResult method. For example, we use the Exchange method to find out whether there’s a forumid, but we still need to see if it’s valid:

void EditForumHandler::ValidateLegalForumId( ){
  if( m_forumId != -1 ) {
    if( SUCCEEDED( m_forumList.ReadOneForum(
      m_forumId, &m_forumRecordset ) ) ) {
      bool containsData;
      if( SUCCEEDED( m_forumList.ContainsForumData(
        m_forumRecordset, &containsData ) ) ) {
        if( !containsData ) {
            m_validationContext.AddResult(
              "forumid", VALIDATION_E_FAIL );
            m_forumId = -1;
        }
      }
    }
  }
}

In this case, I’m using a generic VALIDATION_E_FAIL code, but there’s no reason you can’t make up your own DWORD error-validation codes.

If you have multiple records with the same name, only the last one in is recorded. So, if you check the same value multiple times, as we do with forumid, be aware that later validation failures could overwrite earlier records in the context.

The CValidateContext class gives you several options when adding records to the collection:

class CValidateContext {
public:
  enum { ATL_EMPTY_PARAMS_ARE_FAILURES = 0x00000001 };

  CValidateContext(DWORD dwFlags=0);
  bool AddResult(LPCSTR szName, DWORD type,
    bool bOnlyFailures = true);

  ...
};

When constructing the CValidateContext object, by default, empty parameters (ones that were in the request but have no data) are not considered an error by the CValidateContext. If you specify the ATL_EMPTY_PARAMS_ARE_FAILURES flag when constructing the context, empty parameters are treated as errors. In addition, you can pass a third, optional parameter to the AddResult method. If true (the default), the context ignores records that have the fault code VALIDATION_S_OK or VALIDATION_S_EMPTY (although the latter is ignored only if empty parameters are not errors). This optional parameter is useful when you call AddResult yourself; Validate and Exchange never pass false for this parameter.

When validation fails, you generally want to display something to the user. Nothing is built into ATL Server, but it’s easy enough to display errors on your own. Here’s the .srf file for my “edit forum” page:

<html>
{{handler SimpleForums.dll/EditForum}}
  <head>
    <title>Edit Forum</title>
  </head>
  <body>
    <h1>Edit Forum Information</h1>
    {{if ValidForumId}}
    <form action="editforum.srf?forumid={{ForumId}}"
      method="post">
      <table border="0" cellpadding="0">
        <tr>
          <td>
            Forum Name:
          </td>
          <td>
            <input type="text" name="forumName"
              id="forumName" maxlength="63"
              value="{{ForumName}}" />
          </td>
        </tr>
        <tr>
        <td>
          Forum Description:
        </td>
        <td>
          <textarea cols="50" rows="10"
            wrap="soft" name="forumDescription"
            id="forumDescription">
            {{ForumDescription}}
          </textarea>
        </td>
        </tr>
      </table>
      <input type="submit" />
      <a href="forumlist.srf">Return to Forum List</a>
    </form>
    {{else}}
    <p><b>You have given an invalid forum ID. Shame on you!</b>
    {{endif}}
    {{FormFields}}
    {{ValidationErrors}}
  </body>
</html>

The ValidationErrors substitution is handled by the OnValidationErrors method, which walks the validation context and outputs both the fields that have errors and the error code:

HTTP_CODE EditForumHandler::OnValidationErrors( ) {
  if( m_validationContext.ParamsOK( ) ) {
    m_HttpResponse << "No validation errors occurred";
  }
  else {
    m_HttpResponse << "Validation Errors:";
    int numValidationFailures =
      m_validationContext.GetResultCount( );
    m_HttpResponse << "<ol>";
    for( int i = 0; i < numValidationFailures; ++i ) {
      CStringA faultName;
      DWORD faultCode;
      m_validationContext.GetResultAt( i, faultName,
        faultCode );
      m_HttpResponse << "<li>" << faultName <<
        ": " << FaultCodeToString(faultCode) << "</li>";
    }
    m_HttpResponse << "</ol>";
  }
  return HTTP_SUCCESS;
}
CStringA EditForumHandler::FaultCodeToString(DWORD faultCode) {
  switch(faultCode) {
    case VALIDATION_S_OK:
      return "Validation succeeded";

    case VALIDATION_S_EMPTY:
      return "Name present but contents were empty";

    case VALIDATION_E_PARAMNOTFOUND:
      return "The named value was not found";

    case VALIDATION_E_LENGTHMIN:
      return "Value was present and converted, but too small";
    case VALIDATION_E_LENGTHMAX:
      return "Value was present and converted, but too large";

    case VALIDATION_E_INVALIDLENGTH:
      return "(Unused error code)";

    case VALIDATION_E_INVALIDPARAM:
      return "The value was present but could not be "
        "converted to the given data type";

    case VALIDATION_E_FAIL:
      return "Validation failed";

    default:
      return "Unknown validation failure code";
  }
}

This code simply walks through the validation context and displays the names of the failures (usually the field names) and the failure code, converted to a string. Figure 14.2 shows the results of validation failures. The fields in question weren’t long enough to pass validation (because they need to be at least 1 character).

Figure 14.2. Results of validation failure

[View full size image]

A small bug in the validation functions makes the ATL_EMPTY_PARAMS_ARE_FAILURES flag essential. The problem comes in when you have a post variable with an empty string. For example, Figure 14.3 shows our forum edit form; I cleared the forum name before clicking Submit.

Figure 14.3. Edit Forum page with no forum name

[View full size image]

When I click the Submit Query button, the forumName text field gets sent back in the HTTP post, but with no value. In the ValidateAndExchange method, we make use of ATL Server’s validation functions to check our input:

HTTP_CODE EditForumHandler::ValidatePost( ) {
  ...
  if( m_HttpRequest.GetMethod( ) ==
    CHttpRequest::HTTP_METHOD_POST ) {
    const CHttpRequestParams& formFields =
      m_HttpRequest.GetFormVars( );
    formFields.Validate( "forumName", &m_forumName,
      1, 50, &m_validationContext );
  }

  return HTTP_SUCCESS;
}

The intention here is to require that the forumName variable exists and that it be from 1 to 50 characters in length. If we check the ParamsOK variable, it correctly returns false: The forumName variable is not within 1 and 50 characters in length. However, if we walk the list of errors in the validation context, there will be no record for the forumName field. What’s going on here?

Let’s take a look at the code for CValidateObject::Validate for strings:

template<>
DWORD Validate(
  LPCSTR Param,
  LPCSTR* ppszValue,
  int nMinChars,
  int nMaxChars,
  CValidateContext *pContext) const {
  LPCSTR pszValue = NULL;
  DWORD dwRet = Exchange(Param, &pszValue, pContext);

  if (dwRet == VALIDATION_S_OK ) {
    if (ppszValue)
      *ppszValue = pszValue;
    dwRet = TValidator::Validate(pszValue, nMinChars, nMaxChars);
    if (pContext && dwRet != VALIDATION_S_OK)
      pContext->AddResult(Param, dwRet);
  }
  else if (dwRet == VALIDATION_S_EMPTY && nMinChars > 0) {
    dwRet = VALIDATION_E_LENGTHMIN;
    if (pContext) {
      pContext->SetResultAt(Param, VALIDATION_E_LENGTHMIN);
    }
  }
  return dwRet;
}

The two lines in bold are where the record is added to the validation context. Note that the first one calls the AddResult method. This is where we check for validation failures. Notice the second one: This code executes if the validation result is VALIDATION_S_EMPTY, and there’s a minimum character length on the string. In this case, it calls the SetResultAt method on the validation context instead, using the name of the parameter.

Here’s where the bug comes in. Let’s look at the SetResultAt implementation:

class CValidateContext {
public:

  bool SetResultAt(__in LPCSTR szName, __in DWORD type) {
    _ATLTRY {
      if (!VALIDATION_SUCCEEDED(type) ||
        (type == VALIDATION_S_EMPTY &&
          (m_dwFlags & ATL_EMPTY_PARAMS_ARE_FAILURES))) {
        m_bFailures = true;
      }

      return TRUE == m_results.SetAt(szName,type);
    }
    _ATLCATCHALL() { }

    return false;
  }

  // Returns true if there are no validation failures
  // in the collection, returns false otherwise.
  __checkReturn bool ParamsOK() {
  return !m_bFailures;
  }

protected:
  CSimpleMap<CStringA, DWORD> m_results;
  bool m_bFailures;
}; // CValidateContext

The SetResultAt call sets the m_bFailures flag, which is used by the ParamsOK method, and then calls m_results.SetAt. And here’s the source of the problem: CSimpleMap::SetAt sets the value only if the name you’re using is already in the map. If the key isn’t in the map, SetAt silently fails.

So what happens here is that, because an empty parameter isn’t an error by default, it doesn’t get added to the context in the AddResult call. Then, when the minimum-length validation fails, the call to SetResultAt TRies to add using the SetAt call. But that fails because the parameter isn’t already in the m_results map. As a result, the m_bFailures flag is set, but there’s no actual record of the specific failure.

You can work around this bug in two ways. The first is to set the ATL_EMPTY_PARAMS_ARE_FAILURES flag when you create your validation-context object. This is best if you absolutely must have a value in the parameter in question. The other option is best used if the parameter is actually optional. In this case, be sure to set the minimum length in the Validate call to 0 instead of 1, as I did earlier.

Regular Expressions

Dealing with numeric values is made quite easy by the Validate() method, but for strings, you often need to do a lot more than just check for the maximum length. It’s good security practice to enforce that your input contains only a known set of good characters, for example. Or what if you need to receive dates in a particular format? None of the Validate overrides helps you there.

The typical tool used in these kinds of string validation is the regular expression. UNIX programmers have been using them for years; one could argue that the popularity of the Perl programming language is mainly because of the ease of regular expression matching. Luckily, ATL Server provides a regular expression engine that we can use from the comfort of good old C++.

Unfortunately, a discussion of regular expression syntax and how to use regular expressions is beyond the scope of this book; see the documentation for details. [6]

Regular expressions are done in ATL Server via the CAtlRegExp class:

template <class CharTraits /* =CAtlRECharTraits */>
class CAtlRegExp {
public:
  CAtlRegExp();

  typedef typename CharTraits::RECHARTYPE RECHAR;

  REParseError Parse(const RECHAR *szRE,
    BOOL bCaseSensitive=TRUE);

  BOOL Match(const RECHAR *szIn,
    CAtlREMatchContext<CharTraits> *pContext,
    const RECHAR **ppszEnd=NULL);
};

The usage is fairly simple. For example, suppose we wanted to ensure that the forum name contains only alphabetical characters, spaces, and commas. The following does the trick:

void EditForumHandler::ValidateLegalForumName( ) {
  CAtlRegExp< CAtlRECharTraitsW > re;
  CAtlREMatchContext< CAtlRECharTraitsW > match;

  ATLVERIFY( re.Parse( L"^[a-zA-Z,]*$" ) ==
    REPARSE_ERROR_OK );
  if( !re.Match( m_forumName.GetBuffer( ), &match ) ) {
    m_validationContext.AddResult( "forumName",
      VALIDATION_E_FAIL );
  }
}

First, you create the CAtlRegExp object. The template parameter is a traits class that defines various properties of the character set that the regular expression engine will be searching. ATL defines three of these traits classes: CAtlRECharTraitsA (for ANSI characters), CAtlRECharTraitsMB (for multibyte strings) and CAtlRECharTraitsW (for wide character strings). These traits classes are used much like the traits classes are in the CString class as discussed in Chapter 2, “Strings and Text.”

After you’ve created the regex object, you need to feed in a regular expression by calling the Parse method. This method returns a value of type REParseError. REPARSE_ERROR_OK means that everything was fine; any other return code indicates a syntax error in the regular express. The documentation for CAtlRegExp::Parse gives the complete list of possible error codes.

Next, you create an object of type CAtlREMatchContext, which takes the same character traits template parameter as the regexp object did. Then, you call the Match method on the regular expression object, passing in the string to search and the match context object. Match returns true if the regular expression matched the string, and false if it did not. In some cases, this is all we need to know. In others, we might want to know more about what specifically matched. This information is stored in the match context object. The documentation and sample code give many examples on how to use the match context and more information about what you can do with regular expressions.

Session Management

The scalability of the Web comes directly from its stateless nature. As far as the web server is concerned, every HTTP request is independent. The stateless architecture means that server farms and load balancing are easy, caching can be added at many different places, and it’s easy to add hardware to an existing system.

It also makes writing a shopping cart a real pain in the neck.

Nearly every web application needs to deal with state management. State can be on a per-session basis, a per-application basis, or a per-page basis. When thinking about state management, some standard questions need to be answered:

What’s the scope? From where is the state data available, and what’s its lifetime?
Where’s the data stored? In memory? In a database? In a disk file? In a hidden form field?
How do we find the state when processing a particular request?

One of the more difficult state-management pieces to build by hand is session state: per-user data that persists across HTTP request. Luckily for us, ATL Server, like all other serious web frameworks, provides a session-state service so we don’t have to roll our own.

Using Session State

Before diving into the internals, let’s take a quick look at how to use session state. In our ongoing forum example, I want to add a hit counter to each forum’s page, so I can see how often I’ve gone to the page. It looks something like Figure 14.4.

Figure 14.4. Forum page with a hit counter

The .srf file for this page is pretty simple:

<html>
{{handler SimpleForums.dll/ShowPosts}}
<head>
<title>{{ForumName}}</title>
</head>
<body>
<h1>{{ForumName}}</h1>
<p>You have visited this forum {{HitCount}} times in the
current session.</p>
<div>
<! ... Post List content removed for clarity >
</div>
<a href="newpost.srf?forumid={{ForumId}}">New Post</a>
<a href="forumlist.srf">Return to forum list</a>
{{endif}}
</body>
</html>

The trick is, how do we implement the HitCount replacement? We want the hit counter to stick around between page views; as the user moves from forum to forum on the site, we want each page’s hit count to be independent and persistence.

Unlike classic ASP and ASP.NET, ATL Server does not automatically create a session for you. In the C++ tradition of “don’t pay for what you don’t use,” you must explicitly create a session object when you need it.

Getting the Session Service

The first thing you need to do is get hold of an ISessionStateService interface pointer. This interface provides the capability to create and retrieve sessions. The object is available in your request handler via the m_spServiceProvider member that is inherited from CRequestHandlerT< >. In your ValidateAndExchange function, do something like this:

ShowPostsHandler.h:

class ShowPostsHandler :
    public CRequestHandlerT< ShowPostsHandler > {
...
private:
...
    CComPtr< ISessionStateService > m_spSessionStateSvc;
    CComPtr< ISession > m_spSession;
};

ShowPostsHandler.cpp:

HTTP_CODE ShowPostsHandler::ValidateAndExchange( ) {
  if( FAILED( m_spServiceProvicer->QueryService(
    __uuidof(ISessionStateService), &m_spSessionStateSvc ) ) ) {
    return HTTP_FAIL;
  }

  // Do rest of validation
  ...

  // Retrieve session data
  if( FAILED( RetrieveOrCreateSession( ) ) ) {
    return HTTP_FAIL;
  }

  if( FAILED( UpdateHitCount( ) ) ) {
    return HTTP_FAIL;
  }

  m_HttpResponse.SetContentType( "text/html" );
  return HTTP_SUCCESS;
}

The line in bold is the magic call that gets us the ISessionStateService interface pointer we need.

An Aside: The IServiceProvider Interface

The IServiceProvider interface is actually a standard interface that was introduced back in the IE4 days. It hasn’t gotten a whole lot of attention, but implementing it can give you a surprisingly powerful system. The definition is actually quite simple:

interface IServiceProvider : IUnknown {
    HRESULT QueryService(
        [in] REFGUID guidService,
        [in] REFIID riid,
        [out, iid_is(riid)] IUnknown ** ppvObject);
};

The parameters of QueryService are essentially identical to those of QueryInterface, and QueryService acts a lot like QueryInterface: You ask for a particular IID, and you get back an interface pointer. There’s a major difference, though: QueryInterface is required to return an interface pointer on the same object and obey all the rules of COM identity. QueryService, on the other hand, can (and usually does) return an interface pointer on a different COM object.

This explains the guidService parameter to the QueryService call: It’s specifying which particular object we want to get the interface pointer to. This GUID doesn’t need to be a CLSID, or an IID, or a CATID, or anything else. It’s simply a predefined GUID that the developer chooses to represent that particular service.

The IServiceProvider interface is how ATL Server provides, well, services to the request handlers. When you create your project via the ATL Server Project Wizard and you choose session support, these lines get added to your ISAPI extension class:

// session state support
typedef CSessionStateService<WorkerThreadClass,
  CMemSessionServiceImpl> sessionSvcType;
CComObjectGlobal<sessionSvcType> m_SessionStateSvc;

public:

BOOL GetExtensionVersion(HSE_VERSION_INFO* pVer) {
  // ...
  if (S_OK != m_SessionStateSvc.Initialize(&m_WorkerThread,
    static_cast<IServiceProvider*>(this))) {
    TerminateExtension(0);
    return SetCriticalIsapiError(
      IDS_ATLSRV_CRITICAL_SESSIONSTATEFAILED);
  }
  return TRUE;
}

BOOL TerminateExtension(DWORD dwFlags) {
  m_SessionStateSvc.Shutdown();
  BOOL bRet = baseISAPI::TerminateExtension(dwFlags);
  return bRet;
}

HRESULT STDMETHODCALLTYPE QueryService(
  REFGUID guidService, REFIID riid, void** ppvObject) {
  if (InlineIsEqualGUID(guidService,
    __uuidof(ISessionStateService)))
    return m_SessionStateSvc.QueryInterface(riid, ppvObject);
  return baseISAPI::QueryService(guidService, riid,
    ppvObject);
}

The ISAPI extension creates a session-state service object as a “global” object; you might remember CComObjectGlobal from Chapter 4, “Objects in ATL.” This object lives as long as the ISAPI extension object does and basically ignores AddRef and Release counts. The QueryService implementation checks to see if the guidService parameter is equal to the ISessionStateService method; if so, it simply calls QueryInterface on the member session state service object.

ATL Server uses this technique to provide several kinds of services to the request headers. If you have your own services that you want to provide across the application, this is a good way to do it.

Creating and Retrieving Sessions

So, we now have an ISessionService pointer. The next step is to use that pointer to look up our session, and to create one if it doesn’t exist.

The first question is, how do we know which session to grab? ATL Server has built-in support for the standard approach (a session cookie) and the flexibility to let you do your own session identification, if you need to.

Here’s how you retrieve a session using a session cookie:

HRESULT ShowPostsHandler::RetrieveOrCreateSession( ) {
  HRESULT hr;
  CStringA sessionId;
  m_HttpRequest.GetSessionCookie( ).GetValue( sessionId );
  if( sessionId.GetLength( ) == 0 ) {
    // No session yet, create one
    const size_t nCharacters = 64;
    CHAR szID[nCharacters + 1];
    szID[0] = 0;
    DWORD dwCharacters = nCharacters;
    hr = m_spSessionStateSvc->CreateNewSession(szID,
      &dwCharacters, &m_spSession) );
    if( FAILED( hr ) ) return hr;

    CSessionCookie theSessionCookie( szID );
    m_HttpResponse.AppendCookie( &theSessionCookie );
  }
  else {
    // Retrieve existing session
    hr = m_spSessionStateSvc->GetSession(sessionId,
      &m_spSession ) );
    if( FAILED( hr ) ) return hr;
  }
  return S_OK;
}

First, we grab the value of the cookie. This gives us our session ID. If there isn’t a value, we create the session via the ISessionService::CreateNewSession method. This both creates the session and returns the ID for the session created. We then create a new session cookie and add it to the response. This step is important, and you can easily forget it if you’re used to other web frameworks that create sessions for you automatically.

If there is a cookie value, we use the ISessionService::GetSession method to get an ISession interface pointer and connect back up to the session.

Storing and Retrieving Session Data

When we have our ISession pointer, we can store and retrieve values. ISession maps names (as ANSI strings) to VARIANTS. Usage is pretty much what you’d expect:

HRESULT ShowPostsHandler::UpdateHitCount() {
  CStringA sessionVarName = "mySessionVariable";
  CComVariant hits;
  if( FAILED(
    m_spSession->GetVariable( sessionVarName, &hits ) ) ) {
    // If no such session variable, GetVariable return E_FAIL.
    // Gotta love nice specific HRESULTS
    hits = CComVariant( 0, VT_I4 );
  }
  m_hits = ++hits.lVal;
  return m_spSession->SetVariable( sessionVarName, hits ) );
}

The ISession interface provides the GetVariable and SetVariable methods to get and save a single variable. There are also methods to enumerate the session variables and control session timeouts.

Session State Implementations

One question about session management hasn’t been answered yet: Where is session data stored? The answer, as usual for ATL, depends on which template arguments you use.

Let’s look back at that type declaration in the ISAPI extension:

typedef CSessionStateService<WorkerThreadClass,
  CMemSessionServiceImpl> sessionSvcType;

The CSessionStateService template takes two parameters: The first is the worker thread class for the ISAPI extension. The second is the class that implements the ISessionService interface. In this case, we use CMemSessionServiceImpl, which provides in-memory session storage. In-memory session-state storage has the advantage of being very fast, but because it is only in memory on the server, it doesn’t work in a server farm.

ATL Server provides the CDBSessionServiceImpl as an alternative. This stores session state in a database instead. The access to a session is slower, but it can be shared across multiple machines in a farm. Choose the appropriate service implementation based on your requirements.

Data Caching

A smart caching strategy is often the difference between a site that comes up quickly and one that leave the users staring at the little spinning globe for the traditional 25 seconds before they go to anther site. ATL Server offers data-caching services to help you get below that magic time threshold.

Caching Raw Memory

The most basic caching service is the BLOB cache. No, this isn’t a gelatinous alien that will try to digest your hometown. This cache handles raw chunks of memory. Getting hold of the cache is done just as when using the session service – that is, you use the IServiceProvider interface:

HTTP_CODE ShowPostsHandler::ValidateAndExchange( ) {
  ...
  HRESULT hr = m_spServiceProvider->QueryService(
    __uuidof(IMemoryCache), &m_spMemoryCache );
  if( FAILED( hr ) ) return HTTP_FAIL;
  ...
}

When you have an IMemoryCache interface pointer, you can stuff items into and pull them out of the cache. The cache items are stored as name/value pairs, just like session-state items. Instead of storing VARIANTs, however, the BLOB cache stores void pointers.

Retrieving an item requires two steps. First, you must get a cache item handle:

HRESULT ShowPostsHandler::GetWordOfDay(CStringA &result) {
  HRESULT hr;
  HCACHEITEM hItem;
  hr = m_spMemoryCache->LookupEntry( "WordOfDay", &hItem );
  if( SUCCEEDED( hr ) ) {
    // Found it, pull out the entry
    ...
  }
  else if( hr == E_FAIL ) {
    // Not in cache
    ...
  }
}

The LookupEntry method returns S_OK if it found the item, and E_FAIL if it didn’t. [7]

When we have the item handle, we can retrieve the data from the cache. This is done via the GeTData method, which returns the void* that was stored in the cache, along with a DWORD giving you the length of the item:

HRESULT ShowPostsHandler::GetWordOfDay(CStringA &result) {
  ...
  // Found it, pull out the entry
  void *pData;
  DWORD dataLength;

  hr = m_spMemoryCache->GetData( hItem, &pData, &dataLength );
  if( SUCCEEDED( hr ) ) {
    result = CStringA( static_cast< char * >( pData ),
      dataLength );
  }
  m_spMemoryCache->ReleaseEntry( hItem );
  ...
}

The pointer that is returned from the GeTData call actually points to the data that’s stored inside the cache’s data structure; it’s not a copy. Because we don’t want the cache to delete the item out from under us, we copy it into our result variable.

The final call to ReleaseEntry is essential for proper cache management. The BLOB cache actually does reference counting on the items stored in the cache. Every time you call LookupEntry, the refcount for the entry you found gets incremented. ReleaseEntry decrements the refcount. Entries with a refcount greater than zero are guaranteed to remain in the cache. Because the whole point of using the cache is to pitch infrequently used data, properly releasing entries when you are finished with them is just as important as properly managing COM reference counts. Unfortunately, there’s no CCacheItemPtr smart pointer template to help. [8]

If you get that E_FAIL error code, you typically want to load the cache with the necessary data for next time. Doing so is fairly easy; you just call the Add method:

HRESULT ShowPostsHandler::GetWordOfDay(CStringA &result) {
    ...

   // Not in cache
     char *wordOfTheDay = new char[ 6 ];
     memcpy( wordOfTheDay, "apple", 6 );
     FILETIME ft = { 0 };
     hr = m_spMemoryCache->Add( "WordOfDay", wordOfTheDay,
         6 * sizeof( char ), &ft, 0, 0, 0 );
     ...
}

This code allocates the memory for the item, specifies an expiration time (via the FILETIME value, where 0 means that it doesn’t expire), and places it into the cache. The block of memory is now safely stored until the cache gets flushed or scavenged; at that point, we have a memory leak.

Why the memory leak? The cache is storing only void pointers; it knows nothing about how the memory it has been handed should be freed. It doesn’t run destructors, either. To prevent memory leaks, there is a hook to provide a deallocator, and it’s done on a per-entry basis. The last parameter in the call to the Add method is an optional pointer to an implementation of the IMemoryCacheClient interface, which has a single method:

interface IMemoryCacheClient : IUnknown {
    HRESULT Free([ in ] const void *pvData);
};

When an item is about to be removed from the cache, if you provided an IMemoryCacheClient implementation in the Add call, the cache calls the Free method to clean up. In this example, you’d just need to add a call to delete on the void pointer. Unfortunately, there’s no standard implementation of this interface for use in the BLOB cache.

Caching Files

The BLOB cache is useful for storing small chunks of arbitrary data, but sometimes you need to store large chunks. The file-caching service lets you create temporary files on disk; when the cache item expires, it automatically deletes the disk file.

The file cache operates much like the BLOB cache. You use IServiceProvider to get an IFileCache interface pointer. The file cache uses handles, just like the BLOB cache. The only major difference is that the file cache stores filenames instead of chunks of memory.

Summary

ATL Server is a set of classes to build ISAPI applications in C++. A typical ATL Server project is made up of three major components. The first is an ISAPI extension DLL that implements the required ISAPI methods. In addition, the ISAPI extension provides a thread pool and request-dispatching service to keep the web server responsive.

The second component is the .srf or stencil file. This is a file that contains replacements marked in {{ }} pairs. The .srf file is processed by a request-handler class, usually in a separate DLL. The third component, the request handler, actually processes the HTTP request and implements the replacements used in the .srf files.

ATL Server also provides utility functions to make web development easier. Input validation is supported for numeric ranges and string length, and a regular expression engine is included for sophisticated string analysis. Caching services are also provided to help improve performance in heavily loaded systems.