Complete Guide to C++ and Threads under NT

advertisement
Title page
Prelude .......................................................................................................................................... 4
Starting A Thread ........................................................................................................................ 5
Win32 CreateThread API ...................................................................................................... 5
Run-Time Library Issues ....................................................................................................... 8
Cutting it Down to Size ....................................................................................................... 10
 Simplify the Signature ............................................................................................. 10
 Passing Parameters .................................................................................................. 10
 Returning Results — In brief ................................................................................ 23
 Synchronize the Startup ......................................................................................... 23
 Reuse It ..................................................................................................................... 29
A higher-level C++ model .................................................................................................. 31
 The Launch Pad Model .......................................................................................... 31
 The Launch Pad — Starts a New Thread ........................................................... 32
 Mission Control — Controls The Background Process ................................... 33
 The Rocket — Is The Background Process ........................................................ 33
Ending A Thread ....................................................................................................................... 34
Software Models ........................................................................................................................ 35
Classification of Threaded Code ........................................................................................ 35
 OTO .......................................................................................................................... 35
 OTT .......................................................................................................................... 36
 R/W .......................................................................................................................... 36
 Grp ............................................................................................................................ 37
 FT .............................................................................................................................. 38
Applying Thread-Safety Classifications ............................................................................. 38
Serialization ............................................................................................................................ 39
Servers .................................................................................................................................... 39
Worker Threads .................................................................................................................... 39
GUI Threads ......................................................................................................................... 39
Synchronization Issues ............................................................................................................. 40
Race Conditions .................................................................................................................... 40
Deadlocks............................................................................................................................... 40
System and Library Details .................................................................................................. 40
 Handles ..................................................................................................................... 40
Atomic Operations .................................................................................................................... 41
Simple Instructions ............................................................................................................... 41
The "Interlocked" suite ........................................................................................................ 41
Kernel Synchronization Objects ............................................................................................. 42
Common Info on Kernel Objects ...................................................................................... 42
Complete Guide to C++ and Threads under NT
John M. Dlugosz
Mutex ...................................................................................................................................... 43
 Mutex Semantics ..................................................................................................... 44
 Win32 Mutex API Summary ................................................................................. 45
 Using Mutexes in C++ ........................................................................................... 46
Semaphore ............................................................................................................................. 50
 Semaphore Semantics ............................................................................................. 50
 Win32 Semaphore API Summary ......................................................................... 51
 Using Semaphores in C++ .................................................................................... 52
Event....................................................................................................................................... 52
 Event Semantics ...................................................................................................... 53
 Win32 Event API Summary .................................................................................. 53
 Using Events in C++ ............................................................................................. 54
Timer....................................................................................................................................... 55
 Timer Semantics ...................................................................................................... 55
 Win32 Timer API Summary .................................................................................. 55
 Using Timer in C++ ............................................................................................... 58
Change Notification ............................................................................................................. 58
 Semantics of File Change Notification Objects.................................................. 59
 An alternative: ReadDirectoryChangesW ............................................................ 59
 Semantics of Printer Change Notification Objects ............................................ 59
 Change Object API Summary ............................................................................... 60
Other Kernel Handles .......................................................................................................... 62
 Process and Thread ................................................................................................. 62
 File ............................................................................................................................. 62
 Console-input........................................................................................................... 63
Other Synchronization Primitives........................................................................................... 64
Win32 Critical Section .......................................................................................................... 64
Monitors ................................................................................................................................. 64
Condition Variables (used with Monitors) ........................................................................ 64
Spin Locks.............................................................................................................................. 64
Reader/Writer and Group Locks ....................................................................................... 64
Rendezvous ............................................................................................................................ 64
Conditional Semaphore........................................................................................................ 64
Waiting and Blocking ................................................................................................................ 65
Alertable States and APC's .................................................................................................. 65
Windows Messages ............................................................................................................... 65
A Survey of Wait Primitives ................................................................................................ 65
 WaitForSingleObject and WaitForSingleObjectEx............................................ 65
3/7/2016 3:34:00 PM
Page 2
John M. Dlugosz
Complete Guide to C++ and Threads under NT
 WaitForMultipleObjects and WaitForMultipleObjectsEx ................................ 65
 SignalObjectAndWait ............................................................................................. 66
 MsgWaitForMultipleObjects and MsgWaitForMultipleObjectsEx ................. 66
Thread Priorities.................................................................................................................... 66
Communicating Between Threads and Processes ................................................................ 67
Anonymous Pipes ................................................................................................................. 67
Named Pipes.......................................................................................................................... 67
Mailslots.................................................................................................................................. 67
Sockets .................................................................................................................................... 67
Shared Memory ..................................................................................................................... 67
APC's ...................................................................................................................................... 67
Windows Messages ............................................................................................................... 67
Thread-Specific Data ................................................................................................................ 68
Overlapped I/O ........................................................................................................................ 69
Dynamic Link Libraries (DLL's) ............................................................................................. 69
Fibers ........................................................................................................................................... 71
Processes ..................................................................................................................................... 72
A C++ Threading Library........................................................................................................ 73
Page 3
3/7/2016 3:34:00 PM
Complete Guide to C++ and Threads under NT
John M. Dlugosz
Prelude
Include info on why use threads.
3/7/2016 3:34:00 PM
Page 4
John M. Dlugosz
Complete Guide to C++ and Threads under NT
Starting A Thread
The best place to start, I suppose, is at the beginning. The beginning of a thread I
should say.
There are three Win32 API functions that create a new thread, and we will look into
them in detail. There are also library functions (the compiler’s, my own, and hopefully
soon, your own) that start a thread by eventually calling one of the API functions. We
will discuss their purpose for existing and when to use them.
Once you know the raw API calls, more important advise concerns creating threads in a
higher level meaning of the word. Besides just calling the actual function to start a new
thread, you need to have code before and after that point to set things up and deal with
the results. Furthermore, the newly born thread needs to cooperate in the process, too.
We will go over all these details using C++ code, and point out all the pitfalls and
present sound advise on the proper way to start threads in your program.
Win32 CreateThread API
The most primitive raw function is actually CreateRemoteThread. In NT4,
CreateThread simply calls CreateRemoteThread with the current process as the first
argument. Another way threads are created is with CreateProcess. Starting a whole
new program also creates a single thread to run that program. We will concentrate on
CreateThread.
HANDLE CreateThread(
SECURITY_ATTRIBUTES*,
ulong stack_commit_size,
THREAD_START_ROUTINE* thread_start,
void* parameter, //passed to thread_start
ulong flags, //flag, anyway.
ulong* lpThreadId
);
Where the thread_start parameter is:
ulong __stdcall THREAD_START_ROUTINE (void* parameter);
The first parameter specifies security attributes, and is the same as you find on any
kernel object creation function. You only need to supply this is you need to give the
new thread non-default attributes; namely, allow other users to obtain a handle to your
thread. This process (and any other process owned by the same user) can access the
thread by default, so normally NULL will do here.
You can also use the security attributes argument to make the thread's handle
inheritable. However, there is also a function (SetHandleInformation) to change the
handle's attributes, so you don't need to mess with the complex security structures if
Page 5
3/7/2016 3:34:00 PM
Complete Guide to C++ and Threads under NT
John M. Dlugosz
this is all you need. Security issues are covered under Named Pipes, page xx. Unless
you are told otherwise, just use NULL for security attributes in all the functions
presented.
The second parameter is commonly called the stack size. However, its exact meaning
is seldom understood.
As it turns out, this value has nothing to do with the size of the stack, as far as your
program logic is concerned. That is, changing this value won't affect how deep you can
recurse without running out. Under Win32, the memory system allows reserved but
uncommitted memory. A megabyte is reserved for the stack, and this parameter
specifies how much is initially committed. As the stack grows deeper, more memory is
automatically committed.
Memory commitment is done in units of 1 page, where a page is typically 4K. If you
specify a small value, even zero, you still get one page initially set up for you. If you
overflow that one, the memory management system automatically adds another. Using
a larger value for the stack parameter simply gives you a larger initial size. This can save
the work of growing the stack later. But, under normal conditions, it takes an awful lot
of function calls to chew up 4K, so the work of adding another page is insignificant
compared to what it takes to need another page.
Unless you have special needs, don't worry about this value—just use zero. Besides a
potential efficiency concern, there is another case where this may be useful. If you precommit memory for your stack, you'll know you have enough virtual memory ahead of
time, before your thread starts doing work. If you let it grow as it goes, you may run
out of memory when the memory manager tries to expand the stack. There may be
cases where the extra degree of robustness is necessary.
Now, back to the total reserved capacity of one megabyte. This cannot be changed
within the program. All thread stacks used by a process use the same size stack,
specified as part of the process. You can override this value at link time.
The start address is where the action is. Literally, that's where action takes place. This
function to a thread what main is to a process. Throughout this book, this is called the
thread-start function. The next parameter is a value that is passed as the parameter to the
thread-start function.
So, it's not surprising that the thread-start function takes an argument. This single
argument is declared as a void*, with the intention that you can pass a pointer to
anything, presumably a structure containing as much information as you need. The
return value is an unsigned long, with the idea that the thread exit value will be a status
code of some kind.
Actually, any 32-bit value will do in both places. That's what the __stdcall modifier is
for: This tells the compiler to use a certain calling convention to pass the parameter,
3/7/2016 3:34:00 PM
Page 6
John M. Dlugosz
Complete Guide to C++ and Threads under NT
expect the return value, and clean up after the call. The calling convention in C++
programs can be changed on the command line or with pragmas or project options, so
it's an excellent idea to always declare the thread-start function with this modifier. The
generated code works just fine no matter what is passed and returned, as long as they
are both 32 bits. But the C++ compiler is pickier, requiring that declarations on
function pointers match exactly. So, don't do this:
struct S; // not shown in the example
unsigned long __stdcall foo (const S* param)
{
param->blah();
//…
}
//… later
S value;
unsigned long ID; //an output parameter
::CreateThread (0,0, &foo, &value, 0, &ID);
It may work in C, but C++ is much stricter about type checking. The compiler will
object that &foo is the wrong type. Instead, write it this way:
unsigned long __stdcall foo (void* raw)
{
const S* param= static_cast<S*>(raw);
param->blah();
//…
}
That is, always declare the thread-start function as taking a void* and returning a ulong.
You can't even differ by a const keyword. Then, inside the function, declare what you
really wanted and initialize from the void* raw parameter.
The creation flags can be 0 or CREATE_SUSPENDED, which is 4. That's it; there is only
one available flag. So why 4 instead of 1, or simply making it a bool instead? The same
set of flags is used in the CreateProcess API function. It's just that only this one flag
has any meaning when creating just a thread, not a whole new process.
If the CREATE_SUSPENDED flag is given, then the thread is created but never scheduled
to run. It's as if the thread-start function began with a call to SuspendThread. You can
make it go with a call to ResumeThread. These functions are covered on page xx.
If the CREATE_SUSPENDED flag is not used, then the new thread can start at any time.
It may run for a while before the thread that called CreateThread (called the parent thread
throughout this book, for simplicity) gets a time slice again.
The last parameter is an "out" parameter which receives the thread ID of the created
thread. The thread ID is a unique value that identifies this thread on the system. All
Page 7
3/7/2016 3:34:00 PM
Complete Guide to C++ and Threads under NT
John M. Dlugosz
threads, regardless of which process they are running in, have unique ID's. The thread
ID is rarely used for anything, so don't worry about it.
The return value is a HANDLE to the thread. If the function fails, 0 is returned and
additional information is available via GetLastError.
Remember to always close the handle (with CloseHandle) when you don't need it
anymore. Closing the handle does not make the thread stop. The handle is used to call
other functions that deal with the thread, such as getting the exit value, waiting for the
thread to finish, performing some kinds of inter-thread communication, and suspending
the thread. If you don't plan on doing any of that stuff, it's perfectly OK for the parent
to close the handle immediately after calling CreateThread.
Even when the thread finishes (normally or abnormally), there is still a kernel object
representing the thread. This object does not go away until the last handle to it is
closed. Why would you want a handle to a valid object that represents a finished
thread? To get the thread's exit value, to see if it's finished yet, to prevent the thread ID
from being reused right away, or just because you like to leak memory by not closing all
your handles.
Run-Time Library Issues
The ::CreateThread API call is how a thread is ultimately created. However, the
compiler’s run-time library has its own functions you are supposed to use for this
purpose.
In Microsoft VC++ 4.2
_beginthread (startaddress, stack_size, argument)
returns -1, not NULL, on error
_beginthreadex
called just like CreateThread
and Borland C++ 5.0,
_beginthread (startaddress, stack_size, argument)
_beginthreadNT (startaddress, stack_size, argument, security, flags, id)
Both compilers have a _beginthread function, which is a simplified form. It is fine for
those typical cases when you don’t need the other three parameters. Note however that
Microsoft’s form returns –1 for an error, rather than a 0 handle as with all the other
functions under discussion.
Microsoft’s _beginthreadex is the simplest, because it is called exactly like
::CreateThread. Borland’s version, _beginthreadNT, has the parameters in a different
order.
3/7/2016 3:34:00 PM
Page 8
John M. Dlugosz
Complete Guide to C++ and Threads under NT
Borland also has a declaration for _beginthreadex, which is an inline function that
rearranges the parameters, when the preprocessor symbol __MFC_COMPAT__ is defined.
These functions are provided as wrappers around the simple CreateThread API call so
that the run-time library can do some work of its own every time a thread is created. In
a future article, we’ll show you the proper way to do this in your own libraries, so the
user is not required to call your special function when creating threads.
The run-time library code does it automatically to some extent, so the perils of using
::CreateThread directly have been more than a little exaggerated. However, the
automatic behavior is not perfect. Here’s the scoop:
Under both compilers, the reason for this code is to allocate memory for thread-local
copies of variables that are “global” in the run-time library. For example, the standard
function strtok() needs to hold an internal state between calls. Under the multithreaded library, strtok() can be used on different threads at the same time without
confusion, since each thread maintains a different internal variable.
Under the Microsoft compiler, if the DLL version of the run-time library is used, there
is absolutely no problem with using CreateThread directly. The thread-specific data is
allocated the first time it is needed (if ever), and the DLL thread-detach code frees it
again. On the other hand, if you link with the LIB version of the run-time library, the
cleanup is never done so there is a small memory leak.
The situation with Borland is similar. The thread-specific data structure is allocated the
first time it is needed. However, the initialization is incomplete. If you call
::CreateThread directly and allow the thread data to be created on first use, a call to
_ExceptInit is missed. If you throw something, you get an infinite recursion loop.
Other than that, it works.
So most of the time, you are better off using the run-time library’s version. One nit is
that these functions return unsigned long instead of HANDLE, so you always have to
use an explicit cast!
Although similar in nature, there is one interesting difference between the two
compiler’s implementations. The Microsoft version starts the thread suspended, then
resumes after the handle and id are stored in their variables. With the Borland’s version,
it is possible for the thread to check its internal value (as recorded in the run-time
library’s variable) and see a wrong value for its handle or id, as the new thread may do
that before the parent thread continues. Also, Microsoft’s code puts a try block around
the thread start function, so an unhandled exception1 will cause just the thread to
C++ exception, that is. The operating system's SEH is set up by the CreateThread function. In general,
when I say “exception”, I mean a C++ exception.
1
Page 9
3/7/2016 3:34:00 PM
Complete Guide to C++ and Threads under NT
John M. Dlugosz
terminate. With Borland, an unhanded exception in a thread causes the entire program
to abort.
Cutting it Down to Size
We've seen the most basic functions for creating a new thread. Now, just how is it best
used in C++?
 Simplify the Signature
There are six parameters to CreateThread, most of which are seldom used. In the vast
majority of uses, one or two parameters is enough. So, I present an overloaded form
that is easier to use:
HANDLE CreateThread (THREAD_START_ROUTINE* start, void* parameter= 0)
{
ulong id;
HANDLE retval= ::CreateThread (0,0, start, parameter, 0, &id);
if (!retval) throw win_error (__FILE__, __LINE__, GetLastError());
return retval;
}
This requires one or two parameters only, and also automates error checking. In
general, you can craft your own simpler form that does just what you need.
 Passing Parameters
The thread-start function takes a single void* argument. This means that whatever
information you really need to pass needs to be squirted through a single 32-bit value.
Simple Casting (everything that fits, except pointers)
If you only need to pass a single value, and that value fits in 32 bits, you can simply use
a cast. To convert non-pointer types to/from a void*, use a reinterpret_cast. Old-style
casts are deprecated, so I'll set a proper example and use the keyword casts. If you are
unfamiliar with them, I'll tell you which kind of cast to use in which situation, cookbook style.
ulong __stdcall thread_start (void* raw)
{
int value= reinterpret_cast<int>(raw);
cout << "Thread got: " << value << endl;
return 0;
}
void test1()
{
int value= 42;
3/7/2016 3:34:00 PM
Page 10
John M. Dlugosz
Complete Guide to C++ and Threads under NT
HANDLE h= CreateThread (thread_start, reinterpret_cast<void*>(value));
waiton (h);
CloseHandle(h);
}
Here, the simplified CreateThread is as mentioned above: it fills in all the arguments I
don't care about, and does error checking. The waiton function waits for the thread to
terminate. It simply calls WaitForSingleObject, but could be more elaborate. For
example, it could allow the user to cancel the operation, or it could time-out after a
while and deal with a hung thread. These issues will be covered in full in "Rejoining
Threads", on page xx. For now, think top-down-design and trust the waiton function
to do what it's supposed to.
You can see the approach commonly used in Windows: a word is a word, any any
integer or pointer will work. That's fine in assembly language, and cool enough in C,
but in modern C++ it's something of a faux pas.
In C++, it takes a leap of faith to know that an int value can be stuffed in a void* and
subsequently retrieved unharmed. But we're not exactly talking about portable
programs here—this code is specific to Win32 at the very least, and often specific to
Windows NT. On these platforms, we know that this is a valid assumption. If it were
not, a lot of things would break in the Windows header files, and that goes against
Win32's creed of source level compatibility between supported platforms.
void test2()
{
double value= 3.14159;
void* r= (void*)value;
void* p= reinterpret_cast<void*>(value);
}
The function test2 will not compile. The compiler will reject the idea of casting a
double to a void*, by either syntax. However, the compiler is not objecting because I
asked to stuff an 8-byte value into 4 bytes. Rather, C++ doesn't allow casting between
pointers and floating point values. Casting a float, which is in fact 4 bytes, won't work
either. Conversely, I can write the following:
ulong __stdcall thread_start_3 (void* raw)
{
__int64 value= reinterpret_cast<__int64>(raw);
cout << "Thread got: " << value << endl;
return 0;
}
void test3()
{
__int64 value= 1;
value <<= 60;
Page 11
3/7/2016 3:34:00 PM
Complete Guide to C++ and Threads under NT
John M. Dlugosz
value += 42;
cout << "original value is: " << value << endl;
HANDLE h= CreateThread (thread_start_3, reinterpret_cast<void*>(value));
waiton (h);
CloseHandle(h);
}
Is also trying to cast an 8-byte value into a void*, and the compiler takes it without
complaint2. Run it, and I get this result:
original value is: 1152921504606847018
Thread got: 42
Clearly, I did not get the same value out that I put in. Because of the way the value was
constructed, I know I ended up with the low-order bytes only, with the high-order bytes
discarded.
So, beware of casting things into a void*. Do so only for integral types, bools, and
enumerations, and only when they are 4 bytes or smaller. For other things that are
small enough to fit, you will need to resort to trickery other than casting. A union is a
handy way to go. For example,
union trix4 {
void* p;
float f;
char s[4];
};
ulong __stdcall thread_start_4a (void* raw)
// floating point value -- rejected in test 2.
{
trix4 yipes= {raw};
cout << "Thread got: " << yipes.f << endl;
return 0;
}
ulong __stdcall thread_start_4b (void* raw)
// an array of 4 bytes packed as one value.
{
trix4 yipes= {raw};
cout << "Thread got: " << yipes.s << endl;
return 0;
}
void test4()
{
trix4 yipes;
yipes.f= 3.14159;
Actually, the compiler did complain, not because of the casting. Microsoft's <iostream.h> doesn't have
an output operator for the __int64 type. So I added my own to the test code.
2
3/7/2016 3:34:00 PM
Page 12
John M. Dlugosz
Complete Guide to C++ and Threads under NT
cout << "original value is: " << yipes.f << endl;
HANDLE h= CreateThread (thread_start_4a, yipes.p);
waiton (h);
CloseHandle(h);
char short_message[4]= "hi!";
strcpy (yipes.s, short_message);
cout << "original value is: " << short_message << endl;
h= CreateThread (thread_start_4b, yipes.p);
waiton (h);
CloseHandle(h);
}
This program demonstrates that a floating point value was indeed squeezed through the
void* interface. It also shows how to do the same thing with an array of one-byte
values. Basically, anything that does in fact fit into 4 bytes will work with this technique.
But never do anything this ugly in public3. It could be just the technique you wanted to
avoid dynamic memory management or synchronization issues. That is, passing the
needed data directly as the parameter has definite advantages over passing a pointer to
the real data, due to lifetime issues of the thing being pointed to. But, don't write code
like this—only write wrappers like this. Keep the actual ugly mechanism hidden inside
nice functions. More on abstracting the marshalling of data later (page 16), and more
on lifetime issues after that (page 23).
Simple casting on pointers
For pointers, anything can be implicitly converted to a void*. Unless it's const or
volatile, that is, in which case it still needs another step. To get the value out again, use
the static_cast keyword. To remove const (or volatile), use a const_cast.
ulong __stdcall thread_start_5a (void* raw)
{
double* value= static_cast<double*>(raw);
cout << "Thread got: " << *value << endl;
return 0;
}
ulong __stdcall thread_start_5b (void* raw)
{
const char* s= static_cast<char*>(raw);
cout << "Thread got: " << s << endl;
return 0;
}
void test5()
{
double pi= 3.14159;
Any more than you would pass a kidney stone in public—another case of restrictive interfaces you just
have to deal with.
3
Page 13
3/7/2016 3:34:00 PM
Complete Guide to C++ and Threads under NT
John M. Dlugosz
const char message[]= "Hello World!";
HANDLE h= CreateThread (thread_start_5a, &pi);
waiton (h);
CloseHandle(h);
h= CreateThread (thread_start_5b, const_cast<char*>(message));
waiton (h);
CloseHandle(h);
}
For the double* in case 5a, the address of the variable was passed without any sort of
casting.
CreateThread (thread_start_5a, &pi)
On the other side, the function uses a static_cast to reverse the process. Although
both static_cast and reinterpret_cast are accepted by the compiler to do this, the
static_cast is correct in this case. That's because a static_cast is used to perform any
implicit conversion explicitly, and also to reverse one. The implicit conversion of
double* to void* is really a static_cast that goes unmentioned. So you also should use a
static_cast to reverse the process.
For the const char* in case 5b, the compiler can't do it implicitly, objecting because of
the const. const char* to const void* would be just fine, but const char* to plain void*
is a no-no. Note that the current formal specification of C++ indicates that character
string literals (stuff in double-quotes, such as "hello") are of type "array of const char",
when they used to be "array of char", as they still are in C. Compilers will vary for some
time, and old compilers will eventually be updated. So if you write
static_cast<void*>("hello") your compiler may not catch it as an error, but upon
upgrading your code will suddenly give warnings. So in the example I used a declared
array (where I specify the type I want) rather than a string literal to avoid any confusion.
This is the same technique you'll need on string literals, though contemporary compilers
may not require it.
Back to the point. The const_cast is used to strip off the const, and then the resulting
(non-const) char* is allowed to implicitly turn into a void*.
CreateThread (thread_start_5b, const_cast<char*>(message))
To get it back out, use static_cast, and leave off the const. The result of the cast (type
char*) will be implicitly turned into a const char* when assigned to s. Now, the
compiler will not complain if I declare s as a plain char* rather than a const char*. But,
the calling function will be most annoyed if the thread modifies message! When
squirting through the void* interface, you lose type information including any
const/volatile attributes. It's up to you to correctly take out exactly what you put in.
const char* s= static_cast<char*>(raw)
3/7/2016 3:34:00 PM
Page 14
John M. Dlugosz
Complete Guide to C++ and Threads under NT
With pointers to class types (which includes structures and unions), there is another
problem to watch out for. The following code looks pretty much like &pi from case 5a
above: implicit conversion to send, and static_cast to recover. Case 6a works as
expected, but case 6b prints the wrong answer. Depending on the exact code, it may
give the right answer after all, crash, or call the wrong function!
class A {
int x;
public:
virtual void print() const { cout << x; }
A (int x) : x(x) {}
};
class B {
int x;
public:
virtual void print() const { cout << x; }
B (int x) : x(x) {}
};
class C : public A, public B {
public:
C (int a, int b) : A(a), B(b) {}
void print() const
{ A::print(); cout << ','; B::print(); }
};
ulong __stdcall thread_start_6a (void* raw)
{
A* value= static_cast<A*>(raw);
cout << "Thread got: ";
value->print();
cout << endl;
return 0;
}
ulong __stdcall thread_start_6b (void* raw)
{
B* value= static_cast<B*>(raw);
cout << "Thread got: ";
value->print();
cout << endl;
return 0;
}
void test6()
{
C value (42,66);
HANDLE h= CreateThread (thread_start_6a, &value);
waiton (h);
CloseHandle(h);
h= CreateThread (thread_start_6b, &value);
Page 15
3/7/2016 3:34:00 PM
Complete Guide to C++ and Threads under NT
John M. Dlugosz
waiton (h);
CloseHandle(h);
}
See what's wrong? Let me reiterate a point: When squirting through the void*
interface, you lose all type information. It's up to you to correctly take out exactly what
you put in.
In this example, the test code is putting in a C*, and taking out an A* or a B*. That is
not the same type. If you've used C++ for any length of time, you might not think
anything of it. After all, thanks to the "isa" rule, a pointer to a derived class can be used
anywhere a pointer to the base is expected. In an ordinary function call, you can pass a
C* to a function expecting a B*.
But this is not an ordinary function call. By casting through the void*, type information
is lost and the compiler can't figure out what was meant. For a normal function call, the
compiler sees that a C* is standing in where a B* was expected, and generates proper
code. Specifically, both base classes cannot be at the same address as the complete
object of type C. So, at least one of those test functions is going to get the wrong
pointer.
Creating structures just for passing
Unlike the reinterpret_cast or other tricks for passing non-pointer types, the static_cast
on pointers is guaranteed to be correct. That is, any pointer can be converted into a
void* and back again and you'll get out the same value you put in. The C++
specification says that this must be so.
So, as a matter of principle, passing pointers to the real data is much less distasteful than
coercing actual values into a void* representation.
No matter what you need to pass, you can pass a pointer to it without problems. No
worry about the value fitting into 4 bytes, or doing an end-run around the compiler's
sense of duty to prevent you from making certain casts.
If you really want to pass more than one value to the thread-start function, why not
create a structure just for the purpose? This gets into the concept of marshalling, which
means recognizing that this passing problem is a job in itself, and can be tackled
separately from the real point of the function.
Keep the marshalling code separate from the real functions
Here is an example that demonstrates a couple new things at once. The function count
counts from x to y stepping by z. This is written as an ordinary function, without
worrying about threads at all. Write it, run it, test it, and then come back for more.
void count (int x, int y, int z)
{
3/7/2016 3:34:00 PM
Page 16
John M. Dlugosz
Complete Guide to C++ and Threads under NT
for (int loop= x; loop <= y; loop+=z) {
cout << loop << '\t' << flush;
Sleep (250); //delay a quarter second
}
cout << endl;
}
void test7()
{
cout << "before call to count" << endl;
count (1, 30, 2);
cout << "after call to count" << endl;
}
Notice that count is a perfectly normal function, with none of that void pointer crud.
The test7 function calls it, and you can see by running it that it blocks for three and a
half seconds before continuing.
What we really want is to count in the background, so that the calling function
continues right away while counting proceeds asynchronously. The count function is
no good as a thread-start function, since it doesn't have the right signature. To remedy
this, write another function to wrap it. Introduce a structure for the express purpose of
passing the various arguments.
struct count_args {
int low;
int high;
int step;
};
ulong __stdcall count_thread (void* raw)
{
count_args* args= static_cast<count_args*>(raw);
count (args->low, args->high, args->step);
return 0;
}
void test8()
{
cout << "before call to count" << endl;
count_args args= {1, 30, 2};
HANDLE h= CreateThread (count_thread, &args);
cout << "after call to count" << endl;
waiton (h);
}
The output from this version indicates that counting continues in the background:
before call to count
Page 17
3/7/2016 3:34:00 PM
Complete Guide to C++ and Threads under NT
after call to count
2
4
6
8
10
12
14
20
22
24
26
28
30
16
John M. Dlugosz
18
The function test8 does the now familiar work to force the arguments to fit. But, look
again at count. It was not changed at all! The new function, count_thread, contains all
the extra work required to recover the arguments, and no real code concerning the
point of the function. Meanwhile, count contains the actual algorithm, and none of the
extra work needed for threading.
Sounds like a good idea: separate the parameter passing stuff from the algorithm logic.
This suggests that the parameter passing stuff is a problem in itself. Indeed, once
recognized as such, it deserves a name. This process of transporting parameters across
some kind of boundary is called marshalling. So, abstract the marshalling code.
With this new insight, we can see that test8 only comes half way to meeting this new
criteria. The unpacking of the parameters is separated, but the packing up is not. Well,
easy enough to fix. Or is it? Introducing another function to accomplish the other half
gives us:
HANDLE detached_count (int x, int y, int z)
{
// this doesn't work right.
count_args args= {x,y,z};
return CreateThread (count_thread, &args);
}
void test9()
{
cout << "before call to count" << endl;
HANDLE h= detached_count (1, 30, 2);
cout << "after call to count" << endl;
waiton (h);
}
Looks good at first inspection. The test9 function is about as simple as the original
non-threaded version, with only the minimal of extra work needed to remember the
handle and wait on the background thread. Only problem is, it doesn't work.
The problem is our first introduction to lifetime issues, a general problem with
asynchronous functions. The args variable is local to detached_count, so it vanishes
when detached_count returns. Meanwhile, count is executing in the background after
detached_count does its stuff. Oops. The next section will revisit this issue in depth.
A simple way to make this work is to use the heap, not the stack. The parent thread
allocates the structure, and the new thread gets rid of it only when it's finished. All the
relevant code is within the marshalling functions, and neither count nor test10 care
about this little implementation detail.
3/7/2016 3:34:00 PM
Page 18
John M. Dlugosz
Complete Guide to C++ and Threads under NT
struct count_args_10 {
int low;
int high;
int step;
count_args_10 (int x, int y, int z) : low(x), high(y), step(z) {}
};
ulong __stdcall count_thread_10 (void* raw)
{
count_args_10* args= static_cast<count_args_10*>(raw);
count (args->low, args->high, args->step);
delete args;
return 0;
}
HANDLE detached_count_10 (int x, int y, int z)
{
count_args_10* args= new count_args_10 (x,y,z);
return CreateThread (count_thread_10, args);
}
void test10()
{
cout << "before call to count" << endl;
HANDLE h= detached_count_10 (1, 30, 2);
cout << "after call to count" << endl;
waiton (h);
}
You'll notice that the marshalling implementation consists of a structure and two
functions. Hmm… shouldn't that ring a few bells for object-oriented programmers? It
sure sounds like a class to me. Both functions could be member functions, so there is
now only one "thing" out there, a class representing a detached counter. Let's take that
idea and run with it. Given such an object, can we logically do other things with it?
Waiting for it to finish ought to be a member as well, and that generalizes to a means of
getting back results, as well. I'll illustrate by changing the count function to report the
number of loop iterations performed.
int count (int x, int y, int z)
// revised — demonstrates return value too.
{
int looped= 0;
for (int loop= x; loop <= y; loop+=z) {
cout << loop << '\t' << flush;
Sleep (250); //delay a quarter second
++looped;
}
cout << endl;
return looped;
}
Page 19
3/7/2016 3:34:00 PM
Complete Guide to C++ and Threads under NT
John M. Dlugosz
class detached_counter_11 {
int low;
int high;
int step;
int result;
static ulong __stdcall thread_start (void* raw);
HANDLE h;
public:
detached_counter_11 (int x, int y, int z);
int wait_for_result() const;
};
ulong __stdcall detached_counter_11::thread_start (void* raw)
{
detached_counter_11* args= static_cast<detached_counter_11*>(raw);
args->result= count (args->low, args->high, args->step);
return 0;
}
detached_counter_11::detached_counter_11 (int x, int y, int z)
: low(x), high(y), step(z)
{
h= CreateThread (thread_start, this);
}
int detached_counter_11::wait_for_result() const
{
waiton(h);
return result;
}
void test11()
{
cout << "before call to count" << endl;
detached_counter_11 backgrounder (1, 30, 2);
cout << "after call to count" << endl;
int result= backgrounder.wait_for_result();
cout << "thread finished. Result is " << result << endl;
}
In test11, an object represents the background job, and the private implementation of
that object is the marshalling mechanism and the thread creation logic. It uses count,
which is still an ordinary function which can be used independently of this class as well.
An advantage of this method is that the parameter packing mechanism is encapsulated,
without the lifetime problems seen in test9. To complete the concept, the destructor
should automatically wait if the background computation has not finished yet.
3/7/2016 3:34:00 PM
Page 20
John M. Dlugosz
Complete Guide to C++ and Threads under NT
That's one way to package up the pieces we've already presented. Here is another way.
Instead of making it an object, make it a single function. The other pieces can be
hidden from the user. This example requires multiple source files to demonstrate.
Example 12 — header file
void count (int x, int y, int z);
HANDLE count_in_background (int x, int y, int z);
Example 12 — implementation .cpp file
#include "chapter_1_12.h"
#include <iostream.h>
namespace { //internal stuff
struct thread_args {
int low;
int high;
int step;
thread_args (int x, int y, int z) : low(x), high(y), step(z) {}
};
ulong __stdcall thread_start (void* raw)
{
thread_args* args= static_cast<thread_args*>(raw);
count (args->low, args->high, args->step);
delete args;
return 0;
}
} // end of unnamed namespace
/* /\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\ */
void count (int x, int y, int z)
{
for (int loop= x; loop <= y; loop+=z) {
cout << loop << '\t' << flush;
Sleep (250); //delay a quarter second
}
cout << endl;
}
HANDLE count_in_background (int x, int y, int z)
{
thread_args* args= new thread_args (x,y,z);
return CreateThread (thread_start, args);
Page 21
3/7/2016 3:34:00 PM
Complete Guide to C++ and Threads under NT
John M. Dlugosz
}
Example 12 — main program file
#include "chapter_1_12.h"
#include <iostream.h>
int main()
{
cout << "before call to count" << endl;
HANDLE h= count_in_background (1, 30, 2);
cout << "after call to count" << endl;
waiton (h);
cout << "thread finished." << endl;
return 0;
}
end of Example 12
Here, the header file gives two functions—the regular one and a background version.
In the implementation file, the background version uses the concepts presented in this
chapter to launch the first function in its own thread. The fact that the second function
is implemented in terms of the first, and how it manages to do so, is not present in the
header.
C++ Feature
You'll notice that the pieces of the marshalling implementation, a function and a
structure, are defined inside an unnamed namespace. This is because those items
are used inside this source file only. Declaring the thread_start function as static
would accomplish the same thing for the function, but there is no such thing as a
static class. The names of all classes (including simple structures) have external
linkage and must be unique in the program. Using an unnamed namespace is the
only way to keep thread_args local to this translation unit. In practice, if you didn't
protect it this way and then used the name thread_args in another file, the
compiler+linker probably won't notice this violation of the "one definition rule".
Nothing bad will happen for plain structures with no member functions if that
struct is never used to specialize a template. Well, it might still mess up in really
contrived situations, which is why the rule was changed so that all
structure/class/union names have external linkage, even if they are simple and have
no member functions.
3/7/2016 3:34:00 PM
Page 22
John M. Dlugosz
Complete Guide to C++ and Threads under NT
 Returning Results — In brief
The previous section was all about passing parameters. But what about return values?
A brief survey is in order before continuing. The concepts mentioned here will be
presented in full later in the book.
The background counting example is given information on what to do, and then has no
need to communicate results back to the parent thread. Often, threads are not so
automatous, but require communication with other threads, including the parent thread.
There are several ways to return results from a thread to the parent. In test11 one of
the parameters was an out-parameter. That is, the structure used to pass in the three
counting parameters also was used to receive the result. Meanwhile, the parent has
some way of knowing when the result is ready. In this case, when the thread is done,
the result is ready. The background-counter object must live as long as the background
thread.
This concept can be generalized in two different directions. First, the method used by
overlapped I/O in NT, is to specify where you want the result as one of the arguments,
as well as providing some mechanism to signal "done". The second is to use a future
object which represents the value and will not block until you actually read from it,
assuming it hasn't finished in the mean time.
Alternativly, a thread can use the thread's exit value to communicate information. This
works best when communicating simple status information, such as "print job finished
OK" vs. "print job terminated abnormally."
In a more general case, a thread giving information to its parent is just another case of
inter-thread communications. This can be done with callbacks (as seen with another
form of overlapped I/O in NT), or data queues. This is the best mechanism for things
modeled as a server loop.
 Synchronize the Startup
In the test9 example program, we ran into a problem with the lifetime of the arguments
passed into the new thread. This was worked-around in test10, but the general
problem keeps popping up again and again in different guises.
For example, consider a thread that creates a window. The parent thread later tries to
post a message to that window. Most of the time it works, but sometimes it doesn't.
Seems that the new thread might be slow, and the other thread tries to use the window
before it's been created!
In general, you are faced with a critical initialization phase problem. The parent creates a
service or background object of some kind, and later tries to use it. How can it be sure
that the newly created object is ready for use? In the case of objects, it's a good idea to
Page 23
3/7/2016 3:34:00 PM
Complete Guide to C++ and Threads under NT
John M. Dlugosz
let the constructor finish (running asynchronously in the new thread) before the parent
continues. To take a more abstract position, you want to specify a critical initialization
phase in the new thread, so the parent thread waits for that to finish before continuing.
The critical initialization is synchronous with the parent's call to create the thread. The
two threads are not asynchronous until after this critical point is reached.
In many cases, making the parent wait right away is overkill. But, it's simple, and
addresses a wide variety of issues. Basically, the new thread's setup code is much easier
to write if it doesn't have to consider multi-threading issues until it's all set up.
Here is the general concept in pseudocode.
void function (a,b,c);
void function_in_background (a,b,c)
{
1. pack up {a,b,c} into structure.
2. launch thread_start function in its own thread.
3. wait for "ready" event from new thread.
4. return.
}
ulong __stdcall thread_start (void* raw)
{
1. unpack parameters.
2. signal "ready" to parent.
3. function (a,b,c);
}
The "ready" event is naturally modeled as an event flag, a synchronization primitive. They
are discussed starting on page xx. Here, test13, is a demonstration using the
background counter. Comparing this to test9, you'll find that it's the same principle,
only without the bug. Simply adding the "ready" flag to the logic already present in
test9 (which you recall is a straightforward attempt to move all the marshalling code out
of the algorithm code), overcomes the synchronization problems.
struct thread_args {
int low;
int high;
int step;
event_flag ready;
thread_args (int x, int y, int z);
};
thread_args::thread_args (int x, int y, int z)
: low(x), high(y), step(z),
ready(event_flag::manual_reset, false)
{}
3/7/2016 3:34:00 PM
Page 24
John M. Dlugosz
Complete Guide to C++ and Threads under NT
C++ Feature
Some examples use an aggregate initializer for thread_args. That is, a list of values
in braces. Here I use a constructor because the event_flag object needs constructor
parameters. The aggregate form is a short cut so I don't have to write a constructor
for a simple structure. Since thread_args now has a member that's a full-blown
class, it's not so simple anymore.
ulong __stdcall thread_start (void* raw)
{
// 1. unpack the arguments.
thread_args* args= static_cast<thread_args*>(raw);
const int low= args->low;
const int high= args->high;
const int step= args->step;
// 2. signal "ready" to parent
args->ready.set();
// 3. call the function
count (low, high, step);
return 0;
}
HANDLE count_in_background (int x, int y, int z)
{
thread_args args (x,y,z);
HANDLE h= CreateThread (thread_start, &args);
tasking::WaitForSingleObject (args.ready.h());
return h;
}
void test13()
{
cout << "before call to count" << endl;
HANDLE h= count_in_background (1, 30, 2);
cout << "after call to count" << endl;
waiton (h);
}
The only reason I need critical initialization in test13 is to handle the lifetime of the
arguments being passed in. However, in more complex cases handling the critical
initialization phase like this really does make a difference, and passing the arguments
easily is just another side benefit. Since many threads can benefit from this technique,
why not use it as a standard model, even on simple cases?
Launching threads around an existing normal function is fine for the case where you let
that function complete in the background, and don't interact with that thread until it's
Page 25
3/7/2016 3:34:00 PM
Complete Guide to C++ and Threads under NT
John M. Dlugosz
finished. But what about more complex cases? Specifically, this function model doesn't
lend itself well to marking critical initialization phase.
The function needs to be built with this "ready" indication in mind. So, it's not really a
lonesome old function being endowed with its own thread of execution by totally
independent code. But, the concept still holds that the function should be written as a
normal function, with normal C++ parameters. The marshalling code (which squirts
everything through the void*) and the mechanics of creating the thread should be
isolated into other functions. The algorithm is still in a function that just deals with the
algorithm, but the algorithm knows to signal "ready" at a certain point.
This next example, test14, demonstrates the concept. Here, memory allocation
represents the critical initialization phase, because it's the simplest resource to
manipulate. Also, error checking and cleanup is not shown, to keep the example
focused on the topic.
void alg14 (int x, int y, event_flag& ready)
{
// phase 1 — setup
int* array= new int[x];
ready.set();
// phase 2 — sustained background activity
for (int loop= 0; loop < x; loop++) {
array[loop]= y;
cout << '.' << flush;
Sleep (250);
}
}
It might seem silly to protect the memory allocation in such a manner, but for other
resources it could be an issue. For example, the new thread might open a file, and the
parent thread would work incorrectly if it proceeded before the file existed. Sometimes
this work must be done by the thread needing the resources, and cannot be done in
advance before starting the thread. We will see examples of this later.
The point is that the background process can be separated into two distinct phases.
The critical initialization phase is completed before the parent thread proceeds, and the
second phase then executes concurrently with the parent thread. Why you would need
to put stuff in phase one will be apparent in real code.
In order to signal the end of phase one, the function takes an event_flag as an
argument. So, unlike the counter example, alg14 is aware that it's designed for
background operation. But, there is no hint of the clunky marshalling code here.
The rest of the example should be familiar by now.
struct thread_args_14 {
int size;
3/7/2016 3:34:00 PM
Page 26
John M. Dlugosz
Complete Guide to C++ and Threads under NT
int val;
event_flag ready;
thread_args_14 (int x, int y);
};
thread_args_14::thread_args_14 (int x, int y)
: size(x), val(y),
ready(event_flag::manual_reset, false)
{}
ulong __stdcall thread_start_14 (void* raw)
{
thread_args_14* args= static_cast<thread_args_14*>(raw);
alg14 (args->size, args->val, args->ready);
return 0;
}
HANDLE background_14 (int x, int y)
{
thread_args_14 args (x,y);
HANDLE h= CreateThread (thread_start_14, &args);
tasking::WaitForSingleObject (args.ready.h());
return h;
}
void test14()
{
cout << "before call to background task" << endl;
HANDLE h= background_14 (30,'A');
cout << "after call to to background" << endl;
waiton (h);
}
But look how much simpler the thread_start_14 function is. The critical initialization
phase of alg14 subsumes any need for thread_start_14 to protect the critical
initialization of its own. Specifically, there is no need to copy the arguments into local
variables. Instead, the event flag is passed in, and the function itself signals when it's
good and ready.
Here is another variation on the idea. Instead of a single function split into two phases,
why not have two different functions? A group of functions cooperating on a set of
data is an object. So, we should be looking at a class with member functions.
Initialization in C++ is the job of constructors, so phase one can be done by the
constructor.
Here is the general idea, in pseudocode:
void start_server (a,b,c)
{
1. pack up {a,b,c} into structure.
2. launch thread_start function in its own thread.
Page 27
3/7/2016 3:34:00 PM
Complete Guide to C++ and Threads under NT
John M. Dlugosz
3. wait for "ready" event from new thread.
}
ulong __stdcall thread_start (void* raw)
{
1. unpack parameters.
2. server* p= new server (a,b,c);
3. signal "ready" to parent.
4. p->serving_loop(); //e.g. a message pump
}
This works best with servers, where the background thread gets orders from other
threads rather than just performing a single task and ending. We'll see this in more
detail starting on page xx. But for the sake of not being satisfied with just pseudocode,
here is the same example coded with this technique.
class server15 {
const int size;
int* data;
public:
server15 (int size);
~server15() { delete[] data; }
void go (int value);
};
server15::server15 (int size)
: size(size), data(0)
{
data= new int[size];
}
void server15::go (int value)
{
// phase 2 - sustained background activity
for (int loop= 0; loop < size; loop++) {
data[loop]= value;
cout << '.' << flush;
Sleep (250);
}
}
typedef thread_args_14 thread_args_15; //no change
ulong __stdcall thread_start_15 (void* raw)
{
thread_args_15* args= static_cast<thread_args_15*>(raw);
server15* server= new server15 (args->size);
const int val= args->val; //must save this
args->ready.set();
server->go (val);
delete server;
return 0;
3/7/2016 3:34:00 PM
Page 28
John M. Dlugosz
Complete Guide to C++ and Threads under NT
}
HANDLE background_15 (int x, int y)
{
thread_args_15 args (x,y);
HANDLE h= CreateThread (thread_start_15, &args);
tasking::WaitForSingleObject (args.ready.h());
return h;
}
void test15()
{
cout << "before call to background task" << endl;
HANDLE h= background_15 (30,'A');
cout << "after call to to background" << endl;
waiton (h);
}
Notice that the go member is phase two only, while the whole constructor is assumed
to be phase one. The thread-start function must save args->value in a local variable
before signaling done, as once the signal is sent, the code must assume that args is no
longer a valid object. If the value is to be passed to go, it must be saved earlier. The
args->size passed to the constructor had no such problem, and this is the fundamental
difference between phase one and phase two. Phase two is asynchronous with the
parent thread, and phase one is not. Never underestimate the ramifications of this.
Once the "ready" signal is sent, the code executes in a more hostile environment.
 Reuse It
You may have noticed a strong similarity among the examples in this chapter. After all,
the means to start a thread is being taught as a higher-level idiom, or "pattern" if you
prefer. The point is to learn it in detail and then apply it over and over. So, can the
compiler allow you to build reusable components to fill this role?
Yes and no. The marshalling code is different every time, due to the different
signatures in whatever background function you are trying to start. But the two piece
solution of doing the marshalling separate from the caller and thread-start function sure
looks like it ought to be reusable somehow.
If thread-start functions always took the same arguments, it would be a simple matter,
either using templates or function pointers.
So, what we need is an element of uniformity. The previous sample, test15, offers a
hint. The test11 sample offers hints, too. We want to separate out all the parameter
packing and unpacking from the code that does the thread launching. So, put that in an
object, which is supplied to the reusable launching code.
Page 29
3/7/2016 3:34:00 PM
Complete Guide to C++ and Threads under NT
John M. Dlugosz
Here is a solution that does the simplest case (that is, no critical-initialization phase).
template <typename T>
struct dummy {
static ulong __stdcall launch_thread_helper (void* raw)
{
T* p= static_cast<T*>(raw);
p->start();
return 0;
}
};
template <typename T>
ratwin::types::HANDLE launch_thread (T& x)
{
ulong id;
THREAD_START_ROUTINE start= &dummy<T>::launch_thread_helper;
return CreateThread (id, start, &x);
}
C++ Feature
The launch_thread_helper function is a static member of a dummy class, rather
than simply being a template function. That is because the compilers don't support
template functions that don't use all the template parameters in the function's
argument list.
Also, the address of the function is assigned to a temporary variable start rather
than being used directly in the parameter list to CreateThread. This makes the lines
smaller, but wasn't done in the name of clarity. Rather, Microsoft C++ couldn't
handle it without the temporary.
To use it, supply an object that has a start member. The data needed for the new thread
will already be inside the object.
class detached_counter_16 {
int low;
int high;
int step;
public:
detached_counter_16 (int x, int y, int z) : low(x), high(y), step(z) {}
void start() { count (low, high, step); }
};
int main()
{
cout << "before call to count" << endl;
detached_counter_16 backgrounder (1, 30, 2);
HANDLE h= launch_thread (backgrounder);
cout << "after call to count" << endl;
3/7/2016 3:34:00 PM
Page 30
John M. Dlugosz
Complete Guide to C++ and Threads under NT
waiton (h);
return 0;
}
The launch_thread facility calls backgrounder->start(). So, the code needs to provide
such a function in a suitable object. It's pretty simple—start just calls the function I
really wanted started.
But, in order for start to do that, it has to know what data to pass to count. To
accomplish this, the values are stored in members. To build that object, a constructor is
used.
Thanks to the use of templates, you have full creativity in exactly how you arrange this.
As long as backgrounder->start(); is a legal statement, it works.
Alternatively, you could do without templates if you had a starter abstract base class that
declared start as a virtual function. Then, all starters would have to be derived from that
base class, and that base is what launch_thread would take. The lifetime issues are the
same as in test11. If you want backgrounder to represent the background task, it
needs to outlive the thread. Or, if you want backgrounder to represent a launcher, you
need to deal with the critical-initialization phase so the arguments can be saved within
the thread before the launcher is destroyed.
A higher-level C++ model
The preceding discussions demonstrate a single general idea on how to start a thread in
C++, with a large number of variations on the theme. It is possible to produce some
reusable code to implement this model. With that as initial input, I have developed a
higher level C++ abstraction of threads, complete with reusable support code.
 The Launch Pad Model
I've always thought that the "obvious" approach was to model a background process as
a C++ object. In my early home-brew multitasking systems, I used an object to
represent the thread itself. Today with operating system support for threads, that object
exists inside the kernel. It might seem that wrapping the HANDLE in a C++ object is
the natural way to proceed. After all, that works fine with files, windows, and other
resources.
But this model has inherent flaws, and that prompted me to find a better approach.
The result is my launch pad model, which uses three different abstractions to represent
different aspects of the dynamic system.
Consider the launch_thread template function above. Just what is the backgrounder
object representing? It does not meaningfully represent the background process. In
Page 31
3/7/2016 3:34:00 PM
Complete Guide to C++ and Threads under NT
John M. Dlugosz
fact, the object does not have to continue to exist once the new thread finishes its
critical initialization phase. Clearly, this is something used to start an activity, not the
activity itself.
Contrast this with test11. There, the detached_counter_11 object represented the
background activity. The object had to live as long as the thread, and other code could
use the object in order to operate on the thread (in that example, getting the result of
the calculation).
But reading this abstraction into test11 has its flaws. The object has more data in it than
the thread actually needs. Some of the values are only needed to start the thread, not to
sustain it. An object that controls a thread should not have to carry around that extra
baggage. Ah, what's that there? The object controls the thread. It is not the thread itself, it's
a controller. The extra baggage indicates that two objects are needed—each fills a
different role, and can have distinct lifetimes. But neither represents the background
activity itself, and there are cases where such an abstraction is indeed useful.
So that gives a grand total of three different objects, each representing a different
concept. You can think of these rolls as the launch pad, mission control, and the
rocket.
cute picture here.
 The Launch Pad — Starts a New Thread
Let's revisit test16 and the launch_thread function. The implementation of the
backgrounder object bears a striking resemblance to a more general idiom called the
Command Pattern4.
The command object lets code pass around instructions to be executed by other code
by making all such commands self-contained. This has other uses in programming, and
will in fact find much use later in this book. So, let's adopt a standard command object,
and rewrite test16 to use it. Here is the class under discussion:
class command {
public:
virtual void execute() =0;
void operator()() { execute(); }
virtual ~command() {}
};
Another flaw with the earlier example was that as a reusable mechanism, it's somewhat
lacking in customizability. It assumes common defaults for lots of the thread creations
settings, and in some cases you may want to change these. Instead of adding more
4
Put reference to the gang-of-four Patterns Book here.
3/7/2016 3:34:00 PM
Page 32
John M. Dlugosz
Complete Guide to C++ and Threads under NT
forms of launch_thread that takes various other arguments, think objects. After all, we
already decided that the launch pad should be an object, not a single function. If
launching the thread is behavior of an instantiated launch pad, then it stands to reason
that the launch pad can be configured (perhaps extensively) before use.
Here is the general idea to give a feel for what is wanted. Assume that backgrounder is
an instance of something derived from command, and is similar to its namesake in the
earlier example.
// simple case (as before)
launch_pad launcher; //create with defaults
launcher.launch (backgrounder); //use it.
// more subtle use
launch_pad slow_launcher;
slow_launcher.priority (low);
// … change other settings as desired.
// … later …
slow_launcher.launch (backgrounder);
// … later still …
slow_launcher.launch (backgrounder_2);
A launch pad object can be created and configured. Then that launch pad can be used
to launch any number of background activities. The launch pad is not the background
activity—rather, it is something which starts the background activity. There is a clear
separation of rolls between the background process itself and the launch pad.
A launch pad can be a complex thing in itself. In a rocket launch, look at all the support
structures, gantries, ground vehicles, and personnel involved. That's all to support the
launch process, and is baggage that is left behind on the ground, never becoming part of
the actual mission in space. Meanwhile, the launch pad can be used again as soon as the
launch is finished. Many rockets in orbit could potentially trace their start back to the
same launch pad.
Here is a first cut of code to implement a launch pad, with our good old counter
example as the payload.
 Mission Control — Controls The Background Process
 The Rocket — Is The Background Process
Page 33
3/7/2016 3:34:00 PM
Complete Guide to C++ and Threads under NT
John M. Dlugosz
Ending A Thread
3/7/2016 3:34:00 PM
Page 34
John M. Dlugosz
Complete Guide to C++ and Threads under NT
Software Models
Intro goes here
Classification of Threaded Code
I've designed a classification system to help document and understand threading issues
associated with an object, class, or other system. These models act as mental shorthand,
so that once a class is pegged it is well understood. Why go to the trouble of explaining
(and hopefully documenting) the same basic thing over and over again? Instead, learn
these common classifications and just refer to them.
First, we need to specify what level of organization we are describing. Choose one of
the following:
I
Instance
G
Group
C
Class
S
Subsystem
Then use one of the following ranks:
1 (most restrictive)
OTO
One Thread Only
2
OTT
One Thread at a Time
Grp
Grouped members
R/W
Reader/Writer Grouped
FT
Free Threaded
3
4 (least restrictive)
 OTO
The most restrictive ranking, One Thread Only (OTO) generally covers things that were
not designed with any thread safety in mind. It means that only one specific thread may
manipulate the object (or system or whatever).
If a class is documented as being I-OTO (read: Instance, One Thread Only), it means
that whatever thread that created the object is the only thread which can subsequently
use the object. All member function calls (which include the constructor and
destructor) on a particular instance must use the same thread.
However, different instances can use different threads without any issue. For example,
if worker thread one creates object p1 and thread two creates object p2, then worker
Page 35
3/7/2016 3:34:00 PM
Complete Guide to C++ and Threads under NT
John M. Dlugosz
thread one can call p1->foo and worker thread two can call p2->bar. It would be a
violation of the specification for thread one to call p2->bar, or to otherwise use p2 in
any manner.
Note that the ranking documents the way a class is supposed to be used. If p1 and p2
automatically detect use in the wrong thread and handle them correctly, then the class
would no longer be OTO (it would be FT). If p1 and p2 automatically detect the
misuse in a debug compilation and throw an exception, then the OTO label still applies.
Now I-OTO is pretty simple to achieve. Since a class's state is kept in member data and
is separate from any other instance, each instance is independent on how they are used.
Since OTO is the most restrictive, anything else the class's implementation calls will
satisfy this.
But, what if different instances share data? Consider the presence of a static data
member. Two different instances use the same underlying value, so there is contention
there. This would be classified as C-OTO, meaning the OTO rank applies across all
instances of the class, rather than individually to each instance.
But sometimes that is overkill. Suppose instances point to each other, such as in a
linked list. If two nodes are deleted from the same list at the same time, the link
pointers could be corrupted. But deleting nodes from different lists at the same time is
not a problem. So, this gets a G prefix, for Group. You must specify what the group
is. In this example, the group is all instances in the same collection.
 OTT
A step up from OTO is OTT. All uses of an object (I-OTT) or class (C-OTT) must
still be serialized, but it doesn't matter which thread makes the call. As long as only one
thread of execution at a time exists inside the component, it works fine. The
component doesn't have any special affinity for a particular thread.
Again, this is fairly easy to achieve. Classes written using only pure C++ features would
all be OTT, not OTO. A class fails being OTT, being demoted to OTO, only if it uses
a component that is OTO. So where did the first such class come from? Through the
use of system objects (such as window handles) that are OTO, and from the explicit use
of thread-local storage.
 R/W
The Read/Write rank is a special case of Grouped. But it's so common that it's worth
having its own classification.
This means that the component can be used in two different ways: If one thread is
writing to the component, than no other thread can use the component. But if there
3/7/2016 3:34:00 PM
Page 36
John M. Dlugosz
Complete Guide to C++ and Threads under NT
are no writers, than any number of threads can read from the component at the same
time.
For a ranking of I-R/W, the component is any instance. But all instances are
independent, so writing to one object doesn't put any restrictions on what can be done
with other objects. Rather, it just disallows any simultaneous access to the same object.
For C-R/W, the component is the set of all instances of the class. Writing to one object
means you can't use any other object at the same time.
Most C++ code can qualify as R/W rank with just a little bit of attention during
implementation. First of all, you can't use any components that are ranked as more
restrictive. As far as C++ primitives go, reference counting and other data sharing
among objects will cause a class to fail at being R/W safe.
 Grp
More generally, you can describe groups of members and then describe the restrictions
as resource limits.
Here is how a R/W is actually a special case of Grp. Pretend we are describing a simple
stack class. Start by describing the groups, and formally define a group by listing every
operation that is a member of that group.
Group A — Readers
peek() const
count() const
Group B — Writers
push (const T& value)
pop()
For readers and writers, its pretty simple to tell which members go where. Except for a
few special cases, every const member is a reader, and all the others are writers.
But sometimes the simple readers/writers division is too limited. Formally defined
groups falls in a comfortable zone between total unorganizable chaos and the simplicity
of treating all operations the same. The key to making Grp meaningful and
understandable is to have groups that correspond to the high-level semantics of the
class.
For example, consider a double-ended queue (also known as a deque). It may be
designed so that operations on one end can be performed independently from
operations on the other end. You can do two things at once, provided the threads
manipulate different sides. This would give us the following groups:
Page 37
3/7/2016 3:34:00 PM
Complete Guide to C++ and Threads under NT
John M. Dlugosz
Group A — one end
push_front (const T& value)
pop_front()
clear()
Group B — the other end
push_back (const T& value)
pop_front()
clear()
Notice that it's perfectly correct to have one operation appear in multiple groups.
Now that the groups have been identified, you state the restrictions in terms of how
many operations in each group can be executed at the same time. In this example, I
simply have:
{ 1, 1 }
Meaning that at any one time, I can have at most one member from group A and one
member from group B going on at the same time.
In the readers/writers example, this would be expressed as
{ n, 0 }
{ 0, 1 }
any number of readers but no writers
one writer alone
That is, give multiple cardinality lines. If a particular mix of operations can satisfy any
of the lines, the situation is legal. It is a good idea to give an informal description of
what each line represents, too.
The key to making Grp useful is to have informal group descriptions that are easily
understood and are meaningful for the semantics of the class, plus have formal and
rigorous (yet simple to produce) documentation on exactly what is being specified.
 FT
The Free Threading rank is the most relaxed. Basically, anything goes. This is what
many people think of when they say "thread safe". You can call any member at any
time on any thread. That's not to say that it won't block, but rather that the class won't
malfunction if it's called this way.
There is no meaningful distinction between I-FT and C-FT, so this can simply be called
"Free Threaded" without further qualifications.
Applying Thread-Safety Classifications
3/7/2016 3:34:00 PM
Page 38
John M. Dlugosz
Complete Guide to C++ and Threads under NT
Serialization
Servers
Worker Threads
Contrast test16 with test11: there the object represented the background process.
GUI Threads
"Future" values
Producer/Consumer and Pipelines
Job threads (background tasks)
Job threads are more independent than worker threads. They don't have to be
"background". It's not getting orders from someone like a server, but is doing its own
job as an independent program.
Page 39
3/7/2016 3:34:00 PM
Complete Guide to C++ and Threads under NT
John M. Dlugosz
Synchronization Issues
Race Conditions
Deadlocks
System and Library Details
 Handles
Explain how Kernel, USER, and GDI handles are all different, and how each relates to
threads.
3/7/2016 3:34:00 PM
Page 40
John M. Dlugosz
Complete Guide to C++ and Threads under NT
Atomic Operations
Simple Instructions
The "Interlocked" suite
Page 41
3/7/2016 3:34:00 PM
Complete Guide to C++ and Threads under NT
John M. Dlugosz
Kernel Synchronization Objects
A synchronization object is something that allows a thread to suspend itself and be
awakened when something interesting happens. The core of the operating system,
which is responsible for scheduling threads, has built-in support for this via kernel
objects. Other synchronization techniques can be built on top of these, but ultimately
one of these primitives is used to communicate the intent to wait to the scheduler deep
inside the operating system.
The kernel synchronization objects provided by NT are known as Mutex, Mutant,
Semaphore, Event, and Waitable Timer. The kernel object called a mutex is only used
in kernel mode. What the Win32 API calls a Mutex is actually a Mutant kernel object.
In this book, Mutex means the Win32 API Mutex.
Mutex, Semaphore, and Event are a pretty diverse group, in the sense that each can do
something the other's can't. Mutexes remember which thread owns them and allows
recursive locks; Semaphores can count; and Events allow multiple threads to resume on
the same cue, or provide a way to gate a set of waiting threads without counting them.
Common Info on Kernel Objects
In general kernel objects are managed the same way, so I'm only going to explain it
once. A kernel object is system-wide, and may be used by any process. They are
accessed by HANDLE's which are reference counted. A HANDLE is local to a process,
but different HANDLEs in different processes can refer to the same kernel object. The
object is destroyed when the last HANDLE is closed (via CloseHandle or implicitly by
terminating the process).
To acquire a HANDLE to a kernel object, you can duplicate an existing handle implicitly
by inheriting it in a child process, or explicitly by calling DuplicateHandle (a function
that has too many parameters for its own good).
Besides that, you can call a function that begins with Create… to access a named object
or create it if it doesn't already exist. If you want to access an existing one only without
the possibility of creating it instead, call the corresponding function that begins with
Open….
The security attributes is the first parameter. If you don't plan on using it to
synchronize across processes, just use null.
The name is a nul-terminated string. The designation tchar* here means it's either a
char* or a wchar_t*, depending on whether you compiled for ANSI or UNICODE
character sets.
3/7/2016 3:34:00 PM
Page 42
John M. Dlugosz
Complete Guide to C++ and Threads under NT
The name can be up to 260 characters long, and may contain any characters except
backslash ('\\'). Presumably it can't contain a nul ('\0') either, since that terminates the
input string. Alternatively, you can supply a null pointer for the name, and the mutex is
created without a name. [[[ so what's a name of "" (empty string) do? ]]]
If the Create… function finds an existing object with the same name, the existing object
is used and no object is created. If the existing object is of the wrong type, Create…
returns null and GetLastError returns ERROR_INVALID_HANDLE. You can't have
different kernel objects with the same name (e.g. a mutex and a semaphore).
If the existing object is the proper kind of object, parameters in the Create… call that
specify characteristics of the object are ignored. After all, the object already exists and it
is taken as-is. In the security attributes parameter, the actual security information is
ignored, but the inheritable flag is significant, as that applies to the new handle only
rather than to the object itself.
When Create… accesses an existing object, all rights are requested. If you want to
specify access rights in detail, or if you want an existing object only and not a new
object, use the corresponding Open… call.
The Open… functions take three parameters. The first specifies the desired access to
the object. Details of security are not covered in this book. If you are not interested in
fine-tuning what can be done with the handle, use MUTEX_ALL_ACCESS,
SEMAPHORE_ALL_ACCESS, etc. for the type of object you are opening.
The middle parameter to Open… specifies whether you want the handle to be
inheritable.
The last parameter is the name of the object. If it is not found or it is the wrong type,
the function indicates failure by returning false.
Mutex
A mutex models a resource that can be acquired by a single thread. Semaphores model
resources that have multiple instances, but a mutex is more than just a binary
semaphore.
In general, a thread acquires a mutex before manipulating a shared resource. If the
mutex is unused, the thread proceeds, and the mutex is marked as being in-use. If the
mutex is already in use, the thread blocks. When the mutex is released, the waiting
thread (or one waiting thread, if there are several) wakes up and is allowed to acquire the
mutex. By writing your code so that a thread only manipulates a resource when it "has"
the mutex, you will be assured that only one thread at a time manipulates the resource.
Page 43
3/7/2016 3:34:00 PM
Complete Guide to C++ and Threads under NT
John M. Dlugosz
 Mutex Semantics
Besides only allowing one lock at a time, the mutex remembers which thread locked it,
and considers itself affiliated with that thread. There are several ramifications to this
feature:

The same thread must do the unlocking.

The same thread can harmlessly acquire the same mutex again.

Other threads can sense "abandonment".
The most significant purpose for having mutexes remember who locked them is a
feature known as recursive locks. For an illustration, consider an object that has a
mutex among its member data. The member functions acquire the mutex before
manipulating the object's state, to provide for thread safety in the object's use. That
sounds simple, but consider what happens when one member calls another:
void C::foo()
{
1. acquire the mutex
2. do some calculations
3. calculations includes a call to bar()
7. release the mutex
}
void C::bar()
{
4. acquire the mutex
5. do some calculations
6. release the mutex
}
Now consider what happens when foo is called. In step 1, the mutex is acquired. But
later, in step 4, the mutex is acquired again. If mutexes were equivalent to binary
semaphores, it would block since, strictly speaking, it's trying to acquire a mutex that is
already in use. The thread would deadlock on itself!
This is a common enough issue that a way to deal with it was supplied as a primitive in
the OS. Because the mutex remembers which thread got it, it can realize that the same
thread is asking again. So, in step 4, the line proceeds without blocking. When the
thread asks for the mutex, instead of the system thinking "it's in use; gotta wait for it" it
realizes "he already has it; proceed".
Every lock has to be balanced with an unlock. That is, you don't want step 6 to indicate
that this thread is done with the object. Instead, the mutex has to remember how many
locks are pending and not release until the last lock has been removed. So, step 7 really
releases the mutex.
3/7/2016 3:34:00 PM
Page 44
John M. Dlugosz
Complete Guide to C++ and Threads under NT
Win32 does not document the range limit of the mutex's lock counter, nor document
what might happen if you lock too many times for the implementation to handle. I
investigated this through empirical testing in NT4 and an examination of the structures
found in the DDK (device-driver development kit) headers. The signaled state of any
kernel object is stored as a signed long. Besides being signaled or not signaled, the
actual value means different things to different objects. This value is used for the
recursive acquisition count in mutexes. So, the count is limited to 31 bits, the positive
values representable in a long. If you acquire a mutex something over two billion times,
you get the blue screen of death, as the kernel throws an unhandled exception.
As a secondary benefit to keeping track of which thread locked a mutex, the system can
perform error checking on the unlocking. The system insists that a mutex is released by
the same thread that acquired it.
So what happens if a thread terminates without releasing a mutex that it has acquired?
For other synchronization objects, nothing in particular happens. But for mutexes, the
system knows that the only thread that could have released the mutex didn't and never
can. What happens is that the mutex is forcefully released, and a different code, to
indicate that the mutex was abandoned, is returned to the waiting thread that acquires it.
 Win32 Mutex API Summary
A mutex is created using CreateMutex or OpenMutex, and destroyed using
CloseHandle.
A mutex is acquired by using any of the Wait functions, described on page 65. A mutex
is released using the ReleaseMutex function.
The CreateMutex function
HANDLE CreateMutex (
SECURITY_ATTRIBUTES*,
bool grab_right_away,
const tchar* name );
This returns a handle to a new or existing mutex, as explained on page 42. On error, it
returns null and GetLastError provides more information.
The grab_right_away parameter allows the mutex to be acquired immediately upon
creation. It has the same result as:
HANDLE h= CreateMutex (security, false, name);
WaitForSingleObject (h,0);
Page 45
3/7/2016 3:34:00 PM
Complete Guide to C++ and Threads under NT
John M. Dlugosz
but has the benefit of being an atomic operation. That is, no other thread is allowed to
grab the mutex between the time it is created and the time WaitForSingleObject is
executed. If the mutex has a name, it could happen.
The ReleaseMutex function
BOOL ReleaseMutex (HANDLE mutex);
Nothing to it—call this some time after a mutex is acquired in order to release it.
 Using Mutexes in C++
Mutexes are used to guard usage of a shared resource, or to otherwise make sure that
one thread at a time does something. Their proper use is to acquire the mutex, perform
the operation that is to be protected, and then release the mutex again. Sounds simple?
Well, look again.
void do_something()
{
WaitForSingleObject (MX);
// OK, do my work.
int y= foo(3);
cout << y << endl;
// done with my work, punch out.
ReleaseMutex (MX);
}
What's wrong with this simple code? The very nature of C++. It looks innocent
enough until you discover that MX is never being released and your threads are hanging.
What happens if foo throws an exception?
What happens is the stack is unwound. Presumably a caller of do_something
eventually handles the error, but ReleaseMutex is never called. Execution jumps from
foo all the way to some catch block, and the remainder of do_something is skipped!
Don't write code like this in C++. Remember that resource acquisition is best modeled
as initialization. The region where MX is held should be tracked by using the lifetime of
an object.
A Simple Solution
Here is a minimal annotated class to accomplish this.
class locker {
HANDLE h;
public:
explicit locker (HANDLE h); //locks
~locker(); //unlocks
};
3/7/2016 3:34:00 PM
Page 46
John M. Dlugosz
Complete Guide to C++ and Threads under NT
locker::locker (HANDLE h)
: h(h)
{
WaitForSingleObject (h, INFINITE);
// >> in real code, error checking would go here.
cout << "acquired mutex " << h << endl;
}
locker::~locker()
{
ReleaseMutex (h);
// >> in real code, error checking would go here.
cout << "released mutex " << h << endl;
}
C++ Feature
The explicit keyword on the constructor is used to prevent this constructor from
providing an implicit conversion. That is, I can't accidentally use a HANDLE where a
locker was expected.
Here is the do_something example using this locker class.
HANDLE MX= CreateMutex (0, false, 0);
int foo (int x)
{
if (x==3) throw "invalid argument to foo";
return 1000 / (x-3);
}
void do_something()
{
locker _ (MX);
int y= foo(3);
cout << y << endl;
// releasing MX here is implicit.
}
int main()
{
try {
do_something();
}
catch (const char* message) {
cout << "exception caught: " << message << endl;
}
return 0;
}
Page 47
3/7/2016 3:34:00 PM
Complete Guide to C++ and Threads under NT
John M. Dlugosz
A locker variable is declared, and while it is in scope, the thread holds MX. What is the
name of the variable? I don't use it except to construct it, so what do I need with a
name? C++ doesn't provide anonymous variables, so I just used a throw-away name.
An underscore by itself is just as valid of a name as an x by itself. However, don't use
names that contain two consecutive underscores.
Eliminating the need to explicitly release the mutex is not just cosmetic, and not just a
convenience to encourage multiple return's from the middle of a function. Due to
exception handling, it's necessary. Don't even think about using mutex's without
destructor semantics.
This program outputs (handle id will vary from run to run):
acquired mutex 0x0000004C
released mutex 0x0000004C
exception caught: invalid argument to foo
So you can see that the mutex was properly released even though do_something did
not complete normally. There is no need for special clean-up code, as it is written in a
proper exception-safe manner.
Limitations of the Locker Class
Windows provides the ability to wait on more than one thing at a time. This is a very
powerful feature to have built into the operating system at a low level, simply because
it's difficult to do well using simpler primitives.
Waiting for the first available of several synchronization objects, some of which are
mutexes, is not an elegant thing to do in C++. Consider the pseudocode:
mutex A;
mutex B;
wait for A or B
if (A was acquired) {
do something
release A
}
else {
do something else
release B
}
If you want to handle situations like this, you need to separate the multiple-wait ability
from the creation of the locker. That is, one branch needs a locker on A, and the other
branch needs a locker on B. At least it's not spaghetti—there is still some organization
in that the structure matches the scope of two distinct lockers. I'm sure you realize that
it could be far worse. Don't use mutexes as events. Use them only with resource
acquisition semantics.
3/7/2016 3:34:00 PM
Page 48
John M. Dlugosz
Complete Guide to C++ and Threads under NT
Waiting on multiple mutexes and then switching on the result is something that is better
done using other means. Specifically, look at a server loop using callback routines
instead. See page xx for an example using background I/O.
On the other hand, it does make sense to wait for multiple things when exactly one of
them is a mutex. For example, code may need to use some resource. But while it waits
for access, the user can click on a cancel button, or the wait can time-out. This is easy
to model in C++ by using exceptions. Constructing a locker represents resource
acquisition. Failing to acquire the resource (because the user clicked cancel, or because
it got tired of waiting) is failure to construct. Locker's constructor doesn't return, but
throws an exception instead.
try {
locker _ (construct on mutex A,
or abort if user clicks cancel before A is available);
// if code got this far, _ is constructed, resource is acquired
do something
// implicit release of A as locker goes out of scope
}
catch (cancelled) {
recovery code
}
Often, such code does not need an explicit try block. Rather, a single try block is used
around the entire operation. Any time a resource is acquired, it waits on the cancel
button as well. That way, the user can always quit when the operation is blocked.
Another complex wait scenario is to wait on multiple mutexes at the same time, where
you want all of them. This can be done by having the locker object take an array of
mutexes. Construction succeeds only when all are acquired, and the destructor knows
to free all of them. This is semantically different from acquiring one resource and then
another. Consider:
void foo()
{
locker L1 (A);
locker L2 (B);
// do stuff…
}
void bar()
{
locker L2 (B);
locker L1 (A);
// do stuff
}
Here, foo and bar can deadlock if called at the same time. This particular case can be
fixed by using a consistent ordering to resource acquisition, as explained on page xx
Page 49
3/7/2016 3:34:00 PM
Complete Guide to C++ and Threads under NT
John M. Dlugosz
under deadlocks. But, there are still issues involved with sequential acquisition of
resources. In general, you should avoid holding one thing while waiting on something
else. If you could instead write:
void foo()
{
locker L (A,B);
// do stuff…
}
void bar()
{
locker L (B,A);
// do stuff
}
such that A and B are simultaneously acquired, you will be better off. Windows
supports a primitive to do this, such that A and B are both acquired at the same time, or
neither A nor B is held while it is waiting. It's all or nothing, not a piecemeal acquisition.
So, reasonable and proper uses of mutexes can be structured using a locker class, where
the class supports a list of mutexes (1 is a degenerate case), and other non-mutex
synchronization objects which abort the construction of the locker.
A full blown solution, in the C++ Threading Library, is presented on page xx.
Semaphore
Semaphores can model resources that have multiple instances. For example, a public
restroom can handle three patrons at a time. The line outside the door are the people
"blocked" on acquiring the resource. Rather than modeling each seat as a separate
resource, the entire group of three is controlled by a single semaphore.
But as pointed out earlier, a mutex is not just a binary semaphore. The mutex has to be
released by the same thread that acquired it. The semaphore has no such restrictions, so
something can be acquired by one thread and released by another. No, that's not only
caused by serious pasta in the design—think of "consume" instead of "acquire", and
"produce" instead of "release", and you can see another role for semaphores in modern
C++ programs.
 Semaphore Semantics
When used to model resource acquisition, like a mutex, acquiring the semaphore means
that the resource may be used. However, it does not track the individual identity of a
pool of resources. Think of a turn-style on the door of the public rest room. It is
preset with the capacity, of three in this case. Passing through the door turns the crank
3/7/2016 3:34:00 PM
Page 50
John M. Dlugosz
Complete Guide to C++ and Threads under NT
and decrements the counter. If the counter shows zero, the gate will not turn at all. So,
we get lines outside the door.
On the way out, the gate is turned in the other direction so the counter is incremented.
That allows the gate to unlock and another person to enter. The mechanism in the gate
doesn't keep track of which seat each person takes, or which people even turned the
gate.
In comparison, the mutex is more like the key to the restroom at a road-side filling
station. One person acquires the key, and that person possesses the key which
symbolizes the acquisition of the resource. That token (the key) must be turned back in
afterwards. The only way to free the resource is for the key holder to give it up.
Unlike a mutex, a semaphore doesn't keep track of who "has" it. It's just a counter. If
the same process waits on a semaphore twice, the semaphore is decremented twice.
Call it enough times, and the thread blocks.
The other way to use a semaphore is to model a renewable and consumable resource.
Instead of acquiring something and then having to give it back, you acquire something
and use it up. Think of a bakery kiosk that sells cakes. The patron goes in, buys a cake,
and presumably takes it home and eats it. The counter indicates the number of cakes
that may be acquired. When the counter hits zero, the line forms outside the door. But
customers never bring the cakes back! How does the counter ever increase again? A
different person (the baker) delivers new cakes.
So, one thread (the consumer) decrements the counter, and a different thread (the
producer) increments the counter. The counter represents a buffer so that the producer
and consumer can run asynchronously (though in the long run, they need the same
average rate). Producer/consumer problems are discussed in more depth on page xx,
and simple examples are given later in this section, on page xx.
 Win32 Semaphore API Summary
A semaphore is created using CreateSemaphore or OpenSemaphore, and destroyed
using CloseHandle.
A semaphore is acquired by using any of the Wait functions, described on page 65. A
semaphore is released using the ReleaseSemaphore function.
The CreateSemaphore function
HANDLE CreateSemaphore (
SECURITY_ATTRIBUTES*,
long initial_count,
long max_count,
const tchar* name );
Page 51
3/7/2016 3:34:00 PM
Complete Guide to C++ and Threads under NT
John M. Dlugosz
This returns a handle to a new or existing semaphore, as explained on page 42. On
error, it returns null and GetLastError provides more information.
The initial_count is the initial counter value given when the semaphore is created. It
must be between 0 and max_count, inclusive.
The max_count is an upper limit to the counter. This provides for some error checking
on releasing a semaphore—it is an error to release (increment the counter) past the
maximum.
The ReleaseSemaphore function
BOOL ReleaseSemaphore (
HANDLE semaphore,
long release_count, //must be >0
long* prev_count );
This function increments the counter of the semaphore by the value of release_count.
That is, it is not limited to simply incrementing by one.
If the current count plus release_count would be over the maximum allowed for this
semaphore, the function returns false and the counter is not changed. Specifically, note
that changing the counter by any amount is an atomic all-or-nothing operation. By
analogy, if the baker tries to deliver a dozen cakes at once and the display case only has
room for ten, none of them are delivered, as a dozen won't fit. If you want partial
delivery, call ReleaseSemaphore in a loop (deliver one cake a dozen times) instead.
The prev_count may be null, or points to an area to receive the previous count.
[[ verify that prev_count works even on error ]] There is no function to get the current
value of a semaphore. However, you can obtain the value by calling ReleaseSemaphore
with a release_count so large that it will always fail. The counter will be unchanged,
and *prev_count will indicate the old (and still current5) value.
 Using Semaphores in C++
Event
An event synchronization object models "events" that you can define. That is, the
event's state is explicitly controlled. One thread can wait for an event, where that event
represents anything you want (e.g. "buffer is not full"). Meanwhile, another thread sets
5
Current as of the failed ReleaseSemaphore call, that is. The value might change between the time
is called and the time that *prev_count is checked by the caller.
ReleaseSemaphore
3/7/2016 3:34:00 PM
Page 52
John M. Dlugosz
Complete Guide to C++ and Threads under NT
and clears the event to implement the event's meaning (e.g. set it when items are
removed from the buffer, clear it when inserting an item which reaches capacity).
 Event Semantics
Events come in two flavors: manual-reset and auto-reset. The difference is that an
auto-reset event is automatically reset to not-signaled when a thread waiting on the
event continues. A manual-reset event, on the other hand, is only reset when you tell it
to.
A noticeable difference in function between the two concerns multiple threads waiting
on the same event. When you signal an auto-reset event, only one thread continues and
the event is reset. When you signal a manual-reset event, all waiting threads proceed,
and the event remains set to signaled.
When there are no waiting threads, and the event is signaled, any number of threads
may proceed on a manual-reset event. Any thread that tries to wait on it will see it as
signaled, and not block. The event remains set. On an auto-reset event, as soon as one
thread waits (and continues right away), the event is reset and subsequent threads will
block if they attempt to wait on the event.
 Win32 Event API Summary
An event is created using CreateEvent or OpenEvent, and destroyed using CloseHandle.
A thread waits for an event to be in the signaled state by using any of the Wait
functions, described on page 65. It can be set or reset by using the SetEvent,
ResetEvent, and PulseEvent functions.
The CreateEvent function
HANDLE CreateEvent (
SECURITY_ATTRIBUTES*,
bool manual_reset,
bool initial_state,
const tchar* name );
This returns a handle to a new or existing event, as explained on page 42. On error, it
returns null and GetLastError provides more information.
The manual_reset parameter, if true, indicates that a manual-reset event is to be
created. If false, then an auto-reset event is created.
The initial_state parameter indicates the initial state of the event: signaled (or set) if
true, not signaled (cleared, or reset) if false.
Page 53
3/7/2016 3:34:00 PM
Complete Guide to C++ and Threads under NT
John M. Dlugosz
The SetEvent function
BOOL SetEvent (HANDLE event);
This sets the state of the event to signaled. This can make waiting threads stop waiting.
For a manual-reset event, all waiting threads are released. Any threads that subsequently
wait on the event (until it is reset by a call to ResetEvent or PulseEvent) also proceed
without blocking.
For an auto-reset event, one waiting thread is released and the others keep waiting, and
the event stays non-signaled (reset, or cleared). If there are no waiting threads, then the
event is set to signaled and a thread which subsequently waits on the event will set the
event back to not-signaled and not block. That is, one thread goes, regardless of
whether there are already waiting threads.
The event only has two states—set (signaled) and reset (non-signaled). It does not
count or otherwise save up signals. If a call to SetEvent has no immediate effect
(changing the state to signaled and/or releasing a waiting process) then it is essentially
lost. For example, if an event is signaled and has nothing waiting, then another call to
SetEvent has no effect, as the event is still signaled.
The ResetEvent function
BOOL ResetEvent (HANDLE event);
This sets the state of the event to non-signaled. If the event was already non-signaled,
there is no effect.
The PulseEvent function
BOOL PulseEvent (HANDLE event);
This is like a SetEvent immediately followed by a ResetEvent as one atomic operation.
For auto-reset events, one waiting thread is released. This can be understood as
follows: the releasing of the first waiting thread also causes the event to reset, as is the
fundamental characteristic of auto-reset events. The ResetEvent built into the
PulseEvent is then redundant.
For manual-reset events, all waiting threads are released and the event is left reset.
 Using Events in C++
3/7/2016 3:34:00 PM
Page 54
John M. Dlugosz
Complete Guide to C++ and Threads under NT
Timer
The kernel timer object is called a "Waitable Timer" in the API, so as not to confuse it
with older features in Windows. CreateWaitableTimer, new in NT4, creates a kernel
synchronization object, while SetTimer dates back to 16-bit Windows and works
differently. In this discussion, a timer object is assumed to refer to the kernel
synchronization object.
A timer, when used as a synchronization object, is signaled when the set time is reached.
A timer can also perform callbacks. In this respect it's like overlapped I/O (see page
69), in that you have your choice of waiting for a result or being called back with an
asynchronous procedure call.
 Timer Semantics
Timers can have their behavior tuned in a few different respects.

wait to be signaled, or be called back

manual-reset timer or synchronization timer

periodic timer or one-shot timer
A timer object is signaled when the time expires. You don't have to wait on the timer
object (you could specify a callback function in SetWaitableTimer instead), but it is still
signaled. How long it stays that way can be specified.
A manual-reset timer stays signaled until SetWaitableTimer is called again. It's kind of
silly to define a manual-reset periodic timer. A synchronization timer is like an autoreset event—when a thread completes a wait on this object, it is reset to non-signaled.
A periodic timer is signaled (and any callback performed) repeatedly using some
specified interval. A one-shot timer expires once, and then doesn't do anything else
until SetWaitableTimer is called again. For example, setting "noon on Tuesday" is a
one-shot. But "noon on every Tuesday" is periodic, with a period of one week.
 Win32 Timer API Summary
A timer is created using CreateWaitableTimer or OpenWaitableTimer, and destroyed
using CloseHandle.
A thread waits for the timer to be in the signaled state by using any of the Wait
functions, described on page 65. Alternatively, a callback function can be specified.
The timer programmed to signal (and call back) at a specified time with
SetWaitableTimer, and an active timer can be deactivated using CancelWaitableTimer.
The CreateWaitableTimer function
Page 55
3/7/2016 3:34:00 PM
Complete Guide to C++ and Threads under NT
John M. Dlugosz
HANDLE CreateWaitableTimer (
SECURITY_ATTRIBUTES*,
bool manual_reset,
const tchar* name );
This returns a handle to a new or existing event, as explained on page 42. On error, it
returns null and GetLastError provides more information.
The manual_reset parameter, if true, indicates that a manual-reset timer is to be
created. If false, then a periodic timer is created.
The SetWaitableTimer function
bool SetWaitableTimer (
HANDLE timer,
const LARGE_INTEGER* due_time,
long period, //zero for one-shot
TIMERAPCROUTINE* callback_function,
void* callback_argument, //passed to callback_function
bool power_resume );
typedef void __stdcall TIMERAPCROUTINE (
void* callback_argument,
unsigned long time_low,
unsigned long time_high );
Is that enough parameters for you? Improving upon this function is covered under
Using Timer in C++ on page 58.
The first parameter, timer, is the handle of the timer object you are setting.
The second parameter, due_time, specifies when the timer will expire. However, that's
easier said than done. Just what is a LARGE_INTEGER anyhow, and how is that used to
specify a time?
A LARGE_INTEGER is a struct containing one member of type LONGLONG. A
LONGLONG is 8 bytes. It's typedef'ed as a __int64 if 64-bit integers are supported by
the compiler, or as a double otherwise. Clearly, this is meant to be an opaque type just
defined to take up 8 bytes and force proper alignment, since assigning a number to it
could give surprising results if it turns out to be a double.
The value of these 64 bits is documented to be that of the FILETIME structure. A
FILETIME structure is two unsigned long integers holding the low and high 32 bits of a
single 64-bit number. There are API functions available for producing values of type
FILETIME. For example, SystemTimeToFileTime:
const SYSTEMTIME systime= {
1997, 10, -1/*ignored*/, 27, //October 27th 1997
13, 10, 30, // 1:10:30 PM
500 // milliseconds. Make that 13:10:30.5
3/7/2016 3:34:00 PM
Page 56
John M. Dlugosz
Complete Guide to C++ and Threads under NT
};
union {
FILETIME ftime;
LARGE_INTEGER due_time;
};
bool OK= SystemTimeToFIleTime (&systime, &ftime);
if (!OK) throw "something is wrong";
// finally!
OK= SetWaitableTimer (timer, &due_time, …
The union is used rather than casting &ftime into a LARGE_INTEGER* because of the
alignment restrictions. The LARGE_INTEGER will be aligned on a 8-byte address, while
the FILETIME structure only demands 4-byte alignment. This doesn't matter on x86
architecture (you get a speed penalty but not an error), but on other CPU's it could be a
fatal error.
The due_time can also be a relative time, indicated by using a negative number. For
example, regardless of what time it is now set the timer to go off in 1 hour, two
minutes, and 3.13159 seconds6.
To specify a relative time, express it in terms of 100-nanosecond (or tenth of a
microsecond) intervals, and multiply by –1. Note that this unit base is different from
most other places in Win32, including the next parameter to the SetWaitableTimer
function.
__int64 value= (__int64) –10000000*62*60 + 31315900;
OK= SetWaitableTimer (timer, reinterpret_cast<LARGE_INTEGER*>(&value), …
Note that this time I do use a cast, because the __int64 is aligned properly. If the
compiler doesn't have any native 64-bit integer type, make a class to support 64-bit
arithmetic (use the double trick like FILETIME to force the right alignment).
The arithmetic to compute value begins with a cast so that all the arithmetic is done
using 64-bit values. The numeric literals are of type int, and arithmetic with ints as
input gives int as results. Being non-standard, there is no special suffix to indicate that a
literal should be extra long, analogous to the L suffix to specify a long (as opposed to
int) literal.
The third parameter to SetWaitableTimer, period, indicates a repeat length. If zero, the
timer is one-shot. Otherwise, this value (in milliseconds) is added to the previous set
time and the timer is ready to go again.
The actual accuracy of the waitable timer is unspecified. On the Intel platform, NT4 appears to keep
track of time in hundredth of a second intervals. However, that doesn't mean that the kernel triggers the
timer on any arbitrary tick. Furthermore, just because a timer is signaled doesn't mean that the waiting
thread will get scheduled immediately.
6
Page 57
3/7/2016 3:34:00 PM
Complete Guide to C++ and Threads under NT
John M. Dlugosz
For example, to set a timer to go off the same time as before, but then go off again
every five minutes and twenty seconds, the parameters would be:
__int64 start_time= (__int64) –10000000*62*60 + 31315900;
long period= (5*60+20)*1000;
OK= SetWaitableTimer (timer, reinterpret_cast<LARGE_INTEGER*>(&start_time), period, …
Suppose you were just interested in the period, not in any specific start time. That is, go
off the first time one period length from now. In this case, use the same time as the
period as a relative time for the starting time. Remember to convert milliseconds to
decimicroseconds, changing the sign and being careful not to overflow 32-bit
arithmetic.
bool SetPeriodic (HANDLE timer, long period)
{
const __int64 conversion_factor= -10000;
union {
__int64 start_time_value;
LARGE_INTEGER due_time;
}
start_time_value= conversion_factor * period;
return SetWaitableTimer (timer, &due_time, period, 0,0,false);
}
The callback_function, if not null, is queued as an APC (see page 65) when the timer
goes off. The callback_argument, along with the current time, is passed to the callback
function. If the callback_function is null, callback_argument is ignored.
The last argument, power_resume, if true will cause the computer to resume from
power conservation mode when the timer goes off. If the computer doesn't have such
a thing, the timer is set normally (assuming nothing else is wrong with the arguments)
and GetLastError returns ERROR_NOT_SUPPORTED.
The CancelWaitableTimer function
bool CancelWaitableTimer (HANDLE timer);
This deactivates a timer. It does not change the signaled state of the timer object.
 Using Timer in C++
Clean up the parameters and the 64-bit time values!
Change Notification
A timer can be thought of as a pre-defined "event". That is, it's simply a flag that's
signaled when something interesting happens, and the "it happened" code is also
available as part of the operating system. So it is with change notifications. You can
3/7/2016 3:34:00 PM
Page 58
John M. Dlugosz
Complete Guide to C++ and Threads under NT
use a change notification object to wait for something to happen in a directory, or to a
printer.
 Semantics of File Change Notification Objects
A file change notification object is signaled when something changes in a specified
directory. You can set it up to watch a single directory or a whole directory tree rooted
at a specified directory.
The things that can be monitored for changes are:

Any filename change, including adding, removing, or renaming files.

Any directory name change. Same as filename changes, only concerning names
of subdirectories rather than ordinary files.

Change in attributes, such as the read-only attribute, archive attribute, etc.

The size of a file changed.

Change to the last-write time of a file.

Security attributes changed.
Suppose you write a viewer program, and include a screen to display all files in a
directory as thumbnails. You can set up a change notification object to know when
something changes in that directory, so the display won't be out of date. Don't you hate
it when a program doesn't automatically detect changes to files?
A change notification object applies to a whole directory, so it's easy to watch that
directory for changes. However, when a change occurs, all you know is that a change
occurs. The object becomes signaled. No other information is present! This is fine if
all you plan to do on such a notification is to refresh the whole display. But if you
wanted to know which file or files changed, you would have to scan the directory
information and compare it to what you had before.
[[[ find my experiments with change notifications, give details here ]]]
 An alternative: ReadDirectoryChangesW
 Semantics of Printer Change Notification Objects
Page 59
3/7/2016 3:34:00 PM
Complete Guide to C++ and Threads under NT
John M. Dlugosz
 Change Object API Summary
[[[ Quick summary goes here ]]]
The FindFirstChangeNotification function
HANDLE FindFirstChangeNotification (
const tchar* directory_name,
bool subtree,
ulong filter );
The first parameter is the directory to watch for changes. If subtree is true, this is the
root of the subtree to watch, as all subdirectories are included. If subtree is false, than
only directory_name itself is watched.
The filter is made of one or more of the following bits:
FILE_NOTIFY_CHANGE_FILE_NAME
FILE_NOTIFY_CHANGE_DIR_NAME
FILE_NOTIFY_CHANGE_ATTRIBUTES
FILE_NOTIFY_CHANGE_SIZE
FILE_NOTIFY_CHANGE_LAST_WRITE
FILE_NOTIFY_CHANGE_LAST_ACCESS
FILE_NOTIFY_CHANGE_CREATION
1
2
4
8
0x10
0x20
0x40
not documented
not documented
0x100
The LAST_ACCESS and CREATION values are not documented in the WIN32 API
documentation, but can be found in WINNT.H with the other values.
FILE_NOTIFY_CHANGE_SECURITY
[[[ give details on what all the bits do, with empirical testing ]]]
The function returns null if it fails, or a HANDLE to a change object which becomes
signaled when the indicated change takes place. To reset the object after waiting
completes, use FindNextChangeNotification.
The FindFirstPrinterChangeNotification function
HANDLE FindFirstPrinterChangeNotification (
HANDLE printer, //from call to OpenPrinter
ulong filter, //what to watch for
ulong reserved, //must be zero
void* options );
The first parameter specifies the printer (or print server) to watch for changes.
The second parameter, filter, is a set of bit flags which can be combined to watch for
multiple things. The definitions are arranged hierarchically, meaning that categories are
3/7/2016 3:34:00 PM
Page 60
John M. Dlugosz
Complete Guide to C++ and Threads under NT
defined using all the bits from individual items. The chart below reflects this
organization. Naturally, what counts is the actual bits that get OR'ed together, and you
don't have to use (or limit yourself to) these category symbols.
PRINTER_CHANGE_ALL
PRINTER_CHANGE_PRINTER
PRINTER_CHANGE_ADD_PRINTER
PRINTER_CHANGE_SET_PRINTER
PRINTER_CHANGE_DELETE_PRINTER
PRINTER_CHANGE_FAILED_CONNECTION_PRINTER
PRINTER_CHANGE_JOB
PRINTER_CHANGE_ADD_JOB
PRINTER_CHANGE_SET_JOB
PRINTER_CHANGE_DELETE_JOB
PRINTER_CHANGE_WRITE_JOB
PRINTER_CHANGE_FORM
PRINTER_CHANGE_ADD_FORM
PRINTER_CHANGE_SET_FORM
PRINTER_CHANGE_DELETE_FORM
PRINTER_CHANGE_PORT
PRINTER_CHANGE_ADD_PORT
PRINTER_CHANGE_CONFIGURE_PORT
PRINTER_CHANGE_DELETE_PORT
PRINTER_CHANGE_PRINT_PROCESSOR
PRINTER_CHANGE_ADD_PRINT_PROCESSOR
PRINTER_CHANGE_DELETE_PRINT_PROCESSOR
PRINTER_CHANGE_PRINTER_DRIVER
PRINTER_CHANGE_ADD_PRINTER_DRIVER
PRINTER_CHANGE_SET_PRINTER_DRIVER
PRINTER_CHANGE_DELETE_PRINTER_DRIVER
PRINTER_CHANGE_TIMEOUT
0x7777FFFF
0x000000FF
0x00000001
0x00000002
0x00000004
0x00000008
0x0000FF00
0x00000100
0x00000200
0x00000400
0x00000800
0x00070000
0x00010000
0x00020000
0x00040000
0x00700000
0x00100000
0x00200000
0x00400000
0x07000000
0x01000000
0x04000000
0x70000000
0x10000000
0x20000000
0x40000000
0x80000000
The options pointer is declared as void* in the header, but should really be a
PRINTER_NOTIFY_OPTIONS*. This structure is declared in WINSPOOL.H. This is
basically an encapsulated array of PRINTER_NOTIFY_OPTIONS_TYPE structures, each of
which describes a printer field. You can get change notifications based on these fields
in addition to the bits in the filter parameter. The options parameter can be null if filter
is non-zero. Likewise, options can be zero if the options pointer is used. They can't
both specify no notification events.
The FindNextChangeNotification function
BOOL FindNextChangeNotification (HANDLE notifier);
After a file notification object has been used, meaning that it has been signaled by some
change, this function will reuse it. The object is reset to non-signaled and a subsequent
or pending change will signal the object again. By pending change, I mean changes that
occurred after the wait completed from previous change. The object doesn't function
Page 61
3/7/2016 3:34:00 PM
Complete Guide to C++ and Threads under NT
John M. Dlugosz
properly if you issue FindNextChangeNotification without waiting for a change (from
the FindFirstChangeNotification or the previous FindNextChangeNotification) first. [[[
give details from empirical testing, and show an example program ]]]
The FindNextPrinterChangeNotification function
BOOL FindNextPrinterChangeNotification (
HANDLE notifier,
ulong* changed, //same bits as options in FindFirst.
void* options,
void** info );
This is a bit fancier than the equivalent for directories. This function not only resets the
state of the notification object (so it will watch for changes again), but also gives details
on the current change it noticed. The structures used with printer notifications are
rather complex, and concern printing more than synchronization. This is not a book
about Windows printers. For our purposes, suffice to know that you can treat it like an
event object, waiting for "something interesting" to signal the object.
The FindCloseChangeNotification and FindClosePrinterChangeNotification
functions
BOOL FindCloseChangeNotification (HANDLE notifier);
BOOL FindClosePrinterChangeNotification (HANDLE notifier);
This closes the handle to the change notification object. Unlike most kernel objects
that just use CloseHandle, a handle to notification objects are closed in a special way.
Why is an intriguing question.
Other Kernel Handles
Kernel objects have a "signaled" state, which implies that the wait functions should
operate on any kind of object. In addition to the synchronization objects described
above, here are the other object types and what "signaled" means to them.
 Process and Thread
A thread object becomes signaled when the thread terminates. This makes it easy to
wait for a thread to terminate, as it uses the same Wait functions as any other kernel
object. The same applies to process objects.
 File
A kernel file object is signaled every time an I/O operation completes. It is reset every
time an I/O operation is initiated. However, this behavior is not well documented, so
don't count on knowing exactly when the object is reset. Perhaps it even varies
3/7/2016 3:34:00 PM
Page 62
John M. Dlugosz
Complete Guide to C++ and Threads under NT
depending on the kind of special file (e.g. pipe, socket) is being used (see Console-input
below).
The use of file handles for synchronization is not recommended. Although you can be
sure it is signaled when an operation is completed, what if there are more than one
outstanding request (perhaps on different threads) on the same kernel file object?
For this purpose, use the event object in the OVERLAPPED structure instead. See page
69 for more information. [[[ note: change the xref to a more specific subsection after
that chapter is written ]]]
 Console-input
Console input is treated as a special file. You can obtain a handle to a console-input
object by calling CreateFile with "CONIN$" as the file name. Consoles are the TTY
terminal windows that are used by character-mode programs.
The object is signaled when there is unread input in the console's input buffer.
Page 63
3/7/2016 3:34:00 PM
Complete Guide to C++ and Threads under NT
John M. Dlugosz
Other Synchronization Primitives
Win32 Critical Section
Monitors
Condition Variables (used with Monitors)
Spin Locks
Reader/Writer and Group Locks
Rendezvous
Conditional Semaphore
3/7/2016 3:34:00 PM
Page 64
John M. Dlugosz
Complete Guide to C++ and Threads under NT
Waiting and Blocking
Alertable States and APC's
Windows Messages
Discuss per-thread message queues and how SendMessage et.al. implement complex
blocking behavior.
A Survey of Wait Primitives
In general, the wait functions watch one or more kernel synchronization objects and
return when the object or objects are signaled. Meanwhile, a time-out can be specified.
 WaitForSingleObject and WaitForSingleObjectEx
ulong WaitForSingleObject (HANDLE object, ulong timeout);
ulong WaitForSingleObjectEx (HANDLE object, ulong timeout, bool alertable);
I prefer to simplify this by overloading the name WaitForSingleObject instead of having
two different names, and also providing a default timeout of INFINITE. See page 73
under C++ issues
compiler issues
compiler and linker switches, etc.
volatile data
thread safety
catalog language constructs that are thread safe or unsafe.
A C++ Threading Library. [[[ replace page number with a more specific subsection
later ]]]
This works exactly like WaitForMultipleObjects given an array of one item. So, the only
return values you can expect are WAIT_OBJECT_0, WAIT_ABANDOND,
WAIT_IO_COMPLETION, or WAIT_TIMEOUT.
 WaitForMultipleObjects and WaitForMultipleObjectsEx
ulong WaitForMultipleObjects (
ulong count,
const HANDLE* array,
bool wait_for_all, //wait for "any", not "all", if false
dword timeout //in milliseconds
Page 65
3/7/2016 3:34:00 PM
Complete Guide to C++ and Threads under NT
John M. Dlugosz
);
ulong WaitForMultipleObjects (
ulong count,
const HANDLE* array,
bool wait_for_all, //wait for "any", not "all", if false
dword timeout, //in milliseconds
bool alertable
);
MAXIMUM_WAIT_OBJECTS is 64.
 SignalObjectAndWait
 MsgWaitForMultipleObjects and
MsgWaitForMultipleObjectsEx
Thread Priorities
3/7/2016 3:34:00 PM
Page 66
John M. Dlugosz
Complete Guide to C++ and Threads under NT
Communicating Between Threads and
Processes
Anonymous Pipes
Named Pipes
Mailslots
Sockets
Shared Memory
APC's
Windows Messages
Page 67
3/7/2016 3:34:00 PM
Complete Guide to C++ and Threads under NT
John M. Dlugosz
Thread-Specific Data
3/7/2016 3:34:00 PM
Page 68
John M. Dlugosz
Complete Guide to C++ and Threads under NT
Overlapped I/O
Don't forget to cover the CancelIo function.
future-value model
callback model
completion ports
Page 69
3/7/2016 3:34:00 PM
Complete Guide to C++ and Threads under NT
John M. Dlugosz
Dynamic Link Libraries (DLL's)
3/7/2016 3:34:00 PM
Page 70
John M. Dlugosz
Complete Guide to C++ and Threads under NT
Fibers
Page 71
3/7/2016 3:34:00 PM
Complete Guide to C++ and Threads under NT
John M. Dlugosz
Processes
3/7/2016 3:34:00 PM
Page 72
John M. Dlugosz
Complete Guide to C++ and Threads under NT
C++ issues
compiler issues
compiler and linker switches, etc.
volatile data
thread safety
catalog language constructs that are thread safe or unsafe.
Page 73
3/7/2016 3:34:00 PM
Complete Guide to C++ and Threads under NT
John M. Dlugosz
A C++ Threading Library
3/7/2016 3:34:00 PM
Page 74
Download