PPT - Duke University

advertisement
Duke Systems
Servers and Threads
Jeff Chase
Duke University
Processes and threads
virtual address space
+
Each process has a
virtual address space
(VAS): a private name
space for the virtual
memory it uses.
The VAS is both a
“sandbox” and a
“lockbox”: it limits what
the process can
see/do, and protects
its data from others.
main thread
stack
other threads (optional)
+…
Each process has a thread
bound to the VAS, with
stacks (user and kernel).
From now on, we suppose
that a process could have
additional threads.
If we say a process does
something, we really mean
its thread does it.
We are not concerned with
how to implement them,
but we presume that they
can all make system calls
and block independently.
The kernel can
suspend/restart the thread
wherever and whenever it
wants.
STOP
wait
Threads: a familiar metaphor
1
Page links and
back button
navigate a
“stack” of pages
in each tab.
2 Each tab has its own stack.
One tab is active at any given time.
You create/destroy tabs as needed.
You switch between tabs at your whim.
3
Similarly, each thread has a separate stack.
The OS switches between threads at its whim.
One thread is active per CPU core at any given time.
time 
Threads
• A thread is a stream of control.
– defined by CPU register context (PC, SP, …)
– Note: process “context” is thread context plus
protected registers defining current VAS, e.g.,
ASID or “page table base register(s)”.
– Generally “context” is the register values and
referenced memory state (stack, page tables)
• Multiple threads can execute independently:
– They can run in parallel on multiple CPUs...
• physical concurrency
– …or arbitrarily interleaved on a single CPU.
• logical concurrency
– Each thread must have its own stack.
Two threads sharing a CPU
concept
reality
context
switch
Two threads: closer look
“on deck” and
ready to run
address space
0
x
common runtime
program
code library
running
thread
CPU
(core)
data
R0
Rn
PC
SP
y
x
y
stack
registers
stack
high
Thread context switch
switch
out
switch
in
address space
0
common runtime
x
program
code library
data
R0
CPU
(core)
1. save registers
Rn
PC
SP
y
x
y
registers
stack
2. load registers
high
stack
Thread states and transitions
exit
exited
running
The kernel process/thread scheduler
governs these transitions.
sleep
blocked
wakeup
wait, STOP, read, write,
listen, receive, etc.
STOP
wait
EXIT
ready
Sleep and wakeup are internal
primitives. Wakeup adds a thread to
the scheduler’s ready pool: a set of
threads in the ready state.
CPU Scheduling 101
The OS scheduler makes a sequence of “moves”.
– Next move: if a CPU core is idle, pick a ready thread t from
the ready pool and dispatch it (run it).
– Scheduler’s choice is “nondeterministic”
– Scheduler’s choice determines interleaving of execution
blocked
threads
Wakeup
ready pool
If timer expires, or
wait/yield/terminate
GetNextToRun
SWITCH()
Event-driven programming
• Some of the goals of threads can be met by using an event-driven
programming model.
• An event-driven program executes a sequence of events. The
program consists of a set of handlers for those events.
– e.g., Unix signals
• The program executes sequentially (no concurrency). But the
interleaving of handler executions is determined by the event order.
• Pure event-driven programming can simplify management of
inherently concurrent activities.
– E.g., I/O, user interaction, children, client requests
• Some of these needs can be met using either threads or eventdriven programming. But often we need both.
Event-driven programming vs. threads
• Often we can choose among event-driven or threaded structures.
• So it has been common for academics and developers to argue the
relative merits of “event-driven programming vs. threads”.
• But they are not mutually exclusive.
• Anyway, we need both: to get real parallelism on real systems (e.g.,
multicore), we need some kind of threads underneath anyway.
• We often use event-driven programming built above threads and/or
combined with threads in a hybrid model.
• For example, each thread may be event-driven, or multiple threads
may rendezvous on a shared event queue.
• We illustrate the continuum by looking first at Android and then at
concurrency management in servers (e.g., the Apache Web server).
Android app: main event loop
• The main thread of an Android app is
called the Activity Thread.
• It receives a sequence of events and
invokes their handlers.
1
• Also called the “UI thread” because it
receives all User Interface events.
– screen taps, clicks, swipes, etc.
– All UI calls must be made by the UI
thread: the UI lib is not thread-safe.
– MS-Windows apps are similar.
• The UI thread must not block!
– If it blocks, then the app becomes
unresponsive to user input: bad.
2
3
Android event loop: a closer look
• The main thread delivers UI events
and intents to Activity components.
• It also delivers events (broadcast
intents) to Receiver components.
main
event
loop
• Handlers defined for these
components must not block.
• The handlers execute serially in
event arrival order.
• Note: Service and ContentProvider
components receive invocations from
other apps (i.e., they are servers).
• These invocations run on different
threads…more on that later.
Activity
Activity
UI clicks
and
intents
Receiver
Dispatch events by invoking
component-defined handlers.
Event-driven programming
• This “design pattern” is called eventdriven (event-based) programming.
• In its pure form the thread never
blocks, except to wait for the next
event, whatever it is.
• We can think of the program as a set
of handlers: the system upcalls a
handler to dispatch each event.
events
• Note: here we are using the term
“event” to refer to any notification:
– arriving input
– asynchronous I/O completion
– subscribed events
– child stop/exit, “signals”, etc.
Dispatch events by invoking
handlers (upcalls).
Android event classes: some details
• Android defines a set of classes for
event-driven programming in
conjunction with threads.
• A thread may have at most one
Looper bound to a MessageQueue.
Looper
Message
• Each Looper has exactly one thread
and exactly one MessageQueue.
• The Looper has an interface to
register Handlers.
Message
Queue
• There may be any number of
Handlers registered per Looper.
• These classes are used for the UI
thread, but have other uses as well.
Handler
[These Android details are provided for completeness.]
Android: adding services (simplified)
main/UI
thread
main
event
loop
UI clicks
and
intents
binder
thread pool
Activity
Service
Activity
Provider
Receiver
Service
incoming
binder
messages
Pool of event-driven threads
• Android Binder receives a sequence of events (intents) in each
process.
• They include incoming intents on provider and service
components.
• Handlers for these intents may block. Therefore the app lib uses a
pool of threads to invoke the Handlers for these incoming events.
• Many Android apps don’t have these kinds of components: those
apps can use a simple event-driven programming model and don’t
need to know about threads at all.
• But apps having these component types use a different design
pattern: pool of event-driven threads.
• This pattern is also common in multi-threaded servers, which poll
socket descriptors listening for new requests. Let’s take a look.
Multi-threaded RPC server
[OpenGroup, late 1980s]
Ideal event poll API
Poll()
1. Delivers: returns exactly one event (message or
notification), in its entirety, ready for service (dispatch).
2. Idles: Blocks iff there is no event ready for dispatch.
3. Consumes: returns each posted event at most once.
4. Combines: any of many kinds of events (a poll set) may
be returned through a single call to poll.
5. Synchronizes: may be shared by multiple processes or
threads ( handlers are thread-safe as well).
A look ahead
• Various systems use various combinations of
threaded/blocking and event-driven models.
• Unix made some choices, and then more choices.
• These choices failed for networked servers, which
require effective concurrent handling of requests.
• They failed because they violate each of the five
properties for “ideal” event handling.
• There is a large body of work addressing the
resulting problems. Servers mostly work now.
– More about server performance and Unix/Linux later.
• The Android Binder model is closer to the ideal.
Classic Unix
• Single-threaded processes
• Blocking system calls
– Synchronous I/O: calling process blocks until each I/O
request is “complete”.
• Each blocking call waits for only a single kind of a
event on a single object.
– Process or file descriptor (e.g., file or socket)
• Add signals when that model does not work.
• With sockets: add select system call to monitor I/O
on sets of sockets or other file descriptors.
– select was slow for large poll sets. Now we have various
variants: poll, epoll, pollet, kqueue. None are ideal.
Inside your Web server
Server application
(Apache,
Tomcat/Java, etc)
accept
queue
packet
queues
listen
queue
disk
queue
Server operations
create socket(s)
bind to port number(s)
listen to advertise port
wait for client to arrive on port
(select/poll/epoll of ports)
accept client connection
read or recv request
write or send response
close client socket
Accept loop
while (1) {
int acceptsock = accept(sock, NULL, NULL);
char *input = (char *)malloc(1024*sizeof (char));
recv(acceptsock, input, 1024, 0);
int is_html = 0;
char *contents = handle(input,&is_html);
free(input);
…send response…
close(acceptsock);
}
If a server is listening on only one
port/socket (“listener”), then it can
skip the select/poll/epoll.
Handling a request
Accept Client
Connection
may block
waiting on
network
Read HTTP
Request Header
Find
File
may block
waiting on
disk I/O
Send HTTP
Response Header
Read File
Send Data
Want to be able to process requests concurrently.
Web server (serial process)
 Option 1: could handle requests serially
Client 1
WS
Client 2
R1 arrives
Receive R1
Disk request 1a
R2 arrives
1a completes
R1 completes
Receive R2
 Easy to program, but painfully slow (why?)
Web server (event-driven)
 Option 2: use asynchronous I/O
 Fast, but hard to program (why?)
Client 2 Client 1
WS
Disk
R1 arrives
Receive R1
Disk request 1a
R2 arrives
Receive R2
1a completes
R1 completes
Start 1a
Finish 1a
Web server (multi-process)
 Option 3: assign one thread per request
Client 1
WS1
WS2
Client 2
R1 arrives
Receive R1
Disk request 1a
R2 arrives
Receive R2
1a completes
R1 completes
 Where is each request’s state stored?
Concurrency and pipelining
CPU
DISK
Before
NET
CPU
DISK
NET
After
Download