Slides on cross-domain call and Remote Procedure Call (RPC)

advertisement
Slides on cross-domain call and
Remote Procedure Call (RPC)
This classic paper is a good example of a microbenchmarking
study. It also explains the RPC abstraction and serves as a case
study of the nuts-and-bolts of I/O, and related performance issues.
Or is it “just hacking”?
Request/reply messaging
client
server
request
compute
reply
Messaging: examples and variations
• Details vary!
– Supercomputing: MPI over fast interconnect
– High-level messages (e.g., HTTP) over sockets and
network communication
– Microkernel / Mach / MacOS: high-speed local crossdomain messaging ports. (Also Windows/NT)
– Android: binder, and per-thread message queues
• Common abstraction: “Remote Procedure Call”
– RPC for clients/serves talking over a network.
– For local processes it is often called cross-domain
call or “Local Procedure Call” (LPC, in Windows).
Network File System (NFS)
Remote Procedure Call (RPC)
External Data Representation (XDR)
[ucla.edu]
Cross-domain call: the basics
A
A: syscall to post a
message to B (e.g., a
message queue). Wait
for reply.
B
B: syscalls to receive an
incoming message.
Wait for request.
Request: block A, wakeup B. Reply: block B, wakeup A.
Cross-domain call: the basics
A
Copy data from A to B, or use
a shared memory region.
A: syscall to post a
message to B (e.g., a
message queue). Wait
for reply.
B
B: syscalls to receive an
incoming message.
Wait for request.
Transfer control through kernel: block A, wakeup B.
Note: could use a socket, or fast IPC for processes on same host.
“Marshalling” (“serializing”)
A
B
What if the data is a complex linked structure? Must “pack” it as a
sequence of bytes into a message, and reconstitute it on the other side.
Concept: RPC
Remote Procedure Call (RPC) is request/response interaction
through a published API, using IPC messaging to cross an interprocess boundary.
API stubs generated from an Interface
Description Language (IDL)
Establishing an RPC connection
to a named remote interface is
often called binding.
RPC is used in many standard Internet services. It is also the basis for
component frameworks like DCOM, CORBA, and Android. Software is
packaged into named “objects” or components. Components may publish
interfaces and/or invoke published interfaces of other components.
Components may execute in different processes and/or on different nodes.
The classic picture
Implementing RPC
Birrell/Nelson 1984
RPC Execution
• In general, RPC enables
request/response exchanges
(e.g., by messaging over a
network) that “looks like” a
local procedure call.
• In Android, RPC allows
flexible interaction among
apps running in different
processes, across the kernel
boundary.
• How is this different from a
local procedure call?
• How is it different from a
system call?
RPC: Language integration
RPC: Language integration
Stubs link with the client/server code to “hide” the boundary crossing.
– They “marshal” args/results
– i.e., translate to/from some standard
network stream format
– Also known as linearize, serialize
– …or “flatten”
– Propagate PL-level exceptions
– Stubs are auto-generated from an
Interface Description Language
(IDL) file by a stub compiler tool at
software build time, and linked in.
– Client and server must agree on the
protocol signatures in the IDL file.
Marshalling: a metaphor
Android Architecture and Binder
Dhinakaran Pandiyan
Saketh Paranjape
Stubs
• RPC stubs are procedures linked into the client and server.
– RPC stubs are similar to system call stubs, but they do more than just
trap to the kernel.
– The RPC stubs construct/deconstruct a message transmitted through a
messaging system.
– Binder is an example of such a messaging system, implemented as a
Linux kernel plug-in module (a driver) and some user-space libraries.
• The stubs are generated by a tool that takes a description of the
application’s RPC API written in an Interface Description
Language.
– Looks like any interface definition…
– List of method names and argument/result types and signatures.
– Stub code marshals arguments into request message, marshals results
into a reply message.
Stubs and IDL
This picture illustrates the stub generation and
build process for an RPC system based on the C
language (e.g., ONC or Sun RPC, used in NFS).
Another picture of RPC
Implementing RPC
Birrell/Nelson 1984
Threads and RPC
Q: How do we manage these
“call threads”?
A: Create them as needed,
and keep idle threads in a
thread pool.
When an RPC call arrives,
wake up an idle thread from
the pool to handle it.
On the client, the client thread
blocks until the server thread
returns a response.
[OpenGroup, late 1980s]
Thread pool: idealized
Magic elastic worker pool
Resize worker pool to match
incoming request load:
create/destroy workers as
needed.
idle workers
worker
loop
handler
dispatch
Incoming
request
(event)
queue
handler
Workers wait here for next
request dispatch.
(Workers are threads.)
handler
Handle one
event,
blocking as
necessary.
When handler
is complete,
return to
worker pool.
Event/request queue
We can synchronize an event
queue with a monitor: a
mutex/CV pair.
Protect the event queue data
structure itself with the mutex.
threads waiting on CV
Workers wait on the CV for
next event if the event queue
is empty. Signal the CV when
a new event arrives. This is a
producer/consumer
problem.
worker
loop
handler
dispatch
Incoming
event
queue
handler
handler
Handle one
event,
blocking as
necessary.
When handler
is complete,
return to
worker pool.
Some details
• How is incoming data delivered to the correct process?
• On the return, how does the Receiver know which thread
to wake up?
• How does the wakeup happen?
• What if a request/reply is dropped in the net?
• What if a request/reply is duplicated?
• How does the client find the server? (binding)
• What if the server fails?
• How to go faster if client/server are on the same host?
(“LRPC” or “LPC”)
Firefly vs. Web/HTTP etc.
• Firefly does not use TCP/IP.
• Instead, it has a custom packet protocol. Tradeoffs?
• But some of the basics of network communication are
similar/identical.
• How is (say) HTTP different from RPC?
Networked services: big picture
client host
NIC
device
client
applications
kernel
network
software
Internet
“cloud”
Data is sent on the
network as messages
called packets.
server hosts
with server
applications
A simple, familiar example
request
“GET /images/fish.gif HTTP/1.1”
reply
client (initiator)
server
sd = socket(…);
connect(sd, name);
write(sd, request…);
read(sd, reply…);
close(sd);
s = socket(…);
bind(s, name);
sd = accept(s);
read(sd, request…);
write(sd, reply…);
close(sd);
End-to-end data transfer
buffer queues
(mbufs, skbufs)
sender
receiver
move data from
application to
system buffer
move data from
system buffer to
application
buffer queues
TCP/IP protocol
TCP/IP protocol
compute checksum
compare checksum
packet queues
packet queues
network driver
network driver
DMA + interrupt
DMA + interrupt
transmit packet to
network interface
deposit packet in
host memory
Ports and packet demultiplexing
Data is sent on the network in messages called packets addressed to a
destination node and port. Kernel network stack demultiplexes
incoming network traffic: choose process/socket to receive it based on
destination port.
Incoming network packets
Network adapter hardware
aka, network interface
controller (“NIC”)
Apps with
open
sockets
Wakeup from interrupt handler
return to user mode
trap or fault
sleep
queue
ready
queue
sleep
wakeup
switch
interrupt
Example 1: NIC interrupt wakes thread to receive incoming packets.
Example 2: disk interrupt wakes thread when disk I/O completes.
Example 3: clock interrupt wakes thread after N ms have elapsed.
Note: it isn’t actually the interrupt itself that wakes the thread, but the interrupt
handler (software). The awakened thread must have registered for the wakeup before
sleeping (e.g., by placing its TCB on some sleep queue for the event).
Process, kernel, and syscalls
process user space
syscall stub
user buffers
read() {…}
syscall
dispatch
table
I/O
descriptor
table
trap
copyout
copyin
read() {…}
write() {…}
kernel
I/O objects
Return
to user
mode
Firefly: shared buffers
Performance of Firefly RPC
Michaels Schroeder and Burrows
Binding
Implementing RPC
Birrell/Nelson 1984
Optimize for the common case
Several of the structural features used to improve RPC
performance collapse layers of abstraction. Programming a fast
RPC is not for the squeamish.
The slower path through the operating-system address space is
used when the interrupt routine cannot find the appropriate
RPC thread in the call table, when it encounters a lock conflict
in the call table, or when it handles a non-RPC packet.
Performance of Firefly RPC
Michaels Schroeder and Burrows
Latency and throughput
Performance of Firefly RPC
Michaels Schroeder and Burrows
Marshalling overhead
Performance of Firefly RPC
Michaels Schroeder and Burrows
Steps and overhead
Performance of Firefly RPC
Michaels Schroeder and Burrows
Performance of Firefly RPC
Michaels Schroeder and Burrows
Performance of Firefly RPC
Michaels Schroeder and Burrows
Performance of Firefly RPC
Michaels Schroeder and Burrows
Performance of Firefly RPC
Michaels Schroeder and Burrows
ASPLOS 1991
Schroeder and Burrows suggest that tripling CPU speed would reduce SRC RPC latency
for a small packet by about 50%, on the expectation that the 83% of the time not spent
on the wire will decrease by a factor of 3. Looking at Table 3, however, we see that
much of the RPC time goes to functions that may not benefit proportionally from
modern architectures. ……The only real ‘computation” in RPC, in the traditional sense,
is the checksum processing, and this in fact is memory-intensive and not computeintensive; each checksum addition is paired with a load …. Thus, Ousterhout found in
the Sprite operating system [Ousterhout et al. 88] that kernel-to-kernel null RPC time
was reduced by only half when moving from a Sun-3/75 to a SPARCstation-l, even
though integer performance increased by a factor of five [Ousterhout 90a].
Android: object-based RPC channels
Activity
Manager
Service
etc.
Services
register to
advertise for
clients.
JVM+lib
Bindings are
reference-counted.
A client binds
to a service.
JVM+lib
Android binder
an add-on kernel driver for
/dev/binder object RPC
Linux kernel
Android services and libraries communicate by sending messages
through shared-memory channels set up by binder.
Binder is a add-on driver module that runs in the kernel. Unix drivers can
define arbitrary “I/O control” APIs invoked through the ioctl system call. The
ioctl syscall was designed for device control, but it serves as a general
mechanism to extend the kernel and syscall interface (“kitchen sink”).
Kernel space
Binder: thread pool details
“The system maintains a pool of transaction threads in each process
that it runs in. These threads are used to dispatch all IPCs coming in
from other processes.
For example, when an IPC is made from process A to process B, the
calling thread in A blocks in transact() as it sends the transaction to
process B. The next available pool thread in B receives the incoming
transaction, calls Binder.onTransact() on the target object, and replies
with the result Parcel.
Upon receiving its result, the thread in process A returns to allow its
execution to continue. …”
[http://developer.android.com/reference/android/os/IBinder.html]
Note: in this setting, a “transaction” is just an RPC request/response exchange.
Stubs and Interface Description Language
This picture illustrates the
Android class structure for
objects invoked over
binder RPC.
…including classes generated via Android’s IDL (AIDL).
Download