def - gevent

advertisement
gevent network library
Denis Bilenko
gevent.org
Problem statement
from urllib2 import urlopen
response = urlopen('http://gevent.org')
body = response.read()
How to manage concurrent connections?
Problem statement
Possible answer: Async framework (Twisted, asyncore, ...)
def on_response_read(response):
d = response.read()
d.addCallbacks(on_body_read, on_error)
def on_error(error):
...
def on_body_read(body):
...
d = readURL('http://gevent.org').
d.addCallbacks(on_response_read, on_error)
reactor.run()
simplicity is lost
Problem statement
Possible answer: Threads
from threading import Thread
def read_url(url):
response = urllib2.urlopen(url)
body = response.read()
t1=Thread(target=read_url, args=('http://gevent.org',))
t1.start()
t2=Thread(target=read_url, args=('http://python.org',))
t2.start()
t1.join()
t2.join()
resource hog
Memory required for 10k connections
Memory required for 10k connections
threading
400 MB
twisted
55 MB
gevent (greenlet + libevent)
from gevent import monkey; monkey.patch_all()
def read_url(url):
response = urllib2.urlopen(url)
body = response.read()
a = gevent.spawn(read_url, 'http://gevent.org')
b = gevent.spawn(read_url, 'http://python.org')
gevent.joinall([a, b])
concurrent fetch
Memory required for 10k connections
Memory required for 10k connections
threading
400 MB
twisted
55 MB
gevent
70 MB
greenlet
from greenlet import greenlet
>>> def myfunction(arg):
...
return arg + 1
>>> g = greenlet(myfunction)
>>> g.switch(2)
3
from greenlet import greenlet
>>> MAIN = greenlet.getcurrent()
>>> def myfunction(arg):
...
MAIN.switch('hello')
...
return arg + 1
>>> g = greenlet(myfunction)
>>> g.switch(2)
'hello'
>>> g.switch('hello to you')
3
switching deep down the stack
>>> def myfunction(arg):
...
MAIN.switch('hello')
...
return arg + 1
>>> def top_function(arg):
...
return myfunction(arg)
>>> g = greenlet(top_function)
>>> g.switch(2)
'hello'
from greenlet import greenlet
•
•
•
•
primitive pseudothreads, share same OS thread
switched explicitly via switch() and throw()
organized in a tree, each has .parent except MAIN
switch(), throw() and .parent reserved for gevent
http://codespeak.net/py/0.9.2/greenlet.html
How gevent uses greenlet
MAIN
HUB
spawned greenlets
Hub: greenlet that runs event loop
from gevent import core
class Hub(greenlet.greenlet):
def run(self):
core.dispatch() # wrapper for event_dispatch()
def get_hub():
# return the global Hub instance
# creating one if does not exist
gevent/hub.py
Event loop
• libevent 1.4.x or 2.0.5-beta
• gevent.core: wraps libevent API (like pyevent)
>>> def print_hello():
...
print 'hello'
>>> gevent.core.timer(1, print_hello)
<timer ...>
>>> gevent.core.dispatch()
hello
1 # return value (no more events)
Implementation of gevent.sleep()
def sleep(seconds=0):
"""Put the current greenlet to sleep""“
switch = getcurrent().switch
timer = core.timer(seconds, switch)
try:
get_hub().switch()
finally:
timer.cancel()
Cooperative socket
• gevent.socket: compatible synchronous interface
• wraps a non-blocking socket
def recv(self, size):
while True:
try:
return self._sock.recv(size)
except error, ex:
if ex[0] == EWOULDBLOCK:
wait_read(self.fileno())
else:
raise
Cooperative socket
• gevent.socket: compatible synchronous interface
• wraps a non-blocking socket
def wait_read(fileno):
switch = getcurrent().switch
event = core.read_event(fileno, switch)
try:
get_hub().switch()
finally:
event.cancel()
gevent/socket.py
Cooperative socket
• gevent.socket
• dns queries are resolved through libevent-dns
(getaddrinfo, gethostbyname)
• gevent.ssl
Monkey patching
from gevent import monkey; monkey.patch_all()
def read_url(url):
response = urllib2.urlopen(url)
body = response.read()
a = gevent.spawn(read_url, 'http://gevent.org')
b = gevent.spawn(read_url, 'http://python.org')
gevent.joinall([a, b])
Monkey patching
Patches:
• socket and ssl modules
• time.sleep, select.select
• thread and threading
Beware:
• libraries that wrap C libraries (e.g. MySQLdb)
• Disk I/O
• things not yet patched: subprocess, os.system, sys.stdin
Tested with httplib, urllib2, mechanize, mysql-connector,
SQLAlchemy, ...
Greenlet objects
from gevent import monkey; monkey.patch_all()
def read_url(url):
response = urllib2.urlopen(url)
body = response.read()
a = gevent.spawn(read_url, 'http://gevent.org')
b = gevent.spawn(read_url, 'http://python.org')
gevent.joinall([a, b])
Greenlet objects
def read_url(url):
response = urllib2.urlopen(url)
body = response.read()
g = Greenlet(read_url, url)
g.start()
= spawn
# wait for it to complete
g.join()
# or raise an exception and wait to exit
g.kill()
Greenlet objects
def read_url(url):
response = urllib2.urlopen(url)
body = response.read()
g = Greenlet(read_url, url)
g.start()
= spawn
# wait for it to complete (or timeout expires)
g.join(timeout=2)
# or raise and wait to exit (or timeout expires)
g.kill(timeout=2)
Timeouts
with gevent.Timeout(5):
response = urllib2.urlopen(url)
for line in response:
print line
# raises Timeout if not done after 5 seconds
with gevent.Timeout(5, False):
response = urllib2.urlopen(url)
for line in response:
print line
# exits block if not done after 5 seconds
Beware: catch-all “except:”, non-yielding code
API
• socket, ssl
• Greenlet
• Timeout
• Event, AsyncResult
• Queue (also JoinableQueue, PriorityQueue, LifoQueue)
– Queue(0) is a synchronous channel
• Pool
• StreamServer: TCP and SSL servers
• WSGI servers
WSGI servers
• gevent.wsgi
– uses libevent-http
– efficient, but lacks important features
• gevent.pywsgi
– uses gevent sockets
• green unicorn (gunicorn.org)
– its own parser or gevent’s server
– pre-fork workers
Caveat emptor
• Reduced portability
– no Jython, IronPython
– not all platforms supported by CPython
• PyThreadState is shared
– exc_info (saved/restored by gevent)
– tracing, profiling info
Future plans
• http://code.google.com/p/gevent/issues/list
• alternative coroutine libraries
– Stackless
– swapcontext
• more libevent:
– http client
– buffered socket operations
– priorities
• process handling (gevent.subprocess)
• even more stable API with 1.0
Examples
•
•
•
•
bitbucket.org/denis/gevent/src/tip/examples/
chat.gevent.org
omegle.com
ProjectsUsingGevent
– gevent-mysql
– psycopg2
• bit.ly/use-gevent
– websockets, web crawlers, facebook apps
Summary
•
•
•
•
•
coroutines are easy-to-use threads
as efficient as async libraries
works well if app is I/O bound
simple API, many things familiar
works with unsuspecting 3rd party modules
Thank you!
gevent.org
@gevent
Download