gevent network library Denis Bilenko gevent.org Problem statement from urllib2 import urlopen response = urlopen('http://gevent.org') body = response.read() How to manage concurrent connections? Problem statement Possible answer: Async framework (Twisted, asyncore, ...) def on_response_read(response): d = response.read() d.addCallbacks(on_body_read, on_error) def on_error(error): ... def on_body_read(body): ... d = readURL('http://gevent.org'). d.addCallbacks(on_response_read, on_error) reactor.run() simplicity is lost Problem statement Possible answer: Threads from threading import Thread def read_url(url): response = urllib2.urlopen(url) body = response.read() t1=Thread(target=read_url, args=('http://gevent.org',)) t1.start() t2=Thread(target=read_url, args=('http://python.org',)) t2.start() t1.join() t2.join() resource hog Memory required for 10k connections Memory required for 10k connections threading 400 MB twisted 55 MB gevent (greenlet + libevent) from gevent import monkey; monkey.patch_all() def read_url(url): response = urllib2.urlopen(url) body = response.read() a = gevent.spawn(read_url, 'http://gevent.org') b = gevent.spawn(read_url, 'http://python.org') gevent.joinall([a, b]) concurrent fetch Memory required for 10k connections Memory required for 10k connections threading 400 MB twisted 55 MB gevent 70 MB greenlet from greenlet import greenlet >>> def myfunction(arg): ... return arg + 1 >>> g = greenlet(myfunction) >>> g.switch(2) 3 from greenlet import greenlet >>> MAIN = greenlet.getcurrent() >>> def myfunction(arg): ... MAIN.switch('hello') ... return arg + 1 >>> g = greenlet(myfunction) >>> g.switch(2) 'hello' >>> g.switch('hello to you') 3 switching deep down the stack >>> def myfunction(arg): ... MAIN.switch('hello') ... return arg + 1 >>> def top_function(arg): ... return myfunction(arg) >>> g = greenlet(top_function) >>> g.switch(2) 'hello' from greenlet import greenlet • • • • primitive pseudothreads, share same OS thread switched explicitly via switch() and throw() organized in a tree, each has .parent except MAIN switch(), throw() and .parent reserved for gevent http://codespeak.net/py/0.9.2/greenlet.html How gevent uses greenlet MAIN HUB spawned greenlets Hub: greenlet that runs event loop from gevent import core class Hub(greenlet.greenlet): def run(self): core.dispatch() # wrapper for event_dispatch() def get_hub(): # return the global Hub instance # creating one if does not exist gevent/hub.py Event loop • libevent 1.4.x or 2.0.5-beta • gevent.core: wraps libevent API (like pyevent) >>> def print_hello(): ... print 'hello' >>> gevent.core.timer(1, print_hello) <timer ...> >>> gevent.core.dispatch() hello 1 # return value (no more events) Implementation of gevent.sleep() def sleep(seconds=0): """Put the current greenlet to sleep""“ switch = getcurrent().switch timer = core.timer(seconds, switch) try: get_hub().switch() finally: timer.cancel() Cooperative socket • gevent.socket: compatible synchronous interface • wraps a non-blocking socket def recv(self, size): while True: try: return self._sock.recv(size) except error, ex: if ex[0] == EWOULDBLOCK: wait_read(self.fileno()) else: raise Cooperative socket • gevent.socket: compatible synchronous interface • wraps a non-blocking socket def wait_read(fileno): switch = getcurrent().switch event = core.read_event(fileno, switch) try: get_hub().switch() finally: event.cancel() gevent/socket.py Cooperative socket • gevent.socket • dns queries are resolved through libevent-dns (getaddrinfo, gethostbyname) • gevent.ssl Monkey patching from gevent import monkey; monkey.patch_all() def read_url(url): response = urllib2.urlopen(url) body = response.read() a = gevent.spawn(read_url, 'http://gevent.org') b = gevent.spawn(read_url, 'http://python.org') gevent.joinall([a, b]) Monkey patching Patches: • socket and ssl modules • time.sleep, select.select • thread and threading Beware: • libraries that wrap C libraries (e.g. MySQLdb) • Disk I/O • things not yet patched: subprocess, os.system, sys.stdin Tested with httplib, urllib2, mechanize, mysql-connector, SQLAlchemy, ... Greenlet objects from gevent import monkey; monkey.patch_all() def read_url(url): response = urllib2.urlopen(url) body = response.read() a = gevent.spawn(read_url, 'http://gevent.org') b = gevent.spawn(read_url, 'http://python.org') gevent.joinall([a, b]) Greenlet objects def read_url(url): response = urllib2.urlopen(url) body = response.read() g = Greenlet(read_url, url) g.start() = spawn # wait for it to complete g.join() # or raise an exception and wait to exit g.kill() Greenlet objects def read_url(url): response = urllib2.urlopen(url) body = response.read() g = Greenlet(read_url, url) g.start() = spawn # wait for it to complete (or timeout expires) g.join(timeout=2) # or raise and wait to exit (or timeout expires) g.kill(timeout=2) Timeouts with gevent.Timeout(5): response = urllib2.urlopen(url) for line in response: print line # raises Timeout if not done after 5 seconds with gevent.Timeout(5, False): response = urllib2.urlopen(url) for line in response: print line # exits block if not done after 5 seconds Beware: catch-all “except:”, non-yielding code API • socket, ssl • Greenlet • Timeout • Event, AsyncResult • Queue (also JoinableQueue, PriorityQueue, LifoQueue) – Queue(0) is a synchronous channel • Pool • StreamServer: TCP and SSL servers • WSGI servers WSGI servers • gevent.wsgi – uses libevent-http – efficient, but lacks important features • gevent.pywsgi – uses gevent sockets • green unicorn (gunicorn.org) – its own parser or gevent’s server – pre-fork workers Caveat emptor • Reduced portability – no Jython, IronPython – not all platforms supported by CPython • PyThreadState is shared – exc_info (saved/restored by gevent) – tracing, profiling info Future plans • http://code.google.com/p/gevent/issues/list • alternative coroutine libraries – Stackless – swapcontext • more libevent: – http client – buffered socket operations – priorities • process handling (gevent.subprocess) • even more stable API with 1.0 Examples • • • • bitbucket.org/denis/gevent/src/tip/examples/ chat.gevent.org omegle.com ProjectsUsingGevent – gevent-mysql – psycopg2 • bit.ly/use-gevent – websockets, web crawlers, facebook apps Summary • • • • • coroutines are easy-to-use threads as efficient as async libraries works well if app is I/O bound simple API, many things familiar works with unsuspecting 3rd party modules Thank you! gevent.org @gevent