tl;dr: I learned asyncio and rewrote part of dask.distributed with it; thisdetails my experience
The asyncio library providesconcurrent programming in the style of Go, Clojures
core.async library, ormore traditional libraries like Twisted. Asyncio offers a programming paradigmthat lets many moving parts interact without involving separate threads. Theseseparate parts explicitly yield control to each other and to a centralauthority and then regain control as others yield control to them. This letsone escape traps like race conditions to shared state, a web of callbacks, losterror reporting, and general confusion.
Im not going to write too much about asyncio. Instead Im going to brieflydescribe my problem, link to a solution, and then dive into good-and-bad pointsabout using
asyncio while theyre fresh in my mind.
I wont actually discuss the application much after this section; you cansafely skip this.
I decided to rewrite the
dask.distributed Worker usingasyncio. This worker has to do the following:
- Store local data in a dictionary (easy)
- Perform computations on that data as requested by a remote connection(act as a server in a client-server relationship)
- Collect data from other workers when we dont have all of the necessarydata for a computation locally (peer-to-peer)
- Serve data to other workers who need our data for their own computations(peer-to-peer)
Its a sort of distributed RPC mechanism with peer-to-peer value sharing.Metadata for who-has-what data is stored in a central metadata store; thiscould be something like Redis.
The current implementation of this is a nest of threads, queues, and callbacks.Its not bad and performs well but tends to be hard for others to develop.
Additionally I want to separate the worker code because its useful outside of
dask.distributed . Other distributed computation solutions exist in my headthat rely on this technology.
For the moment the code lives here: https://github.com/mrocklin/dist . I likethe design. The module-level docstring ofworker.py isshort and informative. But again, Im not going to discuss the applicationyet; instead, here are some thoughts on learning/developing with
Disclaimer I am a novice concurrent programmer. I write lots of parallel codebut little concurrent code. I have never used existing frameworks likeTwisted.
I liked the experience of using asyncio and recommend the paradigm to anyonebuilding concurrent applications.
- I can write complex code that involves multiple asynchronous calls,complex logic, and exception handling all in a single place. Complexapplication logic is no longer spread in many places.
- Debugging is much easier now that I can throw
import pdb; pdb.set_trace()lines into my code and expect them to work (this fails when using threads).
- My code fails more gracefully, further improving the debug experience.
- The paradigm shared by Go, Clojures
core.async, and Pythons
asynciofelt viscerally good. I was able to reason well about my program as I wasbuilding it and made nice diagrams about explicitly which sequentialprocesses interacted with which others over which channels. I am much moreconfident of the correctness of the implementation and the design of myprogram. However, after having gone through this exercise I suspect that Icould now implement just about the same design without
asyncio. Thedesign paradigm was perhaps as important as the library itself.
- I have to support Python 2. Fortunately I found the trollius port of
asyncioto bevery usable. It looks like it was a direct fork-then-modify of
- There wasnt a ZeroMQ connectivity layer for Trollius (though
aiozmqexists in Python 3) so Iended up having to use threads anyway for inter-node I/O. This, combinedwith ZeroMQs finicky behavior did mean that my program crashed hardsometimes. Im considering switching to plain sockets (which are supportednativel by Trollius and asyncio) due to this.
- While exceptions raise cleanly I cant determine from where they originate.There are no line numbers or tracebacks. Debugging in a concurrentenvironment is hard; my experience was definitely better than threads butstill could be improved. I hope that
asyncioin Python 3.4 has betterdebugging support.
- The API documentation is thorough but stackoverflow, general bestpractices, and example coverage is very sparse. The project is new sothere isnt much to go on. I found that reading documentation for Go andpresentations on Clojures
core.asyncwere far more helpful in preparingme to use
asynciothan any of the asyncio docs/presentations.
I intend to pursue this into the future and, if the debugging experience isbetter in Python 3 am considering rewriting the dask.distributed Scheduler inPython 3 with asyncio proper. This is possible because the Scheduler doesnthave to be compatible with user code.
I found these videos to be useful: