"

This work is supported by Continuum Analytics and the XDATA Program as part of the Blaze Project

tl;dr: I learned asyncio and rewrote part of dask.distributed with it; thisdetails my experience

asyncio

The asyncio library providesconcurrent programming in the style of Go, Clojures core.async library, ormore traditional libraries like Twisted. Asyncio offers a programming paradigmthat lets many moving parts interact without involving separate threads. Theseseparate parts explicitly yield control to each other and to a centralauthority and then regain control as others yield control to them. This letsone escape traps like race conditions to shared state, a web of callbacks, losterror reporting, and general confusion.

Im not going to write too much about asyncio. Instead Im going to brieflydescribe my problem, link to a solution, and then dive into good-and-bad pointsabout using asyncio while theyre fresh in my mind.

Exercise

I wont actually discuss the application much after this section; you cansafely skip this.

I decided to rewrite the dask.distributed Worker usingasyncio. This worker has to do the following:

  1. Store local data in a dictionary (easy)
  2. Perform computations on that data as requested by a remote connection(act as a server in a client-server relationship)
  3. Collect data from other workers when we dont have all of the necessarydata for a computation locally (peer-to-peer)
  4. Serve data to other workers who need our data for their own computations(peer-to-peer)

Its a sort of distributed RPC mechanism with peer-to-peer value sharing.Metadata for who-has-what data is stored in a central metadata store; thiscould be something like Redis.

The current implementation of this is a nest of threads, queues, and callbacks.Its not bad and performs well but tends to be hard for others to develop.

Additionally I want to separate the worker code because its useful outside of dask.distributed . Other distributed computation solutions exist in my headthat rely on this technology.

For the moment the code lives here: https://github.com/mrocklin/dist . I likethe design. The module-level docstring ofworker.py isshort and informative. But again, Im not going to discuss the applicationyet; instead, here are some thoughts on learning/developing with asyncio .

General Thoughts

Disclaimer I am a novice concurrent programmer. I write lots of parallel codebut little concurrent code. I have never used existing frameworks likeTwisted.

I liked the experience of using asyncio and recommend the paradigm to anyonebuilding concurrent applications.

The Good:

  • I can write complex code that involves multiple asynchronous calls,complex logic, and exception handling all in a single place. Complexapplication logic is no longer spread in many places.
  • Debugging is much easier now that I can throw import pdb; pdb.set_trace() lines into my code and expect them to work (this fails when using threads).
  • My code fails more gracefully, further improving the debug experience. Ctrl-C works.
  • The paradigm shared by Go, Clojures core.async , and Pythons asyncio felt viscerally good. I was able to reason well about my program as I wasbuilding it and made nice diagrams about explicitly which sequentialprocesses interacted with which others over which channels. I am much moreconfident of the correctness of the implementation and the design of myprogram. However, after having gone through this exercise I suspect that Icould now implement just about the same design without asyncio . Thedesign paradigm was perhaps as important as the library itself.
  • I have to support Python 2. Fortunately I found the trollius port of asyncio to bevery usable. It looks like it was a direct fork-then-modify of tulip .

The Bad:

  • There wasnt a ZeroMQ connectivity layer for Trollius (though aiozmq exists in Python 3) so Iended up having to use threads anyway for inter-node I/O. This, combinedwith ZeroMQs finicky behavior did mean that my program crashed hardsometimes. Im considering switching to plain sockets (which are supportednativel by Trollius and asyncio) due to this.
  • While exceptions raise cleanly I cant determine from where they originate.There are no line numbers or tracebacks. Debugging in a concurrentenvironment is hard; my experience was definitely better than threads butstill could be improved. I hope that asyncio in Python 3.4 has betterdebugging support.
  • The API documentation is thorough but stackoverflow, general bestpractices, and example coverage is very sparse. The project is new sothere isnt much to go on. I found that reading documentation for Go andpresentations on Clojures core.async were far more helpful in preparingme to use asyncio than any of the asyncio docs/presentations.

Future

I intend to pursue this into the future and, if the debugging experience isbetter in Python 3 am considering rewriting the dask.distributed Scheduler inPython 3 with asyncio proper. This is possible because the Scheduler doesnthave to be compatible with user code.

I found these videos to be useful:

  1. Stuart Halloway on core.async
  2. David Nolen on core.async
"



    6           6