Using Dask.distributed Plugins to Send Task Notifications

Dask is a fantastic library for parallel computing in Python. It is designed to integrate seamlessly with many PyData libraries including NumPy, Pandas, and Scikit-Learn. I use it heavily to crunch and work with tera-bytes of data and scale multiple data science pipelines efficiently in a distributed approach using Dask.distributed.

Recently, I wanted to monitor my tasks progress on long-running jobs submitted to my dask.distributed clusters. This was simple and easy to implement by extending the distributed scheduler in dask.distributed using SchedulerPlugin interface.

This plugin interface enables custom code to run at several triggered events on the cluster’s scheduler. As I wanted reports on task states to be pushed to my Slack workspace and mobile via Pushover app, I used a Python library called Notifiers in my plugin script to automate sending report notifications.

Here is a sample scheduler plugin script to send a report whenever tasks got completed in-memory:

We can call this plugin by manually launching a scheduler from the command line on one node and passing a preload option:

$ dask-scheduler --preload /path/to/myplugin.py

Or by passing preload to scheduler_options parameter in one of the supported interfaces for cluster deployment in Dask. For this example I will use SSHCluster to deploy using SSH:

from dask.distributed import Client, SSHCluster

cluster = SSHCluster(['node1', 'node2'],
                     connect_options={'known_hosts': None},
                     worker_options={'nthreads': 2, 'nprocs': 1},
                     scheduler_options={'preload': '/path/to/myplugin.py'}
                    )
client = Client(cluster)

Now, let us submit some tasks:

def square(x):
    return x ** 2

def neg(x):
    return -x

a = client.map(square, range(1000))
b = client.map(neg, a)
total = client.submit(sum, b)
total.result()

Once these tasks got completed on the cluster workers, the provided plugin will send a simple performance report to my Slack channel and mobile app.

Slack workspace notification

Slack workspace notification

Pushover notification on mobile

Pushover notification on mobile

This was fun to work on! For more details about Dask plugins for Scheduler and Worker, please refer to this page.

comments powered by Disqus