BwUniCluster 2.0 Python Dask

From bwHPC Wiki
Jump to: navigation, search

This guide explains how to use Python Dask and dask-jobqueue on bwUniCluster2.0.

1 Installation

Use on of our pre-configured Python modules and load them with 'module load ...'. You have to install the packages 'dask' and 'das-jobqueue' if your are using an own conda environment.

2 Using Dask

In a new interactive shell, execute the following commands in Python:

>>> from dask_jobqueue import SLURMCluster
>>> cluster = SLURMCluster(cores=X, memory='X GB', queue='X')

You have to specify how many cores and memory you want for one dask worker. Furthermore a batch queue is required.

>>> cluster.scale (X)

After executing this command with e.g. cluster.scale(5), dask will start to request five worker processes each with the specified amount of cores and memory.

With the command 'squeue' in bash you can check if your dask worker processes are actually running or if you have to wait until you get the requested resources. You have a better chance to get resources quickly if you additionally specify a walltime. You can check all options you have with the following command:

>>> SLURMCluster?

With the command

>>> client

dask will output all resources that are are actually available to distribute your computing.

3 Dask Dashboard

To forward the Dask Dashboard you have to do a ssh port forwarding with the machine on which you have started Dask.

$ ssh -N -L 8787:machineName:8787 yourusername@bwunicluster.scc.kit.edu

After executing this command you can access the dask dashboard in your local browser by typing 'localhost:8787/status'.