JUSTUS2/Software/Julia/Parallel Programming: Difference between revisions

From bwHPC Wiki
< JUSTUS2‎ | Software‎ | Julia
Jump to navigation Jump to search
(Created page with "= Parallel Programming in Julia = Julia supports several paradigms of parallel programming # Implicit multi-threading by math libraries (OpenBLAS, MKL) # Explicit multi-threading using Julia threads (e.g. `Threads.@threads for`) # Multiple processes on one ore more nodes * `Distributed` package and `SlurmManager` from [`ClusterManagers.jl`](https://github.com/JuliaParallel/ClusterManagers.jl) package, `@distributed for`-loops * [`MPI.jl`](https://github.com/JuliaParall...")
 
 
(11 intermediate revisions by the same user not shown)
Line 1: Line 1:
= Parallel Programming in Julia =
= Parallel Programming in Julia =


Julia supports several paradigms of parallel programming
Julia supports several paradigms of parallel programming:


# Implicit multi-threading by math libraries (OpenBLAS, MKL)
# Implicit multi-threading by math libraries (OpenBLAS, MKL)
# Explicit multi-threading using Julia threads (e.g. `Threads.@threads for`)
# Explicit multi-threading using Julia threads (e.g. `Threads.@threads for`) or [https://github.com/JuliaSIMD/Polyester.jl Polyester.jl ]
# Multiple processes on one ore more nodes
# Multiple processes on one ore more nodes
* `Distributed` package and `SlurmManager` from [`ClusterManagers.jl`](https://github.com/JuliaParallel/ClusterManagers.jl) package, `@distributed for`-loops
#* <code>Distributed.jl</code> package and <code>SlurmManager</code> from [https://github.com/JuliaParallel/ClusterManagers.jl <code>ClusterManagers.jl</code>] package, (e.g.<code>@distributed for</code>-loops)
* [`MPI.jl`](https://github.com/JuliaParallel/MPI.jl)
#* [https://github.com/JuliaParallel/MPI.jl <code>MPI.jl</code>]
# Execution on GPUs/CUDA using [`CUDA.jl`](https://cuda.juliagpu.org/stable/)
# Execution on GPUs/CUDA using [https://cuda.juliagpu.org/stable/ <code>CUDA.jl</code> ]

All paradigms may be used at the same time, but must be chosen carefully, to obtain the desired performance.

== Implict Multi-Threading ==

The number of threads used by the mathematical linear algebra libraries may be configured using <code>BLAS.set_num_threads()</code> from the <code>LinearAlgebra</code> package. Alternatively you can set the environment variables <code>OPENBLAS_NUM_THREADS</code> or <code>MKL_NUM_THREADS</code> if you use MKL.

If your code is already multi-threaded, you probably want to set the number of BLAS threads to 1, in order to avoid running too many competing threads, as every Julia thread comes with its own BLAS threads.

== Explicit Multi-Threading ==
Start Julia with option <code>-t x</code> where x is
the number of (Julia) threads or the keyword <code>auto</code>, which however doesn't determine correctly the number of threads requested from SLURM with the option <code>--cpus-per-task</code>. Alternatively, you can set the environment variable <code>JULIA_NUM_THREADS</code>. See the [https://docs.julialang.org/en/v1/manual/multi-threading/ Julia documentation] for more details.

== Multiple Processes ==
With the [https://docs.julialang.org/en/v1/manual/distributed-computing/ Distributed package] Julia has native support for distributed computing using multiple processes on different nodes. To integrate well into SLURM, the use of the [https://github.com/JuliaParallel/ClusterManagers.jl <code>ClusterManagers.jl</code>], providing the <code>addprocs_slurm()</code> function, is advised to spawn the worker processes.

== MPI ==

Distributed computing using MPI can be performed leveraging the [https://github.com/JuliaParallel/MPI.jl <code>MPI.jl</code>] package, which provides Julia wrappers for most of the standard MPI functions.

== CUDA ==
Julia supports computations on NVidia GPUS using the [https://cuda.juliagpu.org/stable/ CUDA.jl] package. It provides the possibility to write own kernels as well as wrappers for libraries like cuBLAS or cuFFT, that contain implementations of standard numerical routines optimized for GPUs.

== Higher Level Packages ==

There are several Julia packages that allow for mixing/changing the different paradigms for parallel computing with minimal code changes:

* [https://github.com/JuliaFolds2/FLoops.jl <code>FLoops.jl</code>] and its backend [https://juliafolds2.github.io/Folds.jl/dev/ <code>Folds.jl</code>]
* [https://juliaparallel.org/Dagger.jl/stable/ Dagger.jl] (still quite experimental)

Latest revision as of 14:28, 5 November 2024

Parallel Programming in Julia

Julia supports several paradigms of parallel programming:

  1. Implicit multi-threading by math libraries (OpenBLAS, MKL)
  2. Explicit multi-threading using Julia threads (e.g. `Threads.@threads for`) or Polyester.jl
  3. Multiple processes on one ore more nodes
  4. Execution on GPUs/CUDA using CUDA.jl

All paradigms may be used at the same time, but must be chosen carefully, to obtain the desired performance.

Implict Multi-Threading

The number of threads used by the mathematical linear algebra libraries may be configured using BLAS.set_num_threads() from the LinearAlgebra package. Alternatively you can set the environment variables OPENBLAS_NUM_THREADS or MKL_NUM_THREADS if you use MKL.

If your code is already multi-threaded, you probably want to set the number of BLAS threads to 1, in order to avoid running too many competing threads, as every Julia thread comes with its own BLAS threads.

Explicit Multi-Threading

Start Julia with option -t x where x is the number of (Julia) threads or the keyword auto, which however doesn't determine correctly the number of threads requested from SLURM with the option --cpus-per-task. Alternatively, you can set the environment variable JULIA_NUM_THREADS. See the Julia documentation for more details.

Multiple Processes

With the Distributed package Julia has native support for distributed computing using multiple processes on different nodes. To integrate well into SLURM, the use of the ClusterManagers.jl, providing the addprocs_slurm() function, is advised to spawn the worker processes.

MPI

Distributed computing using MPI can be performed leveraging the MPI.jl package, which provides Julia wrappers for most of the standard MPI functions.

CUDA

Julia supports computations on NVidia GPUS using the CUDA.jl package. It provides the possibility to write own kernels as well as wrappers for libraries like cuBLAS or cuFFT, that contain implementations of standard numerical routines optimized for GPUs.

Higher Level Packages

There are several Julia packages that allow for mixing/changing the different paradigms for parallel computing with minimal code changes: