JUSTUS2/Getting Started: Difference between revisions

From bwHPC Wiki
Jump to navigation Jump to search
 
(11 intermediate revisions by 2 users not shown)
Line 1: Line 1:
<!--

Here is a short list of things you may need to do first when you get onto the cluster
Here is a short list of things you may need to do first when you get onto the cluster
== Basics ==
== Basics ==
* log in to the cluster: [[JUSTUS2/Login]]
* log in to the cluster: [[JUSTUS2/Login]]
* get accustomed with the linux commandline:
* get accustomed with the linux commandline:
** [https://hpc-wiki.info/hpc/Introduction_to_Linux_in_HPC/The_Command_Line introduction on the hpc wiki] or
** [https://hpc-wiki.info/hpc/Introduction_to_Linux_in_HPC/The_Command_Line introduction on the (external) hpc wiki] or
** linux course at [https://training.bwhpc.de/ training.bwhpc.de]
** linux course at [https://training.bwhpc.de/ training.bwhpc.de]


Line 17: Line 17:


Note that your jobs should not write/read much on the lustre filesystem while the job runs, but either use the ram disk in /tmp or request /scratch if the space of the ram disk isn't sufficient. The [[BwForCluster_JUSTUS_2_Slurm_HOWTO#How_to_clean-up_or_save_files_before_a_job_times_out.3F| Slurm Howto]] shows how to copy and clean up your data from /tmp or /scratch at the end of the job
Note that your jobs should not write/read much on the lustre filesystem while the job runs, but either use the ram disk in /tmp or request /scratch if the space of the ram disk isn't sufficient. The [[BwForCluster_JUSTUS_2_Slurm_HOWTO#How_to_clean-up_or_save_files_before_a_job_times_out.3F| Slurm Howto]] shows how to copy and clean up your data from /tmp or /scratch at the end of the job
-->
== General Workflow of Running a Calculation ==

On a compute cluster, you do not simply run log in and your software, but you write a "job script" that contains all commands to run and process your job and send this into a waiting queue to be run on one of several hundred computers.

How this is done is described in a little more detail here: [[Running Calculations]]

== Get Access to the Cluster ==

Follow the registration process for the bwForCluster. &rarr; [[Registration/bwForCluster|How to Register for a bwForCluster]]

== Login to the Cluster ==

Setup service password and 2FA token and login to the cluster. &rarr; [[JUSTUS2/Login|Login JUSTUS2]]

== Using the Linux Commandline ==

HPC Wiki (external site) &rarr; [https://hpc-wiki.info/hpc/Introduction_to_Linux_in_HPC/The_Command_Line Introduction to Linux Commandline]

Training course &rarr; [https://training.bwhpc.de/ Linux course on training.bwhpc.de]

== Transfer your Data to the Cluster ==

Get familiar with available file systems on the cluster. &rarr; [[Hardware_and_Architecture_(bwForCluster_JUSTUS_2)#Storage_Architecture|File Systems]]

Transfer your data to the cluster using appropriate tools. &rarr; [[Data Transfer|Data Transfer]]

== Find Information About Installed Software and Examples ==

Compiler, Libraries and application software are provided as software modules. Learn how to work with
[[Environment_Modules|software modules]]. &rarr; [[JUSTUS2/Software|Software]]
<!-- Overview of available software modules &rarr; [https://www.bwhpc.de/software.php https://www.bwhpc.de/software.php], select <code>Cluster → bwForCluster JUSTUS 2</code> -->

Run sample script from a pre-installed software (Software job examples in the page above)

== Run your Software in a Batch Job ==

Get familiar with available nodes types on the cluster. &rarr; [[Hardware_and_Architecture_(bwForCluster_JUSTUS_2)|Hardware and Architecture]]

Submit and monitor your jobs with Slurm commands.
* &rarr; [[JUSTUS2/Running Your Calculations|Running Your Calculations]] - a very brief introduction.
* &rarr; [[BwForCluster_JUSTUS_2_Slurm_HOWTO| extensive Slurm HOWTO on specific tasks]]


== Learn about Scaling your Job ==

How many compute-cores should my job use? This depends on the software and the problem you are trying to solve. But if you use too few cores, your computation may take much too long - if you use too many cores, they will not improve the speed of your computation and all you do by using more cores is wasting compute resources and energy.

If you run hundreds or thousands of similar calculations, you should look at this carefully before starting.

How to do this is described in: [[Scaling]]

== Acknowledge the Cluster ==

Remember to mention the cluster in your publications. &rarr; [[bwForCluster JUSTUS 2 Acknowledgement|Acknowledgement]]

Latest revision as of 11:58, 11 September 2024

General Workflow of Running a Calculation

On a compute cluster, you do not simply run log in and your software, but you write a "job script" that contains all commands to run and process your job and send this into a waiting queue to be run on one of several hundred computers.

How this is done is described in a little more detail here: Running Calculations

Get Access to the Cluster

Follow the registration process for the bwForCluster. → How to Register for a bwForCluster

Login to the Cluster

Setup service password and 2FA token and login to the cluster. → Login JUSTUS2

Using the Linux Commandline

HPC Wiki (external site) → Introduction to Linux Commandline

Training course → Linux course on training.bwhpc.de

Transfer your Data to the Cluster

Get familiar with available file systems on the cluster. → File Systems

Transfer your data to the cluster using appropriate tools. → Data Transfer

Find Information About Installed Software and Examples

Compiler, Libraries and application software are provided as software modules. Learn how to work with software modules. → Software

Run sample script from a pre-installed software (Software job examples in the page above)

Run your Software in a Batch Job

Get familiar with available nodes types on the cluster. → Hardware and Architecture

Submit and monitor your jobs with Slurm commands.


Learn about Scaling your Job

How many compute-cores should my job use? This depends on the software and the problem you are trying to solve. But if you use too few cores, your computation may take much too long - if you use too many cores, they will not improve the speed of your computation and all you do by using more cores is wasting compute resources and energy.

If you run hundreds or thousands of similar calculations, you should look at this carefully before starting.

How to do this is described in: Scaling

Acknowledge the Cluster

Remember to mention the cluster in your publications. → Acknowledgement