JUSTUS2/Getting Started: Difference between revisions

From bwHPC Wiki
Jump to navigation Jump to search
No edit summary
 
(22 intermediate revisions by 2 users not shown)
Line 1: Line 1:
<!--

Here is a short list of things you may need to do first when you get onto the cluster
Here is a short list of things you may need to do first when you get onto the cluster
== Basics ==

* log in to the cluster: [[JUSTUS2/Login]]
* log in to the cluster: [[JUSTUS2/Login]]
* get accustomed with the linux commandline: [https://www.uni-ulm.de/?id=linux-ex1 exercises1] [https://www.uni-ulm.de/?id=linux-ex2 exercises2] [https://www.uni-ulm.de/?id=linux-ex1 exercises3]
* get accustomed with the linux commandline:
** [https://hpc-wiki.info/hpc/Introduction_to_Linux_in_HPC/The_Command_Line introduction on the (external) hpc wiki] or
** linux course at [https://training.bwhpc.de/ training.bwhpc.de]

== Running an Example with Preinstalled Software ==
* scientific software: read on how to load [[Software Modules]]
* scientific software: read on how to load [[Software Modules]]
* continue reading until you found that there are example job scripts: [[Environment_Modules#Software_job_examples]]
* continue reading until you found that there are example job scripts: [[Environment_Modules#Software_job_examples]]
* submit a sample job from a software as mentioned in the job example.
* submit a sample job from a software as mentioned in the job example. Also see: [[JUSTUS2/Slurm]]
* monitor your job
* monitor your job: [[JUSTUS2/Slurm#Monitoring_Your_Jobs]]
== Running Your Own Calculations ==
* transfer your own data to the cluster: [[Data Transfer]]
* transfer your own data to the cluster: [[Data Transfer]]
* adapt the sample job script to run your own job

Note that your jobs should not write/read much on the lustre filesystem while the job runs, but either use the ram disk in /tmp or request /scratch if the space of the ram disk isn't sufficient. The [[BwForCluster_JUSTUS_2_Slurm_HOWTO#How_to_clean-up_or_save_files_before_a_job_times_out.3F| Slurm Howto]] shows how to copy and clean up your data from /tmp or /scratch at the end of the job
-->
== General Workflow of Running a Calculation ==

On a compute cluster, you do not simply run log in and your software, but you write a "job script" that contains all commands to run and process your job and send this into a waiting queue to be run on one of several hundred computers.

How this is done is described in a little more detail here: [[Running Calculations]]

== Get Access to the Cluster ==

Follow the registration process for the bwForCluster. &rarr; [[Registration/bwForCluster|How to Register for a bwForCluster]]

== Login to the Cluster ==

Setup service password and 2FA token and login to the cluster. &rarr; [[JUSTUS2/Login|Login JUSTUS2]]

== Using the Linux Commandline ==

HPC Wiki (external site) &rarr; [https://hpc-wiki.info/hpc/Introduction_to_Linux_in_HPC/The_Command_Line Introduction to Linux Commandline]

Training course &rarr; [https://training.bwhpc.de/ Linux course on training.bwhpc.de]

== Transfer your Data to the Cluster ==

Get familiar with available file systems on the cluster. &rarr; [[Hardware_and_Architecture_(bwForCluster_JUSTUS_2)#Storage_Architecture|File Systems]]

Transfer your data to the cluster using appropriate tools. &rarr; [[Data Transfer|Data Transfer]]

== Find Information About Installed Software and Examples ==

Compiler, Libraries and application software are provided as software modules. Learn how to work with
[[Environment_Modules|software modules]]. &rarr; [[JUSTUS2/Software|Software]]
<!-- Overview of available software modules &rarr; [https://www.bwhpc.de/software.php https://www.bwhpc.de/software.php], select <code>Cluster → bwForCluster JUSTUS 2</code> -->

Run sample script from a pre-installed software (Software job examples in the page above)

== Run your Software in a Batch Job ==

Get familiar with available nodes types on the cluster. &rarr; [[Hardware_and_Architecture_(bwForCluster_JUSTUS_2)|Hardware and Architecture]]

Submit and monitor your jobs with Slurm commands.
* &rarr; [[JUSTUS2/Running Your Calculations|Running Your Calculations]] - a very brief introduction.
* &rarr; [[BwForCluster_JUSTUS_2_Slurm_HOWTO| extensive Slurm HOWTO on specific tasks]]


== Learn about Scaling your Job ==

How many compute-cores should my job use? This depends on the software and the problem you are trying to solve. But if you use too few cores, your computation may take much too long - if you use too many cores, they will not improve the speed of your computation and all you do by using more cores is wasting compute resources and energy.

If you run hundreds or thousands of similar calculations, you should look at this carefully before starting.

How to do this is described in: [[Scaling]]

== Acknowledge the Cluster ==

Remember to mention the cluster in your publications. &rarr; [[bwForCluster JUSTUS 2 Acknowledgement|Acknowledgement]]

Latest revision as of 11:58, 11 September 2024

General Workflow of Running a Calculation

On a compute cluster, you do not simply run log in and your software, but you write a "job script" that contains all commands to run and process your job and send this into a waiting queue to be run on one of several hundred computers.

How this is done is described in a little more detail here: Running Calculations

Get Access to the Cluster

Follow the registration process for the bwForCluster. → How to Register for a bwForCluster

Login to the Cluster

Setup service password and 2FA token and login to the cluster. → Login JUSTUS2

Using the Linux Commandline

HPC Wiki (external site) → Introduction to Linux Commandline

Training course → Linux course on training.bwhpc.de

Transfer your Data to the Cluster

Get familiar with available file systems on the cluster. → File Systems

Transfer your data to the cluster using appropriate tools. → Data Transfer

Find Information About Installed Software and Examples

Compiler, Libraries and application software are provided as software modules. Learn how to work with software modules. → Software

Run sample script from a pre-installed software (Software job examples in the page above)

Run your Software in a Batch Job

Get familiar with available nodes types on the cluster. → Hardware and Architecture

Submit and monitor your jobs with Slurm commands.


Learn about Scaling your Job

How many compute-cores should my job use? This depends on the software and the problem you are trying to solve. But if you use too few cores, your computation may take much too long - if you use too many cores, they will not improve the speed of your computation and all you do by using more cores is wasting compute resources and energy.

If you run hundreds or thousands of similar calculations, you should look at this carefully before starting.

How to do this is described in: Scaling

Acknowledge the Cluster

Remember to mention the cluster in your publications. → Acknowledgement