NEMO/Moab/General: Difference between revisions
K Siegmund (talk | contribs) No edit summary |
|||
Line 272: | Line 272: | ||
See also: |
See also: |
||
* [[NEMO/Hardware#Local_Disk_Space_.24TMPDIR|<code>$TMPDIR</code> Variable]] |
* [[NEMO/Hardware#Local_Disk_Space_.24TMPDIR|<code>$TMPDIR</code> Variable]] |
||
* [[NEMO/Moab#Simple_parallel_jobs_with_job_arrays|<code>$MOAB_JOBARRAYINDEX</code |
* [[NEMO/Moab#Simple_parallel_jobs_with_job_arrays|<code>$MOAB_JOBARRAYINDEX</code> Variable]] |
||
=== Interpreting PBS exit codes === |
=== Interpreting PBS exit codes === |
Latest revision as of 10:05, 18 June 2024
Moab® HPC Workload Manager
Moab Commands (excerpt)
Some of the most used Moab commands for non-administrators working on a bwHPC cluster.
MOAB commands | Brief explanation |
---|---|
msub | Submits a job and queues it in an input queue [msub] |
checkjob | Displays detailed job state information [checkjob] |
showq | Displays information about active, eligible, blocked, and/or recently completed jobs [showq] |
mjobctl | Cancel a job and more job control options [mjobctl] |
Job Submission : msub
Batch jobs are submitted by using the command msub. The main purpose of the msub command is to specify the resources that are needed to run the job. msub will then queue the batch job. However, starting of batch job depends on availability of the requested resources and the fair sharing value.
msub Command Parameters
The syntax and use of msub can be displayed via:
$ man msub
msub options can be used from the command line or in your job script.
msub Options | ||
---|---|---|
Command line | Script | Purpose |
-l resources | #MSUB -l resources | Defines the resources that are required by the job. See the description below for this important flag. |
-N name | #MSUB -N name | Gives a user specified name to the job. |
-o filename | #MSUB -o filename | Defines the file-name to be used for the standard output stream of the batch job. By default the file with defined file name is placed under your |
-q queue | #MSUB -q queue | Defines the queue class |
-v variable=arg | #MSUB -v variable=arg | Expands the list of environment variables that are exported to the job |
-m bea | #MSUB -m bea #MSUB -m n |
Send email when job begins (b), ends (e) or aborts (a). Use n if you do not wish to receive emails. |
-M name@uni.de | #MSUB -M name@uni.de | Send email to the specified email address "name@uni.de". |
For cluster specific msub options, read:
msub -l resource_list
The -l option is one of the most important msub options. It is used to specify a number of resource requirements for your job. Multiple resource strings are separated by commas.
msub -l resource_list | ||
---|---|---|
resource | Purpose | |
-l nodes=2:ppn=20 | Number of nodes and number of processes per node | |
-l walltime=600 -l walltime=01:30:00 |
Wall-clock time. Default units are seconds. HH:MM:SS format is also accepted. | |
-l pmem=1gb | Maximum amount of physical memory used by any single process of the job. Allowed units are kb, mb, gb. Be aware that processes are either MPI tasks |
Note that all compute nodes do not have SWAP space, thus DO NOT specify '-l vmem' or '-l pvmem' or your jobs will not start.
msub Examples
Serial Programs
To submit a serial job that runs the script job.sh and that requires 6 GB of main memory and 3 hours of wall clock time
a) execute:
$ msub -N test -l nodes=1:ppn=1,walltime=3:00:00,pmem=6gb job.sh
or b) add after the initial line of your script job.sh the lines (here with a high memory request):
#MSUB -l nodes=1:ppn=1
#MSUB -l walltime=3:00:00
#MSUB -l pmem=2gb
#MSUB -N test
and execute the modified script with the command line option -q gpu:
$ msub -q gpu job.sh
Note, that msub command line options overrule script options.
Multithreaded Programs
Multithreaded programs operate faster than serial programs on CPUs with multiple cores.
Moreover, multiple threads of one process share resources such as memory.
For multithreaded programs based on Open Multi-Processing (OpenMP) number of threads are defined by the environment variable OMP_NUM_THREADS. By default this variable is set to 1 (OMP_NUM_THREADS=1).
To submit a batch job called OpenMP_Test that runs a fourfold threaded program omp_executable which requires 6 GB of total physical memory and total wall clock time of 3 hours:
- generate the script job_omp.sh containing the following lines:
#!/bin/bash
#MSUB -l nodes=1:ppn=4
#MSUB -l walltime=3:00:00
#MSUB -l pmem=6gb
#MSUB -v EXECUTABLE=./omp_executable
#MSUB -v MODULE=<placeholder>
#MSUB -N OpenMP_Test
#Usually you should set
export KMP_AFFINITY=compact,1,0
#export KMP_AFFINITY=verbose,compact,1,0 prints messages concerning the supported affinity
#KMP_AFFINITY Description: https://software.intel.com/en-us/node/524790#KMP_AFFINITY_ENVIRONMENT_VARIABLE
module load ${MODULE}
export OMP_NUM_THREADS=${MOAB_PROCCOUNT}
echo "Executable ${EXECUTABLE} running on ${MOAB_PROCCOUNT} cores with ${OMP_NUM_THREADS} threads"
startexe=${EXECUTABLE}
echo $startexe
exec $startexe
Using Intel compiler the environment variable KMP_AFFINITY switches on binding of threads to specific cores and, if necessary, replace <placeholder> with the required modulefile to enable the OpenMP environment and execute the script job_omp.sh:
$ msub job_omp.sh
Note, that msub command line options overrule script options, e.g.,
$ msub -l mem=2gb job_omp.sh
overwrites the script setting of 6 GB with 2 GB.
MPI Parallel Programs
MPI parallel programs run faster than serial programs on multi CPU and multi core systems. N-fold spawned processes of the MPI program, i.e., MPI tasks, run simultaneously and communicate via the Message Passing Interface (MPI) paradigm. MPI tasks do not share memory but can be spawned over different nodes.
Multiple MPI tasks can not be launched by the MPI parallel program itself but via mpirun:
$ mpirun my_par_program # uses tasks from nodes times ppn, e.g: nodes=1:ppn4 -> 1x4=4)
Generate a wrapper script job_ompi.sh for OpenMPI containing the following lines:
#!/bin/bash
module load mpi/openmpi
# Use when loading OpenMPI
mpirun --bind-to core --map-by core -report-bindings my_par_program
Attention: Do NOT add mpirun options -n <number_of_processes> or any other option defining processes or nodes, since MOAB instructs mpirun about number of processes and node hostnames. Use ALWAYS the MPI options --bind-to core and --map-by core|socket|node. Please type mpirun --help for an explanation of the meaning of the different options of mpirun option --map-by.
Considering 4 OpenMPI tasks on a single node, each requiring 1 GB, and running for 1 hour, execute:
$ msub -l nodes=1:ppn=4,pmem=1gb,walltime=01:00:00 job_ompi.sh
Multithreaded + MPI parallel Programs
Multithreaded + MPI parallel programs operate faster than serial programs on multi CPUs with multiple cores. All threads of one process share resources such as memory. On the contrary MPI tasks do not share memory but can be spawned over different nodes.
Multiple MPI tasks using OpenMPI must be launched by the MPI parallel program mpirun. For multithreaded programs based on Open Multi-Processing (OpenMP) number of threads are defined by the environment variable OMP_NUM_THREADS. By default this variable is set to 1 (OMP_NUM_THREADS=1).
For OpenMPI a job-script to submit a batch job called job_ompi_omp.sh that runs a MPI program with 4 tasks and an fivefold threaded program ompi_omp_program requiring 6 GB of physical memory per process/thread (using 5 threads per MPI task you will get 5*6GB = 30GB per MPI task) and total wall clock time of 3 hours looks like:
#!/bin/bash
#MSUB -l nodes=2:ppn=10
#MSUB -l walltime=03:00:00
#MSUB -l pmem=6gb
#MSUB -v MPI_MODULE=mpi/ompi
#MSUB -v OMP_NUM_THREADS=5
#MSUB -v MPIRUN_OPTIONS="--bind-to core --map-by socket:PE=5 -report-bindings"
#MSUB -v EXECUTABLE=./ompi_omp_program
#MSUB -N test_ompi_omp
module load ${MPI_MODULE}
TASK_COUNT=$((${MOAB_PROCCOUNT}/${OMP_NUM_THREADS}))
echo "${EXECUTABLE} running on ${MOAB_PROCCOUNT} cores with ${TASK_COUNT} MPI-tasks and ${OMP_NUM_THREADS} threads"
startexe="mpirun -n ${TASK_COUNT} ${MPIRUN_OPTIONS} ${EXECUTABLE}"
echo $startexe
exec $startexe
Execute the script job_ompi_omp.sh:
$ msub job_ompi_omp.sh
- With the mpirun option --bind-to core MPI tasks and OpenMP threads are bound to physical cores.
- With the option --map-by socket:PE=<value> (neighbored) MPI tasks will be attached to different sockets and each MPI task is bound to the (in <value>) specified number of cpus. <value> must be set to ${OMP_NUM_THREADS}.
- With the option -bysocket (neighbored) MPI tasks will be attached to different sockets and the option -cpus-per-proc <value> binds each MPI task to the (in <value>) specified number of cpus. <value> must be set to ${OMP_NUM_THREADS}.
- The option -report-bindings shows the bindings between MPI tasks and physical cores.
- The mpirun-options --bind-to core, --map-by socket|...|node:PE=<value> should always be used when running a multithreaded MPI program.
Interactive Jobs
Policies of interactive batch jobs are cluster specific and can be found here:
Moab Environment Variables
Once an eligible compute jobs starts on the compute system, MOAB adds the following variables to the job's environment:
MOAB variables | ||
---|---|---|
Environment variables | Description | |
MOAB_JOBID | Job ID | |
MOAB_JOBNAME | Job name | |
MOAB_NODECOUNT | Number of nodes allocated to job | |
MOAB_PARTITION | Partition name the job is running in | |
MOAB_PROCCOUNT | Number of processors allocated to job | |
MOAB_SUBMITDIR | Directory of job submission | |
MOAB_USER | User name |
See also:
Interpreting PBS exit codes
- The PBS Server logs and accounting logs record the ‘exit status’ of jobs.
- Zero or positive exit status is the status of the top-level shell.
- Certain negative exit statuses are used internally and will never be reported to the user.
- The positive exit status values indicate which signal killed the job.
- Depending on the system, values greater than 128 (or on some systems 256, see wait(2) or waitpid(2) for more information) are the value of the signal that killed the job.
- To interpret (or ‘decode’) the signal contained in the exit status value, subtract the base value from the exit status.
For example, if a job had an exit status of 143, that indicates the jobs was killed via a SIGTERM (e.g. 143 - 128 = 15, signal 15 is SIGTERM).
Job termination
- The exit code from a batch job is a standard Unix termination signal.
- Typically, exit code 0 means successful completion.
- Codes 1-127 are generated from the job calling exit() with a non-zero value to indicate an error.
- Exit codes 129-255 represent jobs terminated by Unix signals.
- Each signal has a corresponding value which is indicated in the job exit code.
Job termination signals
Specific job exit codes are also supplied by the underlying resource manager of the cluster's batch system which is TORQUE. More detailed information can be found in the corresponding documentation:
Submitting Termination Signal
Here is an example, how to 'save' a msub termination signal in a typical bwHPC-submit script.
[...]
exit_code=$?
echo "### Calling YOUR_PROGRAM command ..."
mpirun -np 'NUMBER_OF_CORES' $YOUR_PROGRAM_BIN_DIR/runproc ... (options) 2>&1
[ "$exit_code" -eq 0 ] && echo "all clean..." || \
echo "Executable ${YOUR_PROGRAM_BIN_DIR}/runproc finished with exit code ${$exit_code}"
[...]
- Do not use 'time' mpirun! The exit code will be the one submitted by the first (time) program and not the msub exit code.
- You do not need an exit $exit_code in the scripts.
List of your submitted jobs : showq
Displays information about active, eligible, blocked, and/or recently completed jobs. Since the resource manager is not actually scheduling jobs, the job ordering it displays is not valid. The showq command displays the actual job ordering under the Moab Workload Manager. When used without flags, this command displays all jobs in active, idle, and non-queued states.
Flags
Flag | Description |
---|---|
-b | display blocked jobs only |
-c | display details about recently completed jobs (see example, JOBCPURGETIME) |
-i | display extended details about idle jobs |
-r | display extended details about active (running) jobs |
-v | Display local and full resource manager job IDs as well as partitions. If specified with the '-i' option, will display job reservation time. |
-w | display only jobs associated with the specified constraint. Valid constraints include user, group, acct, class, and qos. |
Examples
$ # use UID for option in showq ---> $ showq -u kn_pop332211 active jobs------------------------ JOBID USERNAME STATE PROCS REMAINING STARTTIME 8370992 kn_pop33 Running 1 2:05:09:17 Wed Jan 13 15:59:01 8370991 kn_pop33 Running 1 2:05:09:17 Wed Jan 13 15:59:01 8370993 kn_pop33 Running 1 2:05:10:20 Wed Jan 13 16:00:04 [...] 8371040 kn_pop33 Running 1 2:05:11:41 Wed Jan 13 16:01:25 50 active jobs 50 of 7072 processors in use by local jobs (0.71%) 434 of 434 nodes active (100.00%) eligible jobs---------------------- JOBID USERNAME STATE PROCS WCLIMIT QUEUETIME 0 eligible jobs blocked jobs----------------------- JOBID USERNAME STATE PROCS WCLIMIT QUEUETIME 0 blocked jobs Total jobs: 50
- The summary of your active jobs shows how many jobs of yours are running, how many processors are in use by your jobs and how many nodes are in use by all active jobs.
- Use showq -u $USER for your own jobs.
- For further options of showq read the manpage of showq.
Detailed job information : checkjob
Checkjob displays detailed job state information and diagnostic output for a specified job. Detailed information is available for queued, blocked, active, and recently completed jobs.
Access
- End users can use checkjob to view the status of their own jobs only.
Output
Attribute | Value | Description |
---|---|---|
Account | <STRING> | Name of account associated with job |
Actual Run Time | [[[DD:]HH:]MM:]SS | Length of time job actually ran. This info is only displayed in simulation mode. |
Allocated Nodes | Square bracket delimited list of node and processor ids | List of nodes and processors allocated to job |
Applied Nodeset** | <STRING> | Nodeset used for job's node allocation |
Arch | <STRING> | Node architecture required by job |
Attr | Square bracket delimited list of job attributes | Job Attributes (i.e. [BACKFILL][PREEMPTEE]) |
Available Memory** | <INTEGER> | The available memory requested by job. Moab displays the relative or exact value by returning a comparison symbol (>, <, >=, <=, or ==) with the value (i.e. Available Memory <= 2048). |
Available Swap** | <INTEGER> | The available swap requested by job. Moab displays the relative or exact value by returning a comparison symbol (>, <, >=, <=, or ==) with the value (i.e. Available Swap >= 1024). |
Average Utilized Procs* | <FLOAT> | Average load balance for a job |
Avg Util Resources Per Task* | <FLOAT> | |
BecameEligible | <TIMESTAMP> | The date and time when the job moved from Blocked to Eligible. |
Bypass | <INTEGER> | Number of times a lower priority job with a later submit time ran before the job |
CheckpointStartTime** | [[[DD:]HH:]MM:]SS | The time the job was first checkpointed |
Class | [<CLASS NAME> <CLASS COUNT>] | Name of class/queue required by job and number of class initiators required per task. |
Dedicated Resources Per Task* | Space-delimited list of <STRING>:<INTEGER> | Resources dedicated to a job on a per-task basis |
Disk | <INTEGER> | Amount of local disk required by job (in MB) |
Estimated Walltime | [[[DD:]HH:]MM:]SS | The scheduler's estimated walltime. In simulation mode, it is the actual walltime. |
EnvVariables** | Comma-delimited list of <STRING> | List of environment variables assigned to job |
Exec Size* | <INTEGER> | Size of job executable (in MB) |
Executable | <STRING> | Name of command to run |
Features | Square bracket delimited list of <STRING>s | Node features required by job |
Flags | ||
Group | <STRING> | Name of UNIX group associated with job |
Holds | Zero or more of User, System, and Batch | Types of job holds currently applied to job |
Image Size | <INTEGER> | Size of job data (in MB) |
IWD (Initial Working Directory) | <DIR> | Directory to run the executable in |
Job Messages** | <STRING> | Messages attached to a job |
Job Submission** | <STRING> | Job script submitted to RM |
Memory | <INTEGER> | Amount of real memory required per node (in MB) |
Max Util Resources Per Task* | <FLOAT> | |
NodeAccess* | ||
Nodecount | <INTEGER> | Number of nodes required by job |
Opsys | <STRING> | Node operating system required by job |
Partition Mask | ALL or colon delimited list of partitions | List of partitions the job has access to |
PE | <FLOAT> | Number of processor-equivalents requested by job |
Per Partition Priority** | Tabular | Table showing job template priority for each partition |
Priority Analysis** | Tabular | Table showing how job's priority was calculated: |
Job PRIORITY* | Cred( User:Group:Class) | Serv(QTime) |
QOS | <STRING> | Quality of Service associated with job |
Reservation | <RSVID ( <TIME1 - <TIME2> Duration: <TIME3>) | RESID specifies the reservation id, TIME1 is the relative start time, TIME2 the relative end time |
TIME3 | The duration of the reservation | |
Req | [<INTEGER>] TaskCount: <INTEGER> Partition: <partition> | A job requirement for a single type of resource followed by the number of tasks instances required and the appropriate partition |
StartCount | <INTEGER> | Number of times job has been started by Moab |
StartPriority | <INTEGER> | Start priority of job |
StartTime | Time job was started by the resource management system | |
State | One of Idle, Starting, Running, etc | Current Job State |
SubmitTime | Time job was submitted to resource management system | |
Swap | <INTEGER> | Amount of swap disk required by job (in MB) |
Task Distribution* | Square bracket delimited list of nodes | |
Time Queued | ||
Total Requested Nodes** | <INTEGER> | Number of nodes the job requested |
Total Requested Tasks | <INTEGER> | Number of tasks requested by job |
User | <STRING> | Name of user submitting job |
Utilized Resources Per Task* | <FLOAT> | |
WallTime | [[[DD:]HH:]MM:]SS of [[[DD:]HH:]MM:]SS | Length of time job has been running out of the specified limit |
In the above table, fields marked with an asterisk (*) are only displayed when set or when the -v flag is specified. Fields marked with two asterisks (**) are only displayed when set or when the -v -v flag is specified.
Arguments
Argument | Format | Default | Description | Example |
---|---|---|---|---|
--flags | --flags=future | (none) | Evaluates future eligibility of job (ignore current resource state and usage limitations) | $ checkjob -v --flags=future 8370992Display reasons why idle job is blocked ignoring node state and current node utilization constraints. |
-l (Policy level) | <POLICYLEVEL> HARD, SOFT, or OFF | (none) | Reports job start eligibility subject to specified throttling policy level. | $ checkjob -l SOFT 8370992 $ checkjob -l HARD 8370992 |
-n (NodeID) | <NODEID> | (none) | Checks job access to specified node and preemption status with regards to jobs located on that node. | checkjob -n uc1n320 8370992 |
-r (Reservation) | <RSVID> | (none) | Checks job access to specified reservation <RSVID>. | checkjob -r rainer_kn_resa.1 8370992 |
-v (Verbose) | (n/a) | Sets verbose mode. If the job is part of an array, the -v option shows pertinent array information before the job-specific information. Specifying the double verbose ("-v -v") displays additional information about the job. See more infos here! | checkjob -v 8370992 |
Parameters
Parameters, descriptions (a lot!) and examples can be found in Adaptive documentation page.
Use this informations if you'd like to do some analyses why your job is hold or blocked!
- For further options of checkjob see the manual page of checkjob
$ man checkjob
Checkjob Examples
Here is an example from the bwUniCluster.
showq -u $USER # show my own jobs active jobs------------------------ JOBID USERNAME STATE PROCS REMAINING STARTTIME 8370992 kn_popnn Running 1 2:03:56:50 Wed Jan 13 15:59:01 8370991 kn_popnn Running 1 2:03:56:50 Wed Jan 13 15:59:01 [...] 8371040 kn_popnn Running 1 2:03:59:14 Wed Jan 13 16:01:25 49 active jobs 49 of 7072 processors in use by local jobs (0.69%) 434 of 434 nodes active (100.00%) eligible jobs---------------------- JOBID USERNAME STATE PROCS WCLIMIT QUEUETIME 0 eligible jobs blocked jobs----------------------- JOBID USERNAME STATE PROCS WCLIMIT QUEUETIME 0 blocked jobs Total jobs: 49 $ $ # now, see what's up with the first job in my queue $ $ checkjob 8370992 job 8370992 AName: Nic_cit_09_Apo_zal_07_cl2_07_cl3_07_cl5_07_2.moab State: Running Creds: user:kn_pop'nnnnn' group:kn_kn account:konstanz class:singlenode WallTime: 20:04:28 of 3:00:00:00 BecameEligible: Wed Jan 13 15:58:11 SubmitTime: Wed Jan 13 15:57:58 (Time Queued Total: 00:01:03 Eligible: 00:00:58) StartTime: Wed Jan 13 15:59:01 TemplateSets: DEFAULT NodeMatchPolicy: EXACTNODE Total Requested Tasks: 1 Req[0] TaskCount: 1 Partition: uc1 Memory >= 4000M Disk >= 0 Swap >= 0 Dedicated Resources Per Task: PROCS: 1 MEM: 4000M NodeSet=ONEOF:FEATURE:[NONE] Allocated Nodes: [uc1n320:1] SystemID: uc1 SystemJID: 8370992 IWD: /pfs/data2/home/kn/kn_kn/kn_pop139522/fastsimcoal25/Midas_RAD_anchored/Nic_cit_09_Apo_zal_07_cl2_07_cl3_07_cl5_07/1col_DIV-resize_admix_zal1st_starlike_cl2-base_CL-growths_GL-bottlegrowth_onlyintramig/new_est/run_2 SubmitDir: /pfs/data2/home/kn/kn_kn/kn_pop139522/fastsimcoal25/Midas_RAD_anchored/Nic_cit_09_Apo_zal_07_cl2_07_cl3_07_cl5_07/1col_DIV-resize_admix_zal1st_starlike_cl2-base_CL-growths_GL-bottlegrowth_onlyintramig/new_est/run_2 Executable: /opt/moab/spool/moab.job.voNyde StartCount: 1 BypassCount: 1 Partition List: uc1 Flags: BACKFILL,FSVIOLATION,GLOBALQUEUE Attr: BACKFILL,FSVIOLATION StartPriority: -3692 PE: 1.00 Reservation '8370992' (-20:04:34 -> 2:03:55:26 Duration: 3:00:00:00) [...]
You can use standard Linux pipe commands to filter the very detailed checkjob output.
- Is the job still running?
$ checkjob 8370992 | grep ^State State: Running
Blocked job information : checkjob -v
This command allows to check the detailed status and resource requirements of your active, queued, or recently completed job. Additionally, this command performs numerous diagnostic checks and determines if and where the job could potentially run. Diagnostic checks include policy violations, reservation constraints, preemption status, and job to resource mapping. If a job cannot run, a text reason is provided along with a summary of how many nodes are and are not available. If the -v flag is specified, a node by node summary of resource availability will be displayed for idle jobs.
If your job is blocked do not delete it!
Job Eligibility
If a job cannot run, a text reason is provided along with a summary of how many nodes are and are not available. If the -v flag is specified, a node by node summary of resource availability will be displayed for idle jobs. For job level eligibility issues, one of the following reasons will be given:
Reason | Description |
---|---|
job has hold in place | one or more job holds are currently in place |
insufficient idle procs | there are currently not adequate processor resources available to start the job |
idle procs do not meet requirements | adequate idle processors are available but these do not meet job requirements |
start date not reached | job has specified a minimum start date which is still in the future |
expected state is not idle | job is in an unexpected state |
state is not idle | job is not in the idle state |
dependency is not met | job depends on another job reaching a certain state |
rejected by policy | job start is prevented by a throttling policy |
If a job cannot run on a particular node, one of the following 'per node' reasons will be given:
Description | Reason |
---|---|
Class | Node does not allow required job class/queue |
CPU | Node does not possess required processors |
Disk | Node does not possess required local disk |
Features | Node does not possess required node features |
Memory | Node does not possess required real memory |
Network | Node does not possess required network interface |
State | Node is not Idle or Running |
Example
A blocked job has hit a limit and will become idle if resource get free. The "-v (verbose)" mode of 'checkjob' also shows a message "BLOCK MSG:" for more details.
checkjob -v 8370992 [...] BLOCK MSG: job <jobID> violates active SOFT MAXPROC limit of 750 for acct mannheim partition ALL (Req: 160 InUse: 742) (recorded at last scheduling iteration)
In this case the job has reached the account limit of mannheim while requesting 160 core when 742 were already in use.
The most common cause of blocked jobs is a violation of MAXPROC or MAXPS limits, indicating that your group has scheduled too many outstanding processor seconds at the same time.
The Limits imposed by the Scheduler
This refers to limits on the number of jobs in the queue which are enforced by the scheduler. The largest factors in determining limits in numbers of jobs are the Maximum Processor Seconds (MAXPS) and the Maximum Processors (MAXPROC) for each account. The MAXPS is the total number of processor core seconds (ps) allocated for each (group) account. It is based on fairshare values in dependency of the configured values for your <OE> (Konstanz, Ulm, etc. ...) .
Users can submit as many jobs but they cannot be scheduled to run if their groups MAXPROC or MAXPS value is exceeded. They instead enter into a "HOLD" state. If the limits of the group is not reached but the resources are not available, the jobs enter into "IDLE" state and will run once the requested resources become available.
Canceling own jobs : canceljob
Caution: This command is deprecated. Use mjobctl -c instead!
The canceljob <JobId> command is used to selectively cancel the specified job(s) (active, idle, or non-queued) from the queue.
Note that only own jobs can be cancelled.
Access
This command can be run by any Moab Administrator and by the owner of the job.
Flag | Name | Format | Default | Description | Example |
---|---|---|---|---|---|
-h | HELP | n./a. | Display usage information | $ canceljob -h | |
JOB ID | <STRING> | (none) | a jobid, a job expression, or the keyword 'ALL' |
Example Use of Canceljob
Example use of canceljob run on the bwUniCluster
[...calc_repo-0]$ msub bwhpc-fasta-example.moab 8374356 # this is the JobId $ $ checkjob 8374356 job 8374356 AName: fasta36_job State: Idle Creds: user:kn_pop235844 group:kn_kn account:konstanz class:multinode WallTime: 00:00:00 of 00:10:00 BecameEligible: Fri Jan 15 12:10:53 SubmitTime: Fri Jan 15 12:10:43 (Time Queued Total: 00:00:10 Eligible: 00:00:08) [...] $ checkjob 8374356 | grep ^State: State: Idle # state is 'Idle' $ # now cancel the job $ canceljob 8374356 job '8374356' cancelled $ checkjob 8374356 | grep ^State: State: Removed # state turned into 'Removed'
Moab Job Control : mjobctl
The mjobctl command controls various aspects of jobs. It is used to submit, cancel, execute, and checkpoint jobs. It can also display diagnostic information about your own jobs.
Canceling own jobs : mjobctl -c
If you want to cancel a job that has been submitted, please do not use the PBS/Torque qdel (n./a.) or the deprecated canceljob commands.
Instead, use mjobctl -c <jobid>.
Flag | Format | Default | Description | Example |
---|---|---|---|---|
-cl | JobId | (none) | Cancel a job. | see: example use of mjobctl -c |
Example Use of mjobctl -c
Canceling a job on the bwUniCluster
[...-calc_repo-0]$ msub bwhpc-fasta-example.moab 8374426 $ checkjob 8374426 | grep ^State State: Idle # job is 'Idle' $ mjobctl -c 8374426 job '8374426' cancelled # job is cancelled checkjob 8374426 | grep ^State State: Removed # now, job is removed $ # my own checkjob wrapper cj 8374426 Job: 8374426 Status: < Removed > Wartezeit: 1m30s Intervall: 30s Job 8374426 wurde gelöscht! $
Other Mjobctl-Options
See also:
Not all of the listed options are available for 'normal' users. Some are for MOAB-admins only.