5. Job Submission with SLURM
SLURM (Simple Linux Utility for Resource Management) is the job scheduler on Falcon. It manages the queue of jobs waiting to run on the supercomputer and allocates computing resources fairly among users. This guide covers the basics of submitting and monitoring jobs.
5.1. Interactive vs. Batch Jobs
Before diving into commands, understand the two ways to use Falcon:
Interactive Jobs You run commands directly and see output immediately. Good for: - Testing code before running full simulations - Debugging issues - Quick exploratory work - Interactive Python/MATLAB sessions
Batch Jobs You write a script with commands and submit it to the queue. The system runs it when resources are available. Good for: - Long-running simulations - Jobs that take hours or days - Running many jobs in sequence - Unattended computing
Most serious work on Falcon uses batch jobs because interactive sessions have time limits and you need to stay connected.
5.2. Interactive Jobs with srun
srun — Run a Command Interactively
Run a command directly on compute resources. Useful for testing and short tasks.
srun hostname # Get the name of the compute node
srun -N 1 -n 1 python script.py # Run Python script on 1 node, 1 task
srun -N 1 -n 4 python script.py # Run on 1 node with 4 tasks
Common flags:
-N NUM # Number of nodes (--nodes=NUM)
-n NUM # Number of tasks (--ntasks=NUM)
-c NUM # Cores per task (--cpus-per-task=NUM)
-t TIME # Time limit (--time=HH:MM:SS)
--mem=SIZE # Memory per node (e.g., 4G, 16G)
--partition=NAME # Which partition/queue (see sinfo)
--job-name=NAME # Name for the job
Example with multiple flags:
srun -N 1 -n 1 -c 4 -t 00:30:00 --mem=8G python analysis.py
This requests 1 node, 1 task, 4 cores per task, 30 minutes, and 8GB of memory.
Important: Interactive jobs have time limits (usually 1-4 hours depending on the partition). For longer jobs, use batch submission.
5.3. Batch Jobs with sbatch
sbatch — Submit a Job Script
For longer or more complex jobs, write a script and submit it to the queue.
Basic Script Structure
Create a file called myjob.sh:
#!/bin/bash
#SBATCH --job-name=myanalysis
#SBATCH --nodes=1
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=4
#SBATCH --time=02:00:00
#SBATCH --mem=16G
#SBATCH --output=job_%j.out
#SBATCH --error=job_%j.err
# Load any modules you need
module load python/3.9
# Your actual commands
python analysis.py
echo "Job finished"
Lines starting with #SBATCH are directives that tell SLURM how to run your job. They must come before any actual commands.
Submit it:
sbatch myjob.sh
SLURM will print out the job ID. You can use this to monitor or cancel the job.
Common SBATCH Directives
#SBATCH --job-name=NAME # Name for your job
#SBATCH --nodes=N # Number of compute nodes
#SBATCH --ntasks=N # Total number of tasks
#SBATCH --cpus-per-task=N # Cores per task (for OpenMP)
#SBATCH --time=HH:MM:SS # Time limit
#SBATCH --mem=SIZE # Memory per node (e.g., 16G)
#SBATCH --mem-per-cpu=SIZE # Memory per core instead
#SBATCH --output=file.out # Where to save stdout
#SBATCH --error=file.err # Where to save stderr
#SBATCH --partition=NAME # Queue/partition name
#SBATCH --account=PROJECT # Project/account to charge
#SBATCH --mail-type=END,FAIL # Email when done or failed
#SBATCH --mail-user=email@domain # Where to send email
#SBATCH --dependency=afterok:123 # Run after job 123 completes
Understanding Job IDs
In the output file names, %j gets replaced with the job ID. For example:
#SBATCH --output=job_%j.out
If your job ID is 12345, the output file will be job_12345.out.
Example: Running Python with NumPy
#!/bin/bash
#SBATCH --job-name=numpy_analysis
#SBATCH --nodes=1
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=8
#SBATCH --time=04:00:00
#SBATCH --mem=32G
#SBATCH --output=results_%j.out
#SBATCH --error=errors_%j.err
# Load Python module
module load python/3.9
# Run your script
python -u analysis.py > output_log.txt
The -u flag makes Python print output immediately instead of buffering it.
5.4. Monitoring Jobs
squeue — Show Job Queue
See what jobs are running and queued.
squeue # Show all your jobs
squeue -u $USER # Show jobs for current user
squeue -A PROJECT # Show jobs in a project
squeue -p NAME # Show jobs in a partition
squeue -j 12345 # Show specific job details
squeue -l # Long format (more details)
squeue -S StartTime # Sort by start time
squeue --states=PENDING # Show only pending jobs
squeue --states=RUNNING # Show only running jobs
Output columns: - JOBID: Job identifier (use this to cancel or modify) - PARTITION: Queue name - NAME: Job name - USER: Who submitted it - ST: Status (R=Running, PD=Pending, CA=Cancelled, etc.) - TIME: How long it’s been running - NODES: Number of nodes - NODELIST: Which nodes it’s running on
Example output:
JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)
12345 cpu myanalysis omar R 2:30:15 1 node[01]
12346 cpu nextjob omar PD 0:00 1 (None)
The job 12345 is running and has been for 2 hours 30 minutes. Job 12346 is pending (waiting for resources).
sinfo — Show Partition Information
See available partitions and resources.
sinfo # Show all partitions
sinfo -p PARTITION_NAME # Show specific partition
sinfo -N # Show per-node information
sinfo -l # Long format
Output shows: - PARTITION: Queue name and if it’s the default (*) - AVAIL: If partition is available (up/down) - TIMELIMIT: Maximum job time allowed - NODES: Number of nodes and their states - States: idle, allocated, down, etc.
sacct — Show Accounting Information
View details about completed jobs.
sacct # Show your recent jobs
sacct -j 12345 # Details for job 12345
sacct --starttime=2024-01-15 # Jobs since a date
sacct --format=JobID,JobName,Elapsed,MaxRSS,State
sstat — Show Currently Running Job Stats
Get real-time statistics on a running job (useful for optimization).
sstat -j 12345 # Stats for job 12345
sstat -j 12345 --format=AveCPU,AveVMSize,MaxRSS
5.5. Canceling and Modifying Jobs
scancel — Cancel a Job
scancel 12345 # Cancel job 12345
scancel -u $USER # Cancel all your jobs (be careful!)
scancel --state=PENDING # Cancel all pending jobs
scontrol — Control Jobs
View and modify job properties.
scontrol show job 12345 # Show full job details
scontrol update JobId=12345 TimeLimit=04:00:00 # Change time limit
5.6. Job States
When you check your job status, you’ll see one of these states:
RUNNING (R): Job is currently executing
PENDING (PD): Job is waiting for resources to become available
COMPLETED (CD): Job finished successfully
FAILED (F): Job exited with error
CANCELLED (CA): You cancelled it with
scancelTIMEOUT: Job hit its time limit and was killed
OUT_OF_MEMORY: Job used more memory than allocated
The reason column shows why a job is pending:
- (None) — Waiting for resources to free up
- (Resources) — Not enough resources available
- (Priority) — Other jobs have higher priority
- (Dependency) — Waiting for another job to complete
5.7. Writing Efficient Job Scripts
Best Practices
Set realistic time limits: Jobs with longer time limits have lower priority. Don’t request 24 hours if you need 2 hours.
Set realistic memory: Check how much your job actually uses. Don’t request 100GB if you need 8GB.
Use the right number of cores: More cores doesn’t always mean faster. Test with different values.
Load modules early: Load required software before running commands.
Test interactively first: Use
srunto test before submitting long batch jobs.Save outputs carefully:
mkdir -p results/ python analysis.py > results/output.txt python visualize.py > results/plots.txt
Chain jobs with dependencies instead of running them sequentially:
# Submit first job JOB1=$(sbatch job1.sh | awk '{print $NF}') # Submit job2 only after job1 completes sbatch --dependency=afterok:$JOB1 job2.sh
Use arrays for many similar jobs:
#!/bin/bash #SBATCH --job-name=parameter_sweep #SBATCH --array=1-100 # Run 100 copies with different SLURM_ARRAY_TASK_ID #SBATCH --time=01:00:00 PARAM=$SLURM_ARRAY_TASK_ID python analysis.py --parameter $PARAM
5.8. Common SLURM Errors
“sbatch: error: Batch script must start with #! (or optionally a blank line)”
Make sure your script starts with #!/bin/bash
“INVALID GRES SPECIFICATION” or similar
Check your #SBATCH directives for typos. Common mistakes:
- --cpus-per-task not --cpus-per-tasks (no ‘s’)
- --nodes not --node
“squeue shows my job but it won’t start”
Your job is pending. Check the reason:
squeue -j YOUR_JOB_ID
If it says (Resources), the partition is busy. Try a shorter time limit or different partition.
“Job killed because it ran out of memory (OUT_OF_MEMORY)”
Increase --mem in your script:
#SBATCH --mem=32G # Instead of 16G
“TIMEOUT - Job exceeded time limit”
Increase --time:
#SBATCH --time=04:00:00 # Instead of 02:00:00
5.9. Useful Workflow
Here’s a typical workflow for running jobs on Falcon:
Develop and test locally on your laptop or in an interactive session
Test with srun on a small subset of data:
srun -N 1 -n 1 -c 4 -t 00:10:00 python analysis.py --small-data
Write a batch script for the full job
Submit with sbatch:
sbatch myjob.shMonitor with squeue:
squeue -u $USER
Check results when done:
cat job_12345.out cat job_12345.err
Analyze and iterate if needed
5.10. Example: Complete Job Script
Here’s a realistic example for a climate science job:
#!/bin/bash
#SBATCH --job-name=geos_chem_run
#SBATCH --nodes=4
#SBATCH --ntasks=128
#SBATCH --cpus-per-task=1
#SBATCH --time=06:00:00
#SBATCH --mem=128G
#SBATCH --partition=cpu
#SBATCH --output=geos_chem_%j.out
#SBATCH --error=geos_chem_%j.err
#SBATCH --mail-type=END,FAIL
#SBATCH --mail-user=user@cardiff.ac.uk
# Load required modules
module load intel/2023
module load netcdf/4.9.2
module load hdf5/1.12.2
# Set up environment
export OMP_NUM_THREADS=1
export GEOS_CHEM_DATA=/data/geos_chem_inputs
# Create output directory
mkdir -p results/${SLURM_JOB_ID}
cd results/${SLURM_JOB_ID}
# Copy input files
cp /home/user/input/* .
# Run simulation
srun ./GeosChem run_config.txt
# Post-process results
python /home/user/postprocess.py
echo "Job completed at $(date)"
5.11. Need Help?
Check SLURM documentation:
man sbatch,man srun,man squeueAsk Omar or colleagues for script help
Check the Falcon wiki for partition information
Look at existing job scripts from groupmates