Run a Non-Interactive Job on the Cluster

This pathway will get you to a first non-interactive computing job on the gizmo computing cluster using the command-line interface at the Fred Hutch.

Pre-requisites

A desktop computer, access to the internet, a good text editor.

If you are unfamiliar with any of these terms or subjects, hover over them to find more information.

Interactive sessions on a login node
Node An individual server in a collection of networked servers that make up a computing cluster.
Shell A command line user interface for Unix-like operating systems.
Scripts A script is set of commands that are executed by an operating system or application.
Session A temporary and interactive information interchange between two or more communicating devices, or between a computer and user.
Workload manager Software that coordinates job submission to nodes on a cluster.
SLURM A type of workload manager used at the Fred Hutch’s gizmo computing cluster.
HutchNet ID A user ID specific to the Fred Hutch.
Workflow manager Software that coordinates the submission of jobs, inputs and outputs of individual jobs in a scientific workflow.
rhino The login node (actually several nodes) of the Fred Hutch high performance computing cluster.
gizmo The name of the Fred Hutch high performance computing cluster’s computing nodes.

Steps

Familiarize Yourself with SLURM

SLURM is the workload manager for our gizmo computing cluster. Review the documentation for basic information about how SLURM works. Once you have logged into the login nodes (rhino), you will be sending non-interactive instructions to the compute nodes (gizmo) via these instructions.

Create a Script

For the purpose of this pathway we will use a pre-existing script in this template GitHub repository. Copy this file to a location you can access from rhino by using:

wget https://github.com/FredHutch/slurm-examples/blob/main/01-introduction/1-hello-world/01.sh

Submit the Script

Use the sbatch command to submit the script to gizmo. Output should appear in the form of a log file in the current driectory with the jobID in the filename.

Where to go from here

Managing Jobs

This documentation provides more information on managing jobs that are queued and running on the cluster, including steps to take when jobs don’t run.

Writing More Complicated Scripts

This was a simple script - you’ll need more advanced scripts to run workload on the cluster. These resources are great starting points:

The Linux Documentation Project manual on bash scripting
The Advanced Guide for bash scripting

Larger Compute Needs

We’ve described a single job - when your work requires many jobs or many steps, more advanced tools are necessary, and you begin to delve into parallel computing. Two commonly used approaches at the Fred Hutch include:

SLURM job arrays provide an easy mechanism for submitting thousands of homogenous jobs
Workflow managers are the gold-standard for managing computational workflows, particularly valuable for managing multi-step analyses. Cromwell and Nextflow are the two preferred workflow manager tools at the Hutch.

Edit this Page via GitHub Comment by Filing an Issue Have Questions? Ask them here.