Supported Technologies and Platforms

Edit this Page via GitHub       Comment by Filing an Issue      Have Questions? Ask them here.

The Fred Hutch provides researchers on campus access to high performance computing using on-premise resources. The various technologies provided are outlined on here along with the basic information required for researchers to identify which resource might be best suited to their particular computing needs.

The Fred Hutch managed systems listed serve needs that rise above those that can be met using your desktop computer or web-based services. Often reasons to move to these high performance computing (HPC) resources include:

  • reproducible compute jobs
  • version controlled and/or specialized software
  • increased compute capability
  • rapid access to large data sets in central data storage locations

Overview of On-Premise Resources

Compute Resource Access Interface Resource Admin Connection to FH Data Storage
Gizmo Via Rhino or NoMachine hosts (CLI, FH credentials on campus/VPN off campus) Scientific Computing Direct to all local storage types
Beagle Via Rhino or NoMachine hosts (CLI, FH credentials on campus/VPN off campus) Center IT home, fast, economy, AWS-S3, and Beagle-specific scratch
Rhino CLI, FH credentials on campus/VPN off campus Scientific Computing Direct to all local storage types
NoMachine NX Client, FH credentials on campus/VPN off campus Scientific Computing Direct to all local storage types
Python/Jupyter Notebooks Via Rhino (CLI, FH credentials on campus/VPN off campus) Scientific Computing Direct to all local storage types
R/R Studio Via Rhino (CLI, FH credentials on campus/VPN off campus) Scientific Computing Direct to all local storage types

Gizmo and Beagle Cluster

While we generally don’t recommend interactive computing on the HPC clusters- interactive use can limit the amount of work you can do and introduce “fragility” into your computing- there are many scenarios where interactively using cluster nodes is a valid approach. For example, if you have a single task that is too much for a rhino, opening a session on a cluster node is the way to go.

If you need an interactive session with dedicated resources, you can start a job on the cluster using the command grabnode. The grabnode command will start an interactive login session on a cluster node. This command will prompt you for how many cores, how much memory, and how much time is required

This command can be run from any NoMachine or rhino host.

Note: At this time we aren’t running interactive jobs on Beagle nodes. If you have a need for this, please email scicomp.

Rhino

Gizmo is actually not a stand alone system; instead, access to the resource is based on the Rhino platform supported by Center IT. Rhino, or more specifically the Rhinos, are three locally managed HPC servers all accessed via the name rhino. Together, they function as a data and compute hub for a variety of data storage resources and high performance computing (HPC). The specific guidance for the use of each of the approaches to HPC access are slightly different, but will all require the user to learn how to access and interact with rhino.

Note: Any user interacting with the following systems will be dependent on being proficient with the care and keeping of the Rhinos.

More information on the topic of ssh configurations for access to rhino can be found here. More information on specific guidance for using rhino and gizmo are in our Resource Library for rhino and for gizmo.

The NoMachine Cluster

NoMachine is a software suite that allows you to run a Linux desktop session remotely. The session runs on the NoMachine server but is displayed on your desktop or laptop using the NoMachine client. NoMachine (also abbreviated NX) is installed on CIT supported PC desktops and laptops.

NX has the particular advantage of maintaining your session even when you disconnect or lose connectivity. All that is required is to restart the client and your session will be as you’d last left it.

There are three systems you can use for NX sessions: lynx, manx, and sphinx. These are not computational systems but rather these hosts are used solely as launch-points for sessions on gizmo or rhino. Running computational tools on this system will get you a warning from SciComp.

Other Available Resources

More to come here regarding VMs, shiny, rancher, data transfer.

Resource and Node Description information

Below we describe the current basic configurations available for node types, numbers, and memory for a variety of scicomp supported computing resources. These tables are useful when deciding on what type of resources you need to request when using rhino and gizmo for interactive and non-interactive jobs. These tables are auto-generated and are a work in progress so that we can provide the most up to date information on the Wiki for your use. Please file an Issue in our GitHub repository if you notice something amiss or need clarification.

Resource Information

Name Type Authentication Authorization Location
rstudio web web hutchnetID FHCRC
proxmox VM cluster web hutchnetID FHCRC

Cluster Node Information

GIZMO

Location: FHCRC

Partition Node Name Node Count CPU Cores Memory
campus f 456 Intel E3-1270v3 4 32GB
largenode g 18 Intel E5-2667v3 6 256GB
largenode h 3 Intel E5-2697v3 14 768GB
none (interactive use) rhino 3 Intel E5-2697v3 14 384GB

Additional resources

Node Name Network Local Storage
f 1G (up to 100MB/s throughput) 800GB @ /loc (ca 100 MB/s throughput)
g 10G (upto 1GB/s throughput) 5TB @ /loc (300MB/s throughput / 1000 IOPS) and 200GB @ /loc/ssd (1GB/s throughput / 500k IOPS)
h 10G (upto 1GB/s throughput) 5TB @ /loc (300MB/s throughput / 1000 IOps) and 200GB @ /loc/ssd (1GB/s throughput / 500k IOPS)
rhino 10G (up to 1GB/s throughput) 5TB @ /loc (300MB/s throughput/ 1000 IOps)

BEAGLE

Location: AWS

Partition Node Name Node Count CPU Cores Memory
campus f 777 Intel c5 4 15GB
largenode g 103 Intel c5 18 60GB
largenode h 34 Intel r4 16 244GB

Additional resources

Node Name Network Local Storage
f EC2 EBS
g EC2 EBS
h EC2 EBS

KOSHU

Location: GCP

Partition Node Name Node Count CPU Cores Memory
campus f 70 Intel ?? 4 32GB
largenode g 10 Intel ?? 8 256GB
gpu h 10 Intel ?? 4 128GB

Additional resources

Node Name Network Local Storage
f GCP Google Persistent Disk
g GCP Google Persistent Disk
h GCP Google Persistent Disk

Updated:

Edit this Page via GitHub       Comment by Filing an Issue      Have Questions? Ask them here.