Probably the more familiar method of computing once resources beyond those available on your desktop/laptop are needed is interactive computing. Here, one would start a shell and enter commands to be run. In these sorts of jobs, output is directed to a terminal and input taken from a keyboard.
Rhino nodes provide this capability- simply log in via SSH and begin your computing.
The rhino compute nodes are large memory, shared systems. These are systems intended for:
- interactive work
- prototyping and development
- compiling software
- jobs requiring more than 48GB RAM
Access to these systems is via secure shell (ssh). There are four rhino nodes- when you use the alias rhino, a round-robin system distributes your session to any one of the four nodes.
These systems should not be used for intensive computational tasks unless the task requires significant memory (i.e. greater than 48GB). Other tasks should be limited in quantity and in run-time. Do not run multiple jobs (more than 10) or run them for a significant amount of time (1,000 CPU-seconds)
To use Fred Hutch IT supported computing resources, you will need to acquire and manage various credentials. You can read more about how to get started on our Credentials page.
There are multiple ways you can get access Scientific Computing resources which are all running on a supported version of Ubuntu Linux. The most simple form of access is using a secure shell terminal software such as ssh or putty. You may also need graphical output (GUI), for example to use tools like R Studio or advanced text editors. See our page on access Methods to learn more about how to set up your connection to the
rhino nodes. To access SciComp resources from outside FHCRC, you have to use WebVPN (Cisco AnyConnect) to come through the FHCRC firewall.
Connections with Storage
Storage mounted on all rhino nodes comes in four basic flavors:
You can read more about Data Storage options on our Storage pages.
mrg@rhino$ ls -l /fh total 4 drwxr-xr-x 270 root root 0 Nov 4 21:11 economy drwxr-xr-x 271 root root 0 Nov 19 11:39 fast dr-xr-xr-x 3 root root 4096 Nov 2 13:09 secure mrg@rhino$ ls -l /fh/fast/corey_l/scicomp total 13 drwxrwxr-x 2 mrg g_mrg 52 Oct 29 10:42 bcl2fastq drwxrwsr-x 4 mrg SciComp 123 Nov 17 14:16 build drwxrwsr-x 4 mrg SciComp 53 Aug 25 09:14 econofile
Running an app on
You can run an app directly on the rhino node to which you are connected. This is useful for lightweight apps and testing, but please refrain from running compute-intensive (CPU time and/or memory resources) processes as the rhino machines are a shared resource. To run directly, simply start your GUI application. Common apps include:
- Rstudio - run
module load R/3.4.1-foss-2016b-fh2 rstudiofollowed by
rstudioto start. You can load any version of the R environment module you like.
The Rstudio program (and rsession) tend to be resource hogs. For this reason, we limit each user to one rstudio session per rhino (or lamprey) at a time. If we see more than one rstudio session for one user, we send a warning email, and give the user an opportunity to exit one of the sessions.
- MATLAB - run
module load matlab/R2016bfollowed by
matlabto start. There are several versions of MATLAB installed, run
module list matlabto see them all.
- Mozilla/Firefox - it can be handy to run a browser on the remote system sometimes. Start one by simply running the
How to run GUI (X windows/X11) apps on
X11 or X Windows is the standard and default Unix/Linux windowing system. It is used locally when you run an app on your Linux laptop/desktop to connect the app to your choosen windowing system (like Gnome, KDE, xfce, etc.), but has always had the capability of remote excecution where the windowing system the user sees and the application they are running do not have to be on the same host. The architecture of X11 is backward from what most people assume: you run an X11 server on your client device, and the applications you run are known as X clients. The version everyone uses is version 11, hence, X11.
The application we all use to connect to remote servers, OpenSSH, does a great job of transparently tunnelling the connections required by X11. However, it can only tunnel when executed in a terminal that is set up with your X11 server. This is the default on Linux, but on macOS (with XQuartz installed) you need to ssh from an XQuartz terminal, not iTerm/iterm2. Since Windows lacks an X11 server, you will need to use NoMachine to run X apps on one of the (Linux) NoMachine servers. Once logged in to a NoMachine server, you should open any terminal application.
Once you have a terminal open, use
ssh to connect to a rhino node:
ssh rhino. Once connected, you have two options:
You may see font errors and/or other warnings on your terminal when you start and run your X11 app. Many of these can safely be ignored, but if you do experience an error or crash, often the messages on your ssh terminal can be helpful in troubleshooting.
If you get an error like:
Error: Can't open display: or any other error mentioning “display” you likely do not have X enabled in the terminal on your local system in which you are running ssh to the remote server. On macOS this can also be caused by a default configuration in some cases that disabled the X forwarding. On these systems, you should ssh with ‘-Y’ to enable trusted X forwarding like this:
ssh -Y <hutchnetid>@rhino.
A useful ‘test’ of the X forwarding is a simple program invoked with the command
xeyes. This creates a small window with a pair of eyes that track the mouse cursor. It is much fast to start than either Rstudio or MATLAB, and gives a positive indication of functioning X11.
If you want to run a computationally-intensive X/GUI app, you should grab your own node to do so. This will not impact other users of the
rhino systems. This method can also be used to run non-GUI apps interactively.
“Grab” a node using one of the grab commands, you can read more about how to do this both here and here. Once done, the grab command you used will have created an ssh session to the remote node you have reserved, and you are now ready to run your app. Use the same commands as you would on a rhino (see above).
Tips, Tricks and Gotchas
We look for processes using the “top” command. If a process has accumulated >1000 (“seconds”) under “TIME+”, we send a warning email to the user. The purpose of the email is to remind or encourage the user to use the gizmo cluster instead for this kind of job, for example, by using the “grab” commands (look here for documentation).
If the user does not respond, we send a warning when TIME+ hits 2000, asking for an explanation. We also warn that if TIME+ hits 4000 without a response from the user, we will kill the process.
If the user still does not respond, and TIME+ for the process hits 4000, we kill the process.
Multiple processes where time adds up to 1000
If multiple processes for one user have TIME+ that adds up to >1000, we also send a warning email. The point is to maintain good interactive response on the rhinos. If one user is hogging many cpu cores, that good response time can degrade. We direct the user to do this kind of activity on a gizmo node, via the grab commands.
Other cases that usually result in a warning
Pybedtool and /tmp - pybedtools can be run in such a way that it leaves files in tmpdir for the duration. Please advise use to read this bug and adjust code accordingly: pybedtools issue 159.
If we see a process using a huge amount of real or virtual memory (say >300 GB for a short period of time, or >90 GB for a few days), we will send an email to the user. The purpose is to find out what they are doing, and if there might be a more efficient way to do that.
A process that has accumulated a TIME+ >1000 but seems to be doing nothing, and hangs around for more than a week, will draw an email from us.
Processes running disconnected from a parent process, such as R running with a parent process ID of “1”, draws our attention. This could be a runaway process, stuck in a loop. We will send email to the user, but if it’s obviously a runaway, we may kill it first and ask the user later.
If a SCREEN process is older than two months, most likely the user who started it has forgotten it. If the user has a pattern of behavior where they start SCREEN and launch R processes and leave them for months, we will “clean up” after them.
Our hope is that people will pay attention to the processes they start, and be conscious that they’re working in a shared environment with limited resources.
Updated: September 24, 2020Edit this Page via GitHub Comment by Filing an Issue Have Questions? Ask them here.