Linux, Unix and Bash

Updated: November 12, 2019

Edit this Page via GitHub       Comment by Filing an Issue      Have Questions? Ask them here.

What is Linux?

Linux is an operating system that has been developed over the past 27 years as a Unix-like operating system. From hobbyist/student beginnings it has grown to be a versitile, mature, and fairly robust technology.

However, calling what we use “Linux” today glosses over what are many of the most important tools that make Linux useful: the GNU project. GNU, a recursive acronym for Gnu’s Not Unix, is where most of the tools we use on Linux come from. Shells, compilers, utilities, and even games used in Linux come from the GNU project. Thus you will sometimes see Linux referred to as “GNU/Linux”- mostly it’s just “Linux,” but it is important that much of the utility comes from that other important project.

Linux has become a core part of modern bioinformatic investigation- many of the most popular tools only run on Linux. Thus, it’s important that you become at least comfortable using Linux and navigating the computational resources provided by the Hutch.

Learning Linux

Learning Linux means learning the shell. The most common shell is bash and the one we’ll be assuming is in use here. A good way to get started learning Linux is going through some of the many tutorials that have been developed and are readily available from various providers:

  • The Unix Shell course from Software Carpentry (note, this organization has a number of different software oriented tutorials and resources as well).
  • The Introduction to Linux guide from The Linux Documentation Project

The rest of this document will expect you’ve gone through either one of the basic introductions above. For more advanced use of the shell, tasks like scripting or programming, see:

Shell Scripting

During the course of your work you may need to do a simple task on a large number of files, like renaming all the files from a sequencing run, or raising the contrast on microscopy images. Performing these tasks on individual files by hand is time-consuming and prone to errors. Unix, Mac OSX and Windows all have simple shell scripting programming languages built-in for these small repetitive tasks require simple logic

The benefits of shell scripting are:

  1. not needing to install additional software on your computer and,
  2. ease of use. Most Unix-based systems (eg. Ubuntu) come with the Bourn Again SHell (“Bash”), which are also standard on Mac OSX systems. Windows have the Command Prompt and PowerShell. You can enter shell scripting commands directly via a command line interface or save these commands in a shell script to be run immediately non-interactively.

Shell Scripting Resources

Linux at the Fred Hutch

With these skills in hand, we will now discuss how to navigate the various Linux-based computational resources available to you in the Hutch computing environment.

Using the Network

These systems all live remotely: either in a server room on campus or possibly in a cloud provider’s datacenter. Thus, we need to use the network to connect to them. Most of our systems require that you are connected to the Campus network, either via wired network connection at a workstation, the Marconi wifi network, or via VPN from off-campus networks.

The next requirement is that you have a tool called “SSH” (for Secure SHell). Mac OSX has one built-in and can be found by going to Applications, then select Utilities, and you will see the application Terminal. Windows users will need to find an add-on. PuTTY is a freely available SSH/terminal client that has been the go-to for Windows users for years.

An alternative is using a NoMachine client to start a graphical session. That process is described here. Within these NoMachine sessions you can start a terminal on the NoMachine server from whence you can start an SSH session.

With those tools, you are now ready to connect to one of the session servers described in our Technologies page. Most commonly you will connect to the host rhino.

Setting up your Account

For the most part, your HutchNetID and password are all that are required to access the computational environment here. If you do have trouble accessing hosts, contact Scientific Computing (scicomp).

Finding Data

Once you have access to these systems you’re ready to start computing. Of course, computation really requires data, so next let’s discuss where all the interesting data resides on these systems.

See the Data Storage section in our Wiki for more guidance about storage locations and access.

Updated: November 12, 2019

Edit this Page via GitHub       Comment by Filing an Issue      Have Questions? Ask them here.