What is Linux?
Linux is an operating system that has been developed over the past 27 years as a Unix-like operating system. From hobbyist/student beginnings it has grown to be a versitile, mature, and fairly robust technology.
However, calling what we use “Linux” today glosses over what are many of the most important tools that make Linux useful: the GNU project. GNU, a recursive acronym for Gnu’s Not Unix, is where most of the tools we use on Linux come from. Shells, compilers, utilities, and even games used in Linux come from the GNU project. Thus you will sometimes see Linux referred to as “GNU/Linux”- mostly it’s just “Linux,” but it is important that much of the utility comes from that other important project.
Linux has become a core part of modern bioinformatic investigation- many of the most popular tools only run on Linux. Thus, it’s important that you become at least comfortable using Linux and navigating the computational resources provided by the Hutch.
Learning Linux means learning the shell. The most common shell is bash and the one we’ll be assuming is in use here. A good way to get started learning Linux is going through some of the many tutorials that have been developed and are readily available from various providers:
- The Unix Shell course from Software Carpentry (note, this organization has a number of different software oriented tutorials and resources as well).
- The Introduction to Linux guide from The Linux Documentation Project
The rest of this document will expect you’ve gone through either one of the basic introductions above. For more advanced use of the shell, tasks like scripting or programming, see:
During the course of your work you may need to do a simple task on a large number of files, like renaming all the files from a sequencing run, or raising the contrast on microscopy images. Performing these tasks on individual files by hand is time-consuming and prone to errors. Unix, Mac OSX and Windows all have simple shell scripting programming languages built-in for these small repetitive tasks require simple logic
The benefits of shell scripting are:
- not needing to install additional software on your computer and,
- ease of use. Most Unix-based systems (eg. Ubuntu) come with the Bourn Again SHell (“Bash”), which are also standard on Mac OSX systems. Windows have the Command Prompt and PowerShell. You can enter shell scripting commands directly via a command line interface or save these commands in a shell script to be run immediately non-interactively.
Shell Scripting Resources
- A gentle introduction to command line interface and related concepts are here.
- Some basic Bash commands can be found here.
- Slightly more advanced Bash scripting are found here.
- Common Bash pitfalls goes into more subtle, advanced usage
- Overview of Windows PowerShell and an example comparing Command Prompt and PowerShell.
Linux at the Fred Hutch
With these skills in hand, we will now discuss how to navigate the various Linux-based computational resources available to you in the Hutch computing environment.
Using the Network
These systems all live remotely: either in a server room on campus or possibly in a cloud provider’s datacenter. Thus, we need to use the network to connect to them. Most of our systems require that you are connected to the Campus network, either via wired network connection at a workstation, the Marconi wifi network, or via VPN from off-campus networks.
The next requirement is that you have a tool called “SSH” (for Secure
SHell). Mac OSX has one built-in and can be found by going to
Applications, then select
Utilities, and you will see the application
Terminal. Windows users will need to find an add-on.
PuTTY is a
freely available SSH/terminal client that has been the go-to for Windows users for years.
An alternative is using a NoMachine client to start a graphical session. That process is described here. Within these NoMachine sessions you can start a terminal on the NoMachine server from whence you can start an SSH session.
With those tools, you are now ready to connect to one of the session servers described in our Technologies page. Most commonly you will connect to the host
Setting up your Account
For the most part, your HutchNetID and password are all that are required to access the computational environment here. If you do have trouble accessing hosts, contact Scientific Computing (
Once you have access to these systems you’re ready to start computing. Of course, computation really requires data, so next let’s discuss where all the interesting data resides on these systems.
This guide shows the common supported storage options. Additionally, see the Data Storage section in our Wiki for more guidance about storage locations and access. All, except for the transfer drive, are available on SciComp supported systems:
What you do next will depend on the direction your work will take you. The Scientific Computing Resource Overview will have more details about the technologies available for your use.