A growing area of large scale data analysis is the visualization and sharing of results of analyses. Because of the utility of and need to communicate results and implications of often large and complex datasets in concise and clear ways, there has been an explosion of platforms, tools, software and approaches for data visualization. While that is a boon to your research, as there is likely a good tool ready for you, it also can be overwhelming and challenging to choose a path to follow. Here we have tried to provide a jump start for connecting with and applying data visualization tools and approaches at the Fred Hutch. While this is not an exhaustive list, we have highlighted what tends to be the most commonly employed or easiest to access resources.

Software for data visualization

Desktop software

Fred Hutch’s Center IT (CIT) supports a wide range of commonly used software at little to no cost to you! We’ve pulled out a shortlist of software relevant to data visualization, but you can view the entire software catalog here. Tableau, MATLAB, and Microsoft Excel all are great options for users who prefer a point and click data visualization tools.

Plotting in R

While it is possible to plot using base R, there are many packages available to make plotting easier and more visually appealing. Data visualization in R has been dominated by the ggplot package and a wealth of add-on packages that allow for further customization (such as RColorBrewer for color palettes and themes, etc). Meanwhile, the communication of data visualizations via interactive webapps like Shiny apps, are also R based and lend themselves well to displaying ggplot and plotly type visualizations.

Packages for plotting

Packages that extend ggplot capabilities

Packages for arranging plots

Packages for coloring plots

Plotting in Python

Historically the Matplotlib had been the go-to library for scientific data visualization in Python. Matplotlib is still a powerful plotting tool, but it’s syntax is complex and the graphics can look outdated when compared to R’s ggplot2. The seaborn library was developed as an easier to use and updated version of Matplotlib and the plotnine library was developed to mimic ggplot’s grammar of graphics style plotting syntax. Still, some Python users choose to do their data processing in Python and switch to R for visualization. The plotly and Altair are two options for interactive visualizations.


Data visualization resources at Fred Hutch

The FH-Data Slack, and more specifically the #data-viz channel, is always available as a space for researchers to ask questions and share resources about data visualization.

Books that cover data visualization

Books can be a great way to dive deeper into a specific coding subject and fortunately many of these books are available online for free! The Fundamentals of Data Visualization by Claus Wilke is a great reference for code agnostic data visualization concepts. For language specific data visualization references, books and documentation that cover a specific language (like Python or R) will often also cover the basics of plotting in that language.




Other data visualization resources!

Data visualization focused blogs and screencasts can be a great way to find inspiration and think outside the box.