June SciComp Updates

Latest R: R/3.5.0-foss-2016b-fh1 Latest Python: Python/3.6.5-foss-2016b-fh3

For a complete list of modules built with R and Python visit: https://fredhutch.github.io/easybuild-life-sciences/

TeXlive Has been updated to version 2018.05.31. This new version uses libraries that are compatible with our newest versions of R-3.4.2, 3.4.3 and 3.5.0. If you have been getting errors with the pdflatex command please try this new build of texlive. Module name: texlive/20180531-foss-2016b

Python-3.6.0 fh3 - 155 Updated packages, 25 new packages. User requested packages: https://github.com/wdecoster/nanostat https://github.com/rrwick/Porechop https://github.com/wdecoster/NanoPlot

Nanopolish - Software package for signal-level analysis of Oxford Nanopore sequencing data. Nanopolish can calculate an improved consensus sequence for a draft genome assembly, detect base modifications, call SNPs and indels with respect to a reference genome and more. Module name: nanopolish/0.7.1-foss-2016b

Clairvoyante - A deep neural network based variant caller. Requires DISPLAY to be set. Only works with Python2. Module name: Clairvoyante/0.1

Assembly Stats Get assembly statistics from FASTA and FASTQ files. Module name: assembly-stats/1.0.1-foss-2016b

Minimap2 is a versatile sequence alignment program that aligns DNA or mRNA sequences against a large reference database. Typical use cases include: (1) mapping PacBio or Oxford Nanopore genomic reads to the human genome; (2) finding overlaps between long reads with error rate up to ~15%; (3) splice-aware alignment of PacBio Iso-Seq or Nanopore cDNA or Direct RNA reads against a reference genome; (4) aligning Illumina single- or paired-end reads; (5) assembly-to-assembly alignment; (6) full-genome alignment between two closely related species with divergence below ~15%. Module Name: minimap2/2.10-foss-2016b

IUPred Intrinsically unstructured/disordered proteins have no single well-defined tertiary structure in their native, functional state. Our server recognizes such regions from the amino acid sequence based on the estimated pairwise energy content. The underlying assumption is that globular proteins are composed of amino acids which have the potential to form a large number of favorable interactions, whereas intrinsically unstructured proteins (IUPs) adopt no stable structure because their amino acid composition does not allow sufficient favorable interactions to form.
Modulename iupred/1.0-GCC-5.4.0-2.26

RAxML Randomized Axelerated Maximum Likelihood. Is a program for sequential and parallel Maximum Likelihood based inference of large phylogenetic trees. It can also be used for post analyses of sets of phylogenetic trees, analyses of alignments and, evolutionary placemen of short reads. Modulename: RAxML/8.2.11-foss-2016b-hybrid-avx2

Tesseract is an OCR engine. Tesseract was originally developed at Hewlett-Packard Laboratories Bristol and at Hewlett-Packard Co, Greeley Colorado between 1985 and 1994, with some more changes made in 1996 to port to Windows, and some C++izing in 1998. In 2005 Tesseract was open sourced by HP. Since 2006 it is developed by Google. Modulename: tesseract/4.0.0-beta.1-foss-2016b

Pindel - Pindel can detect breakpoints of large deletions, medium sized insertions, inversions, tandem duplications and other structural variants at single-based resolution from next-gen sequence data. It uses a pattern growth approach to identify the breakpoints of these variants from paired-end short reads. Module name: Pindel/0.2.5b8-foss-2016b

R 3.5.0 The latest and largest R yet built. Over 800 packages. Updated with packages from BioConductor 3.7. Module Name: R/3.5.0-foss-2016b-fh1

Python 2.7.15 268 Packages, 187 Updated Packages Module Name: Python/2.7.15-foss-2016b-fh1

Leptonica Graphic libraries for image processing and image analysis applications. Module name: leptonica/1.75.3-foss-2016b

RGEOS R- Interface to Geometry Engine - Open Source (‘GEOS’) using the C ‘API’ for topology operations on geometries. The ‘GEOS’ library is external to the package, and, when installing the package from source, must be correctly installed first. Windows and Mac Intel OS X binaries are provided on ‘CRAN’. Based on package GEOS-3.6.2 and R-3.5.0 Moduel Name: rgeos/0.3-26-foss-2016b-R-3.5.0

awscli/1.15.16 updated. ASW CLI changes almost weekly. We are constantly updating this one package.

Steel Bank Common Lisp (SBCL) is a high-performance Common Lisp compiler. It is open source / free software, with a permissive license. In addition to the compiler and runtime system for ANSI Common Lisp, it provides an interactive environment including a debugger, a statistical profiler, a code coverage tool, and many other extensions. Module Name: sbcl/1.4.6

Sniffles is a structural variation caller using third generation sequencing (PacBio or Oxford Nanopore). It detects all types of SVs (10bp+) using evidence from split-read alignments, high-mismatch regions, and coverage analysis. Please note the current version of Sniffles requires sorted output from BWA-MEM (use -M and -x parameter) or NGMLR with the optional SAM attributes enabled! Module Name: Sniffles/1.0.8-foss-2016b

NanoSV NanoSV is a software package that can be used to identify structural genomic variations in long-read sequencing data, such as data produced by Oxford Nanopore Technologies’ MinION, GridION or PromethION instruments, or Pacific Biosciences sequencers Module Name: NanoSV/1.1.2-foss-2016b

SAMtools, BCFtools, HTSlib Samtools is a suite of programs for interacting with high-throughput sequencing data. All three have been updated to version 1.8.

Python 3.6.5 111 Updates from 3.6.5-fh2. 8 New packages. 460 total packages Module Name: Python/3.6.5-foss-2016b-fh1