In addition to the material here, our teaching team maintains a list of useful bioinformatics resources.
Our NGS analyses rely on the bcbio.nextgen framework, a community-developed NGS workflow that comes with full documentation, is open source under the MIT license, and is in use at over a dozen sites internationally. It is installed at both the FAS and HMS Research Computing environments and provides researchers with best practice workflows for exome / whole-genome sequencing (built around GATK 3.0 and FreeBayes) , RNA-Seq (TopHat2/STAR, Sailfish/Salmon, edgeR/DESeq2/limma), and most recently, small RNA-seq, single-cell RNA-seq and the intial steps of ChIP-seq. Take a look at some of the key blog posts around bcbio:
- Scaling NGS pipelines for WGS
- Ensembl variant calling
- Docker for easier distribution
- Introducing RNA-Seq to bcbio
If you are working on the HMS Orchestra cluster and need a piece of bioinformatics software, we recommend you look at the software available through BioGrids project, a software stack for bioinformatics available on the HMS Orchestra cluster
We also recommend conda as an easy way to install and run software in a shared environment without worrying about dependencies. Many general use packages can be found on Anaconda Cloud and bioinformatics software can be found through bioconda.
Computational and Data Storage Resources
We work closely with and use the computational resources of both:
- the Odyssey high performance cluster at the FAS Research Computing Center
- the Orchestra high performance cluster at Harvard Medical School Research Computing
Data storage needs increase by the day. To help to figure out how to keep storage costs manageble, we contribute to an important initiative at HMS to develop and propagate good data management practices and resources across the Harvard community. For details and tips on managing your data, see the HMS data management page
Next Generation Sequencing
We work with data from any sequencing core but currently work most closely with the: