Welcome to the Harvard Informatics Bioinformatics Tips & Tricks workshop!

This web page will guide you through some of the activities we have planned for you today!

Instructors

Tim Sackton: Director of the FAS Informatics group at Harvard University.

Danielle Khost: A bioinformatics scientist in the FAS Informatics group at Harvard University.

Nathan Weeks: A research application developer in the FAS Informatics group at Harvard University.

Gregg Thomas: A bioinformatics scientist in the FAS Informatics group at Harvard University and recent postdoc at the University of Montana where he studied the phylogenetics and comparative genomics of the mouse and rat radiation. He got his PhD at Indiana University where he worked on comparative genomics of arthropods, mutation rate evolution in primates, and convergent evolution using comparative genomics. In general, Gregg uses and develops computational methods to study molecular evolution and phylogenetics to determine what forces drive divergence and adaptation between species.

Workshop Summary & Outline

This workshop aims to introduce students to some basic bioinformatics file formats, tools, and general best practices. The first two days of the workshop will be dedicated to introductions of bioinformatics file formats and the command line tools that we use to view, manipulate, and analyze them. After that, we will begin to shift from using individual commands to writing scripts and constructing bioinformatics workflows, including setting up environments with conda, and interacting with the job scheduling software on the cluster, SLURM.

Here is a brief outline of the topics we'll be covering:

Day 1: Bioinformatics Tools, part 1

Wednesday March 22nd, 9:30 am - 12:30 pm: Biolabs 2062/2064
  • Sequence files (FASTA, FASTQ)
  • Intro to commands useful for bioinformatics (grep, awk)
  • Alignment files (BAM/SAM) and samtools
  • Introduction to piping and redirecting

Day 2: Bioinformatics Tools, part 2

Thursday March 23rd, 9:30 am - 12:30 pm: Northwest Building, 353
  • More on piping and redirecting
  • Interval files (bed, GFF)
  • More on grep and awk
  • Introduction to bedtools

Day 3: Workflows, part 1

Wednesday March 29th, 9:30 am - 12:30 pm: Biolabs 2062/2064
  • Variant files (VCF)
  • Introduction to bcftools
  • Shell scripting

Day 4: Workflows, part 2

Thursday March 30th, 9:30 am - 12:30 pm: Northwest Building, 353
  • Conda/mamba environments
  • Installing software with conda/mamba
  • Interacting with the cluster using SLURM
  • Job scripts and submitting jobs to the cluster

Click the Get Started link below to read some info before class. Additional links to resources will appear for each day of the workshop.


Get Started