Assembly and Annotation of genomes


12-16 February 2018



This course will introduce biologists and bioinformaticians to the concepts of de novo assembly and annotation. Different technologies, from Illumina, PacBio, Oxford Nanopoor and maybe 10X will be used mixed with different approaches like correction, HiC scaffolding to generate good draft assemblies. Particular attention will be given to the quality control of the assemblies and to the understanding how errors occur. Further, annotation tools using RNA-Seq data will be introduced. An outlook of potential analysis is given. In the end of the course the students should be able to understand what is needed to generate a good annotated genome.


Targeted Audience & Assumed Background

The course is aimed at researchers interested in learning more about genome assembly and annotation. It will include information useful for both the beginner and the more advanced user. We will start by introducing general concepts and then continue to step-by-step describe all major components of a genome assembly and annotation workflow, from raw data all the way to a final assembled and annotated genome. There will be a mix of lectures and hands-on practical exercises using command line Linux.

Attendees should have a background in biology. We will dedicate one session to some basic and advanced Linux concepts. Attendees should have also some familiarity with genomic data such as that arising from NGS sequencers.

Learning outcomes

-       Understand the concepts and quality of de novo assembly and annotation for genomes of all sizes, virus to mammals

-       Learn the advantages of the different sequencing technologies e.g. Illumina, Pacific Bioscience and Oxford Nanopore for de novo assembly and how to access the quality of genomes sequences

-       Hands on experience of common tools for de novo sequence assembly, including visualization, contig ordering, scaffolding and error correction

-       Hands on experience of gene finding, including the use of RNA-Seq data

-       Being comfortable to assemble and annotate genomes




Monday 12th – Classes from 09:30 to 17:30 - “get it starting”


Session 1: Introduction (morning)

I this session I will kick off with an introduction lecture about genome assembly and annotation - the past, the present and the future. I will use this introduction to motivate the five-day course. Next, I will explain the use of the virtual machine (VM), and the use of cloud computing. This is followed by short introduction to Linux (although I would prefer if student know a bit of Linux). Through the morning we will kick off our first assembly and put it through an annotation tool (Companion).