22 – 26 February 2021
Due to the COVID-19 outbreak, this course will be held online
New advances in sequencing technologies have opened the door to more contiguous genome assemblies due to the increased length of obtained fragments. Although there is a setback in accuracy, a broad range of algorithms has been developed to cope with it.
This course will introduce the audience with a spectre of methods which are present in a usual assembly workflow, starting from raw data and finishing with a fully assembled genome. We will see
how to obtain nucleotide sequences from raw signals, dive deeper into the most used assembly paradigm for long fragments, try out and compare several state-of-the-art assemblers, and at last,
assess the quality of the obtained assembly with and without a reference genome.
Structured over five days, this course consists of both theoretical and practical aspects which are intertwined through each day. The presented theoretical foundation will be applied on small
bacterial datasets and visualized in order to better grasp the algorithms at hand.
This course is intended for researchers interested in learning the concepts of algorithms for de novo genome assembly with Oxford Nanopore Technologies data. Both beginners and more advanced users will find useful information in the presented matter. Course attendees should bring a laptop with either macOS or any Unix version. Some background in using mentioned operating systems via the command line is desirable, but we will cover the needed essentials throughout the hands-on sessions.
Monday
Session1: Introduction
This course starts with a general introduction to sequencing and assembly. The audience will get familiar with Oxford Nanopore sequencing, how it works, its advantages and disadvantages.
Afterwards, we will transform a subset of a bacterial dataset, containing electric current signals, into a set of nucleotide sequences with error rate higher than previous generations of
sequencing.
Session2: Stitching fragments
Sequencing technologies are still unable to read the whole genome at once, therefore the obtained fragments need to be joined together. We will first try and use sequence alignment, the basis of
many bioinformatics tools. As it is not feasible for larger amounts of data, we will investigate a heuristic approach that uses short substrings of predefined length (Minimap). We will discuss
the trade-off between execution time and sensitivity, and its impact on assembly contiguity, and apply this method on a small bacterial dataset.
Cancellation Policy:
> 30 days before the start date = 30% cancellation fee
< 30 days before the start date= No Refund.
Physalia-courses cannot be held responsible for any travel fees, accommodation or other expenses incurred to you as a result of the cancellation.