Deep Learning methods in population genomics and phylogeography

Dates

20th - 23rd February 2023

 

Due to the COVID-19 outbreak, this course will be held online

 

Course overview

In recent years, deep learning techniques are increasingly being used in evolutionary studies due to their flexible and data-hungry nature, suitable to analyze large and complex genomic datasets. This course will focus on using deep learning, specifically Convolutional Neural Networks (CNN), to extract information from  genetic data for population genomics and phylogeography inference. The theoretical background for simulating genetic data and developing deep learning architectures will be covered and followed by practical examples, in modules structured over five days. On the first day, the participants will learn how to conceive and simulate genetic data under competing demographic scenarios. Day 2 will cover deep learning background and simple practical examples to understand how a CNN works. In Day 3, deep learning will be used to compare the demographic scenarios conceived in Day 1. Day 4 will be dedicated to the simulation of genomic regions with selective sweeps and using CNN to detect such regions on real genomes. The course is structured to include lectures with discussions of key concepts and practical hands-on sessions, contextualised with research study cases.

 

Target audience and assumed background

The course is aimed at graduate students, researchers and professionals interested in genetics, evolution and deep learning, interested in developing applications to test explicit demographic hypotheses and search for selective sweeps. The course will include both general concepts of genetic data simulations and deep learning but will also include more advanced discussion on  advanced details on their internal machinery. The examples discussed during the course will span datasets for both model organisms, for which whole genomes are available, and non-model organisms with less available information.

 

Program

 

Monday– Classes from 2-8 PM Berlin time

- Basic concepts on simulation-based likelihood free methods

- Modelling demographic history with genetic data simulators

- Practical: building a script to simulate genetic data under competing demographic scenarios.


Tuesday– Classes from 2-8 PM Berlin time


- A gentle introduction on supervised deep learning

- Understanding the basic CNN architecture for image recognition

- Practical: Image classification with CNN



Wednesday– Classes from 2-8 PM Berlin time

- Using CNN to learn directly from genetic data

- The building blocks of a CNN script

- Practical: Comparing demographic scenarios with deep learning


Thursday– Classes from 2-8 PM Berlin time


- Introduction to approaches for detecting selection

- Recognizing signatures of selection with deep learning

- Practical: simulating genetic data and using CNN to predict whether a given locus is under selection


 


Cost overview

 

Package 1

480 €


Should you have any further questions, please send an email to info@physalia-courses.org

Cancellation Policy:

 

> 30  days before the start date = 30% cancellation fee

< 30 days before the start date= No Refund.

 

Physalia-courses cannot be held responsible for any travel fees, accommodation or other expenses incurred to you as a result of the cancellation.