Model-based demographic inference from population genomics

Model-based demographic inference from population genomics

Dates

27 - 31 May 2024

To foster international participation, this course will be held online

OVERVIEW

The advent of large-scale sequencing technologies represents an unique opportunity to learn about the evolutionary history of natural populations. Furthermore, a thorough characterization of non-adaptive models is necessary for investigating adaptive processes. This course will focus on inferring demographic models from genomic datasets (e.g. whole-genomes or RAD-seq) from model and non-model organisms using single nucleotide polymorphisms (SNPs) and the site-frequency spectrum (SFS). Theoretical background on population genomics and coalescence theory will be provided through a set of lectures that will be followed by hand-on exercises using simulated and real-data. Participants will compute the SFS from genomic files (in VCF format), formalise hypotheses and design demographic models, estimate demographic parameters (e.g. divergence times, effective population sizes) from the data and compare alternative scenarios of evolution. This course will provide the participants with the theoretical and practical knowledge required to infer the demographic history from any standard next generation sequencing (NGS) dataset on their own.

Target audience and assumed background

This course is intended for graduate students, postdocs or any researcher interested in population genomics and statistical inference and in particular those interested in getting introduced to model-based demographic inference from NGS data. Participants are expected to have basic background in population genetics but no previous programming/scripting experience is required. We will provide an introduction to R and UNIX-based command lines in the context of High Computing Performance facilities usage. However, we strongly encourage you to go through these online tutorials: 1) R tutorial; 2) Bash tutorial before the course. Most of the exercises will be done using fastsimcoal but other simulation/inference tools will be discussed.

Background reading

Rosenberg, N., Nordborg, M. Genealogical trees, coalescent theory and the analysis of genetic polymorphisms. Nat Rev Genet 3, 380–390 (2002).

Schraiber, J., Akey, J. Methods and models for unravelling human evolutionary history.Nat Rev Genet 16, 727–740 (2015)

Laurent Excoffier, Nina Marchi, David Alexander Marques, Remi Matthey-Doret, Alexandre Gouy, Vitor C Sousa, fastsimcoal2: demographic inference under complex evolutionary scenarios, Bioinformatics, Volume 37, Issue 24, 15 December 2021, Pages 4882–4885

Johri P, Aquadro CF, Beaumont M, Charlesworth B, Excoffier L, Eyre-Walker A, et al. (2022) Recommendations for improving statistical inference in population genomics. PLoS Biol 20(5): e3001669.

Program

Monday. 2-8 pm Berlin time
NGS data: sequencing/genotyping technologies (whole-genomes, RAD-seq, poolling)
Short introduction to population genetics concepts
Practicals: 1) Introduction to Unix, R and RStudio. 2) Filtering VCFs (from WGS and RAD-seq) to obtain SNPs for demographic inference.

Tuesday. 2-8 pm Berlin time
Introduction to population genomics and coalescent theory
Definition and computation of the site frequency spectrum
Practicals: 1) Simulating coalescent trees under alternative models; 2) Computing SFS from simulated and real data.

Wednesday. 2-8 pm Berlin time
Designing demographic models
Setting up fastsimcoal2 (and other software packages) input files
Practicals: 1) Handling missing data, low depth of coverage; 2) Computing number of monomorphic sites and folded SFS. 3) Defining models and simulating da

Thursday. 2-8 pm Berlin time
Demographic history inference: parameter estimation
Examples of demographic history inference: common aspects and specificities.
Practicals: 1) Simulating data with ms and DaDi - part II; 2) inferring demographic parameters from real data.

Friday. 2-8 pm Berlin time
Comparison of demographic models: model choice using fastimcoal2 and other available R-packages
Discussion of advantages and limitations of different approaches (e.g., PSMC/MSMC-like approaches)
Practicals: 1) Model comparison (real data); 2) bootstrap parameter estimation
Final discussion

Instructors

Dr Isabel Alves (University of Strasbourg, Fr)

Dr Camille Roux (CNRS - University of Lille, Fr)

COst overview

Package 1

530 €

Cancellation Policy:

> 30 days before the start date = 30% cancellation fee

< 30 days before the start date= No Refund.

Physalia-courses cannot be held responsible for any travel fees, accommodation or other expenses incurred to you as a result of the cancellation.