Inferring demographic history from population genomics data


12 - 16 July 2021

Due to the COVID-19 outbreak, this course will be held online


This course will provide a comprehensive introduction to population genomics data and methods to model and infer the demographic history of populations/species. It will focus on methods based on single nucleotide polymorphisms (SNPs) and the site frequency spectrum (SFS), which can be obtained from whole genome or reduced representation (e.g.,, GBS, RAD) sequencing data from model and non-model species.


The course will take participants through all the steps required for preparing the data, performing and interpreting results from demographic history inference. Specifically, it will go from generating the SFS from VCF files to formalize hypotheses in terms of demographic models with parameters of interest (e.g., past effective sizes, migration rates and times of split). 


Participants will learn how to compare alternative models (model choice) and estimate parameters, as well as simulate data under different demographic history scenarios. By combining lectures addressing key concepts in population genomics with hands-on exercises, participants will learn key approaches used in population genomics that can be applied to several species and types of data, including low coverage and ancient DNA. After completing the course, participants should be able to begin using NGS data to model and infer the demographic history in their study system of choice.


This course is designed for researchers and graduate students with strong interests in using population genomics NGS data to reconstruct the demographic history of species. The course will mainly focus on the analysis of single nucleotide polymorphism/variants (SNP or SNV) data and methods based on the site frequency spectrum (SFS).

We will provide theoretical lectures and hands-on exercises with examples of whole-genome and RAD sequencing data, i.e., we will provide population genomics datasets from model/non-model organisms to be analysed during the course. Participants are encouraged to bring their own data, as they will have the opportunity to test it. In the tutorials we will use the UNIX command line and R.


Assumed Background

Participants should have a basic background in evolution and population genetics. No programming or scripting expertise is required, but previous experience in UNIX-based command line and R is an advantage. We will provide a very brief introduction to UNIX and R, but we would like to ask participants without previous experience in R to go through this tutorial (from Mark Ravinet), and participants without previous knowledge in the unix command line to go through this tutorial before the course. Hands-on exercises will be run in a Linux environment on remote servers. Visualization of results and statistical analyses will be run in R using RStudio.

Learning Outcomes

1)   Handling NGS data from VCF (variant call format) files
2)   Computing and interpreting basic population genetic statistics and the site frequency spectrum (SFS)
3)  Build the SFS for both model and non-model species from NGS data, accounting for missing data and low         depth of coverage
4)   Modeling and inferring demographic history based on the SFS
5)   Understanding the potential and limitations of different methods and approaches


Monday. 2-8 pm Berlin time


·       Introduction to Unix, R and RStudio
·       Introduction to NGS data (e.g. RADseq, whole genomes) and genetic variants (VCF files)
·       Filtering VCFs to obtain SNPs for demographic inference






Dr Vitor C Sousa



Teaching Assistants



Dr Barbara Parreira



COst overview

Cancellation Policy:




> 30  days before the start date = 30% cancellation fee


< 30 days before the start date= No Refund.




Physalia-courses cannot be held responsible for any travel fees, accommodation or other expenses incurred to you as a result of the cancellation.