30 October -17 November 2023
To foster international participation, this course will be held online
This course will provide biologists and bioinformaticians with practical statistical analysis skills to perform rigorous analysis of high-throughput genomic data. The course assumes basic familiarity with genomics and with R programming, but does not assume prior statistical training. It covers the statistical concepts necessary to analyze genomic and transcriptomic high-throughput data generated by next-generation sequencing, including: hypothesis testing, data visualization, genomic region analysis, differential expression analysis, and gene set analysis.
Come to the first class with the following installed:
● R and Bioconductor: www.bioconductor.org/install
● R Studio: https://www.rstudio.com/products/rstudio/download3/
● Modern Statistics for Modern Biology
(by Holmes and Huber)
● The Bioconductor 2018 Workshop collection
** Session 1 – Introduction (Mon, Oct 30, 11 AM-2 PM, Berlin time)
- Introduction to R / RStudio
- Creating high-quality graphics in R
** Session 2 – Hypothesis testing (Wed, 11 AM-2 PM, Berlin time)
- CDF, p-value, binomial test
- types of error, t-test, permutation test
** Session 3 - Introduction to Bioconductor (Fri, Nov 3, 3-6 PM, Berlin time)
- Introduction to Bioconductor
- Working with genomic region data in Bioconductor (GenomicRanges)
** Session 4 -Tidyverse (Mon, Nov 6, 3-6 PM, Berlin time)
- Motivation and introduction to tidy analysis
- Useful packages and paradigms, integration with ggplot2
- Why tidy analysis works for genomics
** Session 5 - RNA-seq data analysis (Wed, Nov 8, 12-3 PM, Berlin time)
- Characteristics of RNA-seq data
- Storing and analyzing RNA-seq data in Bioconductor (SummarizedExperiment)
** Session 6 - Genomic Data Visualisation (Fri, Nov 10, 3-6 PM, Berlin time)
- Visualization of genomic region data using the Gviz package
- Visualization of gene expression data using the ComplexHeatmap package
** Session 7 - Differential expression analysis (Mon, Nov 13, 12-3 PM, Berlin time)
- Multiple hypothesis testing
- Performing differential expression analysis with DESeq2
** Session 8 - Gene set analysis (Wed, Nov 15, 12-3 PM, Berlin time)
- A primer on terminology, existing methods & statistical theory
- GO/KEGG overrepresentation analysis
- Functional class scoring & permutation testing
** Session 9 - Bioconductor tidy workflows (Fri, Nov 17, 3-6 PM, Berlin time)
- Tidy analysis of GenomicRanges datasets
- Genomic overlaps as "joins"
- Performing genomic enrichment analysis with bootstrapping and matching (nullranges package)
- Tidy analysis for transcriptomics: tidybulk, tidySingleCellExperiment, tidyseurat
(Center for Computational Biomedicine, Harvard Medical School, USA)
530 €
Cancellation Policy:
> 30 days before the start date = 30% cancellation fee
< 30 days before the start date= No Refund.
Physalia-courses cannot be held responsible for any travel fees, accommodation or other expenses incurred to you as a result of the cancellation.