Introduction to statistics in R for biologists and ecologists

Dates

15th-18th July 2024

To foster international participation, this course will be held online

 

 

 

Course overview

This course will introduce scientists and practitioners interested in applying statistical approaches in their daily routine using R as a working environment. Participants will be introduced into the mysteries of R and R Studio while learning how to perform common statistical analyses. After a short introduction on R and its principles, the focus will be on questions that could be addressed using common statistical analyses, both for descriptive statistics and for statistical inference.

TARGETED AUDIENCE & ASSUMED BACKGROUND

The course is aimed at early-career researchers (PhD students, early postdocs) or any other types of practitioners interested in widening their analytical toolbox. The course is structured in a way that even an inexperienced and naïve attendee could take advantage of the possibilities offered by the inclusion of statistical analyses using R. There will be a mix of lectures and hands-on practical exercises using R as a freely available software and online resources.
The course is devoted to beginners with no prior knowledge in statistics, programming, and R language, but with a keen interest in using R as a platform for statistical analyses. All scripts will be carefully explained to allow all attendees understanding the rationale and usage of the statistical approaches.

LEARNING OUTCOMES

1. Understand how to read, interpret and write scripts in R.

2. Learn statistical tools to address common questions in research activities.

3. An introduction to efficient, readable and reproducible analyses

4. Being comfortable with using R when performing both descriptive and inferential statistics.

 

Program

 

 Monday – Classes from 2 to 8 pm Berlin time

 

 Session 1: Data Insights

In this session, we will introduce the tidyverse range of packages and how they can be used for data science. We will progress through a data workflow example, learning how to import, check and clean data and start building graphs with the ggplot2 R package. We will cover how to create accurate, clear and beautiful graphs and the rationale behind using data visualisations to understand our data.


Session 2: Descriptive statistics

We will introduce different types of data and the approaches to understanding and summarising them. We will also develop insights from our data using descriptive statistics of centrality and dispersion.


 

Tuesday – Classes from 2 to 8 pm Berlin time

 


Session 3: Introduction to inferential statistics

We will spend session three developing our skills in inferential statistics to make generalisations about data. We will practice with common statistical tests and learn how to use estimates and measures of uncertainty (standard error, confidence intervals) in our analyses.


Session 4: Hypothesis testing

We will practice hypothesis formation and testing and discuss probability distributions (normal, z, chi-square, F), p-values and significance testing.


 
Wednesday– Classes from 2 to 8 pm Berlin time

Session 5: Introduction to linear models

We will go through multiple case studies to apply linear model testing (regression, t-test, ANOVA) and balance theory and application. We will discuss the process of turning model outputs into results summaries for publication.

 
Session 6: Complex linear models

In this session, we will work through complex model building, adding multiple categorical and continuous explanatory variables into linear models. We will practice and discuss working with complex models, testing for interaction effects and model refinement.

 
Thursday– Classes from 2 to 8 pm Berlin time

Session 7: Introduction to generalised linear models

We will spend this session working with generalised linear models, extending the linear model framework to cover variables outside the normal distribution (e.g. Poisson, binomial).

 

Session 8: Introduction to linear mixed models.
 
In this session, we will practice working with mixed models. Ecological and biological data is often complicated and messy, meaning our data has clusters, so our data points are not truly independent. We will practice proper implementation of these tricky but useful models.


Instructor

Dr. Philip Leftwich (University of East Anglia, UK)


Cost overview

        Package 1

 

        480 €

 


Cancellation Policy:

 

> 30  days before the start date = 30% cancellation fee

< 30 days before the start date= No Refund.

 

Physalia-courses cannot be held responsible for any travel fees, accommodation or other expenses incurred to you as a result of the cancellation.