Efficient data analysis with data.table in R

Dates

3rd September 2025

To foster international participation, this course will be held online

 

Course overview

data.table is one of the most powerful and efficient R packages for fast data manipulation. Built on highly optimized C code, it enables rapid and memory-efficient operations such as grouping, summarizing, reshaping, and updating data tables. It also offers fast alternatives to R’s base functions for reading and writing large datasets. This hands-on tutorial will introduce participants to the fundamentals of data.table through live coding demonstrations and practical exercises. Participants will learn how to use data.table within a typical data analysis workflow — from data import and exploration to transformation and export. This course also serves as a stepping stone to more advanced data.table functionalities, including the use of special symbols and combined operations.

 

Requirements

  • Basic familiarity with R and the RStudio environment

  • Understanding of basic R data structures (e.g., vectors, data frames)

  • No prior experience with data.table is required

Learning outcomes

By the end of this course, participants will be able to:

  • Read data efficiently using fread() and write results with fwrite()

  • Understand and apply the core syntax of data.table

  • Perform fast and efficient row subsetting and column operations

  • Use data.table for data exploration, transformation, and summarization

  • Begin exploring more advanced functionalities such as special symbols and combined operations

Session content

Day 1: 03.09 - 10 AM-1 PM Berlin time

 

  • Introduction
    Overview of data.table and its advantages for data manipulation.

  • Reading data
    Efficiently importing datasets using fread().

  • The data.table syntax
    Understanding the core syntax and structure of data.table operations.

  • Subsetting rows
    Techniques for fast and flexible row filtering.

  • Operating with columns
    Creating, updating, and summarizing columns efficiently.

  • Putting it all together
    Building complete workflows for data exploration and transformation.

  • Special symbols
    Introduction to advanced features, such as .N.SD, and .I, to enhance functionality.

 

Instructors

Elio Campitelli:  Elio Campitelli has a Ph.D. from Universidad de Buenos Aires in atmospheric sciences, where they studied the large-scale circulation of the Southern Hemisphere and now studies tropical influences on Antarctic sea ice at Monash University. They also taught Introduction to Programming, and Visualization of Information at Universidad Nacional Guillermo Brown and is a The Carpentries certified instructor. They are an active member of the R community, and maintains several open-source R packages (e.g., ggnewscale; metR).

More information about Elio: https://eliocamp.github.io

 

Paola Corrales: Paola has a PhD in Atmospheric Science from Universidad de Buenos Aires. During her PhD she applied data assimilation techniques to improve the representation of mesoscale convective systems and associated precipitation. She has experience working with Numerical Weather Prediction models using HPC systems and programming languages such as R, bash, and Fortran. She is an active R user and developer and contributes to many communities of practice, such as R-Ladies and rOpenSci. Since 2021, Paola holds a professor position at Universidad Nacional Guillermo Brown where she teaches Visualization of Information, and Data Management. In 2023 Paola became a member of The Carpentries Board of Directors.

More information about Paola: https://paocorrales.github.io

 


COst overview

 

Package 1

 

 

 

150 €

 

 

 

 

 

 

 

 

 


Cancellation Policy:

 

> 30  days before the start date = 30% cancellation fee

< 30 days before the start date= No Refund.

 

Physalia-courses cannot be held responsible for any travel fees, accommodation or other expenses incurred to you as a result of the cancellation.