Program

Monday - Introducing the tools  - 2-8 pm Berlin time

 

  • What is data visualization? Why is it important? Are there any general rules we can follow?
  • Introduction to Jupyter (formerly iPython): this will be our main development environment, since it allows to mix python code, plots, tables and freeform notes
  • How to set up a dataviz project (or any project). A folder where you throw everything it’s not enough!
  1. tips on how to keep a clean, ordered project
  2. brief introduction about git, plus the two main repositories out there: github and gitlab
  • Sample datasets, types of datasets: it’s time to meet the data that we are going to butcher through the course. We’ll talk about data types in general and more specifically the data types you are likely to encounter as bioinformaticians

       3. BYOD - Bring Your Own Data: let us know about your own data sets and problems.

  • Introduction to Pandas, one of the main data manipulation libraries for python, we’ll become familiar with dataframes, data import/export, missing data points, basic data description

Tuesday - Pandas all the way down - 2-8 pm Berlin time

  • More pandas: grouping, pivoting, data aggregation, indexing, remapping. A good half of data visualization is preparing your data in the correct shape and format, so that you can then plot it. This block will give you most of the tools you’ll need to manipulate your data
  • Even more pandas: PCA, timestamps, networks, binning. Each project carries its own case-specific requirements. Here we take a look to many interesting, if not general, use cases

Wednesday - drawing on the screen -2-8 pm Berlin time

  • Introduction to Seaborn plotting library: what it is, why use it, what are its limitations. We’ll see the standard seaborn use cases and general philosophy
  1. Regular plots: scatter, regression, bar, pie
  2. Distribution plots: histograms, box and violin plots, facets
  3. More exotic plots: heatmap, networks, maps
  • Advanced seaborn: manipulating axis and charts for finer (and trickier!) customization. Ever wished to do a plot-within-a-plot? This is your chance!

Thursday - practice, practice, practice - 2-8 pm Berlin time

  • This day is dedicated to guided exercises. We will choose three-four plots from a list of prepared visualization problems. Depending on the students’ interests and requirements we will either revisit the topics that sparked more interest during the past days or tackle more advanced tasks like creating interactive plots and infographics.