Data analysis paradigms in the tidyverse
Case studies in identifying and implementing data analysis paradigms
The relationship between tidyverse packages and base R packages can be complicated. And it’s sometimes difficult to see how tidyverse packages relate to each other—let alone use them in concert.
Join expert Rick Scavetta to explore the recurring paradigms in data analysis that the tidyverse ecosystem is well suited to address. Over three hours, you’ll learn to identify these small recurring patterns as you trace the path from raw data to results using real-world examples and a small collection of useful functions. By the time you’re through, you’ll understand the close link between data structure and these functions as well as how they facilitate efficient data analysis that you can apply directly to your own projects.
What you'll learn-and how you can apply it
By the end of this live online course, you’ll understand:
- Tidyverse functions and data analysis paradigms
- The purpose of different packages and how they work in concert
And you’ll be able to:
- Identify recurring themes in data analysis workflows, moving from raw data to completed results in an efficient manner
- Dig deeper into tidyverse packages not covered in the course and understand how they all fit together to build an ecosystem
This training course is for you because...
- You have existing R scripts that you need to modify.
- You’ve been using only the base R package and need to expand your knowledge base to use the tidyverse.
- A basic knowledge of RStudio
- A working knowledge of base R package fundamentals (functions, objects, indexing, and logical expressions)
- Familiarity with the most common data structures in R (vectors, lists, and data frames)
- An RStudio account (You’ll be provided a web-based RStudio cloud instance for the course)
- Take Inferential Statistics Using R (live online training course with Rick Scavetta)
About your instructor
Rick Scavetta has worked as an independent data science trainer since 2012. Operating as Scavetta Academy, Rick has a close and recurring presence at primary research institutes all over Germany, including many Max Planck Institutes and Excellence Clusters, in fields as varied as primatology, earth sciences, marine biology, molecular genetics, and behavioral psychology.
The timeframes are only estimates and may vary according to how the class is progressing
Introduction (20 minutes)
- Lecture: Poor data structure and base R—the impetus for the tidyverse
- Group discussion: Prework exercises
tidyr, tibble, and readr (20 minutes)
- Lecture: The structure of messy and tidy data; tidyverse solutions
- Hands-on exercises: Identify various paradigms in data analysis workflows
- Group discussion: Problems with data structure
- Break (10 minutes)
dplyr, stringr, and (bare minimum) ggplot2 (60 minutes)
- Lecture: Five verbs, one adjective, and punctuation of dplyr; helper functions and variants of summarize and mutate; pattern matching with stringr and plotting as part of the tidyverse
- Hands-on exercises: Implement dplyr; use dplyr helper functions and special variants; clean up data and plotting
- Group discussion: Building functional sequences with dplyr grammar
- Break (10 minutes)
purrr and forcats (45 minutes)
- Lecture: Introduction to reiteration
- Hands-on exercise: Complete a coding challenge
- Group discussion: Use case scenarios for map versus walk; results and alternative approaches
Wrap-up and Q&A (15 minutes)