O'Reilly logo
live online training icon Live Online training

Statistical literacy: Linear models as a unifying concept (using R)

Uncovering the foundational concepts that link inferential statistics to deep learning

Rick Scavetta

Linear models are an essential underlying concept for many statistical and machine learning techniques. They provide a framework that unites everything from basic equations like the mean and variance all the way up to complex modern processes like deep learning.

Join expert Rick Scavetta for a crash course in linear models. You’ll explore major themes in data analysis through insightful connections and examples as you develop a deeper understanding of the key concepts that unite what at first glance seem to be disparate techniques.

What you'll learn-and how you can apply it

By the end of this live online course, you’ll understand:

  • What linear models are
  • The mean and two-sample t-tests as linear models
  • Models as best-guess predictions
  • The curse of dimensionality
  • The minimization of loss functions (residuals)
  • Similarities among equations for various situations
  • The bias-variance trade-off
  • Complex methods as elaborations of concepts present in simple linear models

And you’ll be able to:

  • Understand reported results based on linear models
  • Use your newfound knowledge a solid basis for further independent study

This training course is for you because...

  • You encounter linear models but aren’t sure what they mean.
  • You want to understand how techniques like the mean and variance, t-tests, ordinary least squares regression and ANOVA are built on the same fundamental concepts.
  • You want to learn how more complex or reiterative methods like clustering, gradient descent, and deep learning are connected to linear models.
  • You apply linear models but aren’t sure how to interpret the results.

Prerequisites

  • A basic knowledge of R and RStudio
  • Familiarity with statistics fundamentals (e.g., simple random samples, systematic versus random error, types of selection bias, and measures for location and spread)
  • An RStudio account (You’ll be provided with RStudio Cloud projects preloaded with exercise scripts and datasets.)

Recommended follow-up:

About your instructor

  • Rick Scavetta has worked as an independent data science trainer since 2012. Operating as Scavetta Academy, Rick has a close and recurring presence at primary research institutes all over Germany, including many Max Planck Institutes and Excellence Clusters, in fields as varied as primatology, earth sciences, marine biology, molecular genetics, and behavioral psychology.

Schedule

The timeframes are only estimates and may vary according to how the class is progressing

Introduction (20 minutes)

  • Group discussion: What are models and linear models? Where do they appear?
  • Lecture: Overview of methods explored in this course
  • Q&A

Classic OLS regression (60 minutes)

  • Lecture: Defining models; the bias-variance trade-off; minimizing loss functions
  • Demo: The basics of linear models in R
  • Hands-on exercise: Code OLS regression from scratch
  • Q&A

Break (5 minutes)

Other statistical tests (30 minutes)

  • Lecture: Understanding two-sample t-tests and ANOVA; the curse of dimensionality
  • Group discussion: Similarities to regression
  • Demo: Executing t-test and ANOVA as linear models
  • Hands-on exercise: Perform tests in R
  • Q&A

Extending linear models (30 minutes)

  • Lecture: Elaborating on simple models for regression and ANOVA
  • Hands-on exercises: Explore model forms in R
  • Q&A

Break (5 minutes)

Complex methods (20 minutes)

  • Lecture: Analytical versus reiterative approaches to minimize the loss function
  • Group discussion: Linear models as the basis for advanced methods
  • Hands-on exercise: Execute advanced methods in R

Wrap-up and Q&A (10 minutes)