Live Online training

# Statistics and hypothesis testing with Python: Essential math for data science

## What you'll learn-and how you can apply it

By the end of this live online course, you’ll understand:

• How to use the central limit theorem in statistics
• What hypothesis testing and parameter estimation are
• How bootstrapping for parameter estimation works

And you’ll be able to:

• Perform hypothesis testing to determine if a result is statistically significant
• Calculate confidence intervals to quantify a measurement uncertainty
• Apply bootstrapping to determine confidence intervals for any general estimator
• Implement A/B testing

## This training course is for you because...

• You’re in a technical role, but you’re looking to transition into a data scientist or data analyst position.
• You want to apply data-driven decision making in your position.
• You work with data and want to generate insights and analysis.

Prerequisites

• A basic understanding of Python (variable creation, conditional statements, functions, and loops) and statistical values (mean, median, and mode)

Recommended preparation:

Recommended follow-up:

• Russell Martin is a Data Scientist in Residence at The Data Incubator. He received his PhD in Applied Mathematics from the Georgia Institute of Technology. Russ lived and worked in the UK for seventeen years, including at Warwick University and the University of Liverpool, where he taught in the Department of Computer Science. As a Data Scientist in Residence, Russ instructs Fellows in our Data Science Fellowship, teaches online courses, and leads trainings with our corporate partners.

## Schedule

The timeframes are only estimates and may vary according to how the class is progressing

Introduction to statistics and statistical inference (10 minutes)

• Lecture: The Jupyter Notebook environment; statistical inference
• Group discussion: Do union or nonunion construction workers have higher salaries?

Hypothesis testing of the mean (10 minutes)

• Lecture: Mean estimation
• Hands-on exercise: Generate income samples for union workers

Standard error of the mean (10 minutes)

• Lecture: Standard error of the mean
• Hands-on exercise: Estimate the standard error
• Group discussion: Increasing sample size

Confidence intervals (10 minutes)

• Lecture: Confidence intervals
• Hands-on exercise: Calculate the confidence interval for the unionized salary mean

Hypothesis testing two means (5 minutes)

• Lecture: Null hypothesis
• Group discussion: Rejecting the null hypothesis with unknown population means

Estimating variance (5 minutes)

• Lecture: Unbiased estimator

Students’ t-distribution (15 minutes)

• Lecture: Large n assumption
• Q&A
• Break (5 minutes)

Standard error of proportion and variance (15 minutes)

• Lecture: Standard error of proportion; the rule of three; the standard error of variance estimate
• Hands-on exercise: Change variables to interact with SEP figures

Hypothesis testing for counts (10 minutes)

• Lecture: The chi-squared hypothesis test
• Group discussion: Where else would you use a chi-squared test?

Bootstrapping (10 minutes)

• Lecture: Subsampling data

Determining distributions (5 minutes)

• Lecture: Data matching distribution
• Hands-on activity: Apply the Kolmogorov-Smirnov test

Wrap-up and Q&A (10 minutes)