O'Reilly logo
live online training icon Live Online training

Core Statistics for Data Science in R and Python

Take your knowledge of statistics to the next level

Noah Gift
Mehul Rangwala

Take your knowledge of data science to the next level and do it in both Python and R at the same time. In this unique training you learn applied statistical techniques that are fundamental to Python and R.

This training covers an introduction to statistics that will prepare you for data science. Learn the foundational techniques of data distributions, skews and descriptive statistics while exploring both R and Python.

What you'll learn-and how you can apply it

  • Apply R statistics fundamentals
  • Apply Python statistics fundamentals
  • Use both Python and R to perform Exploratory Data Analysis

This training course is for you because...

  • You work as a software engineer and want to learn the statistical foundation for data science tasks.
  • You work in data and know R, but want to learn Python as well.
  • You work in data and know Python, but want to learn R as well.

About your instructor

  • Noah Gift is lecturer and consultant at both UC Davis Graduate School of Management MSBA program and the Graduate Data Science program, MSDS, at Northwestern. He is teaching and designing graduate machine learning, AI, Data Science courses and consulting on Machine Learning and Cloud Architecture for students and faculty. These responsibilities including leading a multi-cloud certification initiative for students. He has published close to 100 technical publications including two books on subjects ranging from Cloud Machine Learning to DevOps. Gift received an MBA from UC Davis, a M.S. in Computer Information Systems from Cal State Los Angeles, and a B.S. in Nutritional Science from Cal Poly San Luis Obispo.

    Professionally, Noah has approximately 20 years’ experience programming in Python. He is a Python Software Foundation Fellow, AWS Subject Matter Expert (SME) on Machine Learning, AWS Certified Solutions Architect and AWS Academy Accredited Instructor, Google Certified Professional Cloud Architect, Microsoft MTA on Python. He has worked in roles ranging from CTO, General Manager, Consulting CTO and Cloud Architect. This experience has been with a wide variety of companies including ABC, Caltech, Sony Imageworks, Disney Feature Animation, Weta Digital, AT&T, Turner Studios and Linden Lab. In the last ten years, he has been responsible for shipping many new products at multiple companies that generated millions of dollars of revenue and had global scale. Currently he is consulting startups and other companies.

  • Lecturer Mehul Rangwala has nearly 20 years of experience in information technology and quantitative and qualitative data analysis. An information technology manager at the Sacramento Regional County Sanitation District, Rangwala researches, plans and provides insights to the executive management on IT strategy and the latest technology trends while providing leadership in IT investments, cost/benefit analysis, asset management, IT operations and support.

    His career has been largely in data analysis, budgeting, forecasting, information technology and making important data-driven decisions using research, quantitative and economic principles.

    In 2018, students from both the MBA and MSBA programs selected him as the GSM Teacher of the Year. It is very rare for faculty to receive this honor across multiple programs.

    Rangwala earned his MBA from the UC Davis Graduate School of Management and his B.S. in electronics engineering from India.

Schedule

The timeframes are only estimates and may vary according to how the class is progressing

Part 1: Core Statistical Concepts for Data Science in R (45 min)

  • Basic statistics in R
  • Measures of central tendency in R
  • Measures of central dispersion in R
  • Statistical anomalies in R
  • Z-scores in R
  • Correlation in R
  • Regression in R
  • Fundamentals of data visualization in R

Q&A (10 min)

Break (5 min)

Part 2: Core Statistical Concepts for Data Science in Python - Section 1 (45 min)

Basic statistics in Python Measures of central tendency in Python Measure of central dispersion in Python

Q&A (10 min)

Break (5 min)

Part 3: Core Statistical Concepts for Data Science in Python - Section 2 (45 min)

  • Statistical anomalies in Python
  • Z-scores in Python
  • Correlation in Python
  • Regression in Python
  • Fundamentals of data visualization in Python

Q&A (10 min)

Break (5 min)

Part 4: Next Level Statistical Concepts for Data Science in R (30 min)

  • Advanced statistics in R
  • Analytics techniques in R

Q&A (10 min)

Break (5 min)

Part 5: Next Level Statistical Concepts for Data Science in Python - Section 1 (30 min)

  • Advanced statistics in Python

Q&A (10 min)

Break (5 min)

Part 6: Next Level Statistical Concepts for Data Science in Python - Section 2 (30 min)

  • Analytics techniques in Python