Introduction to Statistics for Data Analysis with Python
Learn the fundamentals of statistics answering realworld questions
This training session focuses on learning ways to implement fundamental concepts of statistics which are essential for every data scientist. We'll witness how statistics enable us to derive insights from raw information to answer our realworld problems/questions. For every aspiring data scientist, statistics opens up the doors to all the major domains which make use of data science.
What you'll learnand how you can apply it
By the end of this live, online course, you’ll understand:
 Data exploration and visualization
 Fundamentals of Descriptive strategy  mean, median mode, measurement of spread, standard deviation, percentile, variance, skewness, correlation, etc
 Inferential statistics  basic principles behind using data for estimation and for assessing theories
And you’ll be able to:
 Explore the data using statistics.
 build statistical models.
This training course is for you because...
 You are a programmer or an aspiring data analyst/scientist.
 For all the beginners in the field of Data/ML/AI with some familiarity with elementary mathematics, and python programming.
Prerequisites
 Python Programming, Pandas, Matplotlib
 Basic Mathematics
 No prior experience with statistics necessary
About your instructor

Harshit Tyagi is a Full Stack Developer and Data Engineer at Elucidata, a Cambridge based Biotech company. He develops algorithms for research scientists at the world’s best medical schools like Yale, UCLA, and MIT. Before Elucidata, he was working as a Systems Development Engineer at an Investment Management firm called Tradelogic where he designed a framework to analyze financial news from all prominent sources to produce accurate trading signals. He is a Python evangelist and loves to contribute to tech communities like Google Developers Groups, Python Delhi User Groups, and other Elearning platforms. With the skills acquired over years and being a mentor and reviewer for more than 3 years in the Elearning era, it’d be great to share the enterprisegrade practices to produce more skillful data scientists and quantitative traders.
Schedule
The timeframes are only estimates and may vary according to how the class is progressing
Introduction to Data Visualisation (50 mins)
 Presentation (15min): Learning how to extract and explore data and understand what different plots and charts mean and represent.
 Discussion (5 mins): Libraries we can use in python for plotting?
 Presentation (15 mins): Overview of different Python plotting libraries, including Numpy, Pandas, Statsmodels, Matplotlib, and Seaborn.
 Exercise (15mins): Practice plotting and Exploratory Data Analysis
 Q&A (5 mins)
Introduction to Descriptive Strategy (50 mins)
 Presentation (20 mins): Basics of Descriptive strategy Mean, Median, Mode, variance, standard deviation, central tendency, etc
 Discussion (10 mins): How can we answer realworld questions using statistics  ex: Who is the best player of football in the world?
 Presentation (15 mins): How does Netflix know what we like?  Percentile, variance, skewness, correlation.
 Exercise (15 mins): Problem: Should we buy an extended warranty on electrical appliances?
 Q&A (5 mins)
Basics of inferential statistics (60mins)
 Presentation (20 mins): Basic principles behind inferential statistics  analyzing categorical and qualitative data, constructing confidence intervals and sampling.
 Codelab walkthrough (15 mins): Use numpy, pandas, statsmodel and seaborn to analyse case studies.
 Exercise (15 mins): Use the concepts to work on an industry problem
 Q&A (10 mins)
Takehome exercise:
 Exercise: Create a statistical model to recommend the type of insurance to individuals based on their location, occupation, marital status, and many other features.