O'Reilly logo
live online training icon Live Online training

Time Series Data Processing and Modeling

Bruno Gonçalves

The availability of large quantity of cheap sensors brought forth by the so-called “Internet of Things” has resulted in an explosion of the amounts of time-varying data. Understanding how to mine, process and analyze such data will only become an ever more important skill in any data scientists toolkit.

In this lecture, we will work through the entire process of how to analyze and model time series data, how to detect and extract trend and seasonality effects and how to implement the ARIMA class of forecasting models. Both real and synthetic datasets will be used to illustrate the different kinds of models and their underlying assumptions.

What you'll learn-and how you can apply it

  • Analyze and process time varying data
  • Identify the different kinds of drifts, lags and trends in time series data
  • Understand auto-correlations and partial auto-correlations
  • Generate and use random walks as test beds for time series
  • Implement a wide range of ARIMA modes with nothing but basic Python

This training course is for you because...

This course is for you if you:

  • Work with time-varying data
  • Are curious about the generic mechanisms that result in time series
  • Want to learn how to handle model time series
  • Wish to Identify trends and correlations in time series
  • Apply time series analysis to real-world datasets

Prerequisites

  • Basic Python
  • Numpy
  • Matplotlib
  • Jupyter

Course Set-up

  • Scientific Python distribution, like Anaconda

Recommended Preparation

Recommended Follow-up

About your instructor

  • Bruno Gonçalves is currently a Senior Data Scientist working at the intersection of Data Science and Finance. Previously, he was a Data Science fellow at NYU's Center for Data Science while on leave from a tenured faculty position at Aix-Marseille Université. Since completing his PhD in the Physics of Complex Systems in 2008 he has been pursuing the use of Data Science and Machine Learning to study Human Behavior. Using large datasets from Twitter, Wikipedia, web access logs, and Yahoo! Meme he studied how we can observe both large scale and individual human behavior in an obtrusive and widespread manner. The main applications have been to the study of Computational Linguistics, Information Diffusion, Behavioral Change and Epidemic Spreading. In 2015 he was awarded the Complex Systems Society's 2015 Junior Scientific Award for "outstanding contributions in Complex Systems Science" and in 2018 is was named a Science Fellow of the Institute for Scientific Interchange in Turin, Italy.

Schedule

The timeframes are only estimates and may vary according to how the class is progressing

Segment 1 – Understanding Time Series (30 min)

  • Empirical Examples
  • Trends
  • Seasons and Cycles
  • Break (10 min)

Segment 2 – Processing Time Series Data( 60 min)

  • Time series transformations (diff, lag, sqrt, etc)
  • Resampling/fill methods
  • Bootstrapping/Jacknife
  • Autocorrelations and Partial Autocorrelation Function
  • Correlations of 2 time series
  • Break (10 min)

Segment 3 – Random Walks (30 min)

  • White noise
  • Drift
  • Smoothing/Rolling window
  • Fast-Fourier Transform
  • Break (10 min)

Segment 4 – ARIMA Models (60 min)

  • Auto-regressive models (AR)
  • Moving Averages (MA)
  • Fitting ARIMA models
  • Seasonal ARIMA models