O'Reilly logo
live online training icon Live Online training

Concurrency in Python

Threads, Processes, and Coroutines Inside and Out

Vishal Prasad

This course is a deep-dive into concurrent programming in Python. With the tapering-off of Moore’s Law, future gains in gross computing power will be primarily underpinned by the ability of working programmers to build clean, scalable, and efficient concurrent systems. Python programmers have numerous flavors of concurrency available to them in the form of multithreading, multiprocessing, coroutines, and more. The aim of this course is to build a firm foundation for understanding these tools.

What you'll learn-and how you can apply it

  • Understand the difference between concurrency and parallelism
  • Write threaded Python with locks and queues
  • Achieve parallelism via multiprocessing
  • Build concurrency primitives from coroutines
  • Build concurrency programs with asyncio

This training course is for you because...

  • You’re a programmer who wants to deep dive into concurrent programming
  • You’re a programmer who has worked with concurrent programs in other languages before, and wants to learn how things are done in Python
  • You’re a Python programmer at an intermediate level who would like to be exposed to advanced topics

Prerequisites

  • Attendees should be familiar with most of the material covered in David Beazley’s Python Programming Language LiveLessons (video)
  • If you are not very familiar with concurrency in general, watch this video of the legendary engineer Rob Pike talk about it (in the context of golang)

Course Set-up

  • A Python 3.6 interpreter

Recommended Follow-up

  • All advanced Python programmers should be familiar with David Beazley’s Python Cookbook, which contains hundreds of instructive, powerful, and clean code snippets for rumination
  • Luciano Ramalho’s Fluent Python is also an excellent resource. The book is more structured like a (very readable) textbook, and is perfect for the programmer who is comfortable working in Python but does not feel like their code is “Pythonic” yet:
  • Beyond Python, for those of you who design distributed systems or who would like to, Martin Kleppman’s Designing Data-Intensive Applications is a must-read. An exceptionally good resource written for the working programmer

  • If you are not already in the habit of doing so, you should regularly watch videos of lecture from Python conferences, which are archived here Py video. Particular talks that are relevant to concurrency are:

About your instructor

  • Vishal Prasad is a Backend Engineer at Peloton. He works on scaling distributed databases, cleaning up analysis pipelines, and punching race-conditions in the face.

Schedule

The timeframes are only estimates and may vary according to how the class is progressing

Python Thread Overview: The GIL, Python & Kernel Threads, CPython internals, Nonblocking I/O, C Extensions (20 mins)

  • In this section we will discuss the Global Interpreter Lock, and the limitations/nonlimitations it imposes on Python programs. We will learn why the GIL is used, why removing it is difficult, and how it can be worked around. We will introduce threads. We will look at the CPython interpreter and see how the GIL allows for concurrent I/O. We will write a C extension that will do native parallelism.

Locks: A little course on semaphores (20 mins)

  • In this section we will discuss thread locking. We will discuss deadlocking and semaphores, and look at a couple toy programs that employ Python threads.
  • Participants will do some code exercises adapted from The Little Book of Semaphores.
  • We will discuss some larger problems with locks.

Queues: “Share memory by communicating” (20 mins)

  • In this section we will discuss a new paradigm for sharing state between concurrent executors via thread-safe queues.

Multiprocessing: Concurrency AND Parallelism! (20 mins)

  • In this section we will discuss multiprocessing, and write some code that is both concurrent and parallel. We will write an analysis pipeline, map-reduce style.
  • We will also benchmark the memory and latency overheads of multiprocessing vs. multithreading
  • General Q&A/Break (10 minutes)

Coroutines: Concurrency from scratch Length (40 minutes)

  • In this section we will build a coroutine-based concurrency library from scratch.

Gevent: A little theory on Greenlets & Monkey Patching 10 minutes)

  • In this section we will look at some gevent internals, and learn the mechanics of monkey patching to understand how nonblocking concurrent I/O is achieved in the wild.

AsyncIO: A practical course on AsyncIO: async/await, event loops, concurrent futures, and more (30 minutes)

  • This section will serve as a general introduction to AsyncIO. I will build out a webscrapper using clean AsyncIO.

Hacks: A grabbag of curious ideas (5 minutes)

  • This section will contain some async hacks, which will be enjoyable to contemplate.

General Q&A 2 (5 minutes)