Skip to Main Content
Data Superstream: The Data Engineering Lifecycle
Closed Captioning available in German, English, Spanish, French, Japanese, Korean, Portuguese (Portugal, Brazil), Chinese (Simplified), Chinese (Traditional)

Overview

Sponsored by Redpanda

Millions (if not billions) of touch points from customers, systems, and processes enter the average business’s data stream every day. Farther down that stream, analysts, data scientists, and ML engineers take that data and use it to develop hypotheses, identify insights, feed learning models, and so much more. The job of the data engineer is to manage this lifecycle from initial generation through storage to ingestion, transformation, and finally serving the data, using tools like AWS, Azure, Google Cloud, Spark, Kafka, SQL, and many more. It’s extremely important and no small feat. That’s why data engineering is one of the fastest growing jobs—and why data engineers are employed by many of the most recognizable tech companies in the world, including IBM, Amazon, Microsoft, Apple, Google, and Facebook.

Join experienced industry experts to learn how the data engineering lifecycle fits into the overall data lifecycle, explore the technologies you’ll need to conquer along the path from generation to service, and better understand how to meet the needs of analysts, scientists, and ML engineers as well as the business stakeholders and customers driving decisions.

What you’ll learn and how you can apply it
  • Discover how the data engineering lifecycle allows data professionals to design and build a robust architecture
  • Standardize the process of ML model deployment and monitoring with MLOps
  • Learn essential data preprocessing techniques crucial for harnessing the potential of LLMs
This live course is for you because…
  • You're a data engineer, ML engineer, or data scientist.
  • You want to effectively approach the data lifecycle from ingestion to labeling to solving problems with machine learning.
  • You want to learn more about prompt engineering and management to tame the inherent unpredictability of AI-generated outputs.

Recommended follow-up:

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Data Superstream: Data Lakes and Warehouses

Data Superstream: Data Lakes and Warehouses

Alistair Croll, Lena Hall, Vini Jaiswal, Einat Orr, Wannes Rosiers, Jessica Larson, Ryan Blue, Tejas Chopra
Data Superstream: Data Lakes and Warehouses 2023

Data Superstream: Data Lakes and Warehouses 2023

Alistair Croll, Piethein Strengholt, Joy Payton, Max Schultze, Arif Wider, Gavita Regunath, Sam Redai, Rachel Bradley-Haas, Robert Bruce Thompson
Data Superstream: Real-Time Analytics

Data Superstream: Real-Time Analytics

Lorien Pratt, Karin Wolok, Mary Grygleski, Ben Gamble, Mark Needham, Dunith Dhanushka, Chris Andrassy, James Corcoran, Yingjun Wu
Data Superstream: Analytics Engineering

Data Superstream: Analytics Engineering

Alistair Croll, Anna Filippova, Emilie Schario, Lewis Davies, Jacob Frackson, Benn Stancil, Nick Acosta, Elizabeth Caley

Publisher Resources

ISBN: 0636920906889