Home
Courses
Data Engineering on Microsoft Azure (DP-203)

Data Engineering on Microsoft Azure (DP-203)

Course 8595

Duration: 4 days
Exam Voucher: Yes
Language: English
Level: Intermediate

In this Azure Data Engineering training course, the student will learn how to implement and manage data engineering workloads on Microsoft Azure, using Azure services such as Azure Synapse Analytics, Azure Data Lake Storage Gen2, Azure Stream Analytics, Azure Databricks, and others. The course focuses on common data engineering tasks such as orchestrating data transfer and transformation pipelines, working with data files in a data lake, creating and loading relational data warehouses, capturing and aggregating streams of real-time data, and tracking data assets and lineage.

Azure Data Engineering Training Delivery Methods

In-Person
Online

Azure Data Engineering Training Information

In this course, you will learn how to:

Explore compute and storage options for data engineering workloads in Azure.
Run interactive queries using serverless SQL pools.
Perform data Exploration and Transformation in Azure Databricks.
Explore, transform, and load data into the Data Warehouse using Apache Spark.
Ingest and load Data into the Data Warehouse.
Transform Data with Azure Data Factory or Azure Synapse Pipelines.
Integrate Data from Notebooks with Azure Data Factory or Azure Synapse Pipelines.
Support Hybrid Transactional Analytical Processing (HTAP) with Azure Synapse Link.
Perform end-to-end security with Azure Synapse Analytics.
Perform real-time Stream Processing with Stream Analytics.
Create a Stream Processing Solution with Event Hubs and Azure Databricks.

Training Prerequisites

Successful students start this course with knowledge of cloud computing and core data concepts and professional experience with data solutions. Specifically completing:

Learning Tree course 8566, Microsoft Azure Fundamentals Training (AZ-900T00)
Learning Tree course 8586, Microsoft Azure Data Fundamentals Training (DP-900)

Certification Information

This class does prepare an individual to take the Microsoft Certified Exam DP-203.

Azure Data Engineering Training Outline

Module 1: Introduction to data engineering on Azure

Microsoft Azure provides a comprehensive platform for data engineering; but what is data engineering? Complete this module to find out.

In this module you will learn how to:

Identify common data engineering tasks
Describe common data engineering concepts
Identify Azure services for data engineering

Module 2: Introduction to Azure Data Lake Storage Gen2

Data lakes are a core element of data analytics architectures. Azure Data Lake Storage Gen2 provides a scalable, secure, cloud-based solution for data lake storage.

In this module you will learn how to:

Describe the key features and benefits of Azure Data Lake Storage Gen2
Enable Azure Data Lake Storage Gen2 in an Azure Storage account
Compare Azure Data Lake Storage Gen2 and Azure Blob storage
Describe where Azure Data Lake Storage Gen2 fits in the stages of analytical processing
Describe how Azure data Lake Storage Gen2 is used in common analytical workloads

Module 3: Introduction to Azure Synapse Analytics

Learn about the features and capabilities of Azure Synapse Analytics - a cloud-based platform for big data processing and analysis.

In this module, you'll learn how to:

Identify the business problems that Azure Synapse Analytics addresses.
Describe core capabilities of Azure Synapse Analytics.
Determine when to use Azure Synapse Analytics.

Module 4: Use Azure Synapse serverless SQL pool to query files in a data lake

With Azure Synapse serverless SQL pool, you can leverage your SQL skills to explore and analyse data in files, without the need to load the data into a relational database.

After the completion of this module, you will be able to:

Identify capabilities and use cases for serverless SQL pools in Azure Synapse Analytics
Query CSV, JSON, and Parquet files using a serverless SQL pool
Create external database objects in a serverless SQL pool

Module 5: Use Azure Synapse serverless SQL pools to transform data in a data lake

By using a serverless SQL pool in Azure Synapse Analytics, you can use the ubiquitous SQL language to transform data in files in a data lake.

After completing this module, you'll be able to:

Use a CREATE EXTERNAL TABLE AS SELECT (CETAS) statement to transform data.
Encapsulate a CETAS statement in a stored procedure.
Include a data transformation stored procedure in a pipeline.

Module 6: Create a lake database in Azure Synapse Analytics

Why choose between working with files in a data lake or a relational database schema? With lake databases in Azure Synapse Analytics, you can combine the benefits of both.

After completing this module, you will be able to:

Understand lake database concepts and components
Describe database templates in Azure Synapse Analytics
Create a lake database

Module 7: Analyse data with Apache Spark in Azure Synapse Analytics

Apache Spark is a core technology for large-scale data analytics. Learn how to use Spark in Azure Synapse Analytics to analyse and visualise data in a data lake.

After completing this module, you will be able to:

Identify core features and capabilities of Apache Spark.
Configure a Spark pool in Azure Synapse Analytics.
Run code to load, analyse, and visualise data in a Spark notebook.

Module 8: Transform data with Spark in Azure Synapse Analytics

Data engineers commonly need to transform large volumes of data. Apache Spark pools in Azure Synapse Analytics provide a distributed processing platform that they can use to accomplish this goal.

In this module, you will learn how to:

Use Apache Spark to modify and save dataframes
Partition data files for improved performance and scalability.
Transform data with SQL

Module 9: Use Delta Lake in Azure Synapse Analytics

Delta Lake is an open source relational storage area for Spark that you can use to implement a data lakehouse architecture in Azure Synapse Analytics.

In this module, you'll learn how to:

Describe core features and capabilities of Delta Lake.
Create and use Delta Lake tables in a Synapse Analytics Spark pool.
Create Spark catalog tables for Delta Lake data.
Use Delta Lake tables for streaming data.
Query Delta Lake tables from a Synapse Analytics SQL pool.

Module 10: Analyse data in a relational data warehouse

Relational data warehouses are a core element of most enterprise Business Intelligence (BI) solutions, and are used as the basis for data models, reports, and analysis.

In this module, you'll learn how to:

Design a schema for a relational data warehouse.
Create fact, dimension, and staging tables.
Use SQL to load data into data warehouse tables.
Use SQL to query relational data warehouse tables.

Module 11: Load data into a relational data warehouse

A core responsibility for a data engineer is to implement a data ingestion solution that loads new data into a relational data warehouse.

In this module, you'll learn how to:

Load staging tables in a data warehouse
Load dimension tables in a data warehouse
Load time dimensions in a data warehouse
Load slowly changing dimensions in a data warehouse
Load fact tables in a data warehouse
Perform post-load optimizations in a data warehouse

Module 12: Build a data pipeline in Azure Synapse Analytics

Pipelines are the lifeblood of a data analytics solution. Learn how to use Azure Synapse Analytics pipelines to build integrated data solutions that extract, transform, and load data across diverse systems.

In this module, you will learn how to:

Describe core concepts for Azure Synapse Analytics pipelines.
Create a pipeline in Azure Synapse Studio.
Implement a data flow activity in a pipeline.
Initiate and monitor pipeline runs.

Module 13: Use Spark Notebooks in an Azure Synapse Pipeline

Apache Spark provides data engineers with a scalable, distributed data processing platform, which can be integrated into an Azure Synapse Analytics pipeline.

In this module, you will learn how to:

Describe notebook and pipeline integration.
Use a Synapse notebook activity in a pipeline.
Use parameters with a notebook activity.

Module 14: Plan hybrid transactional and analytical processing using Azure Synapse Analytics

Learn how hybrid transactional / analytical processing (HTAP) can help you perform operational analytics with Azure Synapse Analytics.

After completing this module, you'll be able to:

Describe Hybrid Transactional / Analytical Processing patterns.
Identify Azure Synapse Link services for HTAP.

Module 15: Implement Azure Synapse Link for SQL

Azure Synapse Link for SQL enables low-latency synchronization of operational data in a relational database to Azure Synapse Analytics.

In this module, you'll learn how to:

Understand key concepts and capabilities of Azure Synapse Link for SQL.
Configure Azure Synapse Link for Azure SQL Database.
Configure Azure Synapse Link for Microsoft SQL Server.

Module 16: Get started with Azure Stream Analytics

Azure Stream Analytics enables you to process real-time data streams and integrate the data they contain into applications and analytical solutions.

In this module, you'll learn how to:

Understand data streams.
Understand event processing.
Understand window functions.
Get started with Azure Stream Analytics.

Module 17: Ingest streaming data using Azure Stream Analytics and Azure Synapse Analytics

Azure Stream Analytics provides a real-time data processing engine that you can use to ingest streaming event data into Azure Synapse Analytics for further analysis and reporting.

After completing this module, you'll be able to:

Describe common stream ingestion scenarios for Azure Synapse Analytics.
Configure inputs and outputs for an Azure Stream Analytics job.
Define a query to ingest real-time data into Azure Synapse Analytics.
Run a job to ingest real-time data, and consume that data in Azure Synapse Analytics.

Module 18: Visualise real-time data with Azure Stream Analytics and Power BI

By combining the stream processing capabilities of Azure Stream Analytics and the data visualization capabilities of Microsoft Power BI, you can create real-time data dashboards.

In this module, you'll learn how to:

Configure a Stream Analytics output for Power BI.
Use a Stream Analytics query to write data to Power BI.
Create a real-time data visualization in Power BI.

Module 19: Introduction to Microsoft Purview

In this module, you'll evaluate whether Microsoft Purview is the right choice for your data discovery and governance needs.

By the end of this module, you'll be able to:

Evaluate whether Microsoft Purview is appropriate for your data discovery and governance needs.
Describe how the features of Microsoft Purview work to provide data discovery and governance.

Module 20: Integrate Microsoft Purview and Azure Synapse Analytics

Learn how to integrate Microsoft Purview with Azure Synapse Analytics to improve data discoverability and lineage tracking.

After completing this module, you'll be able to:

Catalog Azure Synapse Analytics database assets in Microsoft Purview.
Configure Microsoft Purview integration in Azure Synapse Analytics.
Search the Microsoft Purview catalog from Synapse Studio.
Track data lineage in Azure Synapse Analytics pipelines activities.

Module 21: Explore Azure Databricks

Azure Databricks is a cloud service that provides a scalable platform for data analytics using Apache Spark.

In this module, you'll learn how to:

Provision an Azure Databricks workspace.
Identify core workloads and personas for Azure Databricks.
Describe key concepts of an Azure Databricks solution.

Module 22: Use Apache Spark in Azure Databricks

Azure Databricks is built on Apache Spark and enables data engineers and analysts to run Spark jobs to transform, analyse and visualise data at scale.

In this module, you'll learn how to:

Describe key elements of the Apache Spark architecture.
Create and configure a Spark cluster.
Describe use cases for Spark.
Use Spark to process and analyse data stored in files.
Use Spark to visualise data

Module 23: Run Azure Databricks Notebooks with Azure Data Factory

Using pipelines in Azure Data Factory to run notebooks in Azure Databricks enables you to automate data engineering processes at cloud scale.

In this module, you'll learn how to:

Describe how Azure Databricks notebooks can be run in a pipeline.
Create an Azure Data Factory linked service for Azure Databricks.
Use a Notebook activity in a pipeline.
Pass parameters to a notebook.

Get This Course 30 730 kr

4-Day Instructor-Led Training Course
Microsoft Official Courseware
One-on-one after-course coaching included
MOC exam voucher included in course tuition

Get the Advantage 35 495 kr

Take this course and gain unlimited access to more than 300 virtual instructor-led courses
Future-proof your career with more than 100 sought-after certifications in the market
Build real skills through hands-on learning in more than 180 virtual labs
Grow your skills and capabilities with more than one course at a time and save

Thank you for choosing subscriptions!

Include dates with afternoon start times

#8595

maj 21 - 24 15:00 - 22:30 CEST

AnyWare
jun 4 - 7 10:00 - 17:30 CEST

London or AnyWare
jun 4 - 7 9:00 - 16:30 CEST

Stockholm or AnyWare
jun 11 - 14 15:00 - 22:30 CEST

New York or AnyWare
jun 25 - 28 15:00 - 22:30 CEST

AnyWare
aug 6 - 9 10:00 - 17:30 CEST

London or AnyWare
aug 6 - 9 9:00 - 16:30 CEST

Stockholm or AnyWare
aug 6 - 9 15:00 - 22:30 CEST

Herndon, VA or AnyWare
aug 20 - 23 15:00 - 22:30 CEST

Toronto or AnyWare
aug 27 - 30 16:00 - 23:30 CEST

Austin or AnyWare
okt 1 - 4 10:00 - 17:30 CEST

London or AnyWare
okt 1 - 4 9:00 - 16:30 CEST

Stockholm or AnyWare
okt 29 - nov 1 14:00 - 21:30 CET

New York or AnyWare
dec 3 - 6 10:00 - 17:30 CET

London or AnyWare
dec 3 - 6 9:00 - 16:30 CET

Stockholm or AnyWare
dec 10 - 13 15:00 - 22:30 CET

Herndon, VA or AnyWare
jan 7 - 10 15:00 - 22:30 CET

Toronto or AnyWare
jan 21 - 24 16:00 - 23:30 CET

Austin or AnyWare
feb 4 - 7 9:00 - 16:30 CET

Stockholm or AnyWare
feb 4 - 7 10:00 - 17:30 CET

London or AnyWare
mar 25 - 28 14:00 - 21:30 CET

New York or AnyWare

Scroll to view additional course dates

Bring this or any training to your organisation
Full-scale programme development
Delivered when, where, and how you want it
Blended learning models
Tailored content
Expert team coaching
MOC exam voucher included in course tuition

Get the Advantage 35 495 kr

Take this course and gain unlimited access to more than 300 virtual instructor-led courses
Future-proof your career with more than 100 sought-after certifications in the market
Build real skills through hands-on learning in more than 180 virtual labs
Grow your skills and capabilities with more than one course at a time and save

Thank you for choosing subscriptions!

#8595

Questions about this course?

Customise Your Team Training Experience

Fill out the form below or call +46 (0)8 506 668 00

* Required Fields

Preferred method of contact:

Phone

Need Help Finding The Right Training Solution?

Our training advisors are here for you.

Azure Data Engineering Training FAQs

Microsoft Azure Data Engineering Training (DP-203) teaches participants how to design and implement data solutions using Azure services.

The primary audience for this course is data professionals, data architects, and business intelligence professionals who want to learn about data engineering and building analytical solutions using data platform technologies that exist on Microsoft Azure. The secondary audience for this course data analysts and data scientists who work with analytical solutions built on Microsoft Azure.

The course covers a variety of topics related to data engineering, including designing data storage solutions, ingesting and processing data, implementing data security, and monitoring and optimizing data solutions.

Participants will gain skills in designing and implementing data solutions using Azure services, as well as in data processing and transformation, data security, and monitoring and optimization.

The course is live and instructor-led over four days.

The course is online and consists of modules that include videos, readings, and hands-on labs. Participants can work through the course at their own pace.

Yes, participants who complete the course can take the DP-203 exam to earn the Microsoft Certified: Azure Data Engineer Associate certification.

Participants should have experience working with data, including data processing and transformation. Some familiarity with Azure services is also helpful. Specifically, you should have completed both Learning Tree course 8566, Microsoft Azure Fundamentals Training (AZ-900T00) and Learning Tree course 8586, Microsoft Azure Data Fundamentals Training (DP-900).

Yes, Exam DP-203 replaced both Exam DP-200 and Exam DP-201, which retired on June 30, 2021.

Please reach out to info@learningtree.com after your course to obtain your exam voucher.

Data Engineering on Microsoft Azure (DP-203)

Azure Data Engineering Training Delivery Methods

Azure Data Engineering Training Information

Azure Data Engineering Training Outline

Module 1: Introduction to data engineering on Azure

Module 2: Introduction to Azure Data Lake Storage Gen2

Module 3: Introduction to Azure Synapse Analytics

Module 4: Use Azure Synapse serverless SQL pool to query files in a data lake

Module 5: Use Azure Synapse serverless SQL pools to transform data in a data lake

Module 6: Create a lake database in Azure Synapse Analytics

Module 7: Analyse data with Apache Spark in Azure Synapse Analytics

Module 8: Transform data with Spark in Azure Synapse Analytics

Module 9: Use Delta Lake in Azure Synapse Analytics

Module 10: Analyse data in a relational data warehouse

Module 11: Load data into a relational data warehouse

Module 12: Build a data pipeline in Azure Synapse Analytics

Module 13: Use Spark Notebooks in an Azure Synapse Pipeline

Module 14: Plan hybrid transactional and analytical processing using Azure Synapse Analytics

Module 15: Implement Azure Synapse Link for SQL

Module 16: Get started with Azure Stream Analytics

Module 17: Ingest streaming data using Azure Stream Analytics and Azure Synapse Analytics

Module 18: Visualise real-time data with Azure Stream Analytics and Power BI

Module 19: Introduction to Microsoft Purview

Module 20: Integrate Microsoft Purview and Azure Synapse Analytics

Module 21: Explore Azure Databricks

Module 22: Use Apache Spark in Azure Databricks

Module 23: Run Azure Databricks Notebooks with Azure Data Factory

Upgrade your course experience with: Subscriptions

Get the Advantage 35 495 kr

Thank you for choosing subscriptions!

Upgrade your course experience with: Subscriptions

Get the Advantage 35 495 kr

Thank you for choosing subscriptions!

Azure Data Engineering Training FAQs

What is Azure Data Engineering Training (DP-203)?

Who is this course intended for?

What topics are covered in the course?

What skills will I gain from this course?

How long is the course?

What is the format of the course?

Is there a certification exam for this course?

What are the prerequisites for taking this course?

Does the DP-203 Microsoft Certified Exam replace both Exams DP-200 and DP-201?

How do I access my Microsoft Exam Voucher?

Related Courses