Azure data engineers are responsible for data-related tasks that include provisioning data storage services, batch data and ingesting streaming, implementing security requirements, transforming data, implementing data retention policies, identifying performance bottlenecks, and accessing external data sources. In the world of big data, raw, unorganized data is often stored in relational, non-relational, and other storage systems. However, on its own, raw data doesn't have the proper context or meaning to provide meaningful insights to analysts, data scientists, or business decision makers.

Big data requires a service that can orchestrate and operationalize processes to refine these enormous stores of raw data into actionable business insights. Azure Data Factory Training in Hyderabad is a managed cloud service that's built for these complex hybrid extract-transform-load (ETL), extract-load-transform (ELT), and data integration projects.

For example, imagine a gaming company that collects petabytes of game logs that are produced by games in the cloud. The company wants to analyze these logs to gain insights into customer preferences, demographics, and usage behavior. It also wants to identify up-sell and cross-sell opportunities, develop compelling new features, drive business growth, and provide a better experience to its customers.

To analyze these logs, the company needs to use reference data such as customer information, game information, and marketing campaign information that is in an on-premises data store. The company wants to utilize this data from the on-premises data store, combining it with additional log data that it has in a cloud data store.

To extract insights, it hopes to process the joined data by using a Spark cluster in the cloud (Azure HDInsight), and publish the transformed data into a cloud data warehouse such as Azure Synapse Analytics to easily build a report on top of it. They want to automate this workflow, and monitor and manage it on a daily schedule. They also want to execute it when files land in a blob store container.

Best ADF Training in Hyderabd is the platform that solves such data scenarios. It is the cloud-based ETL and data integration service that allows you to create data-driven workflows for orchestrating data movement and transforming data at scale. Using Azure Data Factory, you can create and schedule data-driven workflows (called pipelines) that can ingest data from disparate data stores. You can build complex ETL processes that transform data visually with data flows or by using compute services such as Azure HDInsight Hadoop, Azure Databricks, and Azure SQL Database.

Azure Data Factory Course Curriculum

Module 1: Azure Data Factory Introduction

What is Azure Data Factory(ADF)?
Azure Data Factory Key Components

Pipeline
Activity
Linked Service
Data Set
Integration Runtime
Triggers
Data Flows

Create Azure Bolb Storage Account
Create Azure data lake Storage Gen2 Account
Create Azure SQL Database
Creation of Azure Data Factory Resourse

Module 3 : Azure Data Factory- General Activities

Lookup Activity
Get Metadata Activity
Stored Procedure Activity
Execute Pipeline Activity
Delete Activity
Set Variable Activity
Script Activity
Validation Activity
Web Activity
Wait Activity
Understanding of Each Activity
Filter Activity

Module 5 : Azure Data Factory - Types of Integration Runtimes

Azure IR (Auto Resolve Integration Runtime)
Selfhosted IR
SSIS IR

Module 2: Working with Copy Data Activity

Understanding Azure Data Factory UI
Data Ingestion from Blob Storage Service to Azure SQL Database
Data Ingestion from Azure Blob Storage to Data Lake Storage Gen2
Create Linked service for various data stores and compute
Creation of Datasets that points to file and table
Design Pipelines with various activities
Create SQL Server on Virtual Machines( On-Premise)
Define Copy activity and it features
Copy Activity-Copy Behaviour
Copy Activity_Data Integration Units
Copy Activity- User Properties
Copy Activity- Number of parallel copies

Module 4 : Azure Data Factory - Interation & Conditionals

Filter Activity
ForEach Activity
Switch Activity
if Condition Activity
Until Activity

Module 6 : Azure Data Factory - Types of Triggers

Stoarge Event Tigger
Schedule Trigger
Tumbling Window Trigger

Module 7 : Introduction to DataFlows

Filter Transformation
Select Transformation
Derived Column Transformation
Aggregator Transformation
Join Transformation
Union Transformation

Azure Data Factory Regular Class Practice Sessions

ADF_Session1_ADF Key Components_Copy Data from ADLS Gen2 To Azure SQL Database
ADF_Session2_Dynamically Ingest Data from Multiple Files to Multiple Tables
ADF_Session3_Dynamically Ingest Data from Multiple Files to Multiple Tables_Check File Existence
ADF_Session4_Dynamic Data Ingestion from Tables to Files Using Lookup Query
ADF_Session5_Data Factory Resource Integration with Azure Key Vault
ADF_Session6_Design Data flows with Transformation Logic
ADF_Session7_Design Pipeline With Execution Success and Error Logs
ADF_Session8_Design Pipeline that Implements Incremental Loading
ADF_Session9_Design Pipeline that send Email notifications On Success or Failure of Execution
ADF_Session10_Design Pipeline That Ingest Data from Excel File Format
ADF_Session11_Data Ingestion from JSON_XML_Parquet File Format
ADF_Session12_Data Ingestion from REST API to Azure Data Lake
ADF_Session13_Data Ingestion from On-Premise File System to Azure SQL Database
ADF_Session14_Data Ingestion from On-Premise SQL Server to Azure Data Lake
ADF_Session15_Create Folders Dynamically based on File's Last Modified Date
ADF_Session16_Create_Create Folders Dynamically based on Each File's Last Modified Data

Azure Data Factory_Assignments & Case Studies

ADF_Assignment1_Create Azure Blob Storage Account_Dala Lake Storage Gen2 Account
ADF_Assignment2_Create Azure SQL Database Instance
ADF_Assignment3_Data Ingestion_Copy Data Tool(CDT)
ADF_Assignment4_Add New Columns While Copying Data
ADF_Assignment5_CopyData Activity_Executepipeline Activity_ADLS Gen2_SQLDB
ADF_Assignment6_FilterFileFormats based on File Size and Delete Files from Source Storage
ADF_Assignment7_Insert Metadata_Get Metadata_Stored Procedure Activity
ADF_Assignment8_Insert Metadata_About CSV Files in Azure Storage_Get Metadata_Stored Procedure Activity
ADF_Assignment9_CopyData Activity_Linked Service_Dataset_Pipeline Parameters_Copy Multiple Files_To_Tables
ADF_Assignment10_Copy Data Activity_Copy Behaviour
ADF_Assignment11_Snowflake_Integration
ADF_Assignment12_Snowflake_To_ADLS_Gen2_StagedCopy
ADF_Assignment13_ADF_AWS_S3_Bucket_Integration
ADF_Assignment14_GCP_To_ADLS_Gen2_Integration
ADF_Assignment15_Dataflows_Rank Transformation
ADF_Assignment16_Dataflows_Parse Transformation
ADF_Assignment17_Dataflows_Stringfy Transformation
ADF_Assignment18_Dataflows_SurrogateKey_Transformation
ADF_Assignment19_Dataflows_Windows Transformation
ADF_Assignment20_Dataflows_Coniditional Split_Transformation
ADF_Assignment21_Dataflows_Aggregator_Sorter Transformation
ADF_Assignment22_Dataflows_Lookup Transformation
ADF_Assignment23_Dataflows_Exists Transformation
ADF_Assignment24_REST API Integration
ADF_Assignment25_Data Activity_Filter By Last Modified Date
ADF_Assignment26_Data Activity_Copy behaviour_Preserve Hierarchy_Flatten Hierarchy_Merge Files
ADF_Assignment27_Copy Data Activity_Filter By Last Modified Date_Dynamic Date Expressions
ADF_Assignment28_Copy Data from JSON File To Azure SQL Database Table
ADF_Assignment29_Execute Copy Data Activity based on File Count in the Container
ADF_Assignment30_Copy Data Activity_List of Files Configuration
ADF_Assignment31_Dataflows_Flatten Transformations
ADF_Assignment32_Dataflows_Pivot Transformations
ADF_Assignment33_Databricks Notebook_Integration with Azure Data Factory
ADF_Assignment34_Thumbling Window Trigger_Introduction
ADF_Assignment35_Implement_Thumbling Window Trigger
ADF_Assignment36_Differences Between Debug VS Tigger Now
ADF_Assignment37_Row Format Storage Internals
ADF_Assignment38_Columnar Format Storage Internals
ADF_Assignment39_Copy Data_On-premise File System To ADLS Gen2
ADF_Assignment40_Copy Data from On-premise To Azure Cloud Storages
ADF_Assignment41_Copy Data Activity_Excel File Formats
ADF_Assignment42_Copy Data Activity_Excel File Formats_Lookup Activity_Pipeline Variables
ADF_Assignment43_Copy Data Activity_XML File Formats
ADF_Assignment44_Insert the Metadata about a storage Container Dynamically using Parameterized Stored Procedure
ADF_Assignment45_Introduction To Slowly Changing Dimensions
ADF_Assignment46_Implementation of SCD Type1 Dimension
ADF_Assignment47_SCD Type2 Introduction
ADF_Assignment48_SCD Type2 Implementation
ADF_Assignment49_Copy Data from Azure Blob Storage To ADLS Gen2
ADF_Assignment50_Copy Data from Azure Data Lake Storage Gen2 To Azure SQL Database
ADF_Assignment51_Copy Data from Multiple Files(ADLS Gen2) To Multiple Tables(Azure SQL DB)
ADF_Assignment52_Copy Data Activity_Source File Path Type Configurations
ADF_Assignment53_Execution of Copy Data Activity based on File Count in the Container
ADF_Assignment54_Data Ingestion from JSON File Format to Table
ADF_Assignment55_Data Ingestion Using Copy Data Tool(CDT)
ADF_Assignment56_Create Dataflows_Select_Filter_Derived Column Transformation
ADF_Assignment57_Create Dataflows_Select_Filter_Derived Column Transformation
ADF_Assignment58_Create Dataflows_Join_Union Transformation
ADF_Assignment59_Configure_AzureDevOps GIT with Azure_Data_Factory

Azure Data Factory