• Welcome to CloudMonks
  • +91 96660 64406 | +91 9849668223
  • info@thecloudmonks.com

Azure Data Factory Training

About The Course

Azure data engineers are responsible for data-related tasks that include provisioning data storage services, batch data and ingesting streaming, implementing security requirements, transforming data, implementing data retention policies, identifying performance bottlenecks, and accessing external data sources. In the world of big data, raw, unorganized data is often stored in relational, non-relational, and other storage systems. However, on its own, raw data doesn't have the proper context or meaning to provide meaningful insights to analysts, data scientists, or business decision makers.

Big data requires a service that can orchestrate and operationalize processes to refine these enormous stores of raw data into actionable business insights. Azure Data Factory Training in Hyderabad is a managed cloud service that's built for these complex hybrid extract-transform-load (ETL), extract-load-transform (ELT), and data integration projects.

For example, imagine a gaming company that collects petabytes of game logs that are produced by games in the cloud. The company wants to analyze these logs to gain insights into customer preferences, demographics, and usage behavior. It also wants to identify up-sell and cross-sell opportunities, develop compelling new features, drive business growth, and provide a better experience to its customers.

To analyze these logs, the company needs to use reference data such as customer information, game information, and marketing campaign information that is in an on-premises data store. The company wants to utilize this data from the on-premises data store, combining it with additional log data that it has in a cloud data store.

To extract insights, it hopes to process the joined data by using a Spark cluster in the cloud (Azure HDInsight), and publish the transformed data into a cloud data warehouse such as Azure Synapse Analytics to easily build a report on top of it. They want to automate this workflow, and monitor and manage it on a daily schedule. They also want to execute it when files land in a blob store container.

Best ADF Training in Hyderabd is the platform that solves such data scenarios. It is the cloud-based ETL and data integration service that allows you to create data-driven workflows for orchestrating data movement and transforming data at scale. Using Azure Data Factory, you can create and schedule data-driven workflows (called pipelines) that can ingest data from disparate data stores. You can build complex ETL processes that transform data visually with data flows or by using compute services such as Azure HDInsight Hadoop, Azure Databricks, and Azure SQL Database.


Azure Data Factory:

Module 1: Cloud Computing Concepts

  • What is the "Cloud"?
  • Why Cloud Services
  • Types of cloud services
    • Infrastructure as a Service(IaaS)
    • Platform as a Service(PaaS)
    • Software as a Service(SaaS)

Module 2:BigData Introduction

  • What is BigData?
  • Characteristics of Bigdata
  • Types Of BigData
    • Structured Data
    • Unstructured Data
    • Semi Structured Data

Module 3: Azure Cloud Storage Technologies

  • Azure Blob Storage
  • Azure Data Lake Storage Gen1
  • Azure Data Lake Storage Gen2
  • Azure SQL Database
  • Synapse Dedicated SQL Pool

Module 4: Azure Blob Storage

  • Storage Account
  • Containers
  • Types Of Blobs
  • Performance Tiers
  • Access Tiers
  • Data Replication Policies

Module 5: Azure Data Lake Storage Gen2

  • Enable Hierarchical Name Space
  • Access Control List (ACL)
  • Features of ADLS Gen2

Module 6: Azure SQL Database

  • Compute & Storage Configurations
  • vCore Based Purchasing Model
  • DTU Based Purchasing Model
  • Firewall Rules

Module 7: Azure Data Factory Introduction

  • What is Azure Data Factory(ADF)?
  • Azure Data Factory Key Components
    • Pipeline
    • Activity
    • Linked Service
    • Data Set
    • Integration Runtime
    • Triggers
    • Data Flows

Module 8: Working with Copy Activity

  • Create Linked services for data stores and compute
  • Creation of Datasets that points to File and Table
  • Design Pipelines with Copy activities
  • Define Copy activity and it features
  • Copy Activity-Copy Behaviour
  • Copy Activity_Data Integration Units
  • Copy Activity- User Properties
  • Copy Activity- Number of Parallel Copies
  • Monitoring Pipeline
  • Debug Pipeline
  • Trigger pipeline manually

Module 9 :ADF-Activities

  • Lookup Activity
  • Getmeta Data Activity
  • Delete Activity
  • Dataflow Activity
  • Excute Pipeline Activity
  • Appened Variable Activity
  • Fail Activity
  • Store procedure Activity
  • Set Variable activity
  • Validation Activity
  • Web Activity
  • Wait Activity
  • Script Activity
  • Filter Activity
  • ForEach Activity
  • If Condition Activity
  • Switch Activity
  • Until Activity
  • Notebook Activity

Module 10 : Practical Scenarios and Use Cases

  • ADF_PracticeSession1_Blob_To_Blob
  • ADF_PracticeSession2_Blob_To_Azure_SQLDB
  • ADF_PracticeSession3_Dataset_Parameters_Blob_To_Azure_SQLDB
  • ADF_PracticeSession4_Pipeline_Dataset_LinkedService_Parameters
  • ADF_PracticeSession5_FilteringFileFormats_Getmetadata
    Filter_ForEach_Copy_Activity
  • ADF_PracticeSession6_BulkCopy_Tables_Files
  • ADF_PracticeSession7_Container_Parameterization
    Blob_To_Blob_Storage
  • ADF_PracticeSession8_ExecuteCopyActivity_BasedOnFileCount
  • ADF_PracticeSession9_StoredProcedures_Parameters
  • ADF_PracticeSession11_CopyActivity
    CustomSQL_Queries_StoredProcedures
  • ADF_PracticeSession12_Pipeline_Audit_Log
  • ADF_PracticeSession13_Copybehaviour
  • ADF_PracticeSession14_CopyDataTool
  • ADF_PracticeSession15_Custom_Email_Notification
  • ADF_PracticeSession16_AzureKeyVault_Integration
  • ADF_PracticeSession17_Incremental_Load
  • ADF_PracticeSession18_On-Premise_SQLServer_ADLS_Gen2
  • ADF_PracticeSession19_On-Premise_FileSystem_ADLS_Gen2
  • ADF_PracticeSession20_Eventbased_Trigger
  • ADF_PracticeSession21_Scheduled_Trigger
  • ADF_PracticeSession22_TumblingWindow_Trigger
  • ADF_PracticeSession23_Dataflows_Select_Filter
    DerivedColumn_Transformation
  • ADF_PracticeSession24_Dataflows_Select__DerivedColumn
    Aggregator_Sort_Transformation
  • ADF_PracticeSession25_Dataflows_ConditionalSplit_Transformation
  • ADF_PracticeSession26_Dataflows_Join_Transformation
  • ADF_PracticeSession27_Dataflows_Union_Transformation
  • ADF_PracticeSession28_Dataflows_Lookup_Transformation
  • ADF_PracticeSession29_Dataflows_Exists_Transformation
  • ADF_PracticeSession30_Slowly Changing Dimension Type1 (SCD1)
    with HashKey Function
  • ADF_PracticeSession31_Slowly Changing Dimension Type2
  • ADF_PracticeSession32_Copy_JSON_File_To_AzureSQL
  • ADF_PracticeSession33_Copy Data Activity_Excel File Formats
  • ADF_PracticeSession34_Copy Data Activity_Excel File Formats
    Activity_Pipeline Variables
  • ADF_PracticeSession35_Copy Data Activity_XML File Formats
  • Assignments-PracticeSessions

  • ADF_Assignment1_CopyActivity_Prefix_Wildcard_FilePath_Blob_To_Blob
  • ADF_Assignment2_Add_AdditionalColumns_WhileCopyingData
  • ADF_Assignment3_CSV_To_JSON_Format
  • ADF_Assignment4_Blob_SQLDB_Executepipeline_Activity
  • ADF_Assignment5_SQLDB_BLOB_Overwrite_Append_Mode
  • ADF_Assignment6_Dataflows_Rank_Transformation
  • ADF_Assignment7_Dataflows_Pivot_Transformation
  • ADF_Assignment8_Dataflows_UnPivot_Transformation
  • ADF_Assignment9_Dataflows_SurrogateKey_Transformation
  • ADF_Assignment10_Dataflows_Windows_Transformation
  • ADF_Assignment11_Dataflows_AlterRow_Transformation
  • ADF_Assignment12_Switch Activity-Move and delete data
  • ADF_Assignment13_Until Activity-Parameters & Variables
  • ADF_Assignment14_Remove Duplicate rows using data flows
  • ADF_Assignment15_AWS_S3_Integration
  • ADF_Assignment16_GCP_Integration
  • ADF_Assignment17_Snowflake_Integration

Azure Databricks:

Module 1: Introduction to Azure Databricks

  • Introduction to Databricks

Module 2:Databricks Cluster Management

  • Creating and configuring clusters
  • Managing Clusters
    • Displaying clusters
    • Starting a cluster
    • Terminating a cluster
    • Delete a cluster
    • Cluster Information
    • Cluster logs
    • Cluster access control
  • Types of Clusters
    • All purpose clusters
    • Job cluster
  • Databricks Pools
    • Databricks without pools
    • Databricks with Pools
  • Clusters Mode
    • Standard
    • High Concurrency
    • Single Node
  • Autoscalling
  • Databricks runtime versions
  • Multiuser Clusters

Module 3:Azure Databricks Integration with Azure Blob Storage

  • Mount Azure Blob Storage To DBFS
  • Access Blob Storage Using Direct Connection -Account Access Key

Module 4: Azure Databricks Integration with Azure Data Lake Storage Gen2

  • Mount ADLS Gen2 To DBFS Using OAuth2.0 With Service Principal
  • Access ADL Gen2 To Using Direct Connection

Module 5:Databricks Integration with Azure SQL Database

  • Reading and Writing data from Azure SQL Database

Module 6:Databricks Integration with Azure Synapse

  • Reading and Writing Azure Synapse data from Azure Databricks

Module 7:Azure Databricks-CSV File Formats

  • Read CSV Files
  • Read TSV Files
  • PIPE Seperated CSV Files

Azure Synapse Analytics:

Module 1: Introduction To Azure Synapse

  • Technical requirements
  • Interdiction the components of Azure synapse
  • Creating synapse Workspace
  • Understanding Azure Data Lake Exploring Synapse Studio

Module 2: Using Synapse Pipelines To Orchestarte Your Data

  • Technical requirements
  • Introducing synapse pipe lines
    • Integration runtime
    • Activities
    • Pipelines
    • Triggers
  • Creating linked services
  • Defining source and target
  • Using various activities in synapse pipelines
  • Scheduling synapse pipelines
  • Creating pipelines using samples

Train your teams on the theory and enable technical mastery of cloud computing courses essential to the enterprise such as security, compliance, and migration on AWS, Azure, and Google Cloud Platform.

Talk With Us