Tilbud

Azure Integration Pipelines Deep-Dive

Underviser
Paul Andrew Solution Architect & Data Platform MVP
Start
6. december 2021 09:00
Slut
7. december 2021 16:30
Adresse
Østergade 10, 8000 Aarhus   Vis kort

9.000 DKK

Status

Detaljer

In this course we’ll quickly cover the fundamentals of data integration pipelines before going much deeper into our Azure resources.

Within a typical Azure data platform solution for any enterprise grade data analytics or data science workload an umbrella resource is needed to trigger, monitor, and handle the control flow for transforming datasets, with the goal being actionable data insight. Those requirements are met by deploying Azure Data Integration pipelines, delivered using Azure Synapse Analytics or Azure Data Factory. In this session, Paul Andrew will show you how to create rich dynamic data pipelines and apply these orchestration resources in production. Using scaled architecture design patterns, best practice, data mesh principals, and the latest metadata driven frameworks. We will take a deep dive into the services, considering how to build custom activities, complex pipelines and think about hierarchical design patterns for enterprise grade deployments. All this and more in a complete set of 12 modules (based on real world experience) we will take you through how to implement data integration pipelines in production and delivered advanced orchestration patterns.

If that’s not enough learning for you, a set of hands-on labs will also be made available that you can work through at your own pace. You will leave this course with new skills, ideas, and a much deeper understanding of the resources for your future data platform projects.

Prerequisites

If you’ve a never used Azure Data Integration Pipelines before in either Azure Data Factory or Azure Synapse Analytics, but your a fast learner – that’s ok! However, please watch Paul’s 1 hour complete introduction session, recorded as part of a recent community MeetUp: https://mrpaulandrew.com/2021/08/23/an-introduction-to-azure-data-integration-pipelines/

Agenda

The following offers an insight into the complete agenda and module breakdown for this course.

Module 1: Pipeline Fundamentals

  • The History of Azure Orchestration
  • Synapse Analytics vs Data Factory
  • Integration Components
  • Common Activities
  • Execution Dependencies

Module 2: Integration Runtime Design Patterns

  • Compute Types
    • Azure
    • Hosted
    • SSIS
  • Patterns & Configuration

Module 3: Data Transformation

  • Data Flows
  • Power Query Injection
  • Spark Configuration
  • Use Cases

Module 4: Dynamic Pipelines

  • Expressions & Interpolation
  • Dynamic Content Chains
  • Metadata Driven
  • Orchestration Framework – procfwk.com

Module 5: Execution Parallelism

  • Control Flow Scale Out
  • Concurrency Limitations
  • Internal vs External Activities
  • Decoupling Pipeline Workloads

Module 6: Pipeline Extensibility

  • Azure Batch Service
    • Tasks
    • Compute Pools
    • Scaling
  • Pipeline Custom Activities

Module 7: VNet Integration

  • Private Endpoints
  • Managed VNet’s
  • Firewall Bypass

Module 8: Security

  • Managed Identities vs Service Principals
  • Azure Key Vault Backing
  • Pipeline Access & Permissions

Module 9: Monitoring & Alerting

  • Portal Monitoring
  • Log Analytics & Kusto Queries
  • Operational Dashboards
  • Advanced Alerting

Module 10: CI/CD

  • Source Control vs Developer UI
  • Basic ARM Template Deployments
  • Advanced Deployment Patterns

Module 11: Solution Testing

  • Development Time Validation
  • Test Coverage
  • NUnit Tests

Module 12: Final Thoughts

  • Running Costs
  • Conclusions
  • Best Practices