In Azure Data Factory (ADF), triggers are used to schedule and automate pipeline execution. They allow pipelines to run on specific schedules, in response to events, or when manually invoked. Triggers help orchestrate and manage workflows without the need for manual intervention.
Types of Triggers in ADF
- Schedule Trigger
- Tumbling Window Trigger
- Event-based Trigger
1. Schedule Trigger
- Definition: Executes a pipeline at specific times and intervals.
- Key Features:
- Allows precise scheduling (e.g., daily at 12:00 AM, every 15 minutes, etc.).
- Supports advanced recurrence patterns like every Monday or the last day of the month.
- Does not retain state between runs.
- Use Cases:
- Running ETL pipelines daily or hourly.
- Periodic data processing.
- Example:
- Run a pipeline every day at midnight to process sales data.
2. Tumbling Window Trigger
- Definition: A time-based trigger that executes pipelines at fixed intervals and retains state information.
- Key Features:
- Works with a fixed-size, non-overlapping time window (e.g., hourly, daily).
- Allows processing data in chunks corresponding to the window.
- Supports backfilling (triggering past time windows).
- Use Cases:
- Processing time-series data or logs.
- Ensuring exactly-once execution for data pipelines.
- Example:
- Process IoT device logs every 30 minutes for the previous half-hour.
3. Event-Based Trigger
- Definition: Executes a pipeline in response to events like file creation or deletion in Azure Blob Storage or Azure Data Lake Storage Gen2.
- Key Features:
- Triggers pipelines dynamically based on file events.
- Works in near-real time for event-driven workflows.
- Supports filters to process specific files or folders.
- Use Cases:
- Ingesting and processing new files as they arrive in storage.
- Event-driven processing pipelines.
- Example:
- Trigger a pipeline whenever a new file is uploaded to a specific Blob Storage folder.
Comparison of Trigger Types
Feature | Schedule Trigger | Tumbling Window Trigger | Event-Based Trigger |
---|---|---|---|
Execution Time | Fixed schedule | Fixed time window | On specific events (e.g., file upload) |
State Retention | No | Yes | No |
Supports Backfill | No | Yes | No |
Use Case | Periodic jobs | Time-series or chunked data | Event-driven workflows |
Creating Triggers in ADF
1. Schedule Trigger
- Steps:
- Navigate to Author > Triggers > New Trigger.
- Choose “Schedule” and configure:
- Start time: When the trigger should start.
- Recurrence: Interval (e.g., every 1 hour, every day at 3 PM).
- Attach it to a pipeline.
2. Tumbling Window Trigger
- Steps:
- Navigate to Author > Triggers > New Trigger.
- Choose “Tumbling Window” and configure:
- Start time: When the window starts.
- Window size: Duration (e.g., 1 hour, 1 day).
- Dependency: Define if the trigger depends on previous windows.
- Attach it to a pipeline.
3. Event-Based Trigger
- Steps:
- Navigate to Author > Triggers > New Trigger.
- Choose “Event” and configure:
- Blob path begins with/ends with: Specify file naming patterns or paths.
- Event type: Choose between “Blob created” or “Blob deleted”.
- Attach it to a pipeline.
Best Practices for Triggers in ADF
- Use Tumbling Window for Time-Sensitive Data:
- Ideal for scenarios requiring incremental processing with a guarantee of no overlapping runs.
- Optimize Event-Based Triggers:
- Use filters like
prefix
andsuffix
to avoid triggering unnecessary pipeline executions.
- Use filters like
- Monitor Trigger Runs:
- Use the Monitor tab to track trigger executions and debug issues.
- Combine Trigger Types:
- Use a combination of triggers to handle different workloads (e.g., daily batch processing with Schedule Trigger and real-time file ingestion with Event-Based Trigger).
- Avoid Overlapping Windows:
- When using Tumbling Window, ensure windows do not overlap to prevent duplicate processing.
Common Scenarios and Examples
Example 1: Daily Data Load
- Use a Schedule Trigger to run a pipeline every day at midnight to copy data from SQL to Azure Blob Storage.
Example 2: Processing IoT Data
- Use a Tumbling Window Trigger to process hourly IoT device logs for real-time analytics.
Example 3: Real-Time File Processing
- Use an Event-Based Trigger to ingest and process a file as soon as it’s uploaded to a storage account.