Azure Event Hubs
Streams real-time data from Azure Event Hubs for event ingestion and processing.
Sync Type: Incremental
Overview
The Azure Event Hubs input connector establishes a streaming connection to Azure Event Hubs to ingest real-time event data. It processes events from all partitions concurrently and maintains checkpoint state to ensure reliable data consumption without duplication.
Prerequisites
Before setting up the Azure Event Hubs input, ensure you have:
- A Microsoft Azure account with an active subscription
- An Azure Event Hubs namespace and Event Hub instance
- A registered application in Azure Entra ID with appropriate permissions
- Access to the Event Hub's connection details
Required Permissions
The application requires the following permissions:
- Azure Event Hubs Data Receiver role on the Event Hubs namespace or specific Event Hub
- Access to the specified consumer group (default:
$Default)
Setting up API Access
1. Create an Event Hubs Namespace and Event Hub
- Navigate to the Azure Portal
- Create an Event Hubs namespace:
- Go to "Create a resource" → "Integration" → "Event Hubs"
- Configure the namespace settings (name, subscription, resource group, location)
- Choose an appropriate pricing tier
- Create an Event Hub within the namespace:
- Navigate to your Event Hubs namespace
- Click "Event Hubs" → "+ Event Hub"
- Configure the Event Hub (name, partition count, message retention)
2. Register Application in Azure Entra ID
- Open the App Registration page in Azure Entra ID
- Select "New Registration"
- Add a name for your application
- Click "Register"
- Save the
Application (client) IDandDirectory (tenant) ID - Navigate to "Certificates & secrets"
- Click "New client secret"
- Add a description and set expiration
- Save the client secret value immediately (it won't be visible again)
3. Assign Event Hubs Permissions
- Navigate to your Event Hubs namespace in the Azure Portal
- Click "Access control (IAM)" in the left menu
- Select "Add" → "Add role assignment"
- Choose the "Azure Event Hubs Data Receiver" role
- Click "Next"
- Select "User, group, or service principal"
- Click "Select members"
- Search for your registered application name
- Select the application and click "Select"
- Click "Review + assign"
Configuration
Settings
| Setting | Type | Required | Description |
|---|---|---|---|
| Tenant ID | string | Yes | The Azure Entra ID tenant (directory) ID where your application is registered |
| Subscription ID | string | Yes | The Azure subscription ID containing your Event Hubs namespace |
| Event Hub Namespace | string | Yes | The fully qualified namespace URL (e.g., your-namespace.servicebus.windows.net) |
| Event Hub Name | string | Yes | The name of the specific Event Hub to consume from |
| Consumer Group | string | Yes | The consumer group name for reading events (default: $Default) |
| Lookback Duration | integer | Yes | The duration to look back for events in minutes (default: 60 minutes) |
| Use Synthetic Data | boolean | No | Generate synthetic demo data instead of connecting to the real Event Hub |
Secrets
| Setting | Type | Required | Description |
|---|---|---|---|
| Client ID | string | Yes | The application (client) ID from your Azure Entra ID app registration |
| Client Secret | string | Yes | The client secret value from your Azure Entra ID app registration |
How It Works
Streaming Process
- Connection Initialization: Establishes authenticated connection to Event Hubs using client credentials
- Partition Discovery: Automatically discovers all partitions within the specified Event Hub
- Checkpoint Loading: Loads previous consumption state from persistent storage
- Parallel Processing: Creates separate consumers for each partition to process events concurrently
- Event Streaming: Continuously receives events from all partitions.
- State Management: Updates checkpoints after processing each batch to track progress
Checkpoint Management
- Per-Partition Tracking: Maintains separate checkpoints for each Event Hub partition
- Automatic Recovery: Resumes from last processed event after restarts
- First-Time Setup: Starts from earliest available events when no checkpoint exists
- Persistent Storage: Saves checkpoint state to prevent data loss during interruptions
Performance Considerations
- Partition Count: Higher partition counts enable greater parallelism and throughput
- Consumer Group: Use dedicated consumer groups to avoid conflicts with other applications
- Batch Size: Processes up to 100 events per batch for optimal performance
Troubleshooting
Common Issues
-
Authentication Errors:
- Verify client ID and secret are correct
- Ensure application has "Azure Event Hubs Data Receiver" role
- Check tenant ID matches your Azure subscription
-
Connection Failures:
- Verify Event Hub namespace URL format
- Ensure Event Hub name exists in the namespace
- Check network connectivity and firewall rules
-
No Events Received:
- Verify events are being sent to the Event Hub
- Check consumer group configuration
- Ensure partition has available events
-
Ownership Lost Errors:
- Another consumer in the same group may be competing
- Use dedicated consumer groups for Monad