Skip to main content

Azure Event Hubs

Streams real-time data from Azure Event Hubs for event ingestion and processing.

Sync Type: Incremental

Overview

The Azure Event Hubs input connector establishes a streaming connection to Azure Event Hubs to ingest real-time event data. It processes events from all partitions concurrently and maintains checkpoint state to ensure reliable data consumption without duplication.

Prerequisites

Before setting up the Azure Event Hubs input, ensure you have:

  1. A Microsoft Azure account with an active subscription
  2. An Azure Event Hubs namespace and Event Hub instance
  3. A registered application in Azure Entra ID with appropriate permissions
  4. Access to the Event Hub's connection details

Required Permissions

The application requires the following permissions:

  • Azure Event Hubs Data Receiver role on the Event Hubs namespace or specific Event Hub
  • Access to the specified consumer group (default: $Default)

Setting up API Access

1. Create an Event Hubs Namespace and Event Hub

  1. Navigate to the Azure Portal
  2. Create an Event Hubs namespace:
    • Go to "Create a resource" → "Integration" → "Event Hubs"
    • Configure the namespace settings (name, subscription, resource group, location)
    • Choose an appropriate pricing tier
  3. Create an Event Hub within the namespace:
    • Navigate to your Event Hubs namespace
    • Click "Event Hubs" → "+ Event Hub"
    • Configure the Event Hub (name, partition count, message retention)

2. Register Application in Azure Entra ID

  1. Open the App Registration page in Azure Entra ID
  2. Select "New Registration"
  3. Add a name for your application
  4. Click "Register"
  5. Save the Application (client) ID and Directory (tenant) ID
  6. Navigate to "Certificates & secrets"
  7. Click "New client secret"
  8. Add a description and set expiration
  9. Save the client secret value immediately (it won't be visible again)

3. Assign Event Hubs Permissions

  1. Navigate to your Event Hubs namespace in the Azure Portal
  2. Click "Access control (IAM)" in the left menu
  3. Select "Add" → "Add role assignment"
  4. Choose the "Azure Event Hubs Data Receiver" role
  5. Click "Next"
  6. Select "User, group, or service principal"
  7. Click "Select members"
  8. Search for your registered application name
  9. Select the application and click "Select"
  10. Click "Review + assign"

Configuration

Settings

SettingTypeRequiredDescription
Tenant IDstringYesThe Azure Entra ID tenant (directory) ID where your application is registered
Subscription IDstringYesThe Azure subscription ID containing your Event Hubs namespace
Event Hub NamespacestringYesThe fully qualified namespace URL (e.g., your-namespace.servicebus.windows.net)
Event Hub NamestringYesThe name of the specific Event Hub to consume from
Consumer GroupstringYesThe consumer group name for reading events (default: $Default)
Lookback DurationintegerYesThe duration to look back for events in minutes (default: 60 minutes)
Use Synthetic DatabooleanNoGenerate synthetic demo data instead of connecting to the real Event Hub

Secrets

SettingTypeRequiredDescription
Client IDstringYesThe application (client) ID from your Azure Entra ID app registration
Client SecretstringYesThe client secret value from your Azure Entra ID app registration

How It Works

Streaming Process

  1. Connection Initialization: Establishes authenticated connection to Event Hubs using client credentials
  2. Partition Discovery: Automatically discovers all partitions within the specified Event Hub
  3. Checkpoint Loading: Loads previous consumption state from persistent storage
  4. Parallel Processing: Creates separate consumers for each partition to process events concurrently
  5. Event Streaming: Continuously receives events from all partitions.
  6. State Management: Updates checkpoints after processing each batch to track progress

Checkpoint Management

  • Per-Partition Tracking: Maintains separate checkpoints for each Event Hub partition
  • Automatic Recovery: Resumes from last processed event after restarts
  • First-Time Setup: Starts from earliest available events when no checkpoint exists
  • Persistent Storage: Saves checkpoint state to prevent data loss during interruptions

Performance Considerations

  • Partition Count: Higher partition counts enable greater parallelism and throughput
  • Consumer Group: Use dedicated consumer groups to avoid conflicts with other applications
  • Batch Size: Processes up to 100 events per batch for optimal performance

Troubleshooting

Common Issues

  1. Authentication Errors:

    • Verify client ID and secret are correct
    • Ensure application has "Azure Event Hubs Data Receiver" role
    • Check tenant ID matches your Azure subscription
  2. Connection Failures:

    • Verify Event Hub namespace URL format
    • Ensure Event Hub name exists in the namespace
    • Check network connectivity and firewall rules
  3. No Events Received:

    • Verify events are being sent to the Event Hub
    • Check consumer group configuration
    • Ensure partition has available events
  4. Ownership Lost Errors:

    • Another consumer in the same group may be competing
    • Use dedicated consumer groups for Monad