Skip to main content

SQS S3 CloudTrail

Ingests AWS CloudTrail log files from S3 via Amazon Simple Queue Service (SQS).

Details

The Amazon SQS S3 CloudTrail connector is a specialized variant of the SQS S3 connector for ingesting AWS CloudTrail log files delivered to S3. It handles the CloudTrail-specific file structure (a top-level Records array containing many events per file) so events are delivered correctly through the data pipeline.

Use this connector when ingesting native AWS CloudTrail S3 deliveries. Use the standard SQS S3 connector for all other S3-based data sources.

How It Works

  1. S3 Event Generation: When CloudTrail delivers a log file to the S3 bucket, S3 generates an object-created event notification
  2. SQS Message Receipt: The notification is sent to the configured SQS queue as a JSON message
  3. Event Processing: The connector polls the SQS queue, receives S3 event messages, and parses them
  4. File Download: For each S3 event, the connector downloads the referenced object from S3
  5. Data Processing: Downloaded files are decompressed (if needed) and parsed according to the format settings
  6. Record Extraction: Individual records are extracted and sent to the data pipeline
  7. Cleanup: Successfully processed SQS messages are deleted from the queue

Prerequisites

  • An AWS account with existing S3 buckets and SQS queue, or permissions to create them
  • AWS CloudTrail trail configured to deliver log files to the S3 bucket
  • An IAM role that the platform can assume, with appropriate permissions to access both SQS and S3, or using static AWS credentials
  • S3 bucket configured to send event notifications to the SQS queue
  • Network connectivity between the platform and AWS

Setup Instructions

Step 1: Create IAM Policy

Create an IAM policy that grants the connector access to both S3 and SQS resources:

  1. IAM Role Assumption / Static Credentials
  2. Example permission to attach to the role/user:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"s3:GetObject",
"s3:ListBucket",
"sqs:ChangeMessageVisibility",
"sqs:DeleteMessage",
"sqs:GetQueueAttributes",
"sqs:GetQueueUrl",
"sqs:ReceiveMessage"
],
"Resource": "*"
}
]
}

Step 2: Create S3 Bucket (if needed)

Create an S3 bucket where CloudTrail will deliver log files (skip if you already have one):

  1. Sign in to the AWS Management Console at https://console.aws.amazon.com/s3.
  2. Click the AWS region drop-down list next to the account name in the upper right and select the desired region.
  3. Under General Purpose Buckets, click the Create Bucket button.
  4. Under General Configuration, enter a name for the bucket (e.g., my-cloudtrail-logs-bucket).
  5. Under Object Ownership, ensure ACLs Disabled is selected.
  6. Leave the remaining default selections unchanged and click the Create Bucket button.
  7. Note the bucket name and ARN for later use.

Step 3: Create SQS Queue

Set up an SQS queue to receive S3 bucket event notifications:

  1. Sign in to the AWS Management Console at https://console.aws.amazon.com/sqs.
  2. Under Get Started, click the Create Queue button.
  3. Under Details, enter a name for the queue (e.g., cloudtrail-events-queue).
  4. Under Configuration:
    • For Visibility Timeout: Enter 600 seconds (10 minutes)
    • For Message Retention Period: Enter 7 days
    • Set Receive message wait time to 20 seconds for long polling
  5. Under Access Policy > Choose Method, select the Advanced radio button.
  6. Delete the entire policy JSON and copy/paste the following policy:
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "example-statement-ID",
"Effect": "Allow",
"Principal": {
"Service": "s3.amazonaws.com"
},
"Action": [
"SQS:SendMessage"
],
"Resource": "arn:aws:sqs:{AWS_REGION}:{AWS_ACCOUNT_ID}:{QUEUE_NAME}",
"Condition": {
"ArnLike": {
"aws:SourceArn": "arn:aws:s3:*:*:{BUCKET_NAME}"
},
"StringEquals": {
"aws:SourceAccount": "{AWS_ACCOUNT_ID}"
}
}
}
]
}
  1. Replace the placeholders:
    • {AWS_REGION} with the AWS region (e.g., us-east-1)
    • {AWS_ACCOUNT_ID} with the 12-digit AWS account ID
    • {QUEUE_NAME} with the queue name
    • {BUCKET_NAME} with the S3 bucket name
  2. Click the Create Queue button.
  3. Note the queue URL and ARN for later use.

Step 4: Configure S3 Event Notifications

Configure the S3 bucket to send event notifications to the SQS queue:

  1. Sign in to the AWS Management Console at https://console.aws.amazon.com/s3.
  2. Under General Purpose Buckets, click on the source bucket.
  3. Click the Properties tab at the top.
  4. Locate the Event Notifications section and click Create Event Notification.
  5. Configure the event notification:
    • Event Name: Enter a descriptive name (e.g., cloudtrail-notification)
    • Event Types: Select the checkbox for All object create events
    • Prefix/Suffix: Optionally filter by object key prefix or suffix. CloudTrail delivers to AWSLogs/<account-id>/CloudTrail/<region>/ — set the prefix to AWSLogs/ (or narrower) if the bucket is shared with non-CloudTrail uploads.
    • Destination: Select the radio button for SQS queue
    • SQS Queue: Choose the queue from the dropdown or enter the queue ARN
  6. Click the Save Changes button.

Step 5: Verify Event Notifications

AWS sends a test notification to the queue after creation. To verify:

  1. Navigate to the SQS queue in the AWS Console.
  2. Under Details, locate Messages available, which should have a value of 1.
  3. This confirms that S3 can successfully send notifications to the queue.

Note: The test message is an s3:TestEvent payload, not a real S3 event. The connector ignores it once polling starts.

Step 6: Test the Configuration

Before configuring the connector in the platform, verify that the AWS setup is working:

  1. Wait for the next CloudTrail delivery (typically within 5–15 minutes), or push a CloudTrail-shaped log file to the bucket manually for faster feedback.
  2. Check the SQS queue to confirm a message was received:
    • Navigate to the SQS queue in the AWS Console
    • Under Details, verify Messages available has increased
  3. Use the IAM Policy Simulator (optional) to test permissions:

Settings

For native AWS CloudTrail S3 deliveries, the recommended configuration is Format=json, Compression=gzip, and an empty Record Location.

SettingTypeRequiredDescription
Queue URLstringYesThe URL of the SQS queue that receives S3 event notifications (e.g., https://sqs.us-east-1.amazonaws.com/123456789012/my-queue)
Role ARNstringNoThe ARN of the IAM role to assume for accessing SQS and S3 (e.g., arn:aws:iam::123456789012:role/s3-sqs-connector-role). Required when not using static credentials. See Authentication Methods.
RegionstringYesThe AWS region where the SQS queue and S3 buckets are located (e.g., us-east-1, us-west-2)
FormatstringYesFile format for processing S3 objects. Supported values: json, csv, wsv. For native CloudTrail files, use json.
CompressionstringYesCompression format of S3 objects. Supported values: gzip, auto, none. For native CloudTrail files delivered by AWS, use gzip.
Record LocationstringNoLocation of the record in the object using dot notation (e.g., data.records). Applies only for JSON objects. Leave empty for the entire record. If the location path doesn't exist on a given record, that record will be dropped. For native CloudTrail files, leave empty.
With MetadatabooleanNoIf enabled, adds a _monad_metadata field with bucket information to each emitted message (default: false)
Uses SNSbooleanNoIf enabled, expects SQS messages to be SNS envelopes wrapping S3 event payloads. Enable this only if the bucket sends notifications through SNS before reaching SQS (default: false)

Files are automatically decompressed before processing based on the compression setting in the connector configuration.

Important Notes

  • Event Filtering: Only messages with eventSource of aws:s3 are processed; other messages in the queue are ignored
  • Order Preservation: The order of events within the Records array is preserved
  • File Processing: Each S3 object is downloaded, decompressed, and parsed according to the format settings
  • Record Extraction: Individual records are extracted from files and sent downstream
  • Error Handling: If file processing fails, the SQS message remains in the queue and will be retried
  • Message Deletion: SQS messages are deleted after successful file processing. Messages that aren't S3 event notifications are detected via their metadata and deleted from the queue without further processing.
  • Output Shape: Each emitted message contains a Records array. To produce one message per CloudTrail event, add a downstream jq transform with the query .Records[].
  • Metadata Preservation: If With Metadata is enabled, the _monad_metadata field (containing bucket information) is added at the top level of each emitted message