SQS S3 CloudTrail
Ingests AWS CloudTrail log files from S3 via Amazon Simple Queue Service (SQS).
Details
The Amazon SQS S3 CloudTrail connector is a specialized variant of the SQS S3 connector for ingesting AWS CloudTrail log files delivered to S3. It handles the CloudTrail-specific file structure (a top-level Records array containing many events per file) so events are delivered correctly through the data pipeline.
Use this connector when ingesting native AWS CloudTrail S3 deliveries. Use the standard SQS S3 connector for all other S3-based data sources.
How It Works
- S3 Event Generation: When CloudTrail delivers a log file to the S3 bucket, S3 generates an object-created event notification
- SQS Message Receipt: The notification is sent to the configured SQS queue as a JSON message
- Event Processing: The connector polls the SQS queue, receives S3 event messages, and parses them
- File Download: For each S3 event, the connector downloads the referenced object from S3
- Data Processing: Downloaded files are decompressed (if needed) and parsed according to the format settings
- Record Extraction: Individual records are extracted and sent to the data pipeline
- Cleanup: Successfully processed SQS messages are deleted from the queue
Prerequisites
- An AWS account with existing S3 buckets and SQS queue, or permissions to create them
- AWS CloudTrail trail configured to deliver log files to the S3 bucket
- An IAM role that the platform can assume, with appropriate permissions to access both SQS and S3, or using static AWS credentials
- S3 bucket configured to send event notifications to the SQS queue
- Network connectivity between the platform and AWS
Setup Instructions
Step 1: Create IAM Policy
Create an IAM policy that grants the connector access to both S3 and SQS resources:
- IAM Role Assumption / Static Credentials
- Example permission to attach to the role/user:
Code
Step 2: Create S3 Bucket (if needed)
Create an S3 bucket where CloudTrail will deliver log files (skip if you already have one):
- Sign in to the AWS Management Console at https://console.aws.amazon.com/s3.
- Click the AWS region drop-down list next to the account name in the upper right and select the desired region.
- Under General Purpose Buckets, click the Create Bucket button.
- Under General Configuration, enter a name for the bucket (e.g.,
my-cloudtrail-logs-bucket). - Under Object Ownership, ensure ACLs Disabled is selected.
- Leave the remaining default selections unchanged and click the Create Bucket button.
- Note the bucket name and ARN for later use.
Step 3: Create SQS Queue
Set up an SQS queue to receive S3 bucket event notifications:
- Sign in to the AWS Management Console at https://console.aws.amazon.com/sqs.
- Under Get Started, click the Create Queue button.
- Under Details, enter a name for the queue (e.g.,
cloudtrail-events-queue). - Under Configuration:
- For Visibility Timeout: Enter
600seconds (10 minutes) - For Message Retention Period: Enter
7days - Set Receive message wait time to
20seconds for long polling
- For Visibility Timeout: Enter
- Under Access Policy > Choose Method, select the Advanced radio button.
- Delete the entire policy JSON and copy/paste the following policy:
Code
- Replace the placeholders:
{AWS_REGION}with the AWS region (e.g.,us-east-1){AWS_ACCOUNT_ID}with the 12-digit AWS account ID{QUEUE_NAME}with the queue name{BUCKET_NAME}with the S3 bucket name
- Click the Create Queue button.
- Note the queue URL and ARN for later use.
Step 4: Configure S3 Event Notifications
Configure the S3 bucket to send event notifications to the SQS queue:
- Sign in to the AWS Management Console at https://console.aws.amazon.com/s3.
- Under General Purpose Buckets, click on the source bucket.
- Click the Properties tab at the top.
- Locate the Event Notifications section and click Create Event Notification.
- Configure the event notification:
- Event Name: Enter a descriptive name (e.g.,
cloudtrail-notification) - Event Types: Select the checkbox for All object create events
- Prefix/Suffix: Optionally filter by object key prefix or suffix. CloudTrail delivers to
AWSLogs/<account-id>/CloudTrail/<region>/— set the prefix toAWSLogs/(or narrower) if the bucket is shared with non-CloudTrail uploads. - Destination: Select the radio button for SQS queue
- SQS Queue: Choose the queue from the dropdown or enter the queue ARN
- Event Name: Enter a descriptive name (e.g.,
- Click the Save Changes button.
Step 5: Verify Event Notifications
AWS sends a test notification to the queue after creation. To verify:
- Navigate to the SQS queue in the AWS Console.
- Under Details, locate Messages available, which should have a value of 1.
- This confirms that S3 can successfully send notifications to the queue.
Note: The test message is an s3:TestEvent payload, not a real S3 event. The connector ignores it once polling starts.
Step 6: Test the Configuration
Before configuring the connector in the platform, verify that the AWS setup is working:
- Wait for the next CloudTrail delivery (typically within 5–15 minutes), or push a CloudTrail-shaped log file to the bucket manually for faster feedback.
- Check the SQS queue to confirm a message was received:
- Navigate to the SQS queue in the AWS Console
- Under Details, verify Messages available has increased
- Use the IAM Policy Simulator (optional) to test permissions:
- Go to https://policysim.aws.amazon.com/home/index.jsp
- Select the IAM role
- Test the required S3 and SQS actions listed in the permissions section
Settings
This connector is purpose-built for native AWS CloudTrail S3 deliveries, so the file format (json), compression (gzip), and record location are fixed internally — they're not exposed as user settings.
| Setting | Type | Required | Description |
|---|---|---|---|
| Queue URL | string | Yes | The URL of the SQS queue that receives S3 event notifications (e.g., https://sqs.us-east-1.amazonaws.com/123456789012/my-queue) |
| Role ARN | string | No | The ARN of the IAM role to assume for accessing SQS and S3 (e.g., arn:aws:iam::123456789012:role/s3-sqs-connector-role). Required when not using static credentials. See Authentication Methods. |
| Region | string | Yes | The AWS region where the SQS queue and S3 buckets are located (e.g., us-east-1, us-west-2) |
| Chunking Mode | string | No | How each CloudTrail file's Records array is split before downstream emission. by_size (default) keeps the original envelope and chunks Records into ~1MB batches — every other top-level field is preserved on each chunk. per_record emits one message per Records[] element as a bare event — the surrounding envelope is dropped. |
| Exclude Digest Files | boolean | No | When true, skips S3 objects whose key path contains /CloudTrail-Digest/. CloudTrail digest files contain hash signatures rather than events. Default: false (digest files pass through). |
| With Metadata | boolean | No | If enabled, adds a _monad_metadata field with bucket information to each emitted message (default: false) |
| Uses SNS | boolean | No | If enabled, expects SQS messages to be SNS envelopes wrapping S3 event payloads. Enable this only if the bucket sends notifications through SNS before reaching SQS (default: false) |
Files are automatically decompressed (gzip) and parsed as JSON; no user configuration is needed for either.
Important Notes
-
Event Filtering: Only messages with
eventSourceofaws:s3are processed; other messages in the queue are ignored -
Order Preservation: The order of events within the
Recordsarray is preserved -
File Processing: Each S3 object is downloaded, decompressed, and parsed according to the format settings
-
Record Extraction: Individual records are extracted from files and sent downstream
-
Error Handling: If file processing fails, the SQS message remains in the queue and will be retried
-
Message Deletion: SQS messages are deleted after successful file processing. Messages that aren't S3 event notifications are detected via their metadata and deleted from the queue without further processing.
-
Output Shape: Depends on
Chunking Mode.by_size(default): each emitted message is the CloudTrail envelope withRecordschunked into ~1MB batches. All other top-level fields (_monad_metadata, etc.) are preserved on every chunk.per_record: each emitted message is a single bare CloudTrail event (the envelope is dropped). IfWith Metadatais enabled,_monad_metadatais added to each event individually rather than to the envelope.
With
by_size, downstream consumers that want one event per message can still flatten with ajqtransform:.Records[]. -
Metadata Preservation: If
With Metadatais enabled, the_monad_metadatafield (containing bucket information) is added at the top level of each emitted message
Related Articles
- AWS CloudTrail Documentation
- Creating a Trail (delivers logs to S3)
- CloudTrail Log File Examples
- Amazon S3 Event Notifications
- Amazon SQS Documentation
- S3 Event Message Structure
- AWS IAM Role Chaining and Temporary Credentials