Skip to main content

Amazon Security Lake

Send OCSF records to AWS Security Lake, enabling centralized storage and analysis of security data in a standardized format.

Overview

The Amazon Security Lake connector enables you to send OCSF (Open Cybersecurity Schema Framework) records to AWS Security Lake. This connector streamlines the complex process of preparing and uploading security data to AWS Security Lake by handling the required format conversions, partitioning, and upload requirements.

Setup Instructions

  1. Ensure you have the necessary IAM role with permissions to write to the S3 bucket and access AWS Security Lake. If using static credentials, make sure the static credentials have the below permission policy for the underlying S3 bucket created:
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "VisualEditor0",
"Effect": "Allow",
"Action": [
"s3:PutObject",
"s3:GetObject",
"s3:ListBucket"
],
"Resource": [
"arn:aws:s3:::{bucket_name}",
"arn:aws:s3:::{bucket_name}/*"
]
}
]
}
  1. Configure the connector with the required settings, including the S3 bucket URL, IAM role ARN, and source account details.
  2. Deploy the connector and validate its functionality by sending test records.

Steps to Register a Custom Source in Amazon Security Lake

For custom sources, Amazon Security Lake is responsible for provisioning the S3 bucket location, creating the Lake Formation table associated with that location, creating a role in the customer’s account that will be used to write the custom source data, setting up a Glue crawler to maintain partition metadata, and coordinating subscriber access to the source after it is written to Amazon Security Lake.

Follow these steps to register a custom source:

  1. Navigate to Amazon Security Lake in the AWS Management Console.
  2. Select Create Custom Source.
  3. Provide a name for your data source.
  4. Choose the desired OCSF event class from the dropdown menu.
  5. In the Account Details section:
    • Enter 339712996529 in the AWS Account ID field.
    • Input your "Monad" organization ID as the External ID.
  6. Click Create.
  7. Once completed, make note of the S3 URL and IAM role ARN generated for the connector configuration.
  8. Open the IAM console in your AWS account and locate the role created in step 7.
  9. Access the Trust Relationship tab for that role.
  10. Add the following block to the Statement:
    {
    "Effect": "Allow",
    "Principal": {
    "AWS": "arn:aws:iam::339712996529:root"
    },
    "Action": "sts:TagSession"
    }
  11. The updated trust relationship should resemble the following:
    {
    "Version": "2012-10-17",
    "Statement": [
    {
    "Effect": "Allow",
    "Principal": {
    "AWS": "arn:aws:iam::339712996529:root"
    },
    "Action": "sts:AssumeRole",
    "Condition": {
    "StringEquals": {
    "sts:ExternalId": "<monad organization id>"
    }
    }
    },
    {
    "Effect": "Allow",
    "Principal": {
    "AWS": "arn:aws:iam::339712996529:root"
    },
    "Action": "sts:TagSession"
    }
    ]
    }

Additional Steps for Managed Instances

For managed Monad instances, after creating a custom source in Security Lake, go to the IAM console and find the role created by Security Lake. Update the trust relationship for the role with the following policy:

{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"AWS": "arn:aws:iam::<account_id>:root"
},
"Action": "sts:AssumeRole",
"Condition": {
"StringEquals": {
"sts:ExternalId": "<monad organization id>"
}
}
}
]
}

Functionality

  • The connector ingests OCSF-formatted security events from your systems
  • Transforms JSON events into Apache Parquet format with proper schema and compression
  • Organizes files according to Security Lake's required partition structure
  • Securely delivers the data to your designated S3 bucket in AWS Security Lake

Configuration

The following configuration defines the input parameters. Each field's specifications, such as type, requirements, and descriptions, are detailed below.

Settings

SettingTypeRequiredDescription
AWS IAM Role ARNstringYesThe Amazon Resource Name (ARN) of the IAM role to assume for S3 access.
S3 Bucket UrlstringYesThe S3 bucket URL where data will be stored.
Source Account DetailsstringYesYour 12-digit AWS account ID where the security events originate.
Parquet FormatParquetFormatYesConfiguration for formatting data in Apache Parquet format.
Batch ConfigBatchConfigNoControls batch processing limits (record count, data size, publish rate).

Advanced Configuration

Source Account Details

SettingTypeRequiredDescription
AWS RegionstringYesThe AWS region of the events, as a single bucket can store events from different regions.
Account IDstringYesThe account ID of the account where the events have been created, as a single bucket can store events from different accounts.

ParquetFormatter Details

The ParquetFormat object requires a schema that maps your OCSF events to Parquet format:

{
"Tag": "name=my_schema, repetitiontype=REQUIRED",
"Fields": [
{"Tag": "name=Activity_id, type=INT64, convertedtype=INT_64, repetitiontype=OPTIONAL"},
{"Tag": "name=Activity_name, type=BYTE_ARRAY, convertedtype=UTF8, repetitiontype=OPTIONAL"},
{"Tag": "name=Time, type=INT64, convertedtype=INT_64, repetitiontype=OPTIONAL"},
{"Tag": "name=Severity
...
}

Your schema must include fields matching your OCSF data structure. All string fields should use type=BYTE_ARRAY, convertedtype=UTF8 and numeric fields should use appropriate types like type=INT64.

For more details on how to write Parquet schemas, refer to the parquet-go repository.

BatchConfig Details

SettingTypeRequiredDescription
Batch Data SizeintYesThe AWS account ID of the source account. Must be 12 digits.
Batch Record CountintYesThe AWS region of the source account.
Publish RateintYesThe AWS region of the source account.
  • Lower record count and data size values create smaller, more frequent uploads
  • Higher values batch more data, reducing API calls but increasing latency
  • The publish rate controls how often batches are checked for upload

Secrets

None.

AWS Security Lake Requirements Handled by the Connector

The connector automatically ensures your data meets these AWS Security Lake requirements:

  1. Proper Partitioning: Creates the required directory structure:
    bucket/source_name/region=<region>/accountId=<account_id>/eventDay=<yyyyMMdd>/

  2. Parquet Format Compliance:

    • Converts JSON to Parquet
    • Uses zstandard compression
  3. Consistent Schema: Ensures all records follow the same schema structure

Troubleshooting

If your data isn't appearing in Security Lake:

  1. Check S3 Permissions: Ensure the IAM role has proper write access
  2. Verify File Structure: Confirm files are uploaded with the correct partitioning
  3. Inspect Parquet Files: Use parquet-tools to verify file format and content
  4. Check AWS Glue Catalog: Security Lake uses Glue to catalog data - ensure tables exist
  5. Review Lake Formation Permissions: Verify you have permissions to view the data

Best Practices

  1. Optimize Batch Size: Balance between latency and throughput based on your data volume
  2. Use Descriptive Names: Name your connector to reflect the data source for easier management
  3. Monitor Uploads: Regularly check S3 to ensure data is being properly uploaded
  4. Test With Sample Data: Validate your configuration with a small dataset before full deployment

Next Steps

After configuring the connector:

  1. Verify data appears in AWS Security Lake
  2. Set up queries in Amazon Athena to analyze your security data
  3. Configure subscribers to consume and act on your security data