B2 Cloud Storage
Enables seamless streaming of data stored in Backblaze B2 Cloud Storage buckets.
Sync Type: Incremental
Requirements
- Backblaze B2 Account with Application Key credentials
- B2 bucket
- Required permissions (read-only):
listBuckets- to list available bucketslistFiles- to list files within the bucketreadFiles- to read file contents
Configuration
The following configuration defines the input parameters. Each field's specifications, such as type, requirements, and descriptions, are detailed below.
Settings
| Setting | Type | Required | Description |
|---|---|---|---|
| Region | string | Yes | B2 Region of your bucket (e.g., us-west-001, us-west-002, eu-central-003) |
| Bucket | string | Yes | Name of the B2 bucket |
| Prefix | string | No | Prefix of the B2 object keys to read |
| Compression | string | Yes | Compression format of the B2 objects |
| Format | string | Yes | File format of the B2 objects |
| Partition Format | string | Yes | The existing partition format used in your B2 bucket |
| Record Location | string | No | Location of the record in the JSON object. Applies only if the format is JSON |
| Backfill Start Time | string | No | The date to start fetching data from. If not specified, no past records will be fetched. |
Partition Format
The Partition Format setting specifies the existing organization of data within your B2 bucket. This is crucial for the system to correctly navigate and read your data. Select the option that matches your current B2 bucket structure:
-
Simple Date Format ('Simple Date'):
- Structure:
YYYY/MM/DD - Example:
2024/01/01 - Use case: For buckets using basic chronological organization of data
- Structure:
-
Hive-compatible Format ('Hive'):
- Structure:
year=YYYY/month=MM/day=DD - Example:
year=2024/month=01/day=01 - Use case: For buckets set up in a Hive-compatible format, common in data lake configurations
- Structure:
Selecting the correct Partition Format ensures that the system can efficiently locate and process your existing data by matching your B2 bucket's current structure.
Record Location
The record_location setting helps you specify where to find the array of records within a JSON object. This is particularly useful when your data is nested within the JSON structure.
Example Usage
If your JSON files have the following structure:
{
"metadata": {
"timestamp": "2024-01-01T10:00:00Z",
"version": "1.0"
},
"data": {
"events": [
{ "id": 1, "type": "login" },
{ "id": 2, "type": "logout" }
]
}
}
To process the events array, set:
record_location = "data.events"
If no record_location is specified, it defaults to "@this", which treats the entire JSON object as a single record or expects an array at the root level.
Secrets
| Setting | Type | Required | Description |
|---|---|---|---|
| Application Key ID | string | Yes | Backblaze B2 Application Key ID for authentication |
| Application Key | string | Yes | Backblaze B2 Application Key for authentication |
Setup Instructions
Step 1: Create Application Keys in Backblaze B2
- Log in to your Backblaze B2 Console
- Navigate to "App Keys" in the sidebar
- Click "Add a New Application Key"
- Configure the key:
- Name: Give your key a descriptive name (e.g., "Monad Integration")
- Capabilities: Select the following read-only permissions:
listBucketslistFilesreadFiles
- Bucket Access: Choose either:
- "All" for access to all buckets, or
- "Specific bucket" and select your target bucket
- Click "Create New Key"
- Important: Copy and securely store both the
keyIDandapplicationKey- you won't be able to see theapplicationKeyagain
Step 2: Prepare Your B2 Bucket Structure
Ensure your bucket follows one of these partition formats:
Simple Date Format
bucket/
2024/
01/
01/
data.json
logs.json
Hive Format
bucket/
year=2024/
month=01/
day=01/
data.json
logs.json
You can optionally include a prefix for better organization:
bucket/
data/
device-logs/
2024/
01/
01/
events.json
Troubleshooting
Common Issues
-
Authentication Failed:
- Verify your Application Key ID and Application Key are correct
- Ensure the key has the required read-only capabilities (
listBuckets,listFiles,readFiles) - Check that the key has access to the specified bucket
-
No Files Found:
- Verify the bucket name is correct
- Check that the prefix matches your bucket structure
- Ensure the partition format setting matches your data organization
-
Connection Timeout:
- Verify the region setting matches your bucket's actual region
- Check network connectivity to Backblaze B2 endpoints
-
Parse Errors:
- Ensure the file format setting matches your actual file format
- Verify the record location setting for JSON files
- Check that compression setting matches your file compression