Skip to main content

B2 Cloud Storage

Enables seamless streaming of data stored in Backblaze B2 Cloud Storage buckets.

Sync Type: Incremental

Requirements

  • Backblaze B2 Account with Application Key credentials
  • B2 bucket
  • Required permissions (read-only):
    • listBuckets - to list available buckets
    • listFiles - to list files within the bucket
    • readFiles - to read file contents

Configuration

The following configuration defines the input parameters. Each field's specifications, such as type, requirements, and descriptions, are detailed below.

Settings

SettingTypeRequiredDescription
RegionstringYesB2 Region of your bucket (e.g., us-west-001, us-west-002, eu-central-003)
BucketstringYesName of the B2 bucket
PrefixstringNoPrefix of the B2 object keys to read
Compressionstring YesCompression format of the B2 objects
FormatstringYesFile format of the B2 objects
Partition FormatstringYesThe existing partition format used in your B2 bucket
Record LocationstringNoLocation of the record in the JSON object. Applies only if the format is JSON
Backfill Start TimestringNoThe date to start fetching data from. If not specified, no past records will be fetched.

Partition Format

The Partition Format setting specifies the existing organization of data within your B2 bucket. This is crucial for the system to correctly navigate and read your data. Select the option that matches your current B2 bucket structure:

  1. Simple Date Format ('Simple Date'):

    • Structure: YYYY/MM/DD
    • Example: 2024/01/01
    • Use case: For buckets using basic chronological organization of data
  2. Hive-compatible Format ('Hive'):

    • Structure: year=YYYY/month=MM/day=DD
    • Example: year=2024/month=01/day=01
    • Use case: For buckets set up in a Hive-compatible format, common in data lake configurations

Selecting the correct Partition Format ensures that the system can efficiently locate and process your existing data by matching your B2 bucket's current structure.

Record Location

The record_location setting helps you specify where to find the array of records within a JSON object. This is particularly useful when your data is nested within the JSON structure.

Example Usage

If your JSON files have the following structure:

{
"metadata": {
"timestamp": "2024-01-01T10:00:00Z",
"version": "1.0"
},
"data": {
"events": [
{ "id": 1, "type": "login" },
{ "id": 2, "type": "logout" }
]
}
}

To process the events array, set:

record_location = "data.events"

If no record_location is specified, it defaults to "@this", which treats the entire JSON object as a single record or expects an array at the root level.

Secrets

SettingTypeRequiredDescription
Application Key IDstringYesBackblaze B2 Application Key ID for authentication
Application KeystringYesBackblaze B2 Application Key for authentication

Setup Instructions

Step 1: Create Application Keys in Backblaze B2

  1. Log in to your Backblaze B2 Console
  2. Navigate to "App Keys" in the sidebar
  3. Click "Add a New Application Key"
  4. Configure the key:
    • Name: Give your key a descriptive name (e.g., "Monad Integration")
    • Capabilities: Select the following read-only permissions:
      • listBuckets
      • listFiles
      • readFiles
    • Bucket Access: Choose either:
      • "All" for access to all buckets, or
      • "Specific bucket" and select your target bucket
  5. Click "Create New Key"
  6. Important: Copy and securely store both the keyID and applicationKey - you won't be able to see the applicationKey again

Step 2: Prepare Your B2 Bucket Structure

Ensure your bucket follows one of these partition formats:

Simple Date Format

bucket/
2024/
01/
01/
data.json
logs.json

Hive Format

bucket/
year=2024/
month=01/
day=01/
data.json
logs.json

You can optionally include a prefix for better organization:

bucket/
data/
device-logs/
2024/
01/
01/
events.json

Troubleshooting

Common Issues

  1. Authentication Failed:

    • Verify your Application Key ID and Application Key are correct
    • Ensure the key has the required read-only capabilities (listBuckets, listFiles, readFiles)
    • Check that the key has access to the specified bucket
  2. No Files Found:

    • Verify the bucket name is correct
    • Check that the prefix matches your bucket structure
    • Ensure the partition format setting matches your data organization
  3. Connection Timeout:

    • Verify the region setting matches your bucket's actual region
    • Check network connectivity to Backblaze B2 endpoints
  4. Parse Errors:

    • Ensure the file format setting matches your actual file format
    • Verify the record location setting for JSON files
    • Check that compression setting matches your file compression

References