Delimited Format for Output

This document explains how to configure delimited file output (such as CSV, TSV) for any Monad output component that supports delimited formats.

Overview

Monad provides a flexible delimited format configuration that allows you to convert your data into various delimiter-separated values formats. The most common is CSV (Comma-Separated Values), but you can specify any single character as a delimiter to create other formats such as TSV (Tab-Separated Values) or custom-delimited files.

Configuration Options

When configuring delimited output in Monad, you need to specify:

{
  "delimiter": "[character]",
  "headers": ["header1", "header2", "..."]
}

Parameters

Parameter	Description	Required	Default
`delimiter`	A single character to use as the field separator	Yes	-
`headers`	An array of column header names in the desired order	No	All possible fields, sorted alphabetically

Delimiter Examples

Here are some common delimiter characters you can use:

Format	Delimiter	Description
CSV	`,`	Comma-separated values (most common)
TSV	`\t`	Tab-separated values
Pipe-separated	`\|`	Pipe-separated values
Semicolon-separated	`;`	Semicolon-separated values (common in locales where comma is used as a decimal separator)

Header Configuration

Explicit Headers

When you explicitly define the headers array, Monad will:

Only include the specified columns in the output
Order the columns exactly as specified in the array
Output empty values for any missing fields

Example Configuration:

{
  "delimiter": ",",
  "headers": ["name", "age", "email"]
}

This configuration will produce a file with exactly these three columns in the specified order.

Automatic Headers

If you omit the headers array, Monad will:

Automatically detect all possible fields from your data
Sort the headers alphabetically
Include all fields found in any record

Example Configuration:

{
  "delimiter": ","
}

This configuration will include all fields present in your data, ordered alphabetically.

Output Example

Given the following JSON data:

[
  {"name": "John", "age": 30, "email": "john@example.com"},
  {"name": "Jane", "age": 25, "email": "jane@example.com", "department": "Sales"}
]

With Explicit Headers

Configuration:

{
  "delimiter": ",",
  "headers": ["name", "email", "age"]
}

Output:

name,email,age
John,john@example.com,30
Jane,jane@example.com,25

Note that department is omitted because it wasn't in the specified headers.

With Automatic Headers

Configuration:

{
  "delimiter": ","
}

Output:

age,department,email,name
30,,john@example.com,John
25,Sales,jane@example.com,Jane

All fields are included, sorted alphabetically, with empty values for missing fields.

Best Practices

Explicit Headers Recommended: For consistent output across multiple batches, always specify explicit headers. This ensures that your file structure remains consistent even if the available fields change between batches.
Handling Special Characters: When your data contains the delimiter character, Monad will automatically handle proper escaping in the output.
Unicode Support: Monad supports Unicode characters in both delimiters and data, making it useful for international datasets.
Field Order: When working with large datasets, controlling the field order with explicit headers can significantly improve readability and downstream processing.
Missing Values: Fields that don't exist in a particular record will be output as empty values, not as null or other placeholders.

Batch Considerations

When processing data in batches without explicit headers:

Each batch may generate files with different header sets if the fields vary between batches
The order of headers may change between batches if new fields appear

To prevent this inconsistency, it's strongly recommended to explicitly define headers when your data structure might vary between batches.

File Extension

Delimited output files use the .csv extension by default, regardless of the delimiter used.

Complete Example

Here's a complete configuration example for a Monad output component using CSV format:

{
  "component": "file_output",
  "config": {
    "path": "/data/exports/",
    "format": "delimited",
    "delimited_format": {
      "delimiter": ",",
      "headers": ["id", "name", "email", "created_at"]
    }
  }
}

This configuration will output your data as a CSV file with the specified column order, including only the fields listed in the headers array.

Overview​

Configuration Options​

Parameters​

Delimiter Examples​

Header Configuration​

Explicit Headers​

Automatic Headers​

Output Example​

With Explicit Headers​

With Automatic Headers​

Best Practices​

Batch Considerations​

File Extension​

Complete Example​