GitHub

GitHub Actions Workflow Logs

Streams GitHub Actions workflow run logs for completed runs in a repository, chunked into ~1 MB pieces split at newline boundaries.

Sync Type: Incremental

Sync Interval: 10 minutes

Requirements

Before configuring this input, you need to:

Create a Personal Access Token (PAT) or GitHub App — GitHub authentication docs

Option A: Personal Access Token (PAT)
- Go to your GitHub settings
- Navigate to Developer settings > Personal access tokens > Tokens (classic) or Fine-grained tokens
- Click "Generate new token"
- For fine-grained tokens, select the repository and grant Actions: Read + Metadata: Read permissions
- For classic tokens, select the repo or public_repo scope (and optionally actions:read)
- Copy and securely store the generated token
Option B: GitHub App
- Create a GitHub App in your organization or account
- Grant the app Actions: Read and Metadata: Read permissions
- Install the app on the target repository
- Generate and store the private key in PEM format
- Note the Client ID and Installation ID
Token Scopes & Permissions:
- Personal Access Token (Classic): repo (full control) or public_repo (public repos only), optionally add actions:read
- Personal Access Token (Fine-grained): Actions: Read and Metadata: Read on target repository
- GitHub App: Actions: Read and Metadata: Read permissions
API Access: Ensure the account/organization has not restricted API access for PATs or apps

Details

How Incremental Syncing Works

On the first sync with no Backfill Start Time configured, the input starts from the current time — the first sync emits nothing and subsequent syncs pick up new completed runs going forward. Configure Backfill Start Time to pull historical data.

Each sync queries a lookback window anchored to the cursor (a structural marker of "the highest time we've fully covered"), not to the current time. This means a run whose lifecycle spans a sync boundary — it was created before the last sync ended but did not complete until after — is still captured in the next sync, because the window looks back by up to Maximum Job Execution Time hours behind the cursor.

The walk is checkpointed per sub-window rather than once at the end. As each sub-window of the lookback range is successfully drained, the cursor advances to reflect that coverage — even if the sub-window emitted no records. The benefit is that crash recovery during a long backfill is cheap: a sync that fails partway through a 30-day backfill picks up from the last successfully-drained sub-window, not from the original starting cursor.

A run that completes within the last ~5 seconds of a sync's walk is deferred to a future sync rather than emitted in the current one. This propagation buffer gives GitHub's API state time to fully settle — so when the cursor advances past a given timestamp, any second-granularity sibling runs at that timestamp have all had time to become visible to the list endpoint, and the next sync sees them together rather than missing the laggards. The practical effect is a ~5-second freshness lag at the tail of every sync, in exchange for exactly-once emission with no boundary-second duplicates.

Correctness contract: logs are guaranteed for any completed run where (updated_at − created_at) ≤ Maximum Job Execution Time. Runs with a longer execution duration may be missed. Increase Maximum Job Execution Time if your workflows regularly run longer than 2 hours.

Subsequent syncs pick up where the previous one ended, using an internal cursor advanced per sub-window. Combined with the propagation buffer, this gives exactly-once emission under normal operation. (See Failure modes below for partial-walk semantics.)

Data Retrieval

Workflow runs are fetched from the GitHub REST API, filtered by their creation date
Log archives (ZIP files) are downloaded from GitHub's CDN using signed URLs — these do not count against REST API rate limits
Root-level per-job log files are extracted and chunked for emission. Per-step logs are skipped.

Log Chunking

A single job's log file is split into ~1 MB chunks at newline boundaries. Each chunk is emitted as one record. Each chunk carries its position via chunk_meta.chunk_index and its end-of-file marker via chunk_meta.last_chunk. Records that share a file_id reassemble into one full log file — see Reconstructing full log files below.

The input automatically batches API queries to stay under GitHub's per-query result cap.

Special Cases

Expired logs: GitHub retains workflow logs for ~90 days. Expired runs (404 / 410 HTTP status) are skipped without failing the sync. The query window's lower bound is floored at 30 days ago (a safety bound, tighter than GitHub's retention) so in normal operation we don't approach the retention edge.
Skipped workflow runs: Workflow runs with conclusion: skipped (e.g., a workflow whose top-level if: condition evaluated false, or an Auto approve workflow on an already-approved PR) are not emitted. Their log archives contain only nested per-step runner system entries — no root-level per-job log to ingest. If you compare your emitted-run count against GitHub's status=completed count for the same window and see a gap, expect it to be exactly the number of conclusion=skipped runs in that range.

Re-runs and the `created_at` filter

GitHub's workflow-runs list endpoint only supports filtering by created_at. A re-run preserves the original created_at but bumps run_attempt, so once an input sync has advanced past a run's window, subsequent re-runs of that same run will not be picked up. This is a limitation of GitHub's API, not the input.

Similarly, when a run's first sync sees run_attempt already > 1, only the most recent attempt's logs are fetched — earlier attempts cannot be retrieved retroactively.

Configuration

Settings

Setting	Type	Required	Default	Description
Owner	string	Yes	-	Repository owner (user or organization name)
Repository	string	Yes	-	Repository name (without owner prefix, e.g., `api-service` not `owner/api-service`)
Authentication Method	oneOf	Yes	Personal Access Token	Authentication method to use: Personal Access Token or GitHub App. See sub-fields below.
Maximum Job Execution Time (Hours)	integer	No	2	Upper bound on workflow run duration (`updated_at` minus `created_at`). Runs that complete within this duration are guaranteed to have their logs captured. Runs exceeding this duration may be missed. Accepted range: 1–12. Increase this value if your workflows regularly run longer than 2 hours.
Backfill Start Time	date (RFC3339)	No	Current time	Captures runs whose `updated_at` is after this time. Defaults to the current time on first sync (no historical backfill). Values older than 30 days are silently clipped to a 30-day lookback cap (a safety bound, tighter than GitHub's ~90-day workflow-log retention).
Generate Synthetic Data	boolean	No	false	Generate synthetic demo data instead of connecting to the real data source.

Authentication Method: Personal Access Token

Field	Type	Required	Description
Personal Access Token	secret	Yes	Personal access token with `repo` and/or `actions:read` scopes.

Authentication Method: GitHub App

Field	Type	Required	Description
Client ID	string	Yes	The GitHub App's client ID
Installation ID	string	Yes	The installation ID for accessing the repository
Private Key	secret	Yes	The GitHub App's private key in PEM format

Record Fields

Each emitted record represents a single chunk of a job's log file. The following table describes the top-level and most commonly used nested fields:

Field	Description
`workflow_meta`	Object containing identifying metadata about the run, workflow, and job that produced this log file. The same `workflow_meta` is repeated on every chunk of the same file.
`workflow_meta.run_id`	Unique identifier for the workflow run within the repository (GitHub's run ID).
`workflow_meta.run_attempt`	The attempt number for re-runs; starts at 1. Useful for distinguishing retry attempts of the same run.
`workflow_meta.workflow_id`	Unique identifier for the workflow definition itself.
`workflow_meta.workflow_name`	Human-readable name of the workflow (e.g., "CI/Build").
`workflow_meta.run_number`	Sequential run number within the workflow.
`workflow_meta.head_branch`	Git ref the run was associated with — branch name, tag (e.g. `v1.2.3`), or PR ref (e.g. `refs/pull/N/head`).
`workflow_meta.head_sha`	Git commit SHA for the code that ran.
`workflow_meta.event`	The trigger event (e.g., `push`, `pull_request`, `schedule`).
`workflow_meta.status`	Run status. Always `completed` — this input only ingests completed runs.
`workflow_meta.conclusion`	Final result if status is `completed`: `success`, `failure`, `cancelled`, etc.
`workflow_meta.created_at`	RFC3339 timestamp when the run was created.
`workflow_meta.updated_at`	RFC3339 timestamp of the most recent update to the run.
`workflow_meta.run_started_at`	RFC3339 timestamp when the run began executing.
`workflow_meta.html_url`	Link to the run in the GitHub UI.
`workflow_meta.repository`	Repository identifier in `owner/repo` format.
`workflow_meta.actor`	GitHub user or app that triggered the run.
`workflow_meta.triggering_actor`	The actor that directly triggered this particular run (may differ from `actor` for scheduled/automated runs).
`workflow_meta.job_name`	Human-readable name of the job within the workflow.
`workflow_meta.log_file`	Source filename within GitHub's log archive (e.g., `0_build.txt`). Useful for grouping multiple jobs from the same run.
`file_id`	Stable, deterministic identifier for the source log file: `{owner}/{repo}/{run_id}-{run_attempt}-{log_file}`. Group records by `file_id` to reassemble a full file.
`chunk_meta.chunk_index`	Zero-based position of this chunk within the source file. Use to order chunks when reassembling.
`chunk_meta.last_chunk`	`true` on the final chunk of a file; `false` on every earlier chunk. Signals completeness of the log.
`chunk_meta.chunk_size_bytes`	Byte length of the `content` for this chunk.
`chunk_meta.truncated_lines`	Count of log lines in this chunk that exceeded 1 MB and were truncated. `0` for nearly all real-world logs.
`content`	The raw log text for this chunk, as UTF-8. See Note on non-UTF-8 content below.

Reconstructing full log files

To rebuild a complete log file from its chunks:

Group by file_id: Collect all records that share the same file_id.
Sort by chunk index: Within each group, sort records ascending by chunk_meta.chunk_index.
Verify completeness: Confirm that the highest-index record has chunk_meta.last_chunk == true, and that chunk indexes form a contiguous sequence from 0 to N (no gaps).
Concatenate: Append the content strings in index order. The result is the original log file as it appeared in GitHub's archive.

If chunk_meta.last_chunk == true is missing or the indexes have gaps, the log is incomplete (e.g., a partial sync, or chunks pending in a downstream queue). Wait for the next sync or investigate.

Content format

GitHub's job log files have a consistent structure that's preserved verbatim in the content field:

Leading byte-order mark. The first chunk of every log file begins with U+FEFF (the UTF-8 BOM). When reassembling a full file, the BOM appears once at the start. Most parsers handle this transparently; strict ones may need to strip it.
Per-line timestamp prefix. Each line is formatted as <RFC3339Nano timestamp><space><message>, e.g. 2026-04-28T00:31:56.3946773Z Current runner version: '2.334.0'. Timestamps are nanosecond-precision UTC.
Inline GitHub annotation markers. GitHub uses tags like ##[group]...##[endgroup], ##[error], ##[warning], ##[command], and ##[debug] inline in the log text to drive UI rendering (collapsible sections, severity badges). They appear as literal text in content — downstream consumers can parse them out or display them as-is.

Note on non-UTF-8 content

GitHub Actions logs are UTF-8 encoded. If a job's stdout or stderr writes raw non-UTF-8 bytes (rare — usually only when a process emits binary data into its log stream), those bytes are replaced with the Unicode replacement character U+FFFD ("�") when the record is JSON-encoded. This replacement is not reversible. The vast majority of real-world workflow logs are clean UTF-8 and unaffected.

Rate Limits

Scope	Limit	Window	Notes
REST API (Core) - Personal Access Token	5,000 requests	Per hour	Increases to 15,000 for Enterprise Cloud
REST API (Core) - GitHub App	5,000 base + scales	Per hour	Scales to 12,500 per hour; 15,000 on Enterprise Cloud
Secondary Rate Limit	900 points	Per minute	100 concurrent requests max
Log Downloads	Unlimited	N/A	Downloaded from signed CDN URLs; does not count against REST API limits

Rate Limit Headers: X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Reset, X-RateLimit-Used

Source: GitHub REST API Rate Limits

Troubleshooting

1. Authentication Errors

"invalid credentials": Verify the PAT or GitHub App private key is correct and hasn't expired
"insufficient permissions": Ensure the classic PAT has repo (or public_repo for public repositories), the fine-grained PAT has Actions: Read and Metadata: Read, or the GitHub App has Actions: Read permission
"repository not found": Double-check the Owner and Repository settings; verify the token has access to the repository

2. Rate Limit Errors

The input automatically handles rate limiting by waiting until the reset time
If sync is consistently slow due to rate limits, consider:
- Using a GitHub App for higher rate limits
- Configuring a more recent Backfill Start Time to reduce historical data to process
- Running syncs during off-peak hours

3. No Data

First sync: The default starting point is the current time, so the initial sync emits nothing. Set Backfill Start Time to an RFC3339 timestamp to pull historical runs.
Missing logs: Workflow logs are retained for 90 days by default. Older runs won't have downloadable logs and are skipped.
Empty repositories: Verify that the repository has Actions workflows enabled and has run at least once.

4. Missing Runs (Sync Boundary Gap)

If runs that were in-flight during a previous sync appear to be missing, check whether their total execution time exceeded Maximum Job Execution Time. Increase this setting (up to 12 hours) to widen the lookback window and capture longer-running workflows.

5. Partial Records

Records may have an empty job_name if the log filename doesn't follow the expected <N>_<job_name>.txt or <job_name>.txt pattern
A non-zero truncated_lines value indicates one or more log lines in that chunk exceeded 1 MB and were truncated to their first 1 MB — the record is still valid; only the tail of the offending line was dropped

6. Duplicate Records After Errors or Wall-Clock Cap

The cursor advances per sub-window (typically every few minutes of the lookback range). If a sync errors mid-walk (transient API failure, network issue) or hits the 4-hour wall-clock cap (only triggered by very large backfills or pathological workloads), the next sync resumes from the last successfully-drained sub-window — re-emitting only records from the in-flight sub-window that was interrupted, not from earlier successfully-drained sub-windows.
The re-emit set is bounded by one sub-window's worth of records (typically minutes of activity, not hours).
Destinations that key on file_id + chunk_meta.chunk_index (most upsert-capable sinks) absorb this transparently. Append-only destinations will see duplicate records bounded by what got through before the failure.

Sample Record

Each record represents one ~1 MB chunk of a single job's combined log file. Workflow/run/job identifying metadata is nested under workflow_meta; chunk-specific fields are under chunk_meta.


Code
{
  "workflow_meta": {
    "run_id": 24447590112,
    "run_attempt": 1,
    "workflow_id": 233609137,
    "workflow_name": "Copilot code review",
    "run_number": 158,
    "head_branch": "refs/pull/1879/head",
    "head_sha": "9abf70ec123456789abcdef0123456789abcdef0",
    "event": "pull_request",
    "status": "completed",
    "conclusion": "success",
    "created_at": "2026-04-15T09:42:40Z",
    "updated_at": "2026-04-15T09:44:32Z",
    "run_started_at": "2026-04-15T09:42:40Z",
    "html_url": "https://github.com/monad-inc/inputs/actions/runs/24447590112",
    "repository": "monad-inc/inputs",
    "actor": "Copilot",
    "triggering_actor": "Copilot",
    "job_name": "Cleanup artifacts",
    "log_file": "0_Cleanup artifacts.txt"
  },
  "file_id": "monad-inc/inputs/24447590112-1-0_Cleanup artifacts.txt",
  "chunk_meta": {
    "chunk_index": 0,
    "last_chunk": true,
    "chunk_size_bytes": 5062,
    "truncated_lines": 0
  },
  "content": "2026-04-15T09:44:30Z ##[group]Run actions/upload-artifact@v4\n2026-04-15T09:44:31Z Uploading artifact\n2026-04-15T09:44:32Z Artifact successfully uploaded\n2026-04-15T09:44:32Z ##[endgroup]\n2026-04-15T09:44:32Z ##[group]Post Run actions/checkout@v4\n2026-04-15T09:44:33Z Cleaning up workspace\n2026-04-15T09:44:32Z ##[endgroup]\n"
}

Last modified on July 9, 2026

Audit Logs Actions Workflow Logs (Webhook)

GitHub

GitHub Actions Workflow Logs

Streams GitHub Actions workflow run logs for completed runs in a repository, chunked into ~1 MB pieces split at newline boundaries.

Sync Type: Incremental

Sync Interval: 10 minutes

Requirements

Before configuring this input, you need to:

Create a Personal Access Token (PAT) or GitHub App — GitHub authentication docs

Option A: Personal Access Token (PAT)
- Go to your GitHub settings
- Navigate to Developer settings > Personal access tokens > Tokens (classic) or Fine-grained tokens
- Click "Generate new token"
- For fine-grained tokens, select the repository and grant Actions: Read + Metadata: Read permissions
- For classic tokens, select the repo or public_repo scope (and optionally actions:read)
- Copy and securely store the generated token
Option B: GitHub App
- Create a GitHub App in your organization or account
- Grant the app Actions: Read and Metadata: Read permissions
- Install the app on the target repository
- Generate and store the private key in PEM format
- Note the Client ID and Installation ID
Token Scopes & Permissions:
- Personal Access Token (Classic): repo (full control) or public_repo (public repos only), optionally add actions:read
- Personal Access Token (Fine-grained): Actions: Read and Metadata: Read on target repository
- GitHub App: Actions: Read and Metadata: Read permissions
API Access: Ensure the account/organization has not restricted API access for PATs or apps

Details

How Incremental Syncing Works

Data Retrieval

Workflow runs are fetched from the GitHub REST API, filtered by their creation date
Log archives (ZIP files) are downloaded from GitHub's CDN using signed URLs — these do not count against REST API rate limits
Root-level per-job log files are extracted and chunked for emission. Per-step logs are skipped.

Log Chunking

The input automatically batches API queries to stay under GitHub's per-query result cap.

Special Cases

Expired logs: GitHub retains workflow logs for ~90 days. Expired runs (404 / 410 HTTP status) are skipped without failing the sync. The query window's lower bound is floored at 30 days ago (a safety bound, tighter than GitHub's retention) so in normal operation we don't approach the retention edge.
Skipped workflow runs: Workflow runs with conclusion: skipped (e.g., a workflow whose top-level if: condition evaluated false, or an Auto approve workflow on an already-approved PR) are not emitted. Their log archives contain only nested per-step runner system entries — no root-level per-job log to ingest. If you compare your emitted-run count against GitHub's status=completed count for the same window and see a gap, expect it to be exactly the number of conclusion=skipped runs in that range.

Re-runs and the `created_at` filter

Similarly, when a run's first sync sees run_attempt already > 1, only the most recent attempt's logs are fetched — earlier attempts cannot be retrieved retroactively.

Configuration

Settings

Setting	Type	Required	Default	Description
Owner	string	Yes	-	Repository owner (user or organization name)
Repository	string	Yes	-	Repository name (without owner prefix, e.g., `api-service` not `owner/api-service`)
Authentication Method	oneOf	Yes	Personal Access Token	Authentication method to use: Personal Access Token or GitHub App. See sub-fields below.
Maximum Job Execution Time (Hours)	integer	No	2	Upper bound on workflow run duration (`updated_at` minus `created_at`). Runs that complete within this duration are guaranteed to have their logs captured. Runs exceeding this duration may be missed. Accepted range: 1–12. Increase this value if your workflows regularly run longer than 2 hours.
Backfill Start Time	date (RFC3339)	No	Current time	Captures runs whose `updated_at` is after this time. Defaults to the current time on first sync (no historical backfill). Values older than 30 days are silently clipped to a 30-day lookback cap (a safety bound, tighter than GitHub's ~90-day workflow-log retention).
Generate Synthetic Data	boolean	No	false	Generate synthetic demo data instead of connecting to the real data source.

Authentication Method: Personal Access Token

Field	Type	Required	Description
Personal Access Token	secret	Yes	Personal access token with `repo` and/or `actions:read` scopes.

Authentication Method: GitHub App

Field	Type	Required	Description
Client ID	string	Yes	The GitHub App's client ID
Installation ID	string	Yes	The installation ID for accessing the repository
Private Key	secret	Yes	The GitHub App's private key in PEM format

Record Fields

Each emitted record represents a single chunk of a job's log file. The following table describes the top-level and most commonly used nested fields:

Field	Description
`workflow_meta`	Object containing identifying metadata about the run, workflow, and job that produced this log file. The same `workflow_meta` is repeated on every chunk of the same file.
`workflow_meta.run_id`	Unique identifier for the workflow run within the repository (GitHub's run ID).
`workflow_meta.run_attempt`	The attempt number for re-runs; starts at 1. Useful for distinguishing retry attempts of the same run.
`workflow_meta.workflow_id`	Unique identifier for the workflow definition itself.
`workflow_meta.workflow_name`	Human-readable name of the workflow (e.g., "CI/Build").
`workflow_meta.run_number`	Sequential run number within the workflow.
`workflow_meta.head_branch`	Git ref the run was associated with — branch name, tag (e.g. `v1.2.3`), or PR ref (e.g. `refs/pull/N/head`).
`workflow_meta.head_sha`	Git commit SHA for the code that ran.
`workflow_meta.event`	The trigger event (e.g., `push`, `pull_request`, `schedule`).
`workflow_meta.status`	Run status. Always `completed` — this input only ingests completed runs.
`workflow_meta.conclusion`	Final result if status is `completed`: `success`, `failure`, `cancelled`, etc.
`workflow_meta.created_at`	RFC3339 timestamp when the run was created.
`workflow_meta.updated_at`	RFC3339 timestamp of the most recent update to the run.
`workflow_meta.run_started_at`	RFC3339 timestamp when the run began executing.
`workflow_meta.html_url`	Link to the run in the GitHub UI.
`workflow_meta.repository`	Repository identifier in `owner/repo` format.
`workflow_meta.actor`	GitHub user or app that triggered the run.
`workflow_meta.triggering_actor`	The actor that directly triggered this particular run (may differ from `actor` for scheduled/automated runs).
`workflow_meta.job_name`	Human-readable name of the job within the workflow.
`workflow_meta.log_file`	Source filename within GitHub's log archive (e.g., `0_build.txt`). Useful for grouping multiple jobs from the same run.
`file_id`	Stable, deterministic identifier for the source log file: `{owner}/{repo}/{run_id}-{run_attempt}-{log_file}`. Group records by `file_id` to reassemble a full file.
`chunk_meta.chunk_index`	Zero-based position of this chunk within the source file. Use to order chunks when reassembling.
`chunk_meta.last_chunk`	`true` on the final chunk of a file; `false` on every earlier chunk. Signals completeness of the log.
`chunk_meta.chunk_size_bytes`	Byte length of the `content` for this chunk.
`chunk_meta.truncated_lines`	Count of log lines in this chunk that exceeded 1 MB and were truncated. `0` for nearly all real-world logs.
`content`	The raw log text for this chunk, as UTF-8. See Note on non-UTF-8 content below.

Reconstructing full log files

To rebuild a complete log file from its chunks:

Group by file_id: Collect all records that share the same file_id.
Sort by chunk index: Within each group, sort records ascending by chunk_meta.chunk_index.
Verify completeness: Confirm that the highest-index record has chunk_meta.last_chunk == true, and that chunk indexes form a contiguous sequence from 0 to N (no gaps).
Concatenate: Append the content strings in index order. The result is the original log file as it appeared in GitHub's archive.

If chunk_meta.last_chunk == true is missing or the indexes have gaps, the log is incomplete (e.g., a partial sync, or chunks pending in a downstream queue). Wait for the next sync or investigate.

Content format

GitHub's job log files have a consistent structure that's preserved verbatim in the content field:

Leading byte-order mark. The first chunk of every log file begins with U+FEFF (the UTF-8 BOM). When reassembling a full file, the BOM appears once at the start. Most parsers handle this transparently; strict ones may need to strip it.
Per-line timestamp prefix. Each line is formatted as <RFC3339Nano timestamp><space><message>, e.g. 2026-04-28T00:31:56.3946773Z Current runner version: '2.334.0'. Timestamps are nanosecond-precision UTC.
Inline GitHub annotation markers. GitHub uses tags like ##[group]...##[endgroup], ##[error], ##[warning], ##[command], and ##[debug] inline in the log text to drive UI rendering (collapsible sections, severity badges). They appear as literal text in content — downstream consumers can parse them out or display them as-is.

Note on non-UTF-8 content

Rate Limits

Scope	Limit	Window	Notes
REST API (Core) - Personal Access Token	5,000 requests	Per hour	Increases to 15,000 for Enterprise Cloud
REST API (Core) - GitHub App	5,000 base + scales	Per hour	Scales to 12,500 per hour; 15,000 on Enterprise Cloud
Secondary Rate Limit	900 points	Per minute	100 concurrent requests max
Log Downloads	Unlimited	N/A	Downloaded from signed CDN URLs; does not count against REST API limits

Rate Limit Headers: X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Reset, X-RateLimit-Used

Source: GitHub REST API Rate Limits

Troubleshooting

1. Authentication Errors

"invalid credentials": Verify the PAT or GitHub App private key is correct and hasn't expired
"insufficient permissions": Ensure the classic PAT has repo (or public_repo for public repositories), the fine-grained PAT has Actions: Read and Metadata: Read, or the GitHub App has Actions: Read permission
"repository not found": Double-check the Owner and Repository settings; verify the token has access to the repository

2. Rate Limit Errors

The input automatically handles rate limiting by waiting until the reset time
If sync is consistently slow due to rate limits, consider:
- Using a GitHub App for higher rate limits
- Configuring a more recent Backfill Start Time to reduce historical data to process
- Running syncs during off-peak hours

3. No Data

First sync: The default starting point is the current time, so the initial sync emits nothing. Set Backfill Start Time to an RFC3339 timestamp to pull historical runs.
Missing logs: Workflow logs are retained for 90 days by default. Older runs won't have downloadable logs and are skipped.
Empty repositories: Verify that the repository has Actions workflows enabled and has run at least once.

4. Missing Runs (Sync Boundary Gap)

If runs that were in-flight during a previous sync appear to be missing, check whether their total execution time exceeded Maximum Job Execution Time. Increase this setting (up to 12 hours) to widen the lookback window and capture longer-running workflows.

5. Partial Records

Records may have an empty job_name if the log filename doesn't follow the expected <N>_<job_name>.txt or <job_name>.txt pattern
A non-zero truncated_lines value indicates one or more log lines in that chunk exceeded 1 MB and were truncated to their first 1 MB — the record is still valid; only the tail of the offending line was dropped

6. Duplicate Records After Errors or Wall-Clock Cap

The cursor advances per sub-window (typically every few minutes of the lookback range). If a sync errors mid-walk (transient API failure, network issue) or hits the 4-hour wall-clock cap (only triggered by very large backfills or pathological workloads), the next sync resumes from the last successfully-drained sub-window — re-emitting only records from the in-flight sub-window that was interrupted, not from earlier successfully-drained sub-windows.
The re-emit set is bounded by one sub-window's worth of records (typically minutes of activity, not hours).
Destinations that key on file_id + chunk_meta.chunk_index (most upsert-capable sinks) absorb this transparently. Append-only destinations will see duplicate records bounded by what got through before the failure.

Sample Record

Each record represents one ~1 MB chunk of a single job's combined log file. Workflow/run/job identifying metadata is nested under workflow_meta; chunk-specific fields are under chunk_meta.


Code
{
  "workflow_meta": {
    "run_id": 24447590112,
    "run_attempt": 1,
    "workflow_id": 233609137,
    "workflow_name": "Copilot code review",
    "run_number": 158,
    "head_branch": "refs/pull/1879/head",
    "head_sha": "9abf70ec123456789abcdef0123456789abcdef0",
    "event": "pull_request",
    "status": "completed",
    "conclusion": "success",
    "created_at": "2026-04-15T09:42:40Z",
    "updated_at": "2026-04-15T09:44:32Z",
    "run_started_at": "2026-04-15T09:42:40Z",
    "html_url": "https://github.com/monad-inc/inputs/actions/runs/24447590112",
    "repository": "monad-inc/inputs",
    "actor": "Copilot",
    "triggering_actor": "Copilot",
    "job_name": "Cleanup artifacts",
    "log_file": "0_Cleanup artifacts.txt"
  },
  "file_id": "monad-inc/inputs/24447590112-1-0_Cleanup artifacts.txt",
  "chunk_meta": {
    "chunk_index": 0,
    "last_chunk": true,
    "chunk_size_bytes": 5062,
    "truncated_lines": 0
  },
  "content": "2026-04-15T09:44:30Z ##[group]Run actions/upload-artifact@v4\n2026-04-15T09:44:31Z Uploading artifact\n2026-04-15T09:44:32Z Artifact successfully uploaded\n2026-04-15T09:44:32Z ##[endgroup]\n2026-04-15T09:44:32Z ##[group]Post Run actions/checkout@v4\n2026-04-15T09:44:33Z Cleaning up workspace\n2026-04-15T09:44:32Z ##[endgroup]\n"
}

Last modified on July 9, 2026

Audit Logs Actions Workflow Logs (Webhook)

GitHub Actions Workflow Logs

Requirements

Details

How Incremental Syncing Works

Data Retrieval

Log Chunking

Special Cases

Re-runs and the created_at filter

Configuration

Settings

Authentication Method: Personal Access Token

Authentication Method: GitHub App

Record Fields

Reconstructing full log files

Content format

Note on non-UTF-8 content

Rate Limits

Troubleshooting

1. Authentication Errors

2. Rate Limit Errors

3. No Data

4. Missing Runs (Sync Boundary Gap)

5. Partial Records

6. Duplicate Records After Errors or Wall-Clock Cap

Sample Record

Related Articles

GitHub Actions Workflow Logs

Requirements

Details

How Incremental Syncing Works

Data Retrieval

Log Chunking

Special Cases

Re-runs and the created_at filter

Configuration

Settings

Authentication Method: Personal Access Token

Authentication Method: GitHub App

Record Fields

Reconstructing full log files

Content format

Note on non-UTF-8 content

Rate Limits

Troubleshooting

1. Authentication Errors

2. Rate Limit Errors

3. No Data

4. Missing Runs (Sync Boundary Gap)

5. Partial Records

6. Duplicate Records After Errors or Wall-Clock Cap

Sample Record

Related Articles

Re-runs and the `created_at` filter

Re-runs and the `created_at` filter