Configuration¶

When to read this: you want to control settings, environments, or declarative pipeline definitions without hard-coding everything in Python.

Agora has two complementary configuration layers:

runtime settings via AgoraSettings and environment variables
declarative pipeline definitions via agora/v1 TOML

Use whichever fits your team. Many projects use both.

Project settings with `AgoraSettings`¶

The scaffolded project creates src/settings.py:

from functools import lru_cache

from agora.config import AgoraSettings


class Settings(AgoraSettings):
    pass


@lru_cache(maxsize=1)
def get_settings() -> Settings:
    return Settings()

AgoraSettings reads from:

environment variables
agora.env
Python defaults

Core settings include:

LOG_LEVEL=INFO
AGORA_ENV=dev

Extend Settings with your own project-specific fields for database URLs, API keys, feature flags, and service endpoints.

Inspect the resolved settings with:

agora config show

Declarative pipeline configs¶

Agora supports TOML documents with format = "agora/v1".

Minimal example:

format = "agora/v1"

[defaults]
pipeline = "users"

[pipelines.users]
pipeline_id = "users-import"

[pipelines.users.source]
type = "csv"
path = "data/users.csv"
encoding = "utf-8"
delimiter = ","
has_header = true
row_mapper = { import = "pipelines.mappers:user_from_csv" }

[[pipelines.users.middlewares]]
type = "validate"
schema = { import = "models:UserRecord" }

[[pipelines.users.sinks]]
type = "jsonl"
path = "output/users.jsonl"

Validate the resolved plan without running:

agora run --config pipelines.toml --plan

Run the pipeline:

agora run --config pipelines.toml

Selecting pipelines¶

One config file can contain multiple pipelines:

format = "agora/v1"

[pipelines.users.source]
type = "iterable"
records = []

[pipelines.orders.source]
type = "iterable"
records = []

Select one by name:

agora run users --config pipelines.toml
agora run orders --config pipelines.toml

If you omit the name, defaults.pipeline is used when present.

Profiles and environments¶

Config overlays let you keep one base definition and specialize it for local, staging, or production use.

format = "agora/v1"

[defaults]
pipeline = "orders"
environment = "local"

[pipelines.orders.source]
type = "jsonl"
path = "data/orders.jsonl"

[[pipelines.orders.sinks]]
type = "stdout"

[environments.local.pipelines.orders.dlq]
enabled = true
failure_policy = "log_only"

[environments.local.pipelines.orders.dlq.sink]
type = "sqlite_dlq"
path = ".orders.dlq.db"

[environments.prod.pipelines.orders.dlq]
enabled = true
failure_policy = "raise"

Select an environment explicitly:

agora run --config pipelines.toml --environment prod

Or rely on AGORA_ENV:

AGORA_ENV=prod agora run --config pipelines.toml

Profiles work the same way via [profiles.<name>] and --profile.

Import references¶

Some config values can point to Python callables or classes using an import reference:

row_mapper = { import = "pipelines.mappers:user_from_csv" }
schema = { import = "models:UserRecord" }

That allows TOML configs to stay declarative while reusing project code.

Because import references resolve real Python modules from your project, treat pipeline config as trusted input. Do not accept unreviewed config files from untrusted users.

When you run:

agora run --config pipelines.toml
agora run --config pipelines.toml --plan
agora dlq replay --config pipelines.toml

Agora prepends the project root and src/ to sys.path, then resolves those imports as normal Python objects. In practice, that means a declarative config is operational code with a TOML wrapper.

DLQ and tracing sections¶

Optional sections inside one pipeline:

[pipelines.orders.dlq]
enabled = true
failure_policy = "log_only"

[pipelines.orders.dlq.sink]
type = "sqlite_dlq"
path = ".orders.dlq.db"

[pipelines.orders.tracing]
enabled = true
backend = "in_memory"
auto_configure = true
service_name = "orders-local"

Supported tracing backends:

noop
in_memory
opentelemetry

For opentelemetry, auto_configure = true tells Agora to reuse an existing global tracer provider when one is already configured, or to auto-configure an OTLP exporter when the optional OTel SDK/exporter packages are installed.

Performance section¶

Use [performance] at the document level to set defaults for every pipeline:

[performance]
acceleration = "auto"
profile = "balanced"

Use [pipelines.<name>.performance] to override one pipeline:

[pipelines.orders.performance]
acceleration = "required"
profile = "throughput"

Acceleration modes:

auto: use compatible agora-etl-rs paths when installed
off: force pure Python paths
required: fail before source/sink open if acceleration is missing or incompatible

Profiles:

balanced: keep explicit runtime settings as-is
throughput: apply larger default writer batches, flush cadence, buffer limits, adaptive backpressure, and opt into Rust source prefetch outside buffered lanes
low_latency: apply tighter buffer limits and faster flush cadence

Manual code-level settings win over profile defaults. agora run --config pipelines.toml --plan, Pipeline.explain(), run summaries, and agora doctor --config pipelines.toml show the resolved acceleration mode, profile, compatibility state, and concrete profile settings.

Scheduled worker config¶

The same agora/v1 file can now define a WorkerPool without a custom worker.py module.

Top-level worker options:

[worker]
graceful_shutdown_timeout = 45.0
health_port = 8080
health_host = "0.0.0.0"

Per-pipeline schedules:

[pipelines.orders.schedule]
mode = "every"
minutes = 15

[pipelines.reports.schedule]
mode = "cron"
expression = "0 * * * *"

[pipelines.stream.schedule]
mode = "continuous"

Supported schedule modes:

every with seconds, minutes, hours, or days
cron with expression
continuous
once

Component type names¶

Declarative configs use the same registry keys shown by:

agora plugins list

Examples:

built-in sources: csv, jsonl, parquet, http
built-in sinks: stdout, jsonl, csv, parquet, webhook, log
plugin sinks and sources: redis, kafka, postgres

Recommended workflow¶

A good working pattern is:

start in Python while shaping the pipeline
extract stable callables and schemas into importable modules
move operational wiring into agora/v1 TOML
use --plan in CI to validate configs before deployment

Security and operations notes¶

agora run --config ... and agora dlq replay --config ... both import Python code from the project.
agora config show imports src/settings.py and executes get_settings().
agora run --plan is read-only with respect to pipeline execution, but it still resolves trusted import references from the config.
Health endpoints are intentionally lightweight. Keep them bound to private network interfaces or protect them with AGORA_HEALTH_AUTH_TOKEN.
Treat the built-in health server as a private operations endpoint, not as a public API edge.