Skip to main content

Observability

Declaragent emits three signal families out of the box: Prometheus metrics (/metrics on 127.0.0.1:9464), OpenTelemetry traces + metrics (when OTEL_EXPORTER_OTLP_ENDPOINT is set), and a hash-chained audit log (SQLite, exportable to Splunk / Elastic / Datadog).

This page is the counter-to-dashboard index. For the HTTP surface, see /reference/control-plane. For the trace story, see the Grafana tracing recipe.

Prometheus metrics exposed by declaragent up -d

Every metric below is available on the /metrics endpoint at 127.0.0.1:9464 whenever the CLI runs with -d (override the port with DECLARAGENT_METRICS_PORT; set to 0 to disable).

MCP supervisor

MetricKindLabelsSource
mcp_server_restarts_totalcounterserver_id, reasonpackages/core/src/mcp/supervisor.ts
mcp_server_circuit_stategauge (0|1|2)server_idsame
mcp_server_circuit_open_totalcounterserver_idsame
mcp_server_drain_duration_mshistogramserver_id, outcomesame
mcp_server_rate_limited_totalcounterserver_id, reasonsame

Audit + SIEM export

MetricKindLabelsSource
declaragent_audit_export_acked_totalcounterexporter, vendorpackages/core/src/audit/exporter-loop.ts
declaragent_audit_export_failures_totalcounterexporter, vendor, retryablesame
declaragent_audit_export_pausedgaugeexporter, vendorsame
declaragent_audit_export_last_seqgaugeexporter, vendorsame
declaragent_audit_backpressure_activegaugeexporter, vendorsame
declaragent_audit_backpressure_paused_totalcounterexporter, vendorsame
declaragent_audit_backpressure_drops_totalcounterexporter, vendorsame
declaragent_audit_backpressure_backlog_msgaugeexporter, vendorsame
declaragent_audit_batch_interval_msgaugeexporter, vendorsame
declaragent_audit_batch_rowshistogramexporter, vendorsame

Rate limits

MetricKindLabelsSource
declaragent_provider_rate_limit_waitscounterproviderpackages/cli/src/up-cli.ts
declaragent_provider_rate_limit_wait_mshistogramprovidersame
declaragent_tool_rate_limit_waits_totalcounteragent, toolsame
declaragent_tool_rate_limit_wait_mscounteragent, toolsame

Event sources + channels

MetricKindLabelsSource
source_messages_receivedcounteridpackages/core/src/events/base-source.ts
source_messages_processedcounteridsame
source_messages_failedcounteridsame
source_messages_dlqcounteridsame
source_connection_errorscounteridsame
source_inflightgaugeidsame
source_process_duration_mshistogramidsame
channel_outbound_sentcountertype, idpackages/core/src/channels/base-channel.ts
channel_outbound_failedcountertype, id, reasonsame
channel_outbound_latency_mshistogramtype, idsame
channel_inbound_receivedcountertype, idsame

Naming note

Internal metric keys use dotted identifiers (e.g. declaragent.audit.export.acked_total) for OTel compatibility. They are normalized to Prometheus-valid names ([a-zA-Z_:][a-zA-Z0-9_:]*) at scrape time — every . becomes _. The tables above show the wire names you'll see in Grafana.

Grafana dashboard

Declaragent ships a ready-made Grafana dashboard that aggregates the key counters into three rows — MCP health, Audit + SIEM, Rate limits + dispatch — so you don't have to hand-author panels from scratch.

Quick import:

# Grafana UI → Dashboards → Import → Upload JSON file → pick the file above.
# Pick your Prometheus data source when prompted for DS_PROMETHEUS.

Prometheus scrape config:

scrape_configs:
- job_name: declaragent
static_configs:
- targets: ['your-host:9464']
metrics_path: /metrics
scrape_interval: 15s

Other shipped dashboards

Per-signal dashboards under packages/testkit/dashboards/:

Alert rules

Six rule files under packages/testkit/alerts/ (channels, event-sources, daemon, security, WhatsApp windows, chaos-assertions). Every alert carries a runbook_url that points into the runbook index.

OpenTelemetry

When OTEL_EXPORTER_OTLP_ENDPOINT is set, Declaragent exports spans + metrics over OTLP/HTTP. Key spans:

  • channel.inbound.<platform> — raw inbound decode.
  • bus.dispatch — envelope handed to the engine.
  • engine.turn — per-LLM-call turn (turn_number, model, tokens_in, tokens_out).
  • tool.invoke — per tool call.
  • channel.outbound.<platform> — outbound send + status code.

Full env-var surface in docs/OTEL_SETUP.md. Docker-compose bundle with Prometheus + Tempo + Grafana pre-wired: packages/testkit/observability/.