Developer Docs

Observability & Request Tracing

Logging, tracing, metrics, and incident triage practices for reliable Vitae API integrations.

Observability is essential for production integrations. Without request-level traces and metrics, troubleshooting matching and rendering failures becomes guesswork.

What to log per request

  • Request ID / correlation ID
  • Endpoint path and status code
  • Retry count and latency
  • Quota-related failure markers (429, daily cap)

Metrics to track

  • Success rate by endpoint
  • P95 and P99 latency
  • 429 volume over time
  • Queue depth and dead-letter counts

Incident triage flow

  1. Check error class distribution (4xx vs 5xx)
  2. Check auth failures and quota spikes
  3. Identify regressions by deployment timestamp
  4. Apply rollback or hotfix as needed

Operations checklist

  • Dashboards include request rate, failures, and latency
  • Alerting thresholds are set for sustained 429 and 5xx rates
  • Runbooks link to exact docs pages and fix procedures
  • Post-incident review captures root cause and docs updates