4.3 Agent Health

Orbnetes deployment and release orchestration documentation for operators and platform teams.

Agent Health summarizes readiness of the execution plane.

What this block is for:

  • Detect infrastructure capacity/routing issues early.
  • Confirm whether pipeline delays are caused by queue logic or agent availability.
  • Observe runtime pressure signals (CPU/memory/disk where reported).

Typical indicators:

  • online/offline/inactive state,
  • last heartbeat age,
  • runner version and update state,
  • tags and OS profile,
  • optional runtime metrics trends.

How to interpret quickly:

  • Online + fresh heartbeat: agent is likely claim-capable.
  • Stale heartbeat: connectivity/service issue likely.
  • No matching tags: jobs remain queued even if agents are online.
  • High pressure metrics: expect slower execution or instability.

Operational best practice:

  • Keep at least one fallback agent per critical tag class.
  • Standardize tags and avoid ambiguous routing labels.