oracle-watchdog

Grafana Dashboard

Pre-built Grafana dashboard for monitoring Oracle Cloud node health and recovery

The dashboard is organized into three sections. Monitor Mode shows a per-node status table with Consul connection and session heartbeat health for each Oracle node, alongside a session activity timeseries and monitor logs from Loki. Agent Mode displays the agent’s Consul and OCI connection status, how many nodes are being watched, and whether any are currently missing. Restart Activity tracks cumulative restart attempts, successes, and failures per node in a table, with a timeseries view of restart events over time. Agent logs round out the bottom of the dashboard.

Grafana Dashboard Grafana Dashboard

Metrics

Monitor Mode (:9104)

MetricTypeDescription
oracle_watchdog_consul_connectedgaugeConsul connection status (1=connected, 0=disconnected)
oracle_watchdog_session_activegaugeSession status (1=active, 0=inactive)
oracle_watchdog_reconnect_attempts_totalcounterConsul reconnection attempts
oracle_watchdog_session_renewals_totalcounterSuccessful session renewals
oracle_watchdog_session_failures_totalcounterSession creation or renewal failures

Agent Mode (:9105)

MetricTypeLabelsDescription
oracle_watchdog_agent_consul_connectedgaugeConsul connection status
oracle_watchdog_agent_oci_connectedgaugeOCI connection status
oracle_watchdog_agent_nodes_monitoredgaugeNumber of configured nodes
oracle_watchdog_agent_nodes_missinggaugeCurrently missing nodes
oracle_watchdog_agent_restart_attempts_totalcounternodeRestart attempts per node
oracle_watchdog_agent_restart_successes_totalcounternodeSuccessful restarts per node
oracle_watchdog_agent_restart_failures_totalcounternodeFailed restarts per node
oracle_watchdog_agent_consul_check_failures_totalcounterConsul KV check failures