ADR-003: Blocking Queries for Change Detection
Status: Accepted Date: 2026-04-04
Context
Consul Guardian needs to detect KV changes in near-real-time. Consul provides three mechanisms for change detection:
- Blocking queries (long-polling) -- Client sends a request with the last known index. Consul holds the connection until the index changes or a timeout expires.
- Events API (gossip-based) -- Fire-and-forget events propagated via the gossip protocol.
- Periodic polling -- Fetch all keys on a fixed interval and diff.
Decision
Use Consul blocking queries as the primary change detection mechanism.
A blocking query works like this:
GET /v1/kv/config/?recurse&index=42&wait=5m
→ Consul holds the connection open
→ Key changes → returns immediately with new data and index=43
→ No change after 5m → returns with the same data and index=42
Consequences
Positive
- Near-real-time detection. Changes are detected in under 5 seconds (typical). No polling delay.
- Built on Raft consensus. The
ModifyIndexcomes from Raft, so it's consistent and reliable. - Official, well-documented API. Supported since Consul 0.7.
- Streaming backend (Consul 1.10+). Reduces per-query overhead on the server side.
- Efficient change tracking. The
ModifyIndexper key lets Guardian detect exactly which keys changed without diffing the entire prefix.
Negative
- Holds an HTTP connection open per watched prefix. Each connection consumes a small amount of server resources.
- The index can go backward after a Raft snapshot restore or key deletion. Guardian handles this by resetting to index 0 and doing a full re-sync.
- A response with a new index doesn't guarantee an actual change to watched keys (the index is global, not per-prefix). Guardian must still diff the response.
- Maximum wait time is 10 minutes. Connections must be re-established periodically.
Alternatives Considered
| Option | Pros | Cons |
|---|---|---|
| Events API | Simple fire-and-forget | No delivery guarantee, not persisted, no global ordering. Unreliable for backup. |
| Periodic polling | Simplest to implement | Higher latency (poll interval), more load on Consul, more bandwidth |
| consul watch CLI | Built-in tool | External process, harder to integrate, stdout parsing, less control over backoff |