Architecture Review Task Ledger

This ledger converts the adversarial architecture review into tracked work. Keep it current as implementation lands; unresolved P1 items should block any deployment with real provider credentials.

Active Tasks

ID Priority Area Status Task
ARCH-001 P1 Security In progress Remove or gate prototype mutation surfaces before real credentials are configured. POST /__test/config is now disabled unless :allow_test_config or WARDWRIGHT_ALLOW_TEST_CONFIG=1 is set. /admin/*, receipt reads, and policy-cache APIs now require loopback access or an admin token. Public synthetic-model discovery now returns summaries instead of route graphs, prompt transforms, or governance internals. This is a homelab/single-operator guard, not a complete product auth model. Provider API credentials should be managed separately through fnox-backed secret lookup. Remaining work is to define deployment-topology-specific caller authorization: local-only, SSO/reverse-proxy integration, API-key authorization to use specific synthetic models, or a database-backed user/permission system if the product later needs one.
ARCH-002 P1 Route policy Closed Make route-policy overrides fail closed by default. Forced missing/too-small models block unless the matching route policy explicitly sets allow_fallback: true; when fallback is allowed, receipts preserve the failed forced override, selected fallback route, skipped targets, and policy_route_constraints so the operator can see that fallback was policy-authorized.
ARCH-003 P1 Policy engines In progress Normalize primitive, Dune, WASM, and hybrid policy outputs into one action shape. wardwright.policy_action.v1 and wardwright.policy_result.v1 now carry phase, effect, source, priority, and conflict metadata; receipts include decision.policy_conflicts for ordered same-key conflicts. Remaining work is to extend the same contract across stream/output phases and make conflict resolution enforceable beyond request-phase declaration order.
ARCH-004 P1 TTSR In progress Stream policies now evaluate selected-target provider chunks through the provider runtime, restart unreleased provider attempts for retry and retry_with_reminder, inject retry reminders into the retried provider request, fail closed when the injected retry prompt no longer fits the selected target context window, preserve failed attempts as unreleased receipt evidence, and record provider call/mock status plus generated/released/held/rewritten/blocked byte accounting. Split literal/regex matches are checked against the buffered stream window, including rewrite actions across chunk boundaries. Ollama NDJSON streams and OpenAI-compatible SSE streams now pass through native HTTP streaming transport adapters. The router now drives the incremental stream arbiter while provider chunks arrive, releases bounded safe prefixes over SSE, cancels the provider attempt when a later trigger fires, fails closed when declared max_hold_ms latency budgets are exceeded, and records terminal stream-policy evidence in receipts. Remaining work is richer raw provider-event offsets, provider-specific pools, reroute/degrade semantics after retry prompt growth, and clearer retry semantics after any bytes have already reached the client.
ARCH-005 P2 Policy architecture In progress Keep moving policy evaluation out of HTTP request handling. Request governance now lives in Wardwright.Policy.Plan, normalized action/result contracts sit below the router, and stable pure decisions are beginning to move into Gleam wrappers. State-machine authoring is now treated as a structured governance artifact that can be visualized, simulated, and compiled to pure transition logic or BEAM runtime processes as an implementation detail. Remaining work is compiled plans, phase-specific evaluators, state-machine artifact validation, and projection/trace emission from the same plan.
ARCH-006 P2 State/history In progress Hot policy history now lives behind Wardwright.PolicyCache with deterministic eviction, status reporting, and LiveView/PubSub visibility for recent cache writes. The cache now uses a protected ETS session catalog plus supervised per-session owner processes with bounded ETS tables, so session-local writes no longer serialize through one global history table owner. Recent history threshold classification has a Gleam core. Remaining work is declared aggregate/index tables for cross-session facts, richer owner/index health in LiveView, durable sink persistence, replay/checkpoint strategy, and explicit behavior across restart and multi-instance deployment.
ARCH-007 P2 Alerting In progress Alert enqueue/backpressure classification has a Gleam core, and the current in-memory sink now exposes queue health through /admin/policy-alerts plus redacted PubSub delivery events. Remaining work is turning alert delivery into a supervised sink abstraction with delivery workers, retry/dead-letter persistence, external sink adapters, and durable queue recovery.
ARCH-008 P2 Projection UI In progress Stop hardcoding projection/simulation examples; generate workbench projection data from deterministic policy artifacts, compiled plans, and receipts. Route-policy projections now derive nodes/effects/conflicts and a simulation trace from configured governance rules through Wardwright.Policy.Plan, with a visible no-route-policy gap state when no matching rules exist. Remaining work is to make stream/output projections consume the same compiled-plan/receipt path and remove the last canned simulation previews.
ARCH-009 P2 Provider runtime In progress Provider calls now run through Wardwright.ProviderRuntime using a supervised task boundary, per-target provider_timeout_ms, and runtime PubSub events for provider attempt start/finish. Slow provider calls time out and surface as provider_error receipts instead of hanging the request indefinitely. Streaming requests now use a stream_each provider boundary that emits normalized chunks to the router as they arrive, parses native Ollama/OpenAI-compatible stream transports incrementally, and supports cancellation when the stream arbiter halts an attempt. Remaining work is provider-specific pools, circuit breaking, richer telemetry, credential lookup isolation, and health/degraded-state reporting.
ARCH-010 P2 Test quality In progress Add behavior tests for fail-closed policy semantics and make property tests exercise implementation paths, not only oracle helpers. New coverage exists for forced-route failure, hybrid propagation, test-config gating, and the Gleam-backed structured/history/alert decision cores.

Follow-Up Review Gates