Synthetic Model Composition
Wardwright synthetic models are stable public model IDs backed by a route graph, not aliases for one upstream provider. The first route primitives are inherited from Calciforge’s model gateway work and are deliberately small:
- dispatcher: choose the smallest eligible context window for the request, while keeping larger eligible targets as fallback attempts
- cascade: try configured models in declaration order, skipping targets whose context windows cannot fit the request
- alloy: blend equivalent constituents through deterministic-all, weighted, or round-robin-style selection
Two outside ideas define the shape of this work. XBOW’s “model alloy” writeups show that alternating multiple LLMs inside one agent context can outperform a single-model monoculture on agentic search tasks. oh-my-pi’s Time Traveling Streamed Rules show that rules can sit outside the prompt until model output actually triggers them, then abort and retry with a targeted reminder.
References:
Dispatcher
Dispatchers are for request-shape routing. The current implementation uses estimated prompt tokens and declared context windows. A small request can use a local model; a larger request promotes to a bigger-context model without the caller changing the public model name.
{
"route_root": "fit-dispatcher",
"targets": [
{"model": "local/qwen", "context_window": 32768},
{"model": "managed/kimi", "context_window": 262144}
],
"dispatchers": [
{
"id": "fit-dispatcher",
"models": ["local/qwen", "managed/kimi"]
}
]
}
Expected behavior:
- prompt estimate below
32768: selectlocal/qwen, keepmanaged/kimias fallback - prompt estimate above
32768and below262144: selectmanaged/kimi, record thatlocal/qwenwas skipped for context fit
Policy Control
The route graph is the baseline model definition. Route policy runs before provider selection and can narrow or override that baseline:
restrict_routesadds anallowed_targetsconstraint. Entries may be concrete model IDs such aslocal/qwenor provider prefixes such aslocal.switch_modelandrerouteadd aforced_modelconstraint.- receipts include both the base route decision and
policy_route_constraints, so the UI can show “what the model definition allowed” separately from “what policy removed or forced for this request.” - if policy removes every provider candidate, Wardwright fails closed and records
route_blockedin the receipt instead of falling through to an arbitrary provider.
Built-in declarative route gates and Dune-backed policy snippets can both emit these actions. WASM remains fail-closed until the runtime is enabled.
Cascade
Cascades are reliability plans. They preserve declaration order and skip impossible targets before a provider attempt is made.
{
"route_root": "local-then-remote",
"cascades": [
{
"id": "local-then-remote",
"models": ["local/qwen", "managed/kimi", "managed/reserve"]
}
]
}
Expected behavior:
- try the first configured model that can fit the prompt
- keep later eligible models as fallback attempts
- never send an obviously oversized request to a smaller context window
Alloy
Alloys are for composition when constituents are useful as one synthetic behavior. Wardwright supports ordinary compatible-window alloys and a deliberate partial-context mode for the user-facing “local plus long-context” use case.
{
"route_root": "local-kimi-partial",
"alloys": [
{
"id": "local-kimi-partial",
"strategy": "deterministic_all",
"partial_context": true,
"constituents": ["local/qwen", "managed/kimi"]
}
]
}
Expected behavior:
- when the prompt fits both constituents, both participate
- when the prompt outgrows the smaller context window, the smaller constituent is skipped and the larger one continues alone
- receipts expose selected models, skipped targets, fallback models, route type, strategy, and the context-window reason
Weighted alloys model stochastic composition while keeping test runs reproducible inside the pure route planner. Caller-supplied request metadata must not control provider selection in the serving path.