diff --git a/workspace/OPTIMIZATION.md b/workspace/OPTIMIZATION.md index 4ee2701..67a2a6e 100644 --- a/workspace/OPTIMIZATION.md +++ b/workspace/OPTIMIZATION.md @@ -11,34 +11,39 @@ **Monthly Budget:** - **$20 total** - **Warn at 75%** ($15 spent) -- **Ollama tasks are free** — prioritize for routine work --- ## MODEL SELECTION -### Tier 1: Ollama (Free Local) -Use for: +### System Default: Haiku + +Haiku is your primary model. It decides routing internally: + +**Haiku routes to Ollama when:** - File checking and organization -- Heartbeat tasks (status checks, log review) - Simple templating/formatting +- Log review and cleanup - Non-critical analysis +- Routine status checks -**Advantage:** Free, instant, zero API cost +**Haiku handles directly when:** +- Most tasks fit within Haiku's capability +- Reasoning is needed but not deeply complex +- Code review (non-production) +- Documentation and writing -### Tier 2: Default - Haiku - -Switch to **Sonnet** ONLY when: +**Haiku escalates to Sonnet when:** - Architecture decisions - Production-like code review - Security analysis - Complex debugging/reasoning - Strategic multi-project decisions +### Heartbeat: Ollama Only + +Heartbeats ALWAYS use Ollama. No escalation. If Ollama fails, the heartbeat fails. + ### Decision Rule -- **Ollama first** for routine/repetitive work -- **Haiku second** for most other tasks -- **Sonnet last** for genuinely complex thinking - -**When in doubt:** Try Ollama first, escalate if needed. +Let Haiku decide. It's smart enough to route to Ollama when appropriate and escalate to Sonnet when needed. You only override when you know you need Sonnet upfront.