2026-02-05 10:12:39 -07:00
# OPTIMIZATION.md - Cost & Efficiency Rules
## RATE LIMITS
**API Call Throttling:**
- **5 seconds minimum** between API calls
- **10 seconds minimum** between web searches
- **Batch similar work** whenever possible
- **If you hit 429 error:** STOP and wait 5 minutes
**Monthly Budget:**
- **$20 total**
- **Warn at 75%** ($15 spent)
---
## MODEL SELECTION
2026-02-05 10:35:59 -07:00
### System Default: Haiku
Haiku is your primary model. It decides routing internally:
**Haiku routes to Ollama when:**
2026-02-05 10:29:44 -07:00
- File checking and organization
- Simple templating/formatting
2026-02-05 10:35:59 -07:00
- Log review and cleanup
2026-02-05 10:29:44 -07:00
- Non-critical analysis
2026-02-05 10:35:59 -07:00
- Routine status checks
2026-02-05 10:29:44 -07:00
2026-02-05 10:35:59 -07:00
**Haiku handles directly when:**
- Most tasks fit within Haiku's capability
- Reasoning is needed but not deeply complex
- Code review (non-production)
- Documentation and writing
2026-02-05 10:12:39 -07:00
2026-02-05 10:35:59 -07:00
**Haiku escalates to Sonnet when:**
2026-02-05 10:12:39 -07:00
- Architecture decisions
- Production-like code review
- Security analysis
- Complex debugging/reasoning
- Strategic multi-project decisions
2026-02-05 10:35:59 -07:00
### Heartbeat: Ollama Only
Heartbeats ALWAYS use Ollama. No escalation. If Ollama fails, the heartbeat fails.
2026-02-05 10:12:39 -07:00
2026-02-05 10:35:59 -07:00
### Decision Rule
2026-02-05 10:12:39 -07:00
2026-02-05 10:35:59 -07:00
Let Haiku decide. It's smart enough to route to Ollama when appropriate and escalate to Sonnet when needed. You only override when you know you need Sonnet upfront.