636 B
636 B
OPTIMIZATION.md - Cost & Efficiency Rules
RATE LIMITS
API Call Throttling:
- 5 seconds minimum between API calls
- 10 seconds minimum between web searches
- Batch similar work whenever possible
- If you hit 429 error: STOP and wait 5 minutes
Monthly Budget:
- $20 total
- Warn at 75% ($15 spent)
MODEL SELECTION RULE
Default: Haiku
Switch to Sonnet ONLY when:
- Architecture decisions
- Complex code review
- Security Analysis
- Complex debugging/reasoning
When in doubt: Try Haiku first
Heartbeat: Ollama Only
Heartbeats ALWAYS use Ollama. No escalation. If it fails, it fails.