50 lines
1.2 KiB
Markdown
50 lines
1.2 KiB
Markdown
# OPTIMIZATION.md - Cost & Efficiency Rules
|
|
|
|
## RATE LIMITS
|
|
|
|
**API Call Throttling:**
|
|
- **5 seconds minimum** between API calls
|
|
- **10 seconds minimum** between web searches
|
|
- **Batch similar work** whenever possible
|
|
- **If you hit 429 error:** STOP and wait 5 minutes
|
|
|
|
**Monthly Budget:**
|
|
- **$20 total**
|
|
- **Warn at 75%** ($15 spent)
|
|
|
|
---
|
|
|
|
## MODEL SELECTION
|
|
|
|
### System Default: Haiku
|
|
|
|
Haiku is your primary model. It decides routing internally:
|
|
|
|
**Haiku routes to Ollama when:**
|
|
- File checking and organization
|
|
- Simple templating/formatting
|
|
- Log review and cleanup
|
|
- Non-critical analysis
|
|
- Routine status checks
|
|
|
|
**Haiku handles directly when:**
|
|
- Most tasks fit within Haiku's capability
|
|
- Reasoning is needed but not deeply complex
|
|
- Code review (non-production)
|
|
- Documentation and writing
|
|
|
|
**Haiku escalates to Sonnet when:**
|
|
- Architecture decisions
|
|
- Production-like code review
|
|
- Security analysis
|
|
- Complex debugging/reasoning
|
|
- Strategic multi-project decisions
|
|
|
|
### Heartbeat: Ollama Only
|
|
|
|
Heartbeats ALWAYS use Ollama. No escalation. If Ollama fails, the heartbeat fails.
|
|
|
|
### Decision Rule
|
|
|
|
Let Haiku decide. It's smart enough to route to Ollama when appropriate and escalate to Sonnet when needed. You only override when you know you need Sonnet upfront.
|