45 lines
1.0 KiB
Markdown
45 lines
1.0 KiB
Markdown
# OPTIMIZATION.md - Cost & Efficiency Rules
|
|
|
|
## RATE LIMITS
|
|
|
|
**API Call Throttling:**
|
|
- **5 seconds minimum** between API calls
|
|
- **10 seconds minimum** between web searches
|
|
- **Batch similar work** whenever possible
|
|
- **If you hit 429 error:** STOP and wait 5 minutes
|
|
|
|
**Monthly Budget:**
|
|
- **$20 total**
|
|
- **Warn at 75%** ($15 spent)
|
|
- **Ollama tasks are free** — prioritize for routine work
|
|
|
|
---
|
|
|
|
## MODEL SELECTION
|
|
|
|
### Tier 1: Ollama (Free Local)
|
|
Use for:
|
|
- File checking and organization
|
|
- Heartbeat tasks (status checks, log review)
|
|
- Simple templating/formatting
|
|
- Non-critical analysis
|
|
|
|
**Advantage:** Free, instant, zero API cost
|
|
|
|
### Tier 2: Default - Haiku
|
|
|
|
Switch to **Sonnet** ONLY when:
|
|
- Architecture decisions
|
|
- Production-like code review
|
|
- Security analysis
|
|
- Complex debugging/reasoning
|
|
- Strategic multi-project decisions
|
|
|
|
### Decision Rule
|
|
|
|
- **Ollama first** for routine/repetitive work
|
|
- **Haiku second** for most other tasks
|
|
- **Sonnet last** for genuinely complex thinking
|
|
|
|
**When in doubt:** Try Ollama first, escalate if needed.
|