1.0 KiB
1.0 KiB
OPTIMIZATION.md - Cost & Efficiency Rules
RATE LIMITS
API Call Throttling:
- 5 seconds minimum between API calls
- 10 seconds minimum between web searches
- Batch similar work whenever possible
- If you hit 429 error: STOP and wait 5 minutes
Monthly Budget:
- $20 total
- Warn at 75% ($15 spent)
- Ollama tasks are free — prioritize for routine work
MODEL SELECTION
Tier 1: Ollama (Free Local)
Use for:
- File checking and organization
- Heartbeat tasks (status checks, log review)
- Simple templating/formatting
- Non-critical analysis
Advantage: Free, instant, zero API cost
Tier 2: Default - Haiku
Switch to Sonnet ONLY when:
- Architecture decisions
- Production-like code review
- Security analysis
- Complex debugging/reasoning
- Strategic multi-project decisions
Decision Rule
- Ollama first for routine/repetitive work
- Haiku second for most other tasks
- Sonnet last for genuinely complex thinking
When in doubt: Try Ollama first, escalate if needed.