Files
jarvis/workspace/OPTIMIZATION.md

48 lines
1.0 KiB
Markdown
Raw Normal View History

# OPTIMIZATION.md - Cost & Efficiency Rules
## RATE LIMITS
**API Call Throttling:**
- **5 seconds minimum** between API calls
- **10 seconds minimum** between web searches
- **Batch similar work** whenever possible
- **If you hit 429 error:** STOP and wait 5 minutes
**Monthly Budget:**
- **$20 total**
- **Warn at 75%** ($15 spent)
---
## MODEL SELECTION - Three Tiers
### Basic → Ollama
- File checking and organization
- Simple templating/formatting
- Log review and cleanup
- Non-critical analysis
- Routine status checks
**Cost:** Free (local)
### Normal → Haiku (Default)
- Most tasks
- Code review (non-production)
- Documentation and writing
- General problem solving
- Straightforward reasoning
**Cost:** ~$0.30-1.50 per 1M tokens
### Complex → Sonnet
- Architecture decisions
- Production code review
- Security analysis
- Complex debugging/reasoning
- Strategic multi-project decisions
**Cost:** ~$3-15 per 1M tokens
### Heartbeat: Ollama Only
Heartbeats ALWAYS use Ollama. No escalation. If it fails, it fails.