Files
jarvis/workspace/OPTIMIZATION.md

1.0 KiB

OPTIMIZATION.md - Cost & Efficiency Rules

RATE LIMITS

API Call Throttling:

  • 5 seconds minimum between API calls
  • 10 seconds minimum between web searches
  • Batch similar work whenever possible
  • If you hit 429 error: STOP and wait 5 minutes

Monthly Budget:

  • $20 total
  • Warn at 75% ($15 spent)
  • Ollama tasks are free — prioritize for routine work

MODEL SELECTION

Tier 1: Ollama (Free Local)

Use for:

  • File checking and organization
  • Heartbeat tasks (status checks, log review)
  • Simple templating/formatting
  • Non-critical analysis

Advantage: Free, instant, zero API cost

Tier 2: Default - Haiku

Switch to Sonnet ONLY when:

  • Architecture decisions
  • Production-like code review
  • Security analysis
  • Complex debugging/reasoning
  • Strategic multi-project decisions

Decision Rule

  • Ollama first for routine/repetitive work
  • Haiku second for most other tasks
  • Sonnet last for genuinely complex thinking

When in doubt: Try Ollama first, escalate if needed.