workspace/OPTIMIZATION.md

# OPTIMIZATION.md - Cost & Efficiency Rules

## RATE LIMITS

**API Call Throttling:**
- **5 seconds minimum** between API calls
- **10 seconds minimum** between web searches
- **Batch similar work** whenever possible
- **If you hit 429 error:** STOP and wait 5 minutes

**Monthly Budget:**
- **$20 total**
- **Warn at 75%** ($15 spent)

---

## MODEL SELECTION - Three Tiers

### Basic → Ollama
- File checking and organization
- Simple templating/formatting
- Log review and cleanup
- Non-critical analysis
- Routine status checks

**Cost:** Free (local)

### Normal → Haiku (Default)
- Most tasks
- Code review (non-production)
- Documentation and writing
- General problem solving
- Straightforward reasoning

**Cost:** ~$0.30-1.50 per 1M tokens

### Complex → Sonnet
- Architecture decisions
- Production code review
- Security analysis
- Complex debugging/reasoning
- Strategic multi-project decisions

**Cost:** ~$3-15 per 1M tokens

### Heartbeat: Ollama Only
Heartbeats ALWAYS use Ollama. No escalation. If it fails, it fails.
Add OPTIMIZATION.md with rate limits and model selection rules 2026-02-05 10:12:39 -07:00			`# OPTIMIZATION.md - Cost & Efficiency Rules`

			`## RATE LIMITS`

			`API Call Throttling:`
			`- 5 seconds minimum between API calls`
			`- 10 seconds minimum between web searches`
			`- Batch similar work whenever possible`
			`- If you hit 429 error: STOP and wait 5 minutes`

			`Monthly Budget:`
			`- $20 total`
			`- Warn at 75% ($15 spent)`

			`---`

Simplify model tiers: Basic=Ollama, Normal=Haiku, Complex=Sonnet 2026-02-05 10:36:52 -07:00			`## MODEL SELECTION - Three Tiers`
Add OPTIMIZATION.md with rate limits and model selection rules 2026-02-05 10:12:39 -07:00
Simplify model tiers: Basic=Ollama, Normal=Haiku, Complex=Sonnet 2026-02-05 10:36:52 -07:00			`### Basic → Ollama`
Integrate Ollama for heartbeats and routine tasks - local LLM for zero cost 2026-02-05 10:29:44 -07:00			`- File checking and organization`
			`- Simple templating/formatting`
Rework model tiers: Haiku default with internal routing to Ollama/Sonnet 2026-02-05 10:35:59 -07:00			`- Log review and cleanup`
Integrate Ollama for heartbeats and routine tasks - local LLM for zero cost 2026-02-05 10:29:44 -07:00			`- Non-critical analysis`
Rework model tiers: Haiku default with internal routing to Ollama/Sonnet 2026-02-05 10:35:59 -07:00			`- Routine status checks`
Integrate Ollama for heartbeats and routine tasks - local LLM for zero cost 2026-02-05 10:29:44 -07:00
Simplify model tiers: Basic=Ollama, Normal=Haiku, Complex=Sonnet 2026-02-05 10:36:52 -07:00			`Cost: Free (local)`

			`### Normal → Haiku (Default)`
			`- Most tasks`
Rework model tiers: Haiku default with internal routing to Ollama/Sonnet 2026-02-05 10:35:59 -07:00			`- Code review (non-production)`
			`- Documentation and writing`
Simplify model tiers: Basic=Ollama, Normal=Haiku, Complex=Sonnet 2026-02-05 10:36:52 -07:00			`- General problem solving`
			`- Straightforward reasoning`

			`Cost: ~$0.30-1.50 per 1M tokens`
Add OPTIMIZATION.md with rate limits and model selection rules 2026-02-05 10:12:39 -07:00
Simplify model tiers: Basic=Ollama, Normal=Haiku, Complex=Sonnet 2026-02-05 10:36:52 -07:00			`### Complex → Sonnet`
Add OPTIMIZATION.md with rate limits and model selection rules 2026-02-05 10:12:39 -07:00			`- Architecture decisions`
Simplify model tiers: Basic=Ollama, Normal=Haiku, Complex=Sonnet 2026-02-05 10:36:52 -07:00			`- Production code review`
Add OPTIMIZATION.md with rate limits and model selection rules 2026-02-05 10:12:39 -07:00			`- Security analysis`
			`- Complex debugging/reasoning`
			`- Strategic multi-project decisions`

Simplify model tiers: Basic=Ollama, Normal=Haiku, Complex=Sonnet 2026-02-05 10:36:52 -07:00			`Cost: ~$3-15 per 1M tokens`
Add OPTIMIZATION.md with rate limits and model selection rules 2026-02-05 10:12:39 -07:00
Simplify model tiers: Basic=Ollama, Normal=Haiku, Complex=Sonnet 2026-02-05 10:36:52 -07:00			`### Heartbeat: Ollama Only`
			`Heartbeats ALWAYS use Ollama. No escalation. If it fails, it fails.`