Optimize for Claude Agent SDK: Memory, context, and model selection

## Memory & Context Optimizations

### agent.py
- MAX_CONTEXT_MESSAGES: 10 → 20 (better conversation coherence)
- MEMORY_RESPONSE_PREVIEW_LENGTH: 200 → 500 (richer memory storage)
- MAX_CONVERSATION_HISTORY: 50 → 100 (longer session continuity)
- search_hybrid max_results: 2 → 5 (better memory recall)
- System prompt: Now mentions tool count and flat-rate subscription
- Memory format: Changed "User (username)/Agent" to "username/Garvis"

### llm_interface.py
- Added claude_agent_sdk model (Sonnet) to defaults
- Mode-based model selection:
  * Agent SDK → Sonnet (best quality, flat-rate)
  * Direct API → Haiku (cheapest, pay-per-token)
- Updated logging to show active model

## SOUL.md Rewrite

- Added Garvis identity (name, email, role)
- Listed all 17 tools (was missing 12 tools)
- Added "Critical Behaviors" section
- Emphasized flat-rate subscription benefits
- Clear instructions to always check user profiles

## Benefits

With flat-rate Agent SDK:
-  Use Sonnet for better reasoning (was Haiku)
-  2x context messages (10 → 20)
-  2.5x memory results (2 → 5)
-  2.5x richer memory previews (200 → 500 chars)
-  Bot knows its name and all capabilities
-  Zero marginal cost for thoroughness

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
This commit is contained in:
2026-02-15 10:22:23 -07:00
parent ce2c384387
commit 911d362ba2
3 changed files with 65 additions and 50 deletions

View File

@@ -10,11 +10,11 @@ from self_healing import SelfHealingSystem
from tools import TOOL_DEFINITIONS, execute_tool
# Maximum number of recent messages to include in LLM context
MAX_CONTEXT_MESSAGES = 10 # Increased for better context retention
MAX_CONTEXT_MESSAGES = 20 # Optimized for Agent SDK flat-rate subscription
# Maximum characters of agent response to store in memory
MEMORY_RESPONSE_PREVIEW_LENGTH = 200
MEMORY_RESPONSE_PREVIEW_LENGTH = 500 # Store more context for better memory retrieval
# Maximum conversation history entries before pruning
MAX_CONVERSATION_HISTORY = 50
MAX_CONVERSATION_HISTORY = 100 # Higher limit with flat-rate subscription
class Agent:
@@ -142,14 +142,16 @@ class Agent:
"""Inner chat logic, called while holding _chat_lock."""
soul = self.memory.get_soul()
user_profile = self.memory.get_user(username)
relevant_memory = self.memory.search_hybrid(user_message, max_results=2)
relevant_memory = self.memory.search_hybrid(user_message, max_results=5)
memory_lines = [f"- {mem['snippet']}" for mem in relevant_memory]
system = (
f"{soul}\n\nUser Profile:\n{user_profile}\n\n"
f"Relevant Memory:\n" + "\n".join(memory_lines) +
f"\n\nYou have access to tools for file operations and command execution. "
f"Use them freely to help the user."
f"\n\nYou have access to {len(TOOL_DEFINITIONS)} tools for file operations, "
f"command execution, and Google services. Use them freely to help the user. "
f"Note: You're running on a flat-rate Agent SDK subscription, so don't worry "
f"about API costs when making multiple tool calls or processing large contexts."
)
self.conversation_history.append(
@@ -210,8 +212,8 @@ class Agent:
preview = final_response[:MEMORY_RESPONSE_PREVIEW_LENGTH]
self.memory.write_memory(
f"**User ({username})**: {user_message}\n"
f"**Agent**: {preview}...",
f"**{username}**: {user_message}\n"
f"**Garvis**: {preview}...",
daily=True,
)