Optimize for Claude Agent SDK: Memory, context, and model selection

## Memory & Context Optimizations ### agent.py - MAX_CONTEXT_MESSAGES: 10 → 20 (better conversation coherence) - MEMORY_RESPONSE_PREVIEW_LENGTH: 200 → 500 (richer memory storage) - MAX_CONVERSATION_HISTORY: 50 → 100 (longer session continuity) - search_hybrid max_results: 2 → 5 (better memory recall) - System prompt: Now mentions tool count and flat-rate subscription - Memory format: Changed "User (username)/Agent" to "username/Garvis" ### llm_interface.py - Added claude_agent_sdk model (Sonnet) to defaults - Mode-based model selection: * Agent SDK → Sonnet (best quality, flat-rate) * Direct API → Haiku (cheapest, pay-per-token) - Updated logging to show active model ## SOUL.md Rewrite - Added Garvis identity (name, email, role) - Listed all 17 tools (was missing 12 tools) - Added "Critical Behaviors" section - Emphasized flat-rate subscription benefits - Clear instructions to always check user profiles ## Benefits With flat-rate Agent SDK: - ✅ Use Sonnet for better reasoning (was Haiku) - ✅ 2x context messages (10 → 20) - ✅ 2.5x memory results (2 → 5) - ✅ 2.5x richer memory previews (200 → 500 chars) - ✅ Bot knows its name and all capabilities - ✅ Zero marginal cost for thoroughness Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-15 10:22:23 -07:00
parent ce2c384387
commit 911d362ba2
3 changed files with 65 additions and 50 deletions
--- a/agent.py
+++ b/agent.py
@@ -10,11 +10,11 @@ from self_healing import SelfHealingSystem
 from tools import TOOL_DEFINITIONS, execute_tool

 # Maximum number of recent messages to include in LLM context
-MAX_CONTEXT_MESSAGES = 10  # Increased for better context retention
+MAX_CONTEXT_MESSAGES = 20  # Optimized for Agent SDK flat-rate subscription
 # Maximum characters of agent response to store in memory
-MEMORY_RESPONSE_PREVIEW_LENGTH = 200
+MEMORY_RESPONSE_PREVIEW_LENGTH = 500  # Store more context for better memory retrieval
 # Maximum conversation history entries before pruning
-MAX_CONVERSATION_HISTORY = 50
+MAX_CONVERSATION_HISTORY = 100  # Higher limit with flat-rate subscription


 class Agent:
@@ -142,14 +142,16 @@ class Agent:
        """Inner chat logic, called while holding _chat_lock."""
        soul = self.memory.get_soul()
        user_profile = self.memory.get_user(username)
-        relevant_memory = self.memory.search_hybrid(user_message, max_results=2)
+        relevant_memory = self.memory.search_hybrid(user_message, max_results=5)

        memory_lines = [f"- {mem['snippet']}" for mem in relevant_memory]
        system = (
            f"{soul}\n\nUser Profile:\n{user_profile}\n\n"
            f"Relevant Memory:\n" + "\n".join(memory_lines) +
-            f"\n\nYou have access to tools for file operations and command execution. "
-            f"Use them freely to help the user."
+            f"\n\nYou have access to {len(TOOL_DEFINITIONS)} tools for file operations, "
+            f"command execution, and Google services. Use them freely to help the user. "
+            f"Note: You're running on a flat-rate Agent SDK subscription, so don't worry "
+            f"about API costs when making multiple tool calls or processing large contexts."
        )

        self.conversation_history.append(
@@ -210,8 +212,8 @@ class Agent:

                preview = final_response[:MEMORY_RESPONSE_PREVIEW_LENGTH]
                self.memory.write_memory(
-                    f"**User ({username})**: {user_message}\n"
-                    f"**Agent**: {preview}...",
+                    f"**{username}**: {user_message}\n"
+                    f"**Garvis**: {preview}...",
                    daily=True,
                )