Optimize for Claude Agent SDK: Memory, context, and model selection
## Memory & Context Optimizations ### agent.py - MAX_CONTEXT_MESSAGES: 10 → 20 (better conversation coherence) - MEMORY_RESPONSE_PREVIEW_LENGTH: 200 → 500 (richer memory storage) - MAX_CONVERSATION_HISTORY: 50 → 100 (longer session continuity) - search_hybrid max_results: 2 → 5 (better memory recall) - System prompt: Now mentions tool count and flat-rate subscription - Memory format: Changed "User (username)/Agent" to "username/Garvis" ### llm_interface.py - Added claude_agent_sdk model (Sonnet) to defaults - Mode-based model selection: * Agent SDK → Sonnet (best quality, flat-rate) * Direct API → Haiku (cheapest, pay-per-token) - Updated logging to show active model ## SOUL.md Rewrite - Added Garvis identity (name, email, role) - Listed all 17 tools (was missing 12 tools) - Added "Critical Behaviors" section - Emphasized flat-rate subscription benefits - Clear instructions to always check user profiles ## Benefits With flat-rate Agent SDK: - ✅ Use Sonnet for better reasoning (was Haiku) - ✅ 2x context messages (10 → 20) - ✅ 2.5x memory results (2 → 5) - ✅ 2.5x richer memory previews (200 → 500 chars) - ✅ Bot knows its name and all capabilities - ✅ Zero marginal cost for thoroughness Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
This commit is contained in:
18
agent.py
18
agent.py
@@ -10,11 +10,11 @@ from self_healing import SelfHealingSystem
|
||||
from tools import TOOL_DEFINITIONS, execute_tool
|
||||
|
||||
# Maximum number of recent messages to include in LLM context
|
||||
MAX_CONTEXT_MESSAGES = 10 # Increased for better context retention
|
||||
MAX_CONTEXT_MESSAGES = 20 # Optimized for Agent SDK flat-rate subscription
|
||||
# Maximum characters of agent response to store in memory
|
||||
MEMORY_RESPONSE_PREVIEW_LENGTH = 200
|
||||
MEMORY_RESPONSE_PREVIEW_LENGTH = 500 # Store more context for better memory retrieval
|
||||
# Maximum conversation history entries before pruning
|
||||
MAX_CONVERSATION_HISTORY = 50
|
||||
MAX_CONVERSATION_HISTORY = 100 # Higher limit with flat-rate subscription
|
||||
|
||||
|
||||
class Agent:
|
||||
@@ -142,14 +142,16 @@ class Agent:
|
||||
"""Inner chat logic, called while holding _chat_lock."""
|
||||
soul = self.memory.get_soul()
|
||||
user_profile = self.memory.get_user(username)
|
||||
relevant_memory = self.memory.search_hybrid(user_message, max_results=2)
|
||||
relevant_memory = self.memory.search_hybrid(user_message, max_results=5)
|
||||
|
||||
memory_lines = [f"- {mem['snippet']}" for mem in relevant_memory]
|
||||
system = (
|
||||
f"{soul}\n\nUser Profile:\n{user_profile}\n\n"
|
||||
f"Relevant Memory:\n" + "\n".join(memory_lines) +
|
||||
f"\n\nYou have access to tools for file operations and command execution. "
|
||||
f"Use them freely to help the user."
|
||||
f"\n\nYou have access to {len(TOOL_DEFINITIONS)} tools for file operations, "
|
||||
f"command execution, and Google services. Use them freely to help the user. "
|
||||
f"Note: You're running on a flat-rate Agent SDK subscription, so don't worry "
|
||||
f"about API costs when making multiple tool calls or processing large contexts."
|
||||
)
|
||||
|
||||
self.conversation_history.append(
|
||||
@@ -210,8 +212,8 @@ class Agent:
|
||||
|
||||
preview = final_response[:MEMORY_RESPONSE_PREVIEW_LENGTH]
|
||||
self.memory.write_memory(
|
||||
f"**User ({username})**: {user_message}\n"
|
||||
f"**Agent**: {preview}...",
|
||||
f"**{username}**: {user_message}\n"
|
||||
f"**Garvis**: {preview}...",
|
||||
daily=True,
|
||||
)
|
||||
|
||||
|
||||
Reference in New Issue
Block a user