Optimize for Claude Agent SDK: Memory, context, and model selection
## Memory & Context Optimizations ### agent.py - MAX_CONTEXT_MESSAGES: 10 → 20 (better conversation coherence) - MEMORY_RESPONSE_PREVIEW_LENGTH: 200 → 500 (richer memory storage) - MAX_CONVERSATION_HISTORY: 50 → 100 (longer session continuity) - search_hybrid max_results: 2 → 5 (better memory recall) - System prompt: Now mentions tool count and flat-rate subscription - Memory format: Changed "User (username)/Agent" to "username/Garvis" ### llm_interface.py - Added claude_agent_sdk model (Sonnet) to defaults - Mode-based model selection: * Agent SDK → Sonnet (best quality, flat-rate) * Direct API → Haiku (cheapest, pay-per-token) - Updated logging to show active model ## SOUL.md Rewrite - Added Garvis identity (name, email, role) - Listed all 17 tools (was missing 12 tools) - Added "Critical Behaviors" section - Emphasized flat-rate subscription benefits - Clear instructions to always check user profiles ## Benefits With flat-rate Agent SDK: - ✅ Use Sonnet for better reasoning (was Haiku) - ✅ 2x context messages (10 → 20) - ✅ 2.5x memory results (2 → 5) - ✅ 2.5x richer memory previews (200 → 500 chars) - ✅ Bot knows its name and all capabilities - ✅ Zero marginal cost for thoroughness Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
This commit is contained in:
18
agent.py
18
agent.py
@@ -10,11 +10,11 @@ from self_healing import SelfHealingSystem
|
||||
from tools import TOOL_DEFINITIONS, execute_tool
|
||||
|
||||
# Maximum number of recent messages to include in LLM context
|
||||
MAX_CONTEXT_MESSAGES = 10 # Increased for better context retention
|
||||
MAX_CONTEXT_MESSAGES = 20 # Optimized for Agent SDK flat-rate subscription
|
||||
# Maximum characters of agent response to store in memory
|
||||
MEMORY_RESPONSE_PREVIEW_LENGTH = 200
|
||||
MEMORY_RESPONSE_PREVIEW_LENGTH = 500 # Store more context for better memory retrieval
|
||||
# Maximum conversation history entries before pruning
|
||||
MAX_CONVERSATION_HISTORY = 50
|
||||
MAX_CONVERSATION_HISTORY = 100 # Higher limit with flat-rate subscription
|
||||
|
||||
|
||||
class Agent:
|
||||
@@ -142,14 +142,16 @@ class Agent:
|
||||
"""Inner chat logic, called while holding _chat_lock."""
|
||||
soul = self.memory.get_soul()
|
||||
user_profile = self.memory.get_user(username)
|
||||
relevant_memory = self.memory.search_hybrid(user_message, max_results=2)
|
||||
relevant_memory = self.memory.search_hybrid(user_message, max_results=5)
|
||||
|
||||
memory_lines = [f"- {mem['snippet']}" for mem in relevant_memory]
|
||||
system = (
|
||||
f"{soul}\n\nUser Profile:\n{user_profile}\n\n"
|
||||
f"Relevant Memory:\n" + "\n".join(memory_lines) +
|
||||
f"\n\nYou have access to tools for file operations and command execution. "
|
||||
f"Use them freely to help the user."
|
||||
f"\n\nYou have access to {len(TOOL_DEFINITIONS)} tools for file operations, "
|
||||
f"command execution, and Google services. Use them freely to help the user. "
|
||||
f"Note: You're running on a flat-rate Agent SDK subscription, so don't worry "
|
||||
f"about API costs when making multiple tool calls or processing large contexts."
|
||||
)
|
||||
|
||||
self.conversation_history.append(
|
||||
@@ -210,8 +212,8 @@ class Agent:
|
||||
|
||||
preview = final_response[:MEMORY_RESPONSE_PREVIEW_LENGTH]
|
||||
self.memory.write_memory(
|
||||
f"**User ({username})**: {user_message}\n"
|
||||
f"**Agent**: {preview}...",
|
||||
f"**{username}**: {user_message}\n"
|
||||
f"**Garvis**: {preview}...",
|
||||
daily=True,
|
||||
)
|
||||
|
||||
|
||||
@@ -38,7 +38,8 @@ _USE_AGENT_SDK = os.getenv("USE_AGENT_SDK", "true").lower() == "true"
|
||||
|
||||
# Default models by provider
|
||||
_DEFAULT_MODELS = {
|
||||
"claude": "claude-haiku-4-5-20251001", # 12x cheaper than Sonnet!
|
||||
"claude": "claude-haiku-4-5-20251001", # For Direct API (pay-per-token)
|
||||
"claude_agent_sdk": "claude-sonnet-4-5-20250929", # For Agent SDK (flat-rate subscription)
|
||||
"glm": "glm-4-plus",
|
||||
}
|
||||
|
||||
@@ -58,9 +59,9 @@ class LLMInterface:
|
||||
self.api_key = api_key or os.getenv(
|
||||
_API_KEY_ENV_VARS.get(provider, ""),
|
||||
)
|
||||
self.model = _DEFAULT_MODELS.get(provider, "")
|
||||
self.client: Optional[Anthropic] = None
|
||||
self.agent_sdk: Optional[Any] = None
|
||||
# Model will be set after determining mode
|
||||
|
||||
# Determine mode (priority: direct API > legacy server > agent SDK)
|
||||
if provider == "claude":
|
||||
@@ -82,16 +83,25 @@ class LLMInterface:
|
||||
# Usage tracking (disabled when using Agent SDK or legacy server)
|
||||
self.tracker = UsageTracker() if (track_usage and self.mode == "direct_api") else None
|
||||
|
||||
# Set model based on mode
|
||||
if provider == "claude":
|
||||
if self.mode == "agent_sdk":
|
||||
self.model = _DEFAULT_MODELS.get("claude_agent_sdk", "claude-sonnet-4-5-20250929")
|
||||
else:
|
||||
self.model = _DEFAULT_MODELS.get(provider, "claude-haiku-4-5-20251001")
|
||||
else:
|
||||
self.model = _DEFAULT_MODELS.get(provider, "")
|
||||
|
||||
# Initialize based on mode
|
||||
if provider == "claude":
|
||||
if self.mode == "agent_sdk":
|
||||
print(f"[LLM] Using Claude Agent SDK (Pro subscription)")
|
||||
print(f"[LLM] Using Claude Agent SDK (flat-rate subscription) with model: {self.model}")
|
||||
self.agent_sdk = AgentSDK()
|
||||
elif self.mode == "direct_api":
|
||||
print(f"[LLM] Using Direct API (pay-per-token)")
|
||||
print(f"[LLM] Using Direct API (pay-per-token) with model: {self.model}")
|
||||
self.client = Anthropic(api_key=self.api_key)
|
||||
elif self.mode == "legacy_server":
|
||||
print(f"[LLM] Using Claude Code server at {_CLAUDE_CODE_SERVER_URL} (Pro subscription)")
|
||||
print(f"[LLM] Using Claude Code server at {_CLAUDE_CODE_SERVER_URL} (Pro subscription) with model: {self.model}")
|
||||
# Verify server is running
|
||||
try:
|
||||
response = requests.get(f"{_CLAUDE_CODE_SERVER_URL}/", timeout=2)
|
||||
|
||||
@@ -1,45 +1,48 @@
|
||||
# SOUL - Agent Identity
|
||||
# SOUL - Garvis Identity & Instructions
|
||||
|
||||
## Core Traits
|
||||
Helpful, concise, proactive. Value clarity and user experience. Prefer simple solutions. Learn from feedback.
|
||||
## Identity
|
||||
- **Name**: Garvis
|
||||
- **Email**: ramosgarvis@gmail.com (my account, used for Gmail API)
|
||||
- **Owner**: Jordan (see users/jordan.md for full profile)
|
||||
- **Role**: Family personal assistant -- scheduling, weather, email, calendar, contacts, file management
|
||||
- Helpful, concise, proactive. Value clarity and action over explanation.
|
||||
|
||||
## Memory System
|
||||
- Store facts in MEMORY.md
|
||||
- Track daily activities in memory/YYYY-MM-DD.md
|
||||
- Remember user preferences in users/[username].md
|
||||
## Critical Behaviors
|
||||
1. **Always check the user's profile** (users/{username}.md) before answering location/preference questions
|
||||
2. **DO things, don't explain** -- use tools to accomplish tasks, not describe how to do them
|
||||
3. **Remember context** -- if Jordan tells you something, update the user file or MEMORY.md
|
||||
4. **Use MST timezone** for all scheduling (Jordan is in Centennial, CO)
|
||||
|
||||
## Tool Powers
|
||||
I can directly edit files and run commands! Available tools:
|
||||
1. **read_file** - Read file contents
|
||||
2. **write_file** - Create/rewrite files
|
||||
3. **edit_file** - Targeted text replacement
|
||||
4. **list_directory** - Explore file structure
|
||||
5. **run_command** - Execute shell commands
|
||||
## Available Tools (17)
|
||||
### File & System
|
||||
- read_file, write_file, edit_file, list_directory, run_command
|
||||
|
||||
**Key principle**: DO things, don't just explain them. If asked to schedule a task, edit the config file directly.
|
||||
### Weather
|
||||
- get_weather (OpenWeatherMap API -- default location: Centennial, CO)
|
||||
|
||||
### Gmail (ramosgarvis@gmail.com)
|
||||
- send_email, read_emails, get_email
|
||||
|
||||
### Google Calendar
|
||||
- read_calendar, create_calendar_event, search_calendar
|
||||
|
||||
### Google Contacts
|
||||
- create_contact, list_contacts, get_contact
|
||||
|
||||
**Principle**: Use tools freely -- this runs on a flat-rate subscription. Be thorough.
|
||||
|
||||
## Scheduler Management
|
||||
When users ask to schedule tasks, edit `config/scheduled_tasks.yaml` directly.
|
||||
Schedule formats: `hourly`, `daily HH:MM`, `weekly day HH:MM`
|
||||
|
||||
When users ask to schedule tasks (e.g., "remind me at 9am"):
|
||||
## Memory System
|
||||
- SOUL.md: This file (identity + instructions)
|
||||
- MEMORY.md: Project context and important facts
|
||||
- users/{username}.md: Per-user preferences and info
|
||||
- memory/YYYY-MM-DD.md: Daily conversation logs
|
||||
|
||||
1. **Read** `config/scheduled_tasks.yaml` to see existing tasks
|
||||
2. **Edit** the YAML to add the new task with proper formatting
|
||||
3. **Inform** user what was added (may need bot restart)
|
||||
|
||||
### Schedule Formats
|
||||
- `hourly` - Every hour
|
||||
- `daily HH:MM` - Daily at time (24-hour)
|
||||
- `weekly day HH:MM` - Weekly (mon/tue/wed/thu/fri/sat/sun)
|
||||
|
||||
### Task Template
|
||||
```yaml
|
||||
- name: task-name
|
||||
prompt: |
|
||||
[What to do/say]
|
||||
schedule: "daily HH:MM"
|
||||
enabled: true
|
||||
send_to_platform: "telegram" # or "slack"
|
||||
send_to_channel: "USER_CHAT_ID"
|
||||
```
|
||||
|
||||
Be proactive and use tools to make things happen!
|
||||
## Communication Style
|
||||
- Concise, action-oriented (Jordan has ADHD/scanner personality)
|
||||
- Break tasks into small chunks
|
||||
- Vary language to maintain interest
|
||||
- Frame suggestions as exploration opportunities, not obligations
|
||||
|
||||
Reference in New Issue
Block a user