Optimize for Claude Agent SDK: Memory, context, and model selection
## Memory & Context Optimizations ### agent.py - MAX_CONTEXT_MESSAGES: 10 → 20 (better conversation coherence) - MEMORY_RESPONSE_PREVIEW_LENGTH: 200 → 500 (richer memory storage) - MAX_CONVERSATION_HISTORY: 50 → 100 (longer session continuity) - search_hybrid max_results: 2 → 5 (better memory recall) - System prompt: Now mentions tool count and flat-rate subscription - Memory format: Changed "User (username)/Agent" to "username/Garvis" ### llm_interface.py - Added claude_agent_sdk model (Sonnet) to defaults - Mode-based model selection: * Agent SDK → Sonnet (best quality, flat-rate) * Direct API → Haiku (cheapest, pay-per-token) - Updated logging to show active model ## SOUL.md Rewrite - Added Garvis identity (name, email, role) - Listed all 17 tools (was missing 12 tools) - Added "Critical Behaviors" section - Emphasized flat-rate subscription benefits - Clear instructions to always check user profiles ## Benefits With flat-rate Agent SDK: - ✅ Use Sonnet for better reasoning (was Haiku) - ✅ 2x context messages (10 → 20) - ✅ 2.5x memory results (2 → 5) - ✅ 2.5x richer memory previews (200 → 500 chars) - ✅ Bot knows its name and all capabilities - ✅ Zero marginal cost for thoroughness Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
This commit is contained in:
@@ -38,7 +38,8 @@ _USE_AGENT_SDK = os.getenv("USE_AGENT_SDK", "true").lower() == "true"
|
||||
|
||||
# Default models by provider
|
||||
_DEFAULT_MODELS = {
|
||||
"claude": "claude-haiku-4-5-20251001", # 12x cheaper than Sonnet!
|
||||
"claude": "claude-haiku-4-5-20251001", # For Direct API (pay-per-token)
|
||||
"claude_agent_sdk": "claude-sonnet-4-5-20250929", # For Agent SDK (flat-rate subscription)
|
||||
"glm": "glm-4-plus",
|
||||
}
|
||||
|
||||
@@ -58,9 +59,9 @@ class LLMInterface:
|
||||
self.api_key = api_key or os.getenv(
|
||||
_API_KEY_ENV_VARS.get(provider, ""),
|
||||
)
|
||||
self.model = _DEFAULT_MODELS.get(provider, "")
|
||||
self.client: Optional[Anthropic] = None
|
||||
self.agent_sdk: Optional[Any] = None
|
||||
# Model will be set after determining mode
|
||||
|
||||
# Determine mode (priority: direct API > legacy server > agent SDK)
|
||||
if provider == "claude":
|
||||
@@ -82,16 +83,25 @@ class LLMInterface:
|
||||
# Usage tracking (disabled when using Agent SDK or legacy server)
|
||||
self.tracker = UsageTracker() if (track_usage and self.mode == "direct_api") else None
|
||||
|
||||
# Set model based on mode
|
||||
if provider == "claude":
|
||||
if self.mode == "agent_sdk":
|
||||
self.model = _DEFAULT_MODELS.get("claude_agent_sdk", "claude-sonnet-4-5-20250929")
|
||||
else:
|
||||
self.model = _DEFAULT_MODELS.get(provider, "claude-haiku-4-5-20251001")
|
||||
else:
|
||||
self.model = _DEFAULT_MODELS.get(provider, "")
|
||||
|
||||
# Initialize based on mode
|
||||
if provider == "claude":
|
||||
if self.mode == "agent_sdk":
|
||||
print(f"[LLM] Using Claude Agent SDK (Pro subscription)")
|
||||
print(f"[LLM] Using Claude Agent SDK (flat-rate subscription) with model: {self.model}")
|
||||
self.agent_sdk = AgentSDK()
|
||||
elif self.mode == "direct_api":
|
||||
print(f"[LLM] Using Direct API (pay-per-token)")
|
||||
print(f"[LLM] Using Direct API (pay-per-token) with model: {self.model}")
|
||||
self.client = Anthropic(api_key=self.api_key)
|
||||
elif self.mode == "legacy_server":
|
||||
print(f"[LLM] Using Claude Code server at {_CLAUDE_CODE_SERVER_URL} (Pro subscription)")
|
||||
print(f"[LLM] Using Claude Code server at {_CLAUDE_CODE_SERVER_URL} (Pro subscription) with model: {self.model}")
|
||||
# Verify server is running
|
||||
try:
|
||||
response = requests.get(f"{_CLAUDE_CODE_SERVER_URL}/", timeout=2)
|
||||
|
||||
Reference in New Issue
Block a user