Fix critical performance issues: thread pool exhaustion and tool tracking

Root Cause Analysis: - delegate_task used run_in_executor with default ThreadPoolExecutor (8-12 threads) - Each delegation blocked one thread for 2-8 minutes (full sub-agent conversation) - After 6-8 parallel delegations, pool exhausted → all work hung - Tool tracking used hasattr(block, 'type') but ToolUseBlock has no .type attribute Changes: 1. mcp_tools.py: Replace thread pool with dedicated threads - Each delegate_task creates dedicated daemon thread with isolated event loop - Uses asyncio.Future + loop.call_soon_threadsafe for result communication - Added semaphore to limit concurrent delegations (4 max) - Eliminates pool exhaustion, enables unlimited parallel delegations 2. llm_interface.py: Fix tool tracking - Added TextBlock/ToolUseBlock imports from claude_agent_sdk - Replaced hasattr(block, 'type') checks with isinstance() checks - Fixes tool_calls=0 bug (now correctly tracks tools used) 3. agent.py: Event loop isolation and thread safety - Added defensive sub_agent.llm._event_loop = None in spawn_sub_agent - Ensures sub-agents use asyncio.run() fallback with isolated loops - Generate unique agent IDs with timestamps to prevent caching race conditions Impact: - Fixes 6-8 message hang pattern (no more 10-minute timeouts) - Enables parallel sub-agent execution via delegate_task - Tool tracking now reports accurate tool usage counts - All sub-agents remain in Agent SDK mode (as required) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-03-03 20:48:43 -07:00
parent cc7e623d74
commit a8f3ed40a8
3 changed files with 2101 additions and 27 deletions
--- a/llm_interface.py
+++ b/llm_interface.py
@@ -24,6 +24,7 @@ from typing import Any, Dict, List, Optional, Set

 import requests
 from anthropic import Anthropic
+from claude_agent_sdk import TextBlock, ToolUseBlock
 from usage_tracker import UsageTracker

 logger = logging.getLogger(__name__)
@@ -607,12 +608,13 @@ class LLMInterface:
                            assistant_messages.append(message.content)
                        elif isinstance(message.content, list):
                            for block in message.content:
-                                if hasattr(block, 'type'):
-                                    if block.type == 'text' and hasattr(block, 'text'):
-                                        assistant_messages.append(block.text)
-                                    elif block.type == 'tool_use' and hasattr(block, 'name'):
-                                        tool_names.append(block.name)
-                                        self._last_tool_names = tool_names.copy()
+                                # Use isinstance() checks instead of hasattr(block, 'type')
+                                # ToolUseBlock dataclass has no .type attribute
+                                if isinstance(block, TextBlock):
+                                    assistant_messages.append(block.text)
+                                elif isinstance(block, ToolUseBlock):
+                                    tool_names.append(block.name)
+                                    self._last_tool_names = tool_names.copy()

                    if isinstance(message, ResultMessage):
                        # DEBUG: Log what we captured during message processing