ajarbot

jramos/ajarbot

Fork 0

Commit Graph

Author	SHA1	Message	Date
Jordan Ramos	916f86725d	feat: RSO observation system, child safety, Discord adapter, Telegram watchdog, email attachments Core agent improvements: - RSO (Relevance Scoring & Observation) system: interaction_logger, memory_scorer, signal_detector - Memory access logging (memory_access_log table) for relevance scoring; high-signal turn detection - Rich conversation storage for notable turns; compact_conversation truncates long user messages - Task-type classifier (query/action/analysis/creative) for observation tagging - Nested sub-agent visibility: deep delegations now register against the main agent's manager Child safety (Gabriel profile): - child_safety.py: filtering, audit logging, prompt constants for restricted sessions - .kiro/specs/child-safety-profile: requirements, design, tasks specs - GABRIEL_BOT_PROPOSAL.md: initial proposal doc - Reduced context window (10 msgs) and tutor-mode identity for restricted users Telegram adapter: - Polling watchdog: auto-restarts updater if polling drops unexpectedly - get_me() with exponential-backoff retry on NetworkError at startup - Correct stop() ordering: signal watchdog before cancelling tasks Email / Gmail: - send_email: supports file attachments (attachments list param) - get_email: surfaces attachment metadata in response Scheduled tasks / weather: - Remove OpenWeatherMap API calls from morning-weather task; use wttr.in exclusively - New scheduled tasks and scheduler state persistence Discord: - adapters/discord/__init__.py scaffold - discord-plugin: MCP plugin for Claude Code Discord integration (server.ts, skills, config) Infrastructure: - n8n workflow exports (garvis_webhook, content_pipeline variants) - memory_workspace: context, homelab-repo-updates, weekly observation summaries, error logs - UCS C240 migration plan doc - requirements.txt: new deps - .claude/settings.json, fix_hooks.py: hook/permission tuning	2026-04-23 07:54:01 -06:00
Jordan Ramos	8c039b6cad	Implement adaptive timeout system with activity-based loop detection Problem: Fixed 10-minute timeout kills legitimately slow operations (e.g., 5-minute web searches) while infinite loops waste resources. Solution: Dual-timeout strategy that distinguishes slow from stuck: 1. Idle timeout (5 min): No progress = kill - Tracks message_count growth via heartbeat - Only resets timer when count increases - Slow web searches keep progressing → allowed 2. Total timeout (15 min): Hard cap safety net - Prevents runaway tasks from consuming resources forever - Allows legitimately slow operations to complete 3. Loop detection: Kills after 5+ errors - Tracks error_count and last_error - Detects repetitive failures quickly - Independent of time-based checks Key Changes: - SubAgentState: Add message_count, error_count tracking fields - SubAgentManager.__init__: Dual timeout params (idle=300s, total=900s) - SubAgentManager.update_activity: Accepts message_count, smart timer reset - SubAgentManager.update_error: NEW - tracks errors for loop detection - SubAgentManager.get_hung_agents: 3-check system (idle/total/loop) - SubAgentManager.cleanup_agent: Detailed error messages by type - agent.py heartbeat: Passes sub_agent.llm.message_count every 10s - mcp_tools._DELEGATE_TIMEOUT: Increased to 900s (15 min) Impact: - Slow operations (5-12 min with progress) complete successfully - Infinite loops killed in <5 min via idle timeout or error detection - Clear diagnostics: "Idle timeout: No progress for 305s (23 messages)" - Zero config needed - adaptive behavior works automatically Example: CVE research taking 5 min with 117 messages now completes instead of timing out at 10 min. Loop with repeated errors killed at 3 min. See ADAPTIVE_TIMEOUT_SYSTEM.md for full specification and scenarios. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-03-04 17:40:58 -07:00
Jordan Ramos	e909cc0044	Add MCP delegation bridge and diagram tools Features Added: 1. Agent Registry (agent_registry.py) - Thread-safe global singleton for MCP tool access to Agent instance - Enables MCP tools to call Agent.delegate() without circular imports - Registered at bot startup in bot_runner.py 2. Sub-Agent Manager (sub_agent_manager.py) - Watchdog system monitoring sub-agent lifecycle - Detects hung agents (5min timeout, 30s check interval) - Auto-cleanup and status tracking 3. delegate_task MCP Tool (mcp_tools.py) - Exposes Agent.delegate() to Claude via MCP protocol - Enables parallel sub-agent execution via tool calls - Supports specialist prompts and agent ID caching 4. Memory Write Locks (memory_system.py) - Thread-safe writes to prevent file corruption - Protects write_memory(), update_soul(), update_user() 5. Diagram Tools - Mermaid MCP server (flowcharts, sequence diagrams, etc.) - Excalidraw MCP server (hand-drawn style diagrams) - Config files in config/ directory 6. Adapter Improvements - Enhanced error handling across all adapters - Unified logging patterns Testing: Ready for parallel sub-agent testing Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-03-01 14:34:24 -07:00

Author

SHA1

Message

Date

Jordan Ramos

916f86725d

feat: RSO observation system, child safety, Discord adapter, Telegram watchdog, email attachments

Core agent improvements:
- RSO (Relevance Scoring & Observation) system: interaction_logger, memory_scorer, signal_detector
- Memory access logging (memory_access_log table) for relevance scoring; high-signal turn detection
- Rich conversation storage for notable turns; compact_conversation truncates long user messages
- Task-type classifier (query/action/analysis/creative) for observation tagging
- Nested sub-agent visibility: deep delegations now register against the main agent's manager

Child safety (Gabriel profile):
- child_safety.py: filtering, audit logging, prompt constants for restricted sessions
- .kiro/specs/child-safety-profile: requirements, design, tasks specs
- GABRIEL_BOT_PROPOSAL.md: initial proposal doc
- Reduced context window (10 msgs) and tutor-mode identity for restricted users

Telegram adapter:
- Polling watchdog: auto-restarts updater if polling drops unexpectedly
- get_me() with exponential-backoff retry on NetworkError at startup
- Correct stop() ordering: signal watchdog before cancelling tasks

Email / Gmail:
- send_email: supports file attachments (attachments list param)
- get_email: surfaces attachment metadata in response

Scheduled tasks / weather:
- Remove OpenWeatherMap API calls from morning-weather task; use wttr.in exclusively
- New scheduled tasks and scheduler state persistence

Discord:
- adapters/discord/__init__.py scaffold
- discord-plugin: MCP plugin for Claude Code Discord integration (server.ts, skills, config)

Infrastructure:
- n8n workflow exports (garvis_webhook, content_pipeline variants)
- memory_workspace: context, homelab-repo-updates, weekly observation summaries, error logs
- UCS C240 migration plan doc
- requirements.txt: new deps
- .claude/settings.json, fix_hooks.py: hook/permission tuning

2026-04-23 07:54:01 -06:00

Jordan Ramos

8c039b6cad

Implement adaptive timeout system with activity-based loop detection

**Problem**: Fixed 10-minute timeout kills legitimately slow operations
(e.g., 5-minute web searches) while infinite loops waste resources.

**Solution**: Dual-timeout strategy that distinguishes slow from stuck:

1. **Idle timeout** (5 min): No progress = kill
   - Tracks message_count growth via heartbeat
   - Only resets timer when count increases
   - Slow web searches keep progressing → allowed

2. **Total timeout** (15 min): Hard cap safety net
   - Prevents runaway tasks from consuming resources forever
   - Allows legitimately slow operations to complete

3. **Loop detection**: Kills after 5+ errors
   - Tracks error_count and last_error
   - Detects repetitive failures quickly
   - Independent of time-based checks

**Key Changes**:
- SubAgentState: Add message_count, error_count tracking fields
- SubAgentManager.__init__: Dual timeout params (idle=300s, total=900s)
- SubAgentManager.update_activity: Accepts message_count, smart timer reset
- SubAgentManager.update_error: NEW - tracks errors for loop detection
- SubAgentManager.get_hung_agents: 3-check system (idle/total/loop)
- SubAgentManager.cleanup_agent: Detailed error messages by type
- agent.py heartbeat: Passes sub_agent.llm.message_count every 10s
- mcp_tools._DELEGATE_TIMEOUT: Increased to 900s (15 min)

**Impact**:
- Slow operations (5-12 min with progress) complete successfully
- Infinite loops killed in <5 min via idle timeout or error detection
- Clear diagnostics: "Idle timeout: No progress for 305s (23 messages)"
- Zero config needed - adaptive behavior works automatically

**Example**: CVE research taking 5 min with 117 messages now completes
instead of timing out at 10 min. Loop with repeated errors killed at 3 min.

See ADAPTIVE_TIMEOUT_SYSTEM.md for full specification and scenarios.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

2026-03-04 17:40:58 -07:00

Jordan Ramos

e909cc0044

Add MCP delegation bridge and diagram tools

**Features Added**:

1. **Agent Registry (agent_registry.py)**
   - Thread-safe global singleton for MCP tool access to Agent instance
   - Enables MCP tools to call Agent.delegate() without circular imports
   - Registered at bot startup in bot_runner.py

2. **Sub-Agent Manager (sub_agent_manager.py)**
   - Watchdog system monitoring sub-agent lifecycle
   - Detects hung agents (5min timeout, 30s check interval)
   - Auto-cleanup and status tracking

3. **delegate_task MCP Tool (mcp_tools.py)**
   - Exposes Agent.delegate() to Claude via MCP protocol
   - Enables parallel sub-agent execution via tool calls
   - Supports specialist prompts and agent ID caching

4. **Memory Write Locks (memory_system.py)**
   - Thread-safe writes to prevent file corruption
   - Protects write_memory(), update_soul(), update_user()

5. **Diagram Tools**
   - Mermaid MCP server (flowcharts, sequence diagrams, etc.)
   - Excalidraw MCP server (hand-drawn style diagrams)
   - Config files in config/ directory

6. **Adapter Improvements**
   - Enhanced error handling across all adapters
   - Unified logging patterns

**Testing**: Ready for parallel sub-agent testing

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

2026-03-01 14:34:24 -07:00

3 Commits