Files

Jordan Ramos 0271dea551 Add comprehensive structured logging system

Features:
- JSON-formatted logs for easy parsing and analysis
- Rotating log files (prevents disk space issues)
  * ajarbot.log: All events, 10MB rotation, 5 backups
  * errors.log: Errors only, 5MB rotation, 3 backups
  * tools.log: Tool execution tracking, 10MB rotation, 3 backups

Tool Execution Tracking:
- Every tool call logged with inputs, outputs, duration
- Success/failure status tracking
- Performance metrics (execution time in milliseconds)
- Error messages captured with full context

Logging Integration:
- tools.py: All tool executions automatically logged
- Structured logger classes with context preservation
- Console output (human-readable) + file logs (JSON)
- Separate error log for quick issue identification

Log Analysis:
- JSON format enables programmatic analysis
- Easy to search for patterns (max tokens, iterations, etc.)
- Performance tracking (slow tools, failure rates)
- Historical debugging with full context

Documentation:
- LOGGING.md: Complete usage guide
- Log analysis examples with jq commands
- Error pattern reference
- Maintenance and integration instructions

Benefits:
- Quick error diagnosis with separate errors.log
- Performance monitoring and optimization
- Historical analysis for troubleshooting
- Automatic log rotation (max 95MB total)

Updated .gitignore to exclude logs/ directory

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

2026-02-16 16:32:18 -07:00

5.2 KiB

Raw Blame History

Structured Logging System

Overview

Ajarbot now includes a comprehensive structured logging system to track errors, tool executions, and system behavior.

Log Files

All logs are stored in the logs/ directory (gitignored):

1. `ajarbot.log` - Main Application Log

Format: JSON (one record per line)
Level: DEBUG and above
Size: Rotates at 10MB, keeps 5 backups
Contents: All application events, tool executions, LLM calls

2. `errors.log` - Error-Only Log

Format: JSON
Level: ERROR and CRITICAL only
Size: Rotates at 5MB, keeps 3 backups
Contents: Only errors and critical issues for quick diagnosis

3. `tools.log` - Tool Execution Log

Format: JSON
Level: INFO and above
Size: Rotates at 10MB, keeps 3 backups
Contents: Every tool call with inputs, outputs, duration, and success/failure

Log Format

JSON Structure

{
  "timestamp": "2026-02-16T12:34:56.789Z",
  "level": "ERROR",
  "logger": "tools",
  "message": "Tool failed: permanent_note",
  "module": "tools",
  "function": "execute_tool",
  "line": 500,
  "extra": {
    "tool_name": "permanent_note",
    "inputs": {"title": "Test", "content": "..."},
    "success": false,
    "error": "Unknown tool error",
    "duration_ms": 123.45
  }
}

Tool Log Example

{
  "timestamp": "2026-02-16T06:00:15.234Z",
  "level": "INFO",
  "logger": "tools",
  "message": "Tool executed: get_weather",
  "extra": {
    "tool_name": "get_weather",
    "inputs": {"location": "Centennial, CO"},
    "success": true,
    "result_length": 456,
    "duration_ms": 1234.56
  }
}

Usage in Code

Get a Logger

from logging_config import get_logger, get_tool_logger

# General logger
logger = get_logger("my_module")

# Specialized tool logger
tool_logger = get_tool_logger()

Logging Methods

Basic logging:

logger.debug("Detailed debug info", key="value")
logger.info("Informational message", user_id=123)
logger.warning("Warning message", issue="something")
logger.error("Error occurred", exc_info=True, error_code="E001")
logger.critical("Critical system failure", exc_info=True)

Tool execution logging:

tool_logger.log_tool_call(
    tool_name="permanent_note",
    inputs={"title": "Test", "content": "..."},
    success=True,
    result="Created note successfully",
    duration_ms=123.45
)

Analyzing Logs

View Recent Errors

# Last 20 errors
tail -20 logs/errors.log | jq .

# Errors from specific module
grep '"module":"tools"' logs/errors.log | jq .

Tool Performance Analysis

# Average tool execution time
cat logs/tools.log | jq -r '.extra.duration_ms' | awk '{sum+=$1; count++} END {print sum/count}'

# Failed tools
grep '"success":false' logs/tools.log | jq -r '.extra.tool_name' | sort | uniq -c

# Slowest tool calls
cat logs/tools.log | jq -r '[.extra.tool_name, .extra.duration_ms] | @csv' | sort -t, -k2 -rn | head -10

Find Specific Errors

# Max token errors
grep -i "max.*token" logs/errors.log | jq .

# Tool iteration limits
grep -i "iteration.*exceeded" logs/ajarbot.log | jq .

# MCP tool failures
grep '"tool_name":"permanent_note"' logs/tools.log | grep '"success":false' | jq .

Error Patterns to Watch

Max Tool Iterations - Search: "iteration.*exceeded"
Max Tokens - Search: "max.*token"
MCP Tool Failures - Search: "Unknown tool" or failed MCP tool names
Slow Tools - Tools taking > 5000ms
Repeated Failures - Same tool failing multiple times

Maintenance

Log Rotation

Logs automatically rotate when they reach size limits:

ajarbot.log: 10MB → keeps 5 old files (50MB total)
errors.log: 5MB → keeps 3 old files (15MB total)
tools.log: 10MB → keeps 3 old files (30MB total)

Total max disk usage: ~95MB

Manual Cleanup

# Remove old logs
rm logs/*.log.*

# Clear all logs (careful!)
rm logs/*.log

Integration

Automatic Integration

The logging system is automatically integrated into:

✅ tools.py - All tool executions logged
✅ Console output - Human-readable format
✅ File logs - JSON format for parsing

Adding Logging to New Modules

from logging_config import get_logger

logger = get_logger(__name__)

def my_function():
    logger.info("Starting operation", operation_id=123)
    try:
        # Do work
        logger.debug("Step completed", step=1)
    except Exception as e:
        logger.error("Operation failed", exc_info=True, operation_id=123)

Benefits

Quick Error Diagnosis: Separate errors.log for immediate issue identification
Performance Tracking: Tool execution times and success rates
Historical Analysis: JSON format enables programmatic analysis
Debugging: Full context with inputs, outputs, and stack traces
Monitoring: Easy to parse logs for alerting systems

Future Enhancements

Web dashboard for log visualization
Real-time log streaming via WebSocket
Automatic error rate alerts (email/Telegram)
Integration with external monitoring (Datadog, CloudWatch)
Log aggregation for multi-instance deployments

Last Updated: 2026-02-16 Log System Version: 1.0

5.2 KiB Raw Blame History