This commit implements a comprehensive optimization of all sub-agent prompt definitions based on Opus-powered prompt engineering analysis. All agents now match the quality standard established by librarian.md. Agent Improvements: - scribe.md: 29→340 lines (11.7x expansion) * Added 6 usage examples with role clarity * Implemented comprehensive responsibilities section * Added 3 complete ASCII diagram templates * Included safety protocols and decision frameworks - backend-builder.md: 40→291 lines (7.3x expansion) * Added 6 usage examples with clear boundaries * Expanded core responsibilities (Ansible, Terraform, Docker, Python, Shell) * Added technology stack and validation rules tables * Included handoff protocol for lab-operator deployment * Defined clear boundaries (CREATES code, does NOT deploy) - lab-operator.md: 37→193 lines (5.2x expansion) * Added 6 usage examples with role clarity * Expanded domain expertise with specific commands * Added command style guide (5-step pattern) * Included safety protocols and decision-making framework * Defined clear boundaries (DEPLOYS/OPERATES, does NOT create IaC) - librarian.md: Minor formatting improvements CLAUDE.md Fixes: - Moved YAML frontmatter to line 1 (was incorrectly at line 89) - Fixed trailing pipe character - Completed incomplete sentences about backup strategy and storage growth - Removed redundant information - Expanded status file template with recovery instructions Files Added: - Claude_UPDATES.md: Comprehensive prompt engineering analysis report - monitoring/pve-exporter/pve.yml: PVE monitoring configuration Impact: - Total agent documentation: 249→967 lines (288% increase) - Usage examples: 6→24 total (400% increase) - All agents now have comprehensive safety protocols - Clear role boundaries prevent agent overlap - Validation testing confirms all agents functional 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
179 lines
7.5 KiB
Markdown
179 lines
7.5 KiB
Markdown
---
|
|
version: 2.2.0
|
|
last_updated: 2025-12-07
|
|
infrastructure_source: CLAUDE_STATUS.md
|
|
repository_type: homelab
|
|
primary_node: serviceslab
|
|
proxmox_version: 8.3.3
|
|
vm_count: 10
|
|
lxc_count: 4
|
|
working_directory: /home/jramos/homelab
|
|
git_remote: http://192.168.2.102:3060/jramos/homelab.git
|
|
---
|
|
|
|
# CLAUDE.md
|
|
|
|
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
|
|
|
|
## Repository Overview
|
|
|
|
This is a homelab infrastructure repository managing a Proxmox VE 8.3.3-based services and development laboratory environment. The infrastructure follows a hybrid architecture pattern combining traditional virtualization (KVM/QEMU) with containerization (LXC) for optimal resource utilization and service isolation.
|
|
|
|
## Quick Reference
|
|
|
|
| Resource | Value |
|
|
|----------|-------|
|
|
| **Proxmox Node** | serviceslab (192.168.2.200:8006) |
|
|
| **Proxmox Version** | PVE 8.3.3 |
|
|
| **Infrastructure** | 10 VMs, 4 LXC containers |
|
|
| **Monitoring** | http://192.168.2.114:3000 (Grafana) |
|
|
| **Version Control** | Gitea at 192.168.2.102:3060 |
|
|
| **Working Directory** | /home/jramos/homelab |
|
|
| **Live Status** | See `CLAUDE_STATUS.md` for current inventory |
|
|
|
|
**Key Services:**
|
|
- VM 101 (monitoring-docker): Grafana, Prometheus, PVE Exporter
|
|
- CT 102 (nginx): Nginx Proxy Manager (reverse proxy)
|
|
- CT 112 (twingate-connector): Zero-trust network access
|
|
- CT 113 (n8n): Workflow automation at 192.168.2.107
|
|
|
|
## Agent Selection Guide
|
|
|
|
When working with this repository, choose the appropriate agent based on task type:
|
|
|
|
| Task Type | Primary Agent | Tools Available | Notes |
|
|
|-----------|---------------|-----------------|-------|
|
|
| **Git Operations** | `librarian` | Bash, Read, Grep, Edit, Write | Commits, branches, merges, .gitignore |
|
|
| **Documentation** | `scribe` | Read, Grep, Glob, Edit, Write | READMEs, architecture docs, diagrams |
|
|
| **Infrastructure Ops** | `lab-operator` | Bash, Read, Grep, Glob, Edit, Write | Proxmox, Docker, networking, storage |
|
|
| **Code/IaC Development** | `backend-builder` | Bash, Read, Grep, Glob, Edit, Write | Ansible, Terraform, Python, Shell |
|
|
| **File Creation** | Main Agent | All tools | Use when sub-agents lack specific tools |
|
|
| **Complex Multi-Agent Tasks** | Main Agent | All tools | Coordinates between specialized agents |
|
|
|
|
### Task Routing Decision Tree
|
|
|
|
```
|
|
Is this a git/version control task?
|
|
├── Yes → Use librarian
|
|
└── No ↓
|
|
|
|
Is this documentation (README, guides, diagrams)?
|
|
├── Yes → Use scribe
|
|
└── No ↓
|
|
|
|
Does this require system commands (docker, ssh, proxmox)?
|
|
├── Yes → Use lab-operator
|
|
└── No ↓
|
|
|
|
Is this code/config creation (Ansible, Python, Terraform)?
|
|
├── Yes → Use backend-builder
|
|
└── No → Use Main Agent
|
|
```
|
|
|
|
### Agent Collaboration Patterns
|
|
|
|
**Documentation Workflow:**
|
|
1. `backend-builder` or `lab-operator` creates/modifies infrastructure
|
|
2. `scribe` updates documentation
|
|
3. `librarian` commits all changes
|
|
|
|
**Infrastructure Deployment:**
|
|
1. `backend-builder` writes IaC (Ansible/Terraform/Compose)
|
|
2. `lab-operator` deploys to Proxmox/Docker
|
|
3. `scribe` documents deployment
|
|
4. `librarian` commits configuration
|
|
|
|
## Infrastructure Overview
|
|
|
|
**For detailed, current infrastructure inventory, see:**
|
|
- **Live Status**: `CLAUDE_STATUS.md` (most current)
|
|
- **Service Details**: `services/README.md`
|
|
- **Complete Index**: `INDEX.md`
|
|
|
|
**Quick Summary:**
|
|
- **VMs**: 10 total (IDs: 100, 101, 104-111)
|
|
- **LXC Containers**: 4 total (IDs: 102, 103, 112, 113)
|
|
- **Storage Pools**: local, local-lvm, Vault (ZFS), PBS-Backups, iso-share
|
|
- **Monitoring**: VM 101 at 192.168.2.114 (Grafana/Prometheus/PVE Exporter)
|
|
|
|
**Note**: Infrastructure details change frequently. Always reference `CLAUDE_STATUS.md` for accurate counts, IPs, and status.
|
|
|
|
### Architecture Patterns & Design Decisions
|
|
|
|
**Tiered Application Architecture**: The infrastructure implements a classic three-tier design with dedicated web servers (109, 110), database server (111), and reverse proxy (102), suggesting this lab is used for practicing production-like deployments.
|
|
|
|
**Automation-First Approach**: The presence of Ansible-Control (106), Gitea (100), and NetBox (103) indicates a focus on Infrastructure as Code and proper documentation practices—rather civilized.
|
|
|
|
**Network Simulation Capability**: CML (108) suggests network engineering activities, possibly testing configurations before production deployment.
|
|
|
|
**Container Strategy**: The selective use of LXC for stateless or lightweight services (nginx, netbox, twingate, n8n) vs full VMs for complex applications demonstrates thoughtful resource optimization.
|
|
|
|
**Monitoring & Observability**: The dedicated monitoring VM (101) with Grafana, Prometheus, and PVE Exporter provides comprehensive infrastructure visibility, enabling proactive capacity planning and performance optimization.
|
|
|
|
**Zero-Trust Security**: Implementation of Twingate connector (CT 112) demonstrates modern security practices, providing secure remote access without traditional VPN complexity.
|
|
|
|
**Backup Strategy**: PBS-Backups utilization is at 27.43% (see CLAUDE_STATUS.md for current metrics). Automated daily incremental backups with weekly full backups ensure data protection across all VMs and containers.
|
|
|
|
## Working with This Environment
|
|
|
|
### Universal Workflow
|
|
For every complex task, every Agent must follow this loop:
|
|
1. **Read**: `cat CLAUDE_STATUS.md` to see where we are.
|
|
2. **Execute**: Perform your specific task (Coding, Docs, Sysadmin).
|
|
3. **Update**: Edit `CLAUDE_STATUS.md` to mark your step as `[x]` and update the "Current Context".
|
|
|
|
### Status File Template
|
|
If `CLAUDE_STATUS.md` is missing or corrupted, recover it from the latest disaster recovery export:
|
|
- **Location**: `disaster-recovery/homelab-export-YYYYMMDD-HHMMSS/CLAUDE_STATUS.md`
|
|
- **Alternative**: Use the scribe agent to recreate from current infrastructure state
|
|
|
|
**Minimum required structure:**
|
|
```markdown
|
|
# Homelab Infrastructure Status
|
|
**Last Updated**: YYYY-MM-DD HH:MM:SS
|
|
**Export Reference**: disaster-recovery/homelab-export-YYYYMMDD-HHMMSS
|
|
|
|
## Current Infrastructure Snapshot
|
|
- Proxmox VE 8.3.3 on serviceslab (192.168.2.200)
|
|
- 10 VMs, 4 LXC containers
|
|
|
|
## Current Initiative
|
|
**Goal**: [Initiative description]
|
|
**Phase**: [Planning / Implementation / Testing]
|
|
**Progress Checklist**: [Task list with checkboxes]
|
|
|
|
## Recent Infrastructure Changes
|
|
[Chronological log of changes with dates]
|
|
```
|
|
|
|
|
|
|
|
### Access Patterns
|
|
|
|
- **Proxmox Web UI**: Primary management interface for VM/CT lifecycle operations
|
|
- **Ansible**: Automated configuration deployment and orchestration
|
|
- **Gitea**: CI/CD pipelines for infrastructure testing and deployment
|
|
- **NetBox**: Network documentation and IP address management
|
|
|
|
### Maintenance Considerations
|
|
|
|
- **Uptime**: Track uptime metrics in disaster recovery exports for trend analysis
|
|
- **Storage Growth**: PBS-Backups at 27.43%, Vault at 10.88%, local at 15.13% (see CLAUDE_STATUS.md for current metrics)
|
|
- **Capacity Planning**: Current utilization suggests comfortable headroom; monitor trends via Proxmox metrics in monitoring-docker (101)
|
|
|
|
## Development Setup
|
|
|
|
The repository structure will house:
|
|
- Ansible playbooks and roles for infrastructure automation
|
|
- Terraform/OpenTofu configurations for Proxmox resource provisioning
|
|
- Docker Compose files for service definitions
|
|
- Documentation and runbooks for common operations
|
|
- Network diagrams and architecture documentation
|
|
|
|
## Notes
|
|
|
|
- This is a Windows Subsystem for Linux (WSL2) environment
|
|
- Working directory: /home/jramos/homelab
|
|
- Proxmox node `serviceslab` is the single point of management
|
|
- Infrastructure demonstrates production-like patterns suitable for learning and testing
|