Files
homelab/CLAUDE_STATUS.md
Jordan Ramos f42eeaba92 feat(docs): update documentation for monitoring stack and infrastructure changes
- Update INDEX.md with VM 101 (monitoring-docker) and CT 112 (twingate-connector)
- Update README.md with monitoring and security sections
- Update CLAUDE.md with new architecture patterns
- Update services/README.md with monitoring stack documentation
- Update CLAUDE_STATUS.md with current infrastructure state
- Update infrastructure counts: 10 VMs, 4 Containers
- Update storage stats: PBS 27.43%, Vault 10.88%
- Create comprehensive monitoring/README.md
- Add .gitignore rules for monitoring sensitive files (pve.yml, .env)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-07 12:41:08 -07:00

13 KiB
Raw Blame History

Homelab Infrastructure Status

Last Updated: 2025-12-07 12:00:40 Export Reference: disaster-recovery/homelab-export-20251207-120040

Current Infrastructure Snapshot

Proxmox Environment

  • Node: serviceslab
  • Version: Proxmox VE 8.3.3
  • Management IP: 192.168.2.200
  • Architecture: Single-node cluster
  • Total Resources: 10 VMs, 4 LXC Containers

Virtual Machines (QEMU/KVM) - 10 VMs

VM ID Name IP Address Status Purpose
100 docker-hub 192.168.2.XXX Running Container registry/Docker hub mirror
101 monitoring-docker 192.168.2.114 Running Monitoring stack (Grafana/Prometheus/PVE Exporter)
104 ubuntu-dev - Stopped Ubuntu development environment
105 dev - Stopped General-purpose development workstation
106 Ansible-Control 192.168.2.XXX Running IaC orchestration, configuration management
107 ubuntu-docker - Stopped Ubuntu Docker host
108 CML - Stopped Cisco Modeling Labs - network simulation
109 web-server-01 192.168.2.XXX Running Web application server (clustered)
110 web-server-02 192.168.2.XXX Running Load-balanced pair with web-server-01
111 db-server-01 192.168.2.XXX Running Backend database server

Recent Changes:

  • Added VM 101 (monitoring-docker) for dedicated monitoring infrastructure
  • Removed VM 101 (gitlab) - service decommissioned

Containers (LXC) - 4 Containers

CT ID Name IP Address Status Purpose
102 nginx 192.168.2.101 Running Reverse proxy/load balancer & NPM
103 netbox 192.168.2.XXX Stopped Network documentation/IPAM
112 twingate-connector 192.168.2.XXX Running Zero-trust network access connector
113 n8n 192.168.2.107 Running Workflow automation platform

Recent Changes:

  • Added CT 112 (twingate-connector) for zero-trust network security
  • Added CT 113 (n8n) for workflow automation
  • Removed CT 112 (Anytype) - replaced by n8n

Storage Architecture

Storage Pool Type Total Used % Used Purpose
local Directory - - 15.13% System files, ISOs, templates
local-lvm LVM-Thin - - 0.0% VM disk images (thin provisioned)
Vault NFS/Directory - - 10.88% Secure storage for sensitive data
PBS-Backups PBS - - 27.43% Automated backup repository
iso-share NFS/CIFS - - 1.4% Installation media library
localnetwork Network Share - - N/A Shared resources across infrastructure

Capacity Notes:

  • PBS-Backups utilization increased to 27.43% (healthy retention)
  • Vault utilization decreased to 10.88% (space optimization)
  • local storage at 15.13% (system overhead normal)

Key Services & Stacks

Monitoring & Observability (NEW)

VM 101 - monitoring-docker (192.168.2.114)

  • Grafana: Port 3000 - Visualization and dashboards
  • Prometheus: Port 9090 - Metrics collection and time-series database
  • PVE Exporter: Port 9221 - Proxmox VE metrics exporter
  • Documentation: /home/jramos/homelab/monitoring/README.md
  • Status: Fully operational

Network Security (NEW)

CT 112 - twingate-connector

  • Purpose: Zero-trust network access
  • Type: Lightweight connector
  • Status: Running
  • Integration: Connects homelab to Twingate network

Automation & Integration

CT 113 - n8n (192.168.2.107)

  • Purpose: Workflow automation platform
  • Technology: n8n.io
  • Database: PostgreSQL 15+
  • Features: API integration, scheduled workflows, webhook triggers
  • Documentation: /home/jramos/homelab/services/README.md#n8n-workflow-automation
  • Status: Operational (resolved database locale issues)

Infrastructure Documentation

CT 103 - netbox

  • Purpose: Network documentation and IPAM
  • Status: Stopped (on-demand use)
  • Function: Infrastructure source of truth

Reverse Proxy & Load Balancing

CT 102 - nginx (192.168.2.101)

  • Purpose: Nginx Proxy Manager
  • Ports: 80, 81, 443
  • Function: SSL termination, reverse proxy, certificate management
  • Upstream Services: All web-facing applications

Three-Tier Application Stack

Web Tier:

  • VM 109 (web-server-01) - Primary web server
  • VM 110 (web-server-02) - Load-balanced pair

Database Tier:

  • VM 111 (db-server-01) - Backend database

Proxy Tier:

  • CT 102 (nginx) - Load balancer and SSL termination

Development & Automation

VM 106 - Ansible-Control

  • Purpose: Infrastructure as Code orchestration
  • Tools: Ansible, Terraform/OpenTofu (potential)
  • Status: Running

Container Registry

VM 100 - docker-hub

  • Purpose: Local Docker registry and hub mirror
  • Function: Caching container images for faster deployments
  • Status: Running

Network Simulation

VM 108 - CML

  • Purpose: Cisco Modeling Labs
  • Function: Network topology testing and simulation
  • Status: Stopped (resource-intensive, on-demand use)

Architecture Patterns

Monitoring & Observability (NEW)

The infrastructure now implements a comprehensive monitoring stack following industry best practices:

  • Metrics Collection: Prometheus scraping Proxmox metrics via PVE Exporter
  • Visualization: Grafana providing real-time dashboards and alerting
  • Isolation: Dedicated VM for monitoring services (fault isolation)
  • Integration: Ready for AlertManager, additional exporters, and integrations

Design Decision: VM-based deployment provides kernel-level isolation and prevents resource contention with critical infrastructure services.

Zero-Trust Security (NEW)

Implementation of zero-trust network access principles:

  • Twingate Connector: Lightweight connector providing secure access without VPNs
  • Container Deployment: LXC container for minimal resource overhead
  • Network Segmentation: Secure access to homelab from external networks

Design Decision: LXC container chosen for quick provisioning and low resource consumption.

Automation-First Approach

Workflow automation and infrastructure orchestration:

  • n8n Platform: Visual workflow builder for API integrations
  • Scheduled Tasks: Automated backup checks, monitoring alerts, reports
  • Integration Hub: Connects monitoring, documentation, and operational tools

Design Decision: PostgreSQL backend ensures data persistence and supports complex workflows.

Tiered Application Architecture

Classic three-tier design for production-like environments:

  • Presentation Tier: Paired web servers (109, 110) behind load balancer
  • Business Logic: Application processing on web tier
  • Data Tier: Dedicated database server (111) with backup strategy

Design Decision: Separation of concerns, scalability testing, high availability patterns.

Selective Containerization Strategy

Hybrid approach balancing performance and resource efficiency:

  • LXC Containers: Stateless services (nginx, netbox, twingate, n8n)
  • Full VMs: Complex applications, kernel dependencies, heavy workloads
  • Rationale: LXC for ~10x lower overhead, VMs for isolation and compatibility

Recent Infrastructure Changes (2025-12-07)

Additions

  1. VM 101 (monitoring-docker): New dedicated monitoring infrastructure

    • Grafana for visualization
    • Prometheus for metrics collection
    • PVE Exporter for Proxmox integration
    • IP: 192.168.2.114
  2. CT 112 (twingate-connector): Zero-trust network security

    • Lightweight connector
    • Secure remote access without VPN
  3. CT 113 (n8n): Workflow automation platform

    • PostgreSQL 15+ backend
    • IP: 192.168.2.107
    • Resolved database locale issues

Modifications

  • Storage utilization updated across all pools
  • PBS-Backups now at 27.43% (increased retention)
  • Vault optimized to 10.88% (reduced usage)

Removals

  • VM 101 (gitlab): Decommissioned (previously at this ID)
  • CT 112 (Anytype): Replaced by n8n for better integration

Documentation Updates

  • Created comprehensive monitoring stack documentation
  • Updated all infrastructure tables with current VMs/CTs
  • Added architecture patterns for observability and zero-trust
  • Updated storage statistics
  • Referenced latest export: disaster-recovery/homelab-export-20251207-120040

Repository Structure

homelab/
 monitoring/                      # NEW: Monitoring stack configurations
    README.md                   # Comprehensive monitoring documentation
    grafana/
       docker-compose.yml
    prometheus/
       docker-compose.yml
       prometheus.yml
    pve-exporter/
        docker-compose.yml
        pve.yml
        .env
 services/                        # Docker Compose service configurations
    n8n/                        # n8n workflow automation
    netbox/                     # Network documentation & IPAM
    README.md                   # Services overview (updated)
 disaster-recovery/
    homelab-export-20251207-120040/  # Latest infrastructure export
 scripts/
    crawlers-exporters/         # Infrastructure collection scripts
    fixers/                     # Problem-solving scripts
    qol/                        # Quality of life improvements
 CLAUDE.md                        # AI assistant guidance (updated)
 INDEX.md                         # Navigation index (updated)
 README.md                        # Repository overview (updated)
 CLAUDE_STATUS.md                # This file - current infrastructure status

Current Phase: Infrastructure Documentation Complete

Goal

Comprehensive documentation of monitoring stack and updated infrastructure inventory.

Phase

Documentation & Maintenance

Completed Tasks

  • Created /home/jramos/homelab/monitoring/README.md with comprehensive monitoring documentation
  • Updated CLAUDE_STATUS.md with current infrastructure state
  • Documented 10 VMs and 4 LXC containers
  • Updated storage statistics (PBS 27.43%, Vault 10.88%, local 15.13%)
  • Added monitoring stack architecture and deployment procedures
  • Documented new services: monitoring-docker, twingate-connector, n8n
  • Referenced latest export: disaster-recovery/homelab-export-20251207-120040

Next Steps (Pending)

  • Update INDEX.md with monitoring section and current VM/CT counts
  • Update README.md with all 10 VMs and 4 CTs
  • Update CLAUDE.md with architecture tables for monitoring and zero-trust
  • Update services/README.md with monitoring stack and twingate sections
  • Verify all documentation cross-references are accurate
  • Test monitoring stack deployment procedures

Access Information

Management Interfaces

Key Network Segments

  • Management Network: 192.168.2.0/24
  • Proxmox Host: 192.168.2.200
  • Reverse Proxy: 192.168.2.101 (CT 102)
  • n8n: 192.168.2.107 (CT 113)
  • Monitoring: 192.168.2.114 (VM 101)

Maintenance Schedule

Automated Tasks

  • Backups: Proxmox Backup Server - Daily incremental, Weekly full
  • Monitoring Scrapes: Prometheus - Every 30 seconds
  • Certificate Renewal: Nginx Proxy Manager - Automatic via Let's Encrypt
  • Weekly: Review Grafana dashboards for anomalies
  • Monthly: Update monitoring stack Docker images
  • Quarterly: Review backup retention policies
  • Semi-Annual: Kernel updates on Proxmox host and VMs

Known Issues & Resolutions

Resolved

  •  n8n PostgreSQL locale errors (fixed with fix_n8n_db_c_locale.sh)
  •  n8n database permissions (fixed with fix_n8n_db_permissions.sh)

Active Monitoring

  • PVE Exporter SSL verification (set to false for self-signed certificates)
  • Prometheus retention policies (currently 15 days, may need adjustment)

Deferred

  • NetBox container offline (on-demand service)
  • Development VMs stopped (resource conservation)

Version History

  • v2.1.0 (2025-12-07): Added monitoring stack, twingate connector, updated infrastructure counts
  • v2.0.0 (2025-12-02): Repository reorganization, services migration from GitLab
  • v1.0.0 (2025-11-29): Initial infrastructure documentation

Maintained by: jramos Repository: Homelab Infrastructure Configuration Platform: Proxmox VE 8.3.3 Infrastructure Scale: 10 VMs, 4 Containers Current Status: Operational - Monitoring & Documentation Phase