Files
homelab/CLAUDE_STATUS.md
Jordan Ramos f42eeaba92 feat(docs): update documentation for monitoring stack and infrastructure changes
- Update INDEX.md with VM 101 (monitoring-docker) and CT 112 (twingate-connector)
- Update README.md with monitoring and security sections
- Update CLAUDE.md with new architecture patterns
- Update services/README.md with monitoring stack documentation
- Update CLAUDE_STATUS.md with current infrastructure state
- Update infrastructure counts: 10 VMs, 4 Containers
- Update storage stats: PBS 27.43%, Vault 10.88%
- Create comprehensive monitoring/README.md
- Add .gitignore rules for monitoring sensitive files (pve.yml, .env)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-07 12:41:08 -07:00

348 lines
13 KiB
Markdown
Raw Blame History

This file contains invisible Unicode characters
This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Homelab Infrastructure Status
**Last Updated**: 2025-12-07 12:00:40
**Export Reference**: disaster-recovery/homelab-export-20251207-120040
## Current Infrastructure Snapshot
### Proxmox Environment
- **Node**: serviceslab
- **Version**: Proxmox VE 8.3.3
- **Management IP**: 192.168.2.200
- **Architecture**: Single-node cluster
- **Total Resources**: 10 VMs, 4 LXC Containers
---
## Virtual Machines (QEMU/KVM) - 10 VMs
| VM ID | Name | IP Address | Status | Purpose |
|-------|------|------------|--------|---------|
| 100 | docker-hub | 192.168.2.XXX | Running | Container registry/Docker hub mirror |
| 101 | monitoring-docker | 192.168.2.114 | Running | Monitoring stack (Grafana/Prometheus/PVE Exporter) |
| 104 | ubuntu-dev | - | Stopped | Ubuntu development environment |
| 105 | dev | - | Stopped | General-purpose development workstation |
| 106 | Ansible-Control | 192.168.2.XXX | Running | IaC orchestration, configuration management |
| 107 | ubuntu-docker | - | Stopped | Ubuntu Docker host |
| 108 | CML | - | Stopped | Cisco Modeling Labs - network simulation |
| 109 | web-server-01 | 192.168.2.XXX | Running | Web application server (clustered) |
| 110 | web-server-02 | 192.168.2.XXX | Running | Load-balanced pair with web-server-01 |
| 111 | db-server-01 | 192.168.2.XXX | Running | Backend database server |
**Recent Changes**:
- Added VM 101 (monitoring-docker) for dedicated monitoring infrastructure
- Removed VM 101 (gitlab) - service decommissioned
---
## Containers (LXC) - 4 Containers
| CT ID | Name | IP Address | Status | Purpose |
|-------|------|------------|--------|---------|
| 102 | nginx | 192.168.2.101 | Running | Reverse proxy/load balancer & NPM |
| 103 | netbox | 192.168.2.XXX | Stopped | Network documentation/IPAM |
| 112 | twingate-connector | 192.168.2.XXX | Running | Zero-trust network access connector |
| 113 | n8n | 192.168.2.107 | Running | Workflow automation platform |
**Recent Changes**:
- Added CT 112 (twingate-connector) for zero-trust network security
- Added CT 113 (n8n) for workflow automation
- Removed CT 112 (Anytype) - replaced by n8n
---
## Storage Architecture
| Storage Pool | Type | Total | Used | % Used | Purpose |
|--------------|------|-------|------|--------|---------|
| local | Directory | - | - | 15.13% | System files, ISOs, templates |
| local-lvm | LVM-Thin | - | - | 0.0% | VM disk images (thin provisioned) |
| Vault | NFS/Directory | - | - | 10.88% | Secure storage for sensitive data |
| PBS-Backups | PBS | - | - | 27.43% | Automated backup repository |
| iso-share | NFS/CIFS | - | - | 1.4% | Installation media library |
| localnetwork | Network Share | - | - | N/A | Shared resources across infrastructure |
**Capacity Notes**:
- PBS-Backups utilization increased to 27.43% (healthy retention)
- Vault utilization decreased to 10.88% (space optimization)
- local storage at 15.13% (system overhead normal)
---
## Key Services & Stacks
### Monitoring & Observability (NEW)
**VM 101** - monitoring-docker (192.168.2.114)
- **Grafana**: Port 3000 - Visualization and dashboards
- **Prometheus**: Port 9090 - Metrics collection and time-series database
- **PVE Exporter**: Port 9221 - Proxmox VE metrics exporter
- **Documentation**: `/home/jramos/homelab/monitoring/README.md`
- **Status**: Fully operational
### Network Security (NEW)
**CT 112** - twingate-connector
- **Purpose**: Zero-trust network access
- **Type**: Lightweight connector
- **Status**: Running
- **Integration**: Connects homelab to Twingate network
### Automation & Integration
**CT 113** - n8n (192.168.2.107)
- **Purpose**: Workflow automation platform
- **Technology**: n8n.io
- **Database**: PostgreSQL 15+
- **Features**: API integration, scheduled workflows, webhook triggers
- **Documentation**: `/home/jramos/homelab/services/README.md#n8n-workflow-automation`
- **Status**: Operational (resolved database locale issues)
### Infrastructure Documentation
**CT 103** - netbox
- **Purpose**: Network documentation and IPAM
- **Status**: Stopped (on-demand use)
- **Function**: Infrastructure source of truth
### Reverse Proxy & Load Balancing
**CT 102** - nginx (192.168.2.101)
- **Purpose**: Nginx Proxy Manager
- **Ports**: 80, 81, 443
- **Function**: SSL termination, reverse proxy, certificate management
- **Upstream Services**: All web-facing applications
### Three-Tier Application Stack
**Web Tier**:
- VM 109 (web-server-01) - Primary web server
- VM 110 (web-server-02) - Load-balanced pair
**Database Tier**:
- VM 111 (db-server-01) - Backend database
**Proxy Tier**:
- CT 102 (nginx) - Load balancer and SSL termination
### Development & Automation
**VM 106** - Ansible-Control
- **Purpose**: Infrastructure as Code orchestration
- **Tools**: Ansible, Terraform/OpenTofu (potential)
- **Status**: Running
### Container Registry
**VM 100** - docker-hub
- **Purpose**: Local Docker registry and hub mirror
- **Function**: Caching container images for faster deployments
- **Status**: Running
### Network Simulation
**VM 108** - CML
- **Purpose**: Cisco Modeling Labs
- **Function**: Network topology testing and simulation
- **Status**: Stopped (resource-intensive, on-demand use)
---
## Architecture Patterns
### Monitoring & Observability (NEW)
The infrastructure now implements a comprehensive monitoring stack following industry best practices:
- **Metrics Collection**: Prometheus scraping Proxmox metrics via PVE Exporter
- **Visualization**: Grafana providing real-time dashboards and alerting
- **Isolation**: Dedicated VM for monitoring services (fault isolation)
- **Integration**: Ready for AlertManager, additional exporters, and integrations
**Design Decision**: VM-based deployment provides kernel-level isolation and prevents resource contention with critical infrastructure services.
### Zero-Trust Security (NEW)
Implementation of zero-trust network access principles:
- **Twingate Connector**: Lightweight connector providing secure access without VPNs
- **Container Deployment**: LXC container for minimal resource overhead
- **Network Segmentation**: Secure access to homelab from external networks
**Design Decision**: LXC container chosen for quick provisioning and low resource consumption.
### Automation-First Approach
Workflow automation and infrastructure orchestration:
- **n8n Platform**: Visual workflow builder for API integrations
- **Scheduled Tasks**: Automated backup checks, monitoring alerts, reports
- **Integration Hub**: Connects monitoring, documentation, and operational tools
**Design Decision**: PostgreSQL backend ensures data persistence and supports complex workflows.
### Tiered Application Architecture
Classic three-tier design for production-like environments:
- **Presentation Tier**: Paired web servers (109, 110) behind load balancer
- **Business Logic**: Application processing on web tier
- **Data Tier**: Dedicated database server (111) with backup strategy
**Design Decision**: Separation of concerns, scalability testing, high availability patterns.
### Selective Containerization Strategy
Hybrid approach balancing performance and resource efficiency:
- **LXC Containers**: Stateless services (nginx, netbox, twingate, n8n)
- **Full VMs**: Complex applications, kernel dependencies, heavy workloads
- **Rationale**: LXC for ~10x lower overhead, VMs for isolation and compatibility
---
## Recent Infrastructure Changes (2025-12-07)
### Additions
1. **VM 101 (monitoring-docker)**: New dedicated monitoring infrastructure
- Grafana for visualization
- Prometheus for metrics collection
- PVE Exporter for Proxmox integration
- IP: 192.168.2.114
2. **CT 112 (twingate-connector)**: Zero-trust network security
- Lightweight connector
- Secure remote access without VPN
3. **CT 113 (n8n)**: Workflow automation platform
- PostgreSQL 15+ backend
- IP: 192.168.2.107
- Resolved database locale issues
### Modifications
- Storage utilization updated across all pools
- PBS-Backups now at 27.43% (increased retention)
- Vault optimized to 10.88% (reduced usage)
### Removals
- **VM 101 (gitlab)**: Decommissioned (previously at this ID)
- **CT 112 (Anytype)**: Replaced by n8n for better integration
### Documentation Updates
- Created comprehensive monitoring stack documentation
- Updated all infrastructure tables with current VMs/CTs
- Added architecture patterns for observability and zero-trust
- Updated storage statistics
- Referenced latest export: disaster-recovery/homelab-export-20251207-120040
---
## Repository Structure
```
homelab/
 monitoring/ # NEW: Monitoring stack configurations
  README.md # Comprehensive monitoring documentation
  grafana/
   docker-compose.yml
  prometheus/
   docker-compose.yml
   prometheus.yml
  pve-exporter/
  docker-compose.yml
  pve.yml
  .env
 services/ # Docker Compose service configurations
  n8n/ # n8n workflow automation
  netbox/ # Network documentation & IPAM
  README.md # Services overview (updated)
 disaster-recovery/
  homelab-export-20251207-120040/ # Latest infrastructure export
 scripts/
  crawlers-exporters/ # Infrastructure collection scripts
  fixers/ # Problem-solving scripts
  qol/ # Quality of life improvements
 CLAUDE.md # AI assistant guidance (updated)
 INDEX.md # Navigation index (updated)
 README.md # Repository overview (updated)
 CLAUDE_STATUS.md # This file - current infrastructure status
```
---
## Current Phase: Infrastructure Documentation Complete
### Goal
Comprehensive documentation of monitoring stack and updated infrastructure inventory.
### Phase
Documentation & Maintenance
### Completed Tasks
- [x] Created `/home/jramos/homelab/monitoring/README.md` with comprehensive monitoring documentation
- [x] Updated `CLAUDE_STATUS.md` with current infrastructure state
- [x] Documented 10 VMs and 4 LXC containers
- [x] Updated storage statistics (PBS 27.43%, Vault 10.88%, local 15.13%)
- [x] Added monitoring stack architecture and deployment procedures
- [x] Documented new services: monitoring-docker, twingate-connector, n8n
- [x] Referenced latest export: disaster-recovery/homelab-export-20251207-120040
### Next Steps (Pending)
- [ ] Update INDEX.md with monitoring section and current VM/CT counts
- [ ] Update README.md with all 10 VMs and 4 CTs
- [ ] Update CLAUDE.md with architecture tables for monitoring and zero-trust
- [ ] Update services/README.md with monitoring stack and twingate sections
- [ ] Verify all documentation cross-references are accurate
- [ ] Test monitoring stack deployment procedures
---
## Access Information
### Management Interfaces
- **Proxmox UI**: https://192.168.2.200:8006
- **Grafana**: http://192.168.2.114:3000
- **Prometheus**: http://192.168.2.114:9090
- **Nginx Proxy Manager**: http://192.168.2.101:81
- **n8n**: http://192.168.2.107:5678
### Key Network Segments
- **Management Network**: 192.168.2.0/24
- **Proxmox Host**: 192.168.2.200
- **Reverse Proxy**: 192.168.2.101 (CT 102)
- **n8n**: 192.168.2.107 (CT 113)
- **Monitoring**: 192.168.2.114 (VM 101)
---
## Maintenance Schedule
### Automated Tasks
- **Backups**: Proxmox Backup Server - Daily incremental, Weekly full
- **Monitoring Scrapes**: Prometheus - Every 30 seconds
- **Certificate Renewal**: Nginx Proxy Manager - Automatic via Let's Encrypt
### Recommended Manual Tasks
- **Weekly**: Review Grafana dashboards for anomalies
- **Monthly**: Update monitoring stack Docker images
- **Quarterly**: Review backup retention policies
- **Semi-Annual**: Kernel updates on Proxmox host and VMs
---
## Known Issues & Resolutions
### Resolved
-  n8n PostgreSQL locale errors (fixed with `fix_n8n_db_c_locale.sh`)
-  n8n database permissions (fixed with `fix_n8n_db_permissions.sh`)
### Active Monitoring
- PVE Exporter SSL verification (set to false for self-signed certificates)
- Prometheus retention policies (currently 15 days, may need adjustment)
### Deferred
- NetBox container offline (on-demand service)
- Development VMs stopped (resource conservation)
---
## Version History
- **v2.1.0** (2025-12-07): Added monitoring stack, twingate connector, updated infrastructure counts
- **v2.0.0** (2025-12-02): Repository reorganization, services migration from GitLab
- **v1.0.0** (2025-11-29): Initial infrastructure documentation
---
**Maintained by**: jramos
**Repository**: Homelab Infrastructure Configuration
**Platform**: Proxmox VE 8.3.3
**Infrastructure Scale**: 10 VMs, 4 Containers
**Current Status**: Operational - Monitoring & Documentation Phase