Root Cause: - N8N_ENCRYPTION_KEY in /opt/n8n/.env contained literal shell command string $(openssl rand -hex 32) instead of executed value - .env files do not execute shell commands, only parse literal strings - Caused n8n service crash loop preventing startup Troubleshooting Process: - Identified service crash loop via journalctl logs - Backend-Builder diagnosed invalid encryption key issue - Multiple heredoc script attempts failed due to Windows/Linux line ending issues in WSL environment - Created simple fix script using echo statements (no heredoc) Solution: - Fix script created at /tmp/fix_n8n_simple.sh - Generates proper encryption key using openssl rand -hex 32 - Recreates .env with corrected configuration including missing N8N_LISTEN_ADDRESS=0.0.0.0 and NODE_ENV=production - Backs up existing .env before changes - Sets proper permissions (600, n8n:n8n) Reviews: - Backend-Builder: APPROVED (95% confidence, technically sound) - Lab-Operator: APPROVED with safeguards (ZFS snapshot, DB backup) Status: Ready for deployment by user on CT 113 tomorrow 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
306 lines
12 KiB
Markdown
306 lines
12 KiB
Markdown
# Homelab Status Tracker
|
|
|
|
**Last Updated**: 2025-11-30 17:37:00
|
|
**Goal**: Document and commit recent infrastructure planning and integration documentation
|
|
**Phase**: Completed
|
|
**Current Context**: All documentation corrections committed. Architecture updates for Debian 12 and NPM committed to repository. Latest commit hash: c16d5210709c38ccf3ef22785c23ac99a61f1703
|
|
|
|
---
|
|
|
|
## Current Tasks
|
|
|
|
### Pre-Commit Security & Sanitization
|
|
- [x] **Step 1**: Sanitize API key in OBSIDIAN-MCP-SETUP.md
|
|
- Status: Completed at 2025-11-30 13:20:00
|
|
- Owner: Librarian
|
|
- Action: Replaced all 5 occurrences of real API key with placeholder
|
|
- Result: Verified no production secrets remain in file
|
|
|
|
- [x] **Step 2**: Update .gitignore to exclude Claude config files
|
|
- Status: Completed at 2025-11-30 13:21:00
|
|
- Owner: Librarian
|
|
- Action: Added .claude.json, *.claude.json, and .claude/ patterns
|
|
- Result: Claude configuration files will not be committed to repository
|
|
|
|
- [x] **Step 3**: Stage all changes for commit
|
|
- Status: Completed at 2025-11-30 13:22:00
|
|
- Owner: Librarian
|
|
- Action: Executed git add -A
|
|
- Result: Staged 6 files (1 deleted, 2 modified, 3 new)
|
|
|
|
- [x] **Step 4**: Create commit with proper message
|
|
- Status: Completed at 2025-11-30 13:24:29
|
|
- Owner: Librarian
|
|
- Action: Created commit with comprehensive conventional commit message
|
|
- Result: Commit hash a1841f1c4193b143c9fa71746929cfe3cd9cbdbe
|
|
- Changes: 6 files changed, 2,849 insertions(+), 73 deletions(-)
|
|
|
|
---
|
|
|
|
## Completed Reviews
|
|
|
|
- [x] **Scribe Review**: Documented all changes comprehensively
|
|
- [x] **Librarian Security Review**: Identified security concerns
|
|
- [x] **Lab-Operator Infrastructure Review**: Validated operational impact
|
|
|
|
---
|
|
|
|
## Changes Being Committed
|
|
|
|
### Modified Files
|
|
- **CLAUDE.md**: Enhanced with Universal Workflow sections
|
|
|
|
### Deleted Files
|
|
- **.claude/agents/homelab-steve.md**: Removed legacy agent definition
|
|
|
|
### New Files
|
|
- **CLAUDE_STATUS.md**: Status tracking file
|
|
- **OBSIDIAN-MCP-SETUP.md**: Obsidian MCP guide (820 lines)
|
|
- **n8n/N8N-SETUP-PLAN.md**: n8n deployment plan (1,948 lines)
|
|
|
|
---
|
|
|
|
## Post-Commit Documentation Corrections
|
|
|
|
- [x] **Fix PostgreSQL Installation Instructions**: n8n/N8N-SETUP-PLAN.md
|
|
- Status: Completed at 2025-11-30 13:30:00
|
|
- Owner: Scribe
|
|
- Issue: PostgreSQL 16 installation failed - package not in standard repos
|
|
- Action: Added PostgreSQL official repository setup steps (lines 587-605)
|
|
- Result: Installation instructions now work correctly
|
|
- Reported by: User (real-world deployment feedback)
|
|
|
|
- [x] **Architecture Corrections - Batch Updates**: n8n/N8N-SETUP-PLAN.md
|
|
- Status: Completed at 2025-11-30 14:00:00
|
|
- Owners: Scribe (documentation), Lab-Operator (validation)
|
|
- Issues Identified:
|
|
1. OS mismatch: Document referenced Ubuntu, actual deployment is Debian 12
|
|
2. Reverse proxy mismatch: Document described standalone nginx, actual is Nginx Proxy Manager (NPM)
|
|
- Total Changes Applied: 30+ corrections across 4 batches
|
|
|
|
**Batch 1 - OS Corrections (2 changes)**:
|
|
- Line 200: Updated OS template "Debian 12 or Ubuntu" → "Debian 12"
|
|
- Line 588: Updated comment "Ubuntu repositories" → "Debian repositories"
|
|
|
|
**Batch 2 - NPM Terminology Updates (10 changes)**:
|
|
- Line 12: Executive summary updated to reference NPM
|
|
- Lines 112-113: CT 102 specs updated (2 cores, 4GB RAM, 10GB disk) and renamed to nginx-proxy-mgr
|
|
- Line 170: LXC consistency reference updated to NPM
|
|
- Lines 260, 286, 308-309: Network diagrams updated (nginx → NPM, added port 81)
|
|
- Line 320: Firewall comment updated
|
|
- Lines 583-584: Removed nginx-light and certbot from prerequisites
|
|
- Line 893: Firewall rule comment updated to NPM
|
|
|
|
**Batch 3 - Major Section Rewrites (2 sections)**:
|
|
- Lines 379-437: Section VI-A completely rewritten for NPM architecture
|
|
* Added NPM overview with GitHub link
|
|
* Replaced manual nginx config with NPM web UI instructions
|
|
* Documented NPM admin access (port 81)
|
|
* Updated SSL configuration approach (GUI vs certbot)
|
|
- Lines 765-917: Phase 7 completely rewritten (reduced from 20min to 10min)
|
|
* Replaced SSH/manual config with browser-based NPM UI steps
|
|
* Added step-by-step proxy host creation guide
|
|
* Included SSL certificate request via NPM interface
|
|
* Added NPM-specific troubleshooting section
|
|
|
|
**Batch 4 - Remaining Updates (15+ changes)**:
|
|
- Line 1093: "HTTPS through nginx" → "HTTPS through NPM"
|
|
- Lines 1360-1372: Troubleshooting section updated for NPM (Docker commands, UI access)
|
|
- Line 1376: Firewall check comment updated
|
|
- Line 1392: Timeout check reference updated to NPM Advanced settings
|
|
- Line 1444: Security hardening checklist updated
|
|
- Lines 1478-1487: Rate limiting implementation updated for NPM
|
|
- Line 1575: Workflow diagram updated
|
|
- Line 1801: Architecture diagram updated (nginx → NPM)
|
|
- Line 1868: Deployment checklist updated
|
|
|
|
**Key Architecture Changes Documented**:
|
|
1. Debian 12 vs Ubuntu: Package repositories differ, PostgreSQL requires official apt repo
|
|
2. NPM vs Standalone Nginx:
|
|
- Configuration: Web UI at :81 vs manual config files
|
|
- SSL Management: Automatic via UI vs manual certbot commands
|
|
- Monitoring: Built-in dashboard vs log file review
|
|
- Architecture: Docker-based NPM vs system nginx service
|
|
- Maintenance: GUI-based vs SSH/command-line
|
|
|
|
**Lab-Operator Validation**: ✅ APPROVED
|
|
- All changes verified against actual Proxmox infrastructure
|
|
- NPM compatibility confirmed (Docker on LXC with nesting=1)
|
|
- Security implications reviewed and documented
|
|
- No operational risks identified
|
|
|
|
**Impact**:
|
|
- Phase 7 time reduced: 20 minutes → 10 minutes
|
|
- Deployment complexity reduced (no SSH to CT 102 required)
|
|
- Maintenance simplified (web UI vs config files)
|
|
- Documentation accuracy: Aligned with real deployment environment
|
|
|
|
- [x] **Commit Architecture Corrections to Repository**
|
|
- Status: Completed at 2025-11-30 17:37:00
|
|
- Owner: Librarian
|
|
- Action: Created commit with conventional commit message for n8n architecture corrections
|
|
- Result: Commit hash c16d5210709c38ccf3ef22785c23ac99a61f1703
|
|
- Changes: 2 files changed, 325 insertions(+), 194 deletions(-)
|
|
* CLAUDE_STATUS.md: Updated with detailed change log
|
|
* n8n/N8N-SETUP-PLAN.md: 30+ architecture corrections (Debian 12 + NPM)
|
|
|
|
---
|
|
|
|
---
|
|
|
|
## Active Troubleshooting: n8n 502 Bad Gateway
|
|
|
|
**Started**: 2025-11-30
|
|
**Updated**: 2025-12-01
|
|
**Status**: Ready for Deployment
|
|
**Issue**: n8n returns 502 Bad Gateway - Root cause identified and fix script prepared
|
|
|
|
### Problem Summary
|
|
|
|
**Symptoms**:
|
|
- ❌ External access: `https://n8n.apophisnetworking.net` returns 502 Bad Gateway (from mobile)
|
|
- ❌ Internal access: Returns nginx default page or connection issues
|
|
- ✅ Comparison: `beszel.apophisnetworking.net` works perfectly (both internal and external)
|
|
|
|
**Configuration Context**:
|
|
- n8n location: CT 113 at 192.168.2.113:5678
|
|
- NPM location: CT 102 at 192.168.2.101
|
|
- Beszel location: 192.168.2.102:8090 (working reference)
|
|
- All services behind same NPM, same Cloudflare DNS setup
|
|
|
|
### n8n Configuration (from /opt/n8n/.env)
|
|
|
|
```bash
|
|
# n8n Configuration
|
|
N8N_PROTOCOL=https
|
|
N8N_HOST=n8n.apophisnetworking.net
|
|
N8N_PORT=5678
|
|
N8N_PATH=/
|
|
WEBHOOK_URL=https://n8n.apophisnetworking.net/
|
|
|
|
# Database
|
|
DB_TYPE=postgresdb
|
|
DB_POSTGRESDB_HOST=localhost
|
|
DB_POSTGRESDB_PORT=5432
|
|
DB_POSTGRESDB_DATABASE=n8n_db
|
|
DB_POSTGRESDB_USER=n8n_user
|
|
```
|
|
|
|
### NPM Proxy Host Configuration (from screenshots)
|
|
|
|
**Details Tab**:
|
|
- Domain: `n8n.apophisnetworking.net`
|
|
- Scheme: `http`
|
|
- Forward to: `192.168.2.113:5678`
|
|
- Websockets: ✓ Enabled
|
|
- Status: Online (green)
|
|
|
|
**SSL Tab**:
|
|
- Certificate: `*.apophisnetworking.net` (wildcard)
|
|
- Force SSL: ✓ Enabled
|
|
- HTTP/2: ✓ Enabled
|
|
- HSTS: ✓ Enabled
|
|
|
|
### Diagnostic Steps Completed
|
|
|
|
- [x] **Verify n8n service status** (Lab-Operator)
|
|
- Status: Service in crash loop - repeatedly starting and failing
|
|
- Command: `systemctl status n8n` showed "activating (auto-restart)"
|
|
|
|
- [x] **Review service logs** (Lab-Operator)
|
|
- Command: `journalctl -u n8n -n 100`
|
|
- Errors found: Encryption key validation failures
|
|
- Log showed: n8n exiting immediately after start attempt
|
|
|
|
- [x] **Analyze .env configuration** (Backend-Builder)
|
|
- Found: `N8N_ENCRYPTION_KEY=$(openssl rand -hex 32)`
|
|
- Issue: .env files don't execute shell commands - this is a literal string
|
|
- Missing: `N8N_LISTEN_ADDRESS=0.0.0.0`
|
|
- Missing: `NODE_ENV=production`
|
|
- Password needs quoting: `DB_POSTGRESDB_PASSWORD="Nbkx4mdmay1)"`
|
|
|
|
### Root Cause Analysis
|
|
|
|
**PRIMARY ISSUE**: Invalid N8N_ENCRYPTION_KEY in /opt/n8n/.env
|
|
|
|
**Technical Explanation**:
|
|
The `.env` file contained `N8N_ENCRYPTION_KEY=$(openssl rand -hex 32)` which was intended to generate a random encryption key. However, `.env` files are not shell scripts - they don't execute commands. The variable was set to the **literal string** `$(openssl rand -hex 32)` instead of an actual 64-character hexadecimal key.
|
|
|
|
**Impact**:
|
|
- n8n service fails encryption key validation on startup
|
|
- Service enters crash loop (start → fail → restart → fail)
|
|
- NPM returns 502 Bad Gateway because backend service is down
|
|
- Issue was NOT hairpin NAT or NPM misconfiguration (beszel works fine with same setup)
|
|
|
|
**Additional Configuration Issues Identified**:
|
|
1. Missing `N8N_LISTEN_ADDRESS=0.0.0.0` - would cause service to listen only on localhost
|
|
2. Missing `NODE_ENV=production` - affects performance and security
|
|
3. Database password not quoted - special characters need proper escaping
|
|
|
|
### Attempted Solutions & Lessons Learned
|
|
|
|
**Attempt 1-3: Heredoc Script Failures**
|
|
- Created fix script using heredoc syntax for .env generation
|
|
- Error: `warning: here-document at line 22 delimited by end-of-file (wanted 'ENVEOF')`
|
|
- Root cause: Windows/Linux line ending issues when copying script from WSL to LXC container
|
|
- Backend-Builder's first attempt incorrectly changed to SQLite (corrected to maintain PostgreSQL)
|
|
- Lesson: Heredoc syntax fragile in cross-platform environments
|
|
|
|
**Final Solution: Simple Echo-Based Script**
|
|
- Replaced heredoc with simple `echo` statements
|
|
- More robust to copy-paste and line ending issues
|
|
- Avoids CRLF/LF conversion problems
|
|
|
|
### Solution: Fix Script Ready for Deployment
|
|
|
|
**Script Location**: `/tmp/fix_n8n_simple.sh` (on WSL, ready to transfer to CT 113)
|
|
|
|
**Script Actions**:
|
|
1. Generates proper encryption key: `ENCRYPTION_KEY=$(openssl rand -hex 32)`
|
|
2. Backs up existing .env with timestamp: `/opt/n8n/.env.backup.YYYYMMDD_HHMMSS`
|
|
3. Creates new .env file with corrected configuration:
|
|
- Actual generated encryption key (not shell command)
|
|
- Adds `N8N_LISTEN_ADDRESS=0.0.0.0`
|
|
- Adds `NODE_ENV=production`
|
|
- Properly quotes `DB_POSTGRESDB_PASSWORD`
|
|
- Maintains PostgreSQL database configuration
|
|
4. Sets secure permissions: `chmod 600` and `chown n8n:n8n`
|
|
5. Restarts n8n service
|
|
6. Verifies service status and local connectivity
|
|
|
|
**Reviews Completed**:
|
|
- ✅ **Backend-Builder**: Code review APPROVED (95% confidence, technically sound)
|
|
- ✅ **Lab-Operator**: Operational review APPROVED with safeguards documented
|
|
- Minimal downtime (~13 seconds)
|
|
- No database corruption risk
|
|
- Rollback procedures documented
|
|
- Security recommendations provided
|
|
|
|
**Pre-Execution Safeguards**:
|
|
1. Create ZFS snapshot of CT 113: `pct snapshot 113 pre-n8n-fix`
|
|
2. Backup PostgreSQL database: `pg_dump n8n_db > /tmp/n8n_db_pre_fix_backup.sql`
|
|
3. Verify no encrypted credentials exist (likely none since service never started)
|
|
|
|
**Security Notes**:
|
|
- Script contains hardcoded password - **delete after use**: `shred -u /tmp/fix_n8n_simple.sh`
|
|
- Do NOT commit script to git repository
|
|
- Encryption key properly secured in .env with 600 permissions
|
|
|
|
### Next Actions
|
|
|
|
- [ ] User to deploy fix script on CT 113 tomorrow (2025-12-02)
|
|
- [ ] Test external access after fix: `https://n8n.apophisnetworking.net`
|
|
- [ ] Verify service stability for 24 hours
|
|
- [ ] Update this status file to RESOLVED after successful deployment
|
|
|
|
### Files Referenced
|
|
|
|
- `/home/jramos/homelab/n8n/N8N-SETUP-PLAN.md` - Phase 5 configuration
|
|
- `/opt/n8n/.env` - n8n configuration (on CT 113)
|
|
- `/tmp/fix_n8n_simple.sh` - Fix script (NOT committed to git - contains password)
|
|
- `/data/nginx/proxy_host/*.conf` - NPM proxy configs (on CT 102)
|
|
|
|
---
|
|
|
|
**Repository**: /home/jramos/homelab | **Branch**: main
|