docs(n8n): document troubleshooting session for 502 Bad Gateway issue
Root Cause: - N8N_ENCRYPTION_KEY in /opt/n8n/.env contained literal shell command string $(openssl rand -hex 32) instead of executed value - .env files do not execute shell commands, only parse literal strings - Caused n8n service crash loop preventing startup Troubleshooting Process: - Identified service crash loop via journalctl logs - Backend-Builder diagnosed invalid encryption key issue - Multiple heredoc script attempts failed due to Windows/Linux line ending issues in WSL environment - Created simple fix script using echo statements (no heredoc) Solution: - Fix script created at /tmp/fix_n8n_simple.sh - Generates proper encryption key using openssl rand -hex 32 - Recreates .env with corrected configuration including missing N8N_LISTEN_ADDRESS=0.0.0.0 and NODE_ENV=production - Backs up existing .env before changes - Sets proper permissions (600, n8n:n8n) Reviews: - Backend-Builder: APPROVED (95% confidence, technically sound) - Lab-Operator: APPROVED with safeguards (ZFS snapshot, DB backup) Status: Ready for deployment by user on CT 113 tomorrow 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
169
CLAUDE_STATUS.md
169
CLAUDE_STATUS.md
@@ -1,9 +1,9 @@
|
|||||||
# Homelab Status Tracker
|
# Homelab Status Tracker
|
||||||
|
|
||||||
**Last Updated**: 2025-11-30 13:25:00
|
**Last Updated**: 2025-11-30 17:37:00
|
||||||
**Goal**: Document and commit recent infrastructure planning and integration documentation
|
**Goal**: Document and commit recent infrastructure planning and integration documentation
|
||||||
**Phase**: Completed
|
**Phase**: Completed
|
||||||
**Current Context**: All pre-commit tasks completed successfully. Documentation committed to repository with proper security sanitization. Commit hash: a1841f1c4193b143c9fa71746929cfe3cd9cbdbe
|
**Current Context**: All documentation corrections committed. Architecture updates for Debian 12 and NPM committed to repository. Latest commit hash: c16d5210709c38ccf3ef22785c23ac99a61f1703
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
@@ -135,6 +135,171 @@
|
|||||||
- Maintenance simplified (web UI vs config files)
|
- Maintenance simplified (web UI vs config files)
|
||||||
- Documentation accuracy: Aligned with real deployment environment
|
- Documentation accuracy: Aligned with real deployment environment
|
||||||
|
|
||||||
|
- [x] **Commit Architecture Corrections to Repository**
|
||||||
|
- Status: Completed at 2025-11-30 17:37:00
|
||||||
|
- Owner: Librarian
|
||||||
|
- Action: Created commit with conventional commit message for n8n architecture corrections
|
||||||
|
- Result: Commit hash c16d5210709c38ccf3ef22785c23ac99a61f1703
|
||||||
|
- Changes: 2 files changed, 325 insertions(+), 194 deletions(-)
|
||||||
|
* CLAUDE_STATUS.md: Updated with detailed change log
|
||||||
|
* n8n/N8N-SETUP-PLAN.md: 30+ architecture corrections (Debian 12 + NPM)
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Active Troubleshooting: n8n 502 Bad Gateway
|
||||||
|
|
||||||
|
**Started**: 2025-11-30
|
||||||
|
**Updated**: 2025-12-01
|
||||||
|
**Status**: Ready for Deployment
|
||||||
|
**Issue**: n8n returns 502 Bad Gateway - Root cause identified and fix script prepared
|
||||||
|
|
||||||
|
### Problem Summary
|
||||||
|
|
||||||
|
**Symptoms**:
|
||||||
|
- ❌ External access: `https://n8n.apophisnetworking.net` returns 502 Bad Gateway (from mobile)
|
||||||
|
- ❌ Internal access: Returns nginx default page or connection issues
|
||||||
|
- ✅ Comparison: `beszel.apophisnetworking.net` works perfectly (both internal and external)
|
||||||
|
|
||||||
|
**Configuration Context**:
|
||||||
|
- n8n location: CT 113 at 192.168.2.113:5678
|
||||||
|
- NPM location: CT 102 at 192.168.2.101
|
||||||
|
- Beszel location: 192.168.2.102:8090 (working reference)
|
||||||
|
- All services behind same NPM, same Cloudflare DNS setup
|
||||||
|
|
||||||
|
### n8n Configuration (from /opt/n8n/.env)
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# n8n Configuration
|
||||||
|
N8N_PROTOCOL=https
|
||||||
|
N8N_HOST=n8n.apophisnetworking.net
|
||||||
|
N8N_PORT=5678
|
||||||
|
N8N_PATH=/
|
||||||
|
WEBHOOK_URL=https://n8n.apophisnetworking.net/
|
||||||
|
|
||||||
|
# Database
|
||||||
|
DB_TYPE=postgresdb
|
||||||
|
DB_POSTGRESDB_HOST=localhost
|
||||||
|
DB_POSTGRESDB_PORT=5432
|
||||||
|
DB_POSTGRESDB_DATABASE=n8n_db
|
||||||
|
DB_POSTGRESDB_USER=n8n_user
|
||||||
|
```
|
||||||
|
|
||||||
|
### NPM Proxy Host Configuration (from screenshots)
|
||||||
|
|
||||||
|
**Details Tab**:
|
||||||
|
- Domain: `n8n.apophisnetworking.net`
|
||||||
|
- Scheme: `http`
|
||||||
|
- Forward to: `192.168.2.113:5678`
|
||||||
|
- Websockets: ✓ Enabled
|
||||||
|
- Status: Online (green)
|
||||||
|
|
||||||
|
**SSL Tab**:
|
||||||
|
- Certificate: `*.apophisnetworking.net` (wildcard)
|
||||||
|
- Force SSL: ✓ Enabled
|
||||||
|
- HTTP/2: ✓ Enabled
|
||||||
|
- HSTS: ✓ Enabled
|
||||||
|
|
||||||
|
### Diagnostic Steps Completed
|
||||||
|
|
||||||
|
- [x] **Verify n8n service status** (Lab-Operator)
|
||||||
|
- Status: Service in crash loop - repeatedly starting and failing
|
||||||
|
- Command: `systemctl status n8n` showed "activating (auto-restart)"
|
||||||
|
|
||||||
|
- [x] **Review service logs** (Lab-Operator)
|
||||||
|
- Command: `journalctl -u n8n -n 100`
|
||||||
|
- Errors found: Encryption key validation failures
|
||||||
|
- Log showed: n8n exiting immediately after start attempt
|
||||||
|
|
||||||
|
- [x] **Analyze .env configuration** (Backend-Builder)
|
||||||
|
- Found: `N8N_ENCRYPTION_KEY=$(openssl rand -hex 32)`
|
||||||
|
- Issue: .env files don't execute shell commands - this is a literal string
|
||||||
|
- Missing: `N8N_LISTEN_ADDRESS=0.0.0.0`
|
||||||
|
- Missing: `NODE_ENV=production`
|
||||||
|
- Password needs quoting: `DB_POSTGRESDB_PASSWORD="Nbkx4mdmay1)"`
|
||||||
|
|
||||||
|
### Root Cause Analysis
|
||||||
|
|
||||||
|
**PRIMARY ISSUE**: Invalid N8N_ENCRYPTION_KEY in /opt/n8n/.env
|
||||||
|
|
||||||
|
**Technical Explanation**:
|
||||||
|
The `.env` file contained `N8N_ENCRYPTION_KEY=$(openssl rand -hex 32)` which was intended to generate a random encryption key. However, `.env` files are not shell scripts - they don't execute commands. The variable was set to the **literal string** `$(openssl rand -hex 32)` instead of an actual 64-character hexadecimal key.
|
||||||
|
|
||||||
|
**Impact**:
|
||||||
|
- n8n service fails encryption key validation on startup
|
||||||
|
- Service enters crash loop (start → fail → restart → fail)
|
||||||
|
- NPM returns 502 Bad Gateway because backend service is down
|
||||||
|
- Issue was NOT hairpin NAT or NPM misconfiguration (beszel works fine with same setup)
|
||||||
|
|
||||||
|
**Additional Configuration Issues Identified**:
|
||||||
|
1. Missing `N8N_LISTEN_ADDRESS=0.0.0.0` - would cause service to listen only on localhost
|
||||||
|
2. Missing `NODE_ENV=production` - affects performance and security
|
||||||
|
3. Database password not quoted - special characters need proper escaping
|
||||||
|
|
||||||
|
### Attempted Solutions & Lessons Learned
|
||||||
|
|
||||||
|
**Attempt 1-3: Heredoc Script Failures**
|
||||||
|
- Created fix script using heredoc syntax for .env generation
|
||||||
|
- Error: `warning: here-document at line 22 delimited by end-of-file (wanted 'ENVEOF')`
|
||||||
|
- Root cause: Windows/Linux line ending issues when copying script from WSL to LXC container
|
||||||
|
- Backend-Builder's first attempt incorrectly changed to SQLite (corrected to maintain PostgreSQL)
|
||||||
|
- Lesson: Heredoc syntax fragile in cross-platform environments
|
||||||
|
|
||||||
|
**Final Solution: Simple Echo-Based Script**
|
||||||
|
- Replaced heredoc with simple `echo` statements
|
||||||
|
- More robust to copy-paste and line ending issues
|
||||||
|
- Avoids CRLF/LF conversion problems
|
||||||
|
|
||||||
|
### Solution: Fix Script Ready for Deployment
|
||||||
|
|
||||||
|
**Script Location**: `/tmp/fix_n8n_simple.sh` (on WSL, ready to transfer to CT 113)
|
||||||
|
|
||||||
|
**Script Actions**:
|
||||||
|
1. Generates proper encryption key: `ENCRYPTION_KEY=$(openssl rand -hex 32)`
|
||||||
|
2. Backs up existing .env with timestamp: `/opt/n8n/.env.backup.YYYYMMDD_HHMMSS`
|
||||||
|
3. Creates new .env file with corrected configuration:
|
||||||
|
- Actual generated encryption key (not shell command)
|
||||||
|
- Adds `N8N_LISTEN_ADDRESS=0.0.0.0`
|
||||||
|
- Adds `NODE_ENV=production`
|
||||||
|
- Properly quotes `DB_POSTGRESDB_PASSWORD`
|
||||||
|
- Maintains PostgreSQL database configuration
|
||||||
|
4. Sets secure permissions: `chmod 600` and `chown n8n:n8n`
|
||||||
|
5. Restarts n8n service
|
||||||
|
6. Verifies service status and local connectivity
|
||||||
|
|
||||||
|
**Reviews Completed**:
|
||||||
|
- ✅ **Backend-Builder**: Code review APPROVED (95% confidence, technically sound)
|
||||||
|
- ✅ **Lab-Operator**: Operational review APPROVED with safeguards documented
|
||||||
|
- Minimal downtime (~13 seconds)
|
||||||
|
- No database corruption risk
|
||||||
|
- Rollback procedures documented
|
||||||
|
- Security recommendations provided
|
||||||
|
|
||||||
|
**Pre-Execution Safeguards**:
|
||||||
|
1. Create ZFS snapshot of CT 113: `pct snapshot 113 pre-n8n-fix`
|
||||||
|
2. Backup PostgreSQL database: `pg_dump n8n_db > /tmp/n8n_db_pre_fix_backup.sql`
|
||||||
|
3. Verify no encrypted credentials exist (likely none since service never started)
|
||||||
|
|
||||||
|
**Security Notes**:
|
||||||
|
- Script contains hardcoded password - **delete after use**: `shred -u /tmp/fix_n8n_simple.sh`
|
||||||
|
- Do NOT commit script to git repository
|
||||||
|
- Encryption key properly secured in .env with 600 permissions
|
||||||
|
|
||||||
|
### Next Actions
|
||||||
|
|
||||||
|
- [ ] User to deploy fix script on CT 113 tomorrow (2025-12-02)
|
||||||
|
- [ ] Test external access after fix: `https://n8n.apophisnetworking.net`
|
||||||
|
- [ ] Verify service stability for 24 hours
|
||||||
|
- [ ] Update this status file to RESOLVED after successful deployment
|
||||||
|
|
||||||
|
### Files Referenced
|
||||||
|
|
||||||
|
- `/home/jramos/homelab/n8n/N8N-SETUP-PLAN.md` - Phase 5 configuration
|
||||||
|
- `/opt/n8n/.env` - n8n configuration (on CT 113)
|
||||||
|
- `/tmp/fix_n8n_simple.sh` - Fix script (NOT committed to git - contains password)
|
||||||
|
- `/data/nginx/proxy_host/*.conf` - NPM proxy configs (on CT 102)
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
**Repository**: /home/jramos/homelab | **Branch**: main
|
**Repository**: /home/jramos/homelab | **Branch**: main
|
||||||
|
|||||||
Reference in New Issue
Block a user