Files
homelab/scripts/crawlers-exporters/COLLECTION-GUIDE.md

597 lines
14 KiB
Markdown
Raw Normal View History

# Homelab Infrastructure Collection Guide
This guide explains how to use the `collect-homelab-config.sh` script to automatically gather your Proxmox infrastructure configuration for documentation, backup, and Infrastructure as Code purposes.
## Overview
The collection script is a sophisticated, read-only tool that:
- Gathers Proxmox VE configurations (VMs, containers, storage, network)
- Exports system information and cluster state
- Organizes everything in a well-structured directory
- Sanitizes sensitive information (passwords, tokens, optionally IPs)
- Generates summary reports and documentation
- Creates compressed archives for easy storage
**Important**: This script performs NO modifications to your system. It is entirely safe to run.
## Prerequisites
### Running from WSL2 (Your Current Environment)
You'll need SSH access to your Proxmox host. The script must run **on the Proxmox node itself**, not from WSL.
### Access Methods
**Option 1: SSH to Proxmox and run directly**
```bash
# From WSL, SSH to your Proxmox host
ssh root@<proxmox-ip>
# Transfer the script
scp collect-homelab-config.sh root@<proxmox-ip>:/root/
# SSH into Proxmox
ssh root@<proxmox-ip>
# Make executable and run
chmod +x /root/collect-homelab-config.sh
/root/collect-homelab-config.sh
```
**Option 2: Run via SSH from WSL**
```bash
# Copy script to Proxmox
scp collect-homelab-config.sh root@<proxmox-ip>:/tmp/
# Execute remotely
ssh root@<proxmox-ip> 'bash /tmp/collect-homelab-config.sh'
# Copy results back to WSL
scp -r root@<proxmox-ip>:/root/homelab-export-* ./
```
**Option 3: Create a wrapper script** (recommended)
We'll create a convenient wrapper that handles the SSH transfer and execution automatically.
## Quick Start
### 1. Make the Script Executable
```bash
chmod +x collect-homelab-config.sh
```
### 2. Run with Default Settings
On your Proxmox host:
```bash
sudo ./collect-homelab-config.sh
```
This will:
- Use **standard** collection level
- Sanitize passwords and tokens (but preserve IPs)
- Create output in `./homelab-export-<timestamp>/`
- Generate a compressed `.tar.gz` archive
### 3. Review the Results
```bash
# View the summary
cat homelab-export-*/SUMMARY.md
# Check for any errors
cat homelab-export-*/collection.log
# Browse the collected configs
ls -R homelab-export-*/
```
## Collection Levels
The script supports four collection levels:
### Basic
**What it collects:**
- System information (CPU, memory, disk, network)
- Proxmox core configurations (datacenter, storage, users)
- VM and container configurations
- Network configurations
**Use case:** Quick snapshot for documentation
**Example:**
```bash
./collect-homelab-config.sh --level basic
```
### Standard (Default)
**What it collects:**
- Everything in Basic
- Storage pool configurations and status
- Backup job definitions
- Cluster information and resources
- Guest VM/container details
**Use case:** Regular infrastructure snapshots
**Example:**
```bash
./collect-homelab-config.sh --level standard
```
### Full
**What it collects:**
- Everything in Standard
- System service configurations
- Detailed service status
- Extended system state
**Use case:** Comprehensive documentation before major changes
**Example:**
```bash
./collect-homelab-config.sh --level full
```
### Paranoid
**What it collects:**
- Everything possible (experimental features included)
**Use case:** Maximum detail for disaster recovery planning
**Example:**
```bash
./collect-homelab-config.sh --level paranoid
```
## Sanitization Options
Protect sensitive information in your exports:
### Default Sanitization
Sanitizes passwords and tokens, preserves IP addresses:
```bash
./collect-homelab-config.sh
```
### Full Sanitization
Sanitizes everything (IPs, passwords, tokens):
```bash
./collect-homelab-config.sh --sanitize all
```
### IP Sanitization Only
```bash
./collect-homelab-config.sh --sanitize ips
```
### No Sanitization
**Warning:** Only use if storing in a secure location
```bash
./collect-homelab-config.sh --sanitize none
```
## Command Reference
### Common Usage Patterns
**Daily snapshot with compression:**
```bash
./collect-homelab-config.sh -l standard -o /backup/homelab/daily
```
**Pre-maintenance full backup:**
```bash
./collect-homelab-config.sh -l full -o /backup/pre-maintenance-$(date +%Y%m%d)
```
**Documentation export (sanitized for sharing):**
```bash
./collect-homelab-config.sh -l standard --sanitize all -o ./docs-export
```
**Verbose output for troubleshooting:**
```bash
./collect-homelab-config.sh -v
```
**Quick collection without compression:**
```bash
./collect-homelab-config.sh --no-compress
```
### All Options
```
-l, --level LEVEL Collection level: basic, standard, full, paranoid
-o, --output DIR Output directory
-s, --sanitize WHAT Sanitization: all, ips, none
-c, --compress Compress output (default: true)
--no-compress Skip compression
-v, --verbose Verbose logging
-h, --help Show help
```
## Output Structure
After running the script, you'll have this structure:
```
homelab-export-<timestamp>/
├── README.md # Complete documentation of the export
├── SUMMARY.md # Collection statistics and overview
├── collection.log # Detailed collection log
├── configs/
│ ├── proxmox/ # Proxmox VE configurations
│ │ ├── datacenter.cfg
│ │ ├── storage.cfg
│ │ ├── user.cfg
│ │ └── firewall-cluster.fw
│ ├── vms/ # VM configs (VMID-name.conf)
│ │ ├── 100-docker-hub.conf
│ │ ├── 101-gitlab.conf
│ │ └── ...
│ ├── lxc/ # Container configs (CTID-name.conf)
│ │ ├── 102-nginx.conf
│ │ ├── 103-netbox.conf
│ │ └── ...
│ ├── storage/ # Storage configurations and status
│ ├── network/ # Network interface configs
│ ├── backup/ # Backup job definitions
│ └── services/ # System service configs (full/paranoid)
├── exports/
│ ├── system/ # System information
│ │ ├── pve-version.txt
│ │ ├── hostname.txt
│ │ ├── cpuinfo.txt
│ │ ├── meminfo.txt
│ │ └── ...
│ ├── cluster/ # Cluster status and resources
│ │ ├── cluster-status.txt
│ │ └── cluster-resources.json
│ └── guests/ # VM/CT lists and details
│ ├── vm-list.txt
│ ├── container-list.txt
│ └── all-guests.json
├── docs/ # (Empty - for manual documentation)
├── scripts/ # (Empty - for automation scripts)
└── diagrams/ # (Empty - for network diagrams)
```
## Automated Collection
### Setting Up Regular Snapshots
Create a cron job on your Proxmox host to collect weekly snapshots:
```bash
# Edit root's crontab
sudo crontab -e
# Add weekly collection every Sunday at 3 AM
0 3 * * 0 /root/collect-homelab-config.sh -l standard -o /backup/homelab/weekly-$(date +\%Y\%U) >> /var/log/homelab-collection.log 2>&1
```
### Retention Policy Example
Keep 4 weekly backups and clean up old ones:
```bash
#!/bin/bash
# /root/homelab-collection-cleanup.sh
BACKUP_DIR="/backup/homelab"
KEEP_WEEKS=4
# Remove archives older than KEEP_WEEKS
find "${BACKUP_DIR}" -name "homelab-export-*.tar.gz" -mtime +$((KEEP_WEEKS * 7)) -delete
find "${BACKUP_DIR}" -name "weekly-*" -type d -mtime +$((KEEP_WEEKS * 7)) -exec rm -rf {} \;
```
Add to crontab:
```bash
0 4 * * 0 /root/homelab-collection-cleanup.sh
```
## Integration with Git
### Initial Repository Setup
```bash
# On your Proxmox host or after copying to WSL
cd homelab-export-<timestamp>/
# Initialize repository
git init
# Create .gitignore for sensitive files
cat > .gitignore <<'EOF'
# Exclude logs
*.log
# Exclude any unsanitized exports if keeping both
*-unsanitized/
# Exclude compressed archives (large files)
*.tar.gz
*.zip
EOF
# Initial commit
git add .
git commit -m "Initial homelab infrastructure snapshot $(date +%Y-%m-%d)"
# Add remote (replace with your repository)
git remote add origin git@your-git-server:homelab/infrastructure.git
# Push
git push -u origin main
```
### Tracking Changes Over Time
```bash
# After subsequent collections
cd /path/to/homelab-export-<new-timestamp>/
# Copy to your git repo
cp -r configs/ /path/to/git-repo/
cp -r exports/ /path/to/git-repo/
cd /path/to/git-repo/
git add .
git diff --staged # Review what changed
git commit -m "Infrastructure snapshot $(date +%Y-%m-%d)"
git push
```
## Using Collected Data
### 1. Documentation
The collected configs serve as your infrastructure's source of truth:
```bash
# Review your VM configurations
cat configs/vms/*.conf
# Check storage setup
cat configs/proxmox/storage.cfg
# Understand network topology
cat configs/network/interfaces
```
### 2. Disaster Recovery
In case of catastrophic failure:
1. Reinstall Proxmox VE on new hardware
2. Reference `configs/network/interfaces` to recreate network configuration
3. Reference `configs/proxmox/storage.cfg` to recreate storage pools
4. Use VM/container configs to understand what to restore
5. Restore actual disk images from your Proxmox Backup Server
### 3. Infrastructure as Code Development
Use the collected configs to build Terraform/Ansible:
**Example: Convert to Terraform**
```bash
# Review a VM config
cat configs/vms/100-docker-hub.conf
# Create Terraform resource based on it
# (Use the config as reference for memory, CPU, disks, etc.)
```
**Example: Create Ansible Playbook**
```bash
# Use the exports to understand current state
cat exports/guests/vm-list.txt
# Build Ansible inventory
# Create playbooks to manage these VMs
```
### 4. Change Tracking
```bash
# Compare two exports
diff -u homelab-export-20240101-120000/configs/vms/ \
homelab-export-20240201-120000/configs/vms/
# What changed in storage?
diff homelab-export-20240101-120000/configs/proxmox/storage.cfg \
homelab-export-20240201-120000/configs/proxmox/storage.cfg
```
## Troubleshooting
### Permission Denied Errors
**Problem:** Script fails with permission errors
**Solution:** Run as root or with sudo
```bash
sudo ./collect-homelab-config.sh
```
### Some Items Skipped
**Problem:** SUMMARY.md shows skipped items
**Cause:** Likely normal - some features may not be enabled on your system
**Action:** Review `collection.log` to see what was skipped. Common skipped items:
- ZFS tools (if you're not using ZFS)
- Cluster configs (if running single node)
- HA configs (if HA is not configured)
This is typically not an error.
### SSH Connection Issues (Remote Execution)
**Problem:** Cannot connect to Proxmox from WSL
**Solution:**
```bash
# Test SSH connection
ssh root@<proxmox-ip> 'hostname'
# If prompted for password, set up key-based auth
ssh-keygen -t ed25519
ssh-copy-id root@<proxmox-ip>
```
### Disk Space
**Problem:** Out of disk space
**Solution:**
- Use `--no-compress` to skip compression
- Use `-l basic` for smaller exports
- Specify output directory on a larger filesystem: `-o /path/to/larger/disk/`
### Script Doesn't Execute
**Problem:** `bash: ./collect-homelab-config.sh: Permission denied`
**Solution:**
```bash
chmod +x collect-homelab-config.sh
```
## Security Best Practices
1. **Storage:** Keep exports in secure locations
```bash
# Set restrictive permissions
chmod 700 homelab-export-*/
```
2. **Git:** Use private repositories for unsanitized exports
```bash
# Never push unsanitized configs to public repos
# Always review before pushing:
git diff --staged
```
3. **Sanitization:** Use `--sanitize all` for any exports that leave your network
4. **Encryption:** Encrypt sensitive exports
```bash
# Encrypt the archive
gpg --symmetric --cipher-algo AES256 homelab-export-*.tar.gz
# Decrypt when needed
gpg --decrypt homelab-export-*.tar.gz.gpg > homelab-export.tar.gz
```
5. **Cleanup:** Remove old exports
```bash
# After archiving to secure backup storage
rm -rf homelab-export-20240101-*/
```
## Advanced Usage
### Custom Collection Script
Create a wrapper for your specific needs:
```bash
#!/bin/bash
# /root/my-homelab-backup.sh
BACKUP_ROOT="/backup/homelab"
DATE=$(date +%Y%m%d)
# Run collection
/root/collect-homelab-config.sh \
--level full \
--output "${BACKUP_ROOT}/export-${DATE}" \
--sanitize all \
--verbose
# Copy to NFS share
cp "${BACKUP_ROOT}/export-${DATE}.tar.gz" /mnt/backups/
# Upload to off-site (example)
# rclone copy "${BACKUP_ROOT}/export-${DATE}.tar.gz" remote:homelab-backups/
# Cleanup old exports (keep 30 days)
find "${BACKUP_ROOT}" -name "export-*.tar.gz" -mtime +30 -delete
echo "Backup completed: ${DATE}"
```
### Selective Collection
To collect only specific components, modify the script or extract them manually:
```bash
# Just VM configs
mkdir -p /tmp/just-vms
cp homelab-export-*/configs/vms/* /tmp/just-vms/
# Just network info
cp -r homelab-export-*/configs/network /tmp/network-backup/
```
### Integration with Monitoring
Alert if collection fails:
```bash
#!/bin/bash
# Wrapper with monitoring
if /root/collect-homelab-config.sh -l standard -o /backup/daily; then
echo "Homelab collection successful" | mail -s "Backup OK" admin@example.com
else
echo "Homelab collection FAILED - check logs" | mail -s "Backup FAILED" admin@example.com
fi
```
## Next Steps
1. **Run your first collection:**
```bash
./collect-homelab-config.sh --level standard --verbose
```
2. **Review the output:**
```bash
cat homelab-export-*/SUMMARY.md
less homelab-export-*/collection.log
```
3. **Set up automated collections** (see Automated Collection section)
4. **Initialize a Git repository** for your exports (see Integration with Git section)
5. **Create documentation** in the `docs/` folder within your exports
6. **Build network diagrams** and save them in `diagrams/`
7. **Use the configs** to start building Infrastructure as Code (Terraform, Ansible)
## Support and Issues
If you encounter issues:
1. Run with `--verbose` flag for detailed output
2. Review `collection.log` for error messages
3. Check that you're running as root/sudo
4. Verify you're on a Proxmox VE host (`cat /etc/pve/.version`)
5. Ensure adequate disk space (`df -h`)
---
**Prepared by:** Your homelab automation assistant, Steve
**Script Version:** 1.0.0
**Last Updated:** 2024-11-28