docs(security): comprehensive security audit and remediation documentation
- Add SECURITY.md policy with credential management, Docker security, SSL/TLS guidance - Add security audit report (2025-12-20) with 31 findings across 4 severity levels - Add pre-deployment security checklist template - Update CLAUDE_STATUS.md with security audit initiative - Expand services/README.md with comprehensive security sections - Add script validation report and container name fix guide Audit identified 6 CRITICAL, 3 HIGH, 2 MEDIUM findings 4-phase remediation roadmap created (estimated 6-13 min downtime) All security scripts validated and ready for execution Related: Security Audit Q4 2025, CRITICAL-001 through CRITICAL-006 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
This commit is contained in:
864
SECURITY.md
Normal file
864
SECURITY.md
Normal file
@@ -0,0 +1,864 @@
|
||||
# Security Policy
|
||||
|
||||
**Version**: 1.0
|
||||
**Last Updated**: 2025-12-20
|
||||
**Effective Date**: 2025-12-20
|
||||
|
||||
## Overview
|
||||
|
||||
This document establishes the security policy and best practices for the homelab infrastructure environment running on Proxmox VE. The policy applies to all virtual machines (VMs), LXC containers, Docker services, and network resources deployed within the homelab.
|
||||
|
||||
## Scope
|
||||
|
||||
This security policy covers:
|
||||
- Proxmox VE infrastructure (serviceslab node at 192.168.2.200)
|
||||
- All virtual machines and LXC containers
|
||||
- Docker containers and compose stacks
|
||||
- Network services and reverse proxies
|
||||
- Authentication and access control systems
|
||||
- Data storage and backup systems
|
||||
- Monitoring and logging infrastructure
|
||||
|
||||
## Vulnerability Disclosure
|
||||
|
||||
### Reporting Security Issues
|
||||
|
||||
Security vulnerabilities should be reported immediately to the infrastructure maintainer:
|
||||
|
||||
**Contact**: jramos
|
||||
**Repository**: http://192.168.2.102:3060/jramos/homelab
|
||||
**Documentation**: `/home/jramos/homelab/troubleshooting/`
|
||||
|
||||
### Disclosure Process
|
||||
|
||||
1. **Report**: Submit vulnerability details via secure channel
|
||||
2. **Acknowledge**: Receipt confirmation within 24 hours
|
||||
3. **Investigate**: Assessment and validation within 72 hours
|
||||
4. **Remediate**: Fix deployment based on severity (see SLA below)
|
||||
5. **Document**: Post-remediation documentation in `/troubleshooting/`
|
||||
6. **Review**: Security audit update and lessons learned
|
||||
|
||||
### Severity Classification
|
||||
|
||||
| Severity | Response Time | Example |
|
||||
|----------|---------------|---------|
|
||||
| CRITICAL | < 4 hours | Docker socket exposure, root credential leaks |
|
||||
| HIGH | < 24 hours | Unencrypted credentials, missing authentication |
|
||||
| MEDIUM | < 72 hours | Weak passwords, missing SSL/TLS |
|
||||
| LOW | < 7 days | Informational findings, optimization opportunities |
|
||||
|
||||
## Security Best Practices
|
||||
|
||||
### 1. Credential Management
|
||||
|
||||
#### 1.1 Password Requirements
|
||||
|
||||
**Minimum Standards**:
|
||||
- Length: 16+ characters for administrative accounts
|
||||
- Complexity: Mixed case, numbers, special characters
|
||||
- Uniqueness: No password reuse across services
|
||||
- Rotation: Every 90 days for privileged accounts
|
||||
|
||||
**Prohibited Practices**:
|
||||
- Default passwords (e.g., `admin/admin`, `password`, `changeme`)
|
||||
- Hardcoded credentials in docker-compose files
|
||||
- Plaintext passwords in configuration files
|
||||
- Credentials committed to version control
|
||||
|
||||
#### 1.2 Secrets Management
|
||||
|
||||
**Docker Secrets Strategy**:
|
||||
```bash
|
||||
# BAD: Hardcoded in docker-compose.yml
|
||||
environment:
|
||||
- POSTGRES_PASSWORD=mypassword123
|
||||
|
||||
# GOOD: Environment file (.env)
|
||||
environment:
|
||||
- POSTGRES_PASSWORD=${POSTGRES_PASSWORD}
|
||||
|
||||
# BETTER: Docker secrets (for swarm mode)
|
||||
secrets:
|
||||
- postgres_password
|
||||
```
|
||||
|
||||
**Environment File Protection**:
|
||||
```bash
|
||||
# Ensure .env files are gitignored
|
||||
echo "*.env" >> .gitignore
|
||||
echo ".env.*" >> .gitignore
|
||||
|
||||
# Set restrictive permissions
|
||||
chmod 600 /path/to/service/.env
|
||||
chown root:root /path/to/service/.env
|
||||
```
|
||||
|
||||
**Credential Storage Locations**:
|
||||
- Docker service secrets: `/path/to/service/.env` (gitignored)
|
||||
- Proxmox credentials: Stored in Proxmox secret storage or `.env` files
|
||||
- Database passwords: Environment variables, rotated quarterly
|
||||
- API tokens: Environment variables, scoped to minimum permissions
|
||||
|
||||
#### 1.3 Credential Rotation
|
||||
|
||||
**Rotation Schedule**:
|
||||
| Credential Type | Frequency | Tool/Script |
|
||||
|-----------------|-----------|-------------|
|
||||
| Proxmox root/API users | 90 days | `scripts/security/rotate-pve-credentials.sh` |
|
||||
| Database passwords | 90 days | `scripts/security/rotate-paperless-password.sh` |
|
||||
| JWT secrets | 90 days | `scripts/security/rotate-bytestash-jwt.sh` |
|
||||
| Service passwords | 90 days | `scripts/security/rotate-logward-credentials.sh` |
|
||||
| SSH keys | 365 days | Manual rotation via Ansible |
|
||||
|
||||
**Rotation Workflow**:
|
||||
1. **Backup**: Create full backup before rotation (`scripts/security/backup-before-remediation.sh`)
|
||||
2. **Generate**: Create new credential using password manager or `openssl rand -base64 32`
|
||||
3. **Update**: Modify `.env` file or service configuration
|
||||
4. **Restart**: Restart affected service: `docker compose restart <service>`
|
||||
5. **Verify**: Test service functionality post-rotation
|
||||
6. **Document**: Record rotation in `/troubleshooting/` log file
|
||||
|
||||
### 2. Docker Security
|
||||
|
||||
#### 2.1 Docker Socket Protection
|
||||
|
||||
**CRITICAL**: The Docker socket (`/var/run/docker.sock`) provides root-level access to the host system.
|
||||
|
||||
**Current Exposures** (as of 2025-12-20 audit):
|
||||
- Portainer: Direct socket mount
|
||||
- Nginx Proxy Manager: Direct socket mount
|
||||
- Speedtest Tracker: Direct socket mount
|
||||
|
||||
**Remediation Strategy**:
|
||||
```yaml
|
||||
# INSECURE: Direct socket mount
|
||||
volumes:
|
||||
- /var/run/docker.sock:/var/run/docker.sock
|
||||
|
||||
# SECURE: Use docker-socket-proxy
|
||||
services:
|
||||
socket-proxy:
|
||||
image: tecnativa/docker-socket-proxy
|
||||
environment:
|
||||
- CONTAINERS=1
|
||||
- NETWORKS=1
|
||||
- SERVICES=1
|
||||
- TASKS=0
|
||||
- POST=0
|
||||
volumes:
|
||||
- /var/run/docker.sock:/var/run/docker.sock:ro
|
||||
restart: unless-stopped
|
||||
|
||||
portainer:
|
||||
image: portainer/portainer-ce
|
||||
environment:
|
||||
- DOCKER_HOST=tcp://socket-proxy:2375
|
||||
# No direct socket mount
|
||||
```
|
||||
|
||||
**Implementation Guide**: See `scripts/security/docker-socket-proxy/README.md`
|
||||
|
||||
#### 2.2 Container User Privileges
|
||||
|
||||
**Principle**: Containers should run as non-root users whenever possible.
|
||||
|
||||
**Current Issues** (2025-12-20 audit):
|
||||
- Multiple containers running as root (UID 0)
|
||||
- Missing `user:` directive in docker-compose files
|
||||
|
||||
**Remediation**:
|
||||
```yaml
|
||||
# Add to docker-compose.yml
|
||||
services:
|
||||
myapp:
|
||||
image: myapp:latest
|
||||
user: "1000:1000" # Run as non-root user
|
||||
# OR use image-specific variables
|
||||
environment:
|
||||
- PUID=1000
|
||||
- PGID=1000
|
||||
```
|
||||
|
||||
**Verification**:
|
||||
```bash
|
||||
# Check running container user
|
||||
docker exec <container> id
|
||||
|
||||
# Should show non-root user:
|
||||
# uid=1000(appuser) gid=1000(appuser)
|
||||
```
|
||||
|
||||
#### 2.3 Container Hardening
|
||||
|
||||
**Security Checklist**:
|
||||
- [ ] Run as non-root user
|
||||
- [ ] Use read-only root filesystem where possible: `read_only: true`
|
||||
- [ ] Drop unnecessary capabilities: `cap_drop: [ALL]`
|
||||
- [ ] Limit resources: `mem_limit`, `cpus`
|
||||
- [ ] Enable no-new-privileges: `security_opt: [no-new-privileges:true]`
|
||||
- [ ] Use minimal base images (Alpine, distroless)
|
||||
- [ ] Scan images for vulnerabilities: `docker scan <image>`
|
||||
|
||||
**Example Hardened Service**:
|
||||
```yaml
|
||||
services:
|
||||
secure-app:
|
||||
image: secure-app:latest
|
||||
user: "1000:1000"
|
||||
read_only: true
|
||||
security_opt:
|
||||
- no-new-privileges:true
|
||||
cap_drop:
|
||||
- ALL
|
||||
cap_add:
|
||||
- NET_BIND_SERVICE # Only if needed
|
||||
mem_limit: 512m
|
||||
cpus: 0.5
|
||||
tmpfs:
|
||||
- /tmp:size=100M,mode=1777
|
||||
```
|
||||
|
||||
#### 2.4 Image Security
|
||||
|
||||
**Best Practices**:
|
||||
1. **Pin image versions**: Use specific tags, not `latest`
|
||||
```yaml
|
||||
image: nginx:1.25.3-alpine # GOOD
|
||||
image: nginx:latest # BAD
|
||||
```
|
||||
|
||||
2. **Verify image signatures**: Enable Docker Content Trust
|
||||
```bash
|
||||
export DOCKER_CONTENT_TRUST=1
|
||||
```
|
||||
|
||||
3. **Scan for vulnerabilities**: Use Trivy or Grype
|
||||
```bash
|
||||
# Install trivy
|
||||
docker run aquasec/trivy image nginx:1.25.3-alpine
|
||||
```
|
||||
|
||||
4. **Use official images**: Prefer verified publishers from Docker Hub
|
||||
|
||||
5. **Regular updates**: Monthly image update cycle
|
||||
```bash
|
||||
docker compose pull
|
||||
docker compose up -d
|
||||
```
|
||||
|
||||
### 3. SSL/TLS Configuration
|
||||
|
||||
#### 3.1 Certificate Management
|
||||
|
||||
**Nginx Proxy Manager (NPM)**:
|
||||
- Primary SSL termination point for external services
|
||||
- Let's Encrypt integration for automatic certificate renewal
|
||||
- Deployed on CT 102 (192.168.2.101)
|
||||
|
||||
**Certificate Lifecycle**:
|
||||
1. **Generation**: Use Let's Encrypt via NPM UI (http://192.168.2.101:81)
|
||||
2. **Deployment**: Automatic via NPM
|
||||
3. **Renewal**: Automatic via NPM (60 days before expiry)
|
||||
4. **Monitoring**: Check NPM dashboard for expiry warnings
|
||||
|
||||
**Manual Certificate Installation** (if needed):
|
||||
```bash
|
||||
# Copy certificate to service
|
||||
cp /path/to/cert.pem /path/to/service/certs/
|
||||
cp /path/to/key.pem /path/to/service/certs/
|
||||
|
||||
# Set permissions
|
||||
chmod 644 /path/to/service/certs/cert.pem
|
||||
chmod 600 /path/to/service/certs/key.pem
|
||||
```
|
||||
|
||||
#### 3.2 SSL/TLS Best Practices
|
||||
|
||||
**Current Gaps** (2025-12-20 audit):
|
||||
- Internal services using HTTP (Grafana, Prometheus, PVE Exporter)
|
||||
- Missing HSTS headers on some NPM proxies
|
||||
- No TLS 1.3 enforcement
|
||||
|
||||
**Remediation Checklist**:
|
||||
- [ ] Enable SSL for all web UIs (Grafana, Prometheus, Portainer)
|
||||
- [ ] Configure NPM to force HTTPS redirects
|
||||
- [ ] Enable HSTS headers: `Strict-Transport-Security: max-age=31536000`
|
||||
- [ ] Disable TLS 1.0 and 1.1 (use TLS 1.2+ only)
|
||||
- [ ] Use strong cipher suites (Mozilla Intermediate configuration)
|
||||
|
||||
**NPM SSL Configuration**:
|
||||
```
|
||||
# Custom Nginx Configuration (NPM Advanced tab)
|
||||
add_header Strict-Transport-Security "max-age=31536000; includeSubDomains" always;
|
||||
add_header X-Frame-Options "SAMEORIGIN" always;
|
||||
add_header X-Content-Type-Options "nosniff" always;
|
||||
add_header X-XSS-Protection "1; mode=block" always;
|
||||
|
||||
ssl_protocols TLSv1.2 TLSv1.3;
|
||||
ssl_ciphers 'ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256';
|
||||
ssl_prefer_server_ciphers on;
|
||||
```
|
||||
|
||||
#### 3.3 Internal Service SSL
|
||||
|
||||
**Grafana HTTPS**:
|
||||
```ini
|
||||
# /etc/grafana/grafana.ini
|
||||
[server]
|
||||
protocol = https
|
||||
cert_file = /etc/grafana/certs/cert.pem
|
||||
cert_key = /etc/grafana/certs/key.pem
|
||||
```
|
||||
|
||||
**Prometheus HTTPS**:
|
||||
```yaml
|
||||
# prometheus.yml
|
||||
web:
|
||||
tls_server_config:
|
||||
cert_file: /etc/prometheus/certs/cert.pem
|
||||
key_file: /etc/prometheus/certs/key.pem
|
||||
```
|
||||
|
||||
### 4. Network Security
|
||||
|
||||
#### 4.1 Network Segmentation
|
||||
|
||||
**Current Architecture**:
|
||||
- Single flat network: 192.168.2.0/24
|
||||
- All VMs and containers on same subnet
|
||||
|
||||
**Recommended Segmentation**:
|
||||
```
|
||||
Management VLAN (VLAN 10): 192.168.10.0/24
|
||||
- Proxmox node (192.168.10.200)
|
||||
- Ansible-Control (192.168.10.106)
|
||||
|
||||
Services VLAN (VLAN 20): 192.168.20.0/24
|
||||
- Web servers (109, 110)
|
||||
- Database server (111)
|
||||
- Docker services
|
||||
|
||||
DMZ VLAN (VLAN 30): 192.168.30.0/24
|
||||
- Nginx Proxy Manager (exposed to internet)
|
||||
- Public-facing services
|
||||
|
||||
Monitoring VLAN (VLAN 40): 192.168.40.0/24
|
||||
- Grafana, Prometheus, PVE Exporter
|
||||
- Logging services
|
||||
```
|
||||
|
||||
**Implementation**: Use Proxmox VLANs and firewall rules (Phase 4 remediation)
|
||||
|
||||
#### 4.2 Firewall Rules
|
||||
|
||||
**Proxmox Firewall Best Practices**:
|
||||
```bash
|
||||
# Enable Proxmox firewall
|
||||
pveum cluster firewall enable
|
||||
|
||||
# Default deny incoming
|
||||
pveum cluster firewall rules add --action DROP --dir in
|
||||
|
||||
# Allow management access
|
||||
pveum cluster firewall rules add --action ACCEPT --proto tcp --dport 8006 --source 192.168.2.0/24
|
||||
|
||||
# Allow SSH (key-based only)
|
||||
pveum cluster firewall rules add --action ACCEPT --proto tcp --dport 22 --source 192.168.2.0/24
|
||||
```
|
||||
|
||||
**Docker Network Isolation**:
|
||||
```yaml
|
||||
# Create isolated networks per service
|
||||
networks:
|
||||
frontend:
|
||||
driver: bridge
|
||||
backend:
|
||||
driver: bridge
|
||||
internal: true # No external access
|
||||
|
||||
services:
|
||||
web:
|
||||
networks:
|
||||
- frontend
|
||||
- backend
|
||||
|
||||
db:
|
||||
networks:
|
||||
- backend # Database not exposed to frontend
|
||||
```
|
||||
|
||||
#### 4.3 Rate Limiting & DDoS Protection
|
||||
|
||||
**Current Gaps**:
|
||||
- No rate limiting on NPM proxies
|
||||
- No fail2ban deployment
|
||||
- No intrusion detection system (IDS)
|
||||
|
||||
**NPM Rate Limiting**:
|
||||
```nginx
|
||||
# Custom Nginx Configuration (NPM)
|
||||
limit_req_zone $binary_remote_addr zone=api_limit:10m rate=10r/s;
|
||||
limit_req_zone $binary_remote_addr zone=web_limit:10m rate=100r/s;
|
||||
|
||||
location /api/ {
|
||||
limit_req zone=api_limit burst=20 nodelay;
|
||||
}
|
||||
|
||||
location / {
|
||||
limit_req zone=web_limit burst=50 nodelay;
|
||||
}
|
||||
```
|
||||
|
||||
**Fail2ban Deployment** (Phase 3 remediation):
|
||||
```bash
|
||||
# Install on NPM container or host
|
||||
apt-get install fail2ban
|
||||
|
||||
# Configure jail for NPM
|
||||
cat > /etc/fail2ban/jail.d/npm.conf << EOF
|
||||
[npm]
|
||||
enabled = true
|
||||
port = http,https
|
||||
filter = npm
|
||||
logpath = /var/log/nginx/error.log
|
||||
maxretry = 5
|
||||
bantime = 3600
|
||||
EOF
|
||||
```
|
||||
|
||||
### 5. Access Control
|
||||
|
||||
#### 5.1 Authentication
|
||||
|
||||
**Multi-Factor Authentication (MFA)**:
|
||||
- **Proxmox**: Enable 2FA via TOTP (Google Authenticator, Authy)
|
||||
```bash
|
||||
# Enable 2FA for user
|
||||
pveum user tfa <user@pam> <TFA-ID>
|
||||
```
|
||||
- **Portainer**: Enable MFA in Portainer settings
|
||||
- **Grafana**: Enable TOTP 2FA in user preferences
|
||||
- **NPM**: No native MFA (use reverse proxy authentication)
|
||||
|
||||
**SSO Integration**:
|
||||
- TinyAuth (CT 115) provides SSO for NetBox
|
||||
- Extend to other services using OAuth2/OIDC (Phase 4)
|
||||
|
||||
#### 5.2 Authorization
|
||||
|
||||
**Principle of Least Privilege**:
|
||||
- Grant minimum required permissions
|
||||
- Use role-based access control (RBAC) where available
|
||||
- Regular access reviews (quarterly)
|
||||
|
||||
**Proxmox Roles**:
|
||||
```bash
|
||||
# Create limited user for monitoring
|
||||
pveum user add monitor@pve
|
||||
pveum acl modify / --user monitor@pve --role PVEAuditor
|
||||
```
|
||||
|
||||
**Docker/Portainer Roles**:
|
||||
- Admin: Full access to all stacks
|
||||
- User: Access to specific stacks only
|
||||
- Read-only: View-only access for monitoring
|
||||
|
||||
#### 5.3 SSH Access
|
||||
|
||||
**SSH Hardening**:
|
||||
```bash
|
||||
# /etc/ssh/sshd_config
|
||||
PermitRootLogin no
|
||||
PasswordAuthentication no
|
||||
PubkeyAuthentication yes
|
||||
Port 22 # Consider non-standard port
|
||||
AllowUsers jramos ansible-user
|
||||
MaxAuthTries 3
|
||||
ClientAliveInterval 300
|
||||
ClientAliveCountMax 2
|
||||
```
|
||||
|
||||
**SSH Key Management**:
|
||||
- Use ED25519 keys: `ssh-keygen -t ed25519 -C "your_email@example.com"`
|
||||
- Rotate keys annually
|
||||
- Store private keys securely (password manager, SSH agent)
|
||||
- Distribute public keys via Ansible
|
||||
|
||||
### 6. Logging and Monitoring
|
||||
|
||||
#### 6.1 Centralized Logging
|
||||
|
||||
**Current State**:
|
||||
- Individual service logs: `docker compose logs`
|
||||
- No centralized log aggregation
|
||||
|
||||
**Recommended Stack** (Phase 4):
|
||||
- **Loki**: Log aggregation
|
||||
- **Promtail**: Log shipping
|
||||
- **Grafana**: Log visualization
|
||||
|
||||
**Implementation**:
|
||||
```yaml
|
||||
# loki/docker-compose.yml
|
||||
services:
|
||||
loki:
|
||||
image: grafana/loki:latest
|
||||
ports:
|
||||
- 3100:3100
|
||||
volumes:
|
||||
- ./loki-config.yml:/etc/loki/loki-config.yml
|
||||
- loki-data:/loki
|
||||
|
||||
promtail:
|
||||
image: grafana/promtail:latest
|
||||
volumes:
|
||||
- /var/log:/var/log:ro
|
||||
- /var/lib/docker/containers:/var/lib/docker/containers:ro
|
||||
- ./promtail-config.yml:/etc/promtail/promtail-config.yml
|
||||
```
|
||||
|
||||
#### 6.2 Security Monitoring
|
||||
|
||||
**Key Metrics to Monitor**:
|
||||
- Failed authentication attempts (Proxmox, SSH, services)
|
||||
- Docker socket access events
|
||||
- Privilege escalation attempts
|
||||
- Network traffic anomalies
|
||||
- Resource exhaustion (CPU, memory, disk)
|
||||
|
||||
**Alerting Rules** (Prometheus):
|
||||
```yaml
|
||||
# alerts.yml
|
||||
groups:
|
||||
- name: security
|
||||
rules:
|
||||
- alert: HighFailedSSHLogins
|
||||
expr: rate(ssh_failed_login_total[5m]) > 5
|
||||
for: 5m
|
||||
annotations:
|
||||
summary: "High rate of failed SSH logins"
|
||||
|
||||
- alert: DockerSocketAccess
|
||||
expr: increase(docker_socket_access_total[1h]) > 100
|
||||
annotations:
|
||||
summary: "Unusual Docker socket activity"
|
||||
```
|
||||
|
||||
#### 6.3 Audit Logging
|
||||
|
||||
**Proxmox Audit Log**:
|
||||
```bash
|
||||
# View Proxmox audit log
|
||||
cat /var/log/pve/tasks/index
|
||||
|
||||
# Monitor in real-time
|
||||
tail -f /var/log/pve/tasks/index
|
||||
```
|
||||
|
||||
**Docker Audit Logging**:
|
||||
```yaml
|
||||
# docker-compose.yml
|
||||
services:
|
||||
myapp:
|
||||
logging:
|
||||
driver: "json-file"
|
||||
options:
|
||||
max-size: "10m"
|
||||
max-file: "3"
|
||||
labels: "service,environment"
|
||||
```
|
||||
|
||||
### 7. Backup and Recovery
|
||||
|
||||
#### 7.1 Backup Strategy
|
||||
|
||||
**Current Implementation**:
|
||||
- Proxmox Backup Server (PBS) at 28.27% utilization
|
||||
- Automated daily incremental backups
|
||||
- Weekly full backups
|
||||
|
||||
**Backup Scope**:
|
||||
- All VMs and LXC containers
|
||||
- Docker volumes (manual backup via scripts)
|
||||
- Configuration files (version controlled in Git)
|
||||
|
||||
**Backup Verification**:
|
||||
```bash
|
||||
# Pre-remediation backup
|
||||
/home/jramos/homelab/scripts/security/backup-before-remediation.sh
|
||||
|
||||
# Verify backup integrity
|
||||
proxmox-backup-client list --repository <repo>
|
||||
```
|
||||
|
||||
#### 7.2 Encryption at Rest
|
||||
|
||||
**Current Gaps** (2025-12-20 audit):
|
||||
- PBS backups not encrypted
|
||||
- Docker volumes not encrypted
|
||||
- Sensitive configuration files unencrypted
|
||||
|
||||
**Remediation** (Phase 4):
|
||||
```bash
|
||||
# Enable PBS encryption
|
||||
proxmox-backup-client backup ... --encrypt
|
||||
|
||||
# LUKS encryption for sensitive volumes
|
||||
cryptsetup luksFormat /dev/sdb
|
||||
cryptsetup luksOpen /dev/sdb encrypted-volume
|
||||
mkfs.ext4 /dev/mapper/encrypted-volume
|
||||
```
|
||||
|
||||
#### 7.3 Disaster Recovery
|
||||
|
||||
**Recovery Time Objective (RTO)**: 4 hours
|
||||
**Recovery Point Objective (RPO)**: 24 hours
|
||||
|
||||
**Recovery Procedure**:
|
||||
1. **Assess Damage**: Identify failed components
|
||||
2. **Restore Infrastructure**: Rebuild Proxmox node if needed
|
||||
3. **Restore VMs/Containers**: Use PBS restore
|
||||
4. **Restore Data**: Mount backup volumes
|
||||
5. **Verify Functionality**: Test all services
|
||||
6. **Document Incident**: Post-mortem in `/troubleshooting/`
|
||||
|
||||
**Recovery Testing**: Quarterly DR drills
|
||||
|
||||
### 8. Vulnerability Management
|
||||
|
||||
#### 8.1 Vulnerability Scanning
|
||||
|
||||
**Container Scanning**:
|
||||
```bash
|
||||
# Install Trivy
|
||||
wget -qO - https://aquasecurity.github.io/trivy-repo/deb/public.key | sudo apt-key add -
|
||||
echo "deb https://aquasecurity.github.io/trivy-repo/deb $(lsb_release -sc) main" | sudo tee -a /etc/apt/sources.list.d/trivy.list
|
||||
sudo apt-get update
|
||||
sudo apt-get install trivy
|
||||
|
||||
# Scan all running containers
|
||||
docker ps --format '{{.Image}}' | xargs -I {} trivy image {}
|
||||
|
||||
# Scan docker-compose stack
|
||||
trivy config docker-compose.yml
|
||||
```
|
||||
|
||||
**Host Scanning**:
|
||||
```bash
|
||||
# Install OpenSCAP
|
||||
apt-get install libopenscap8 openscap-scanner
|
||||
|
||||
# Run CIS benchmark scan
|
||||
oscap xccdf eval --profile cis --results scan-results.xml /usr/share/xml/scap/ssg/content/ssg-ubuntu2004-xccdf.xml
|
||||
```
|
||||
|
||||
#### 8.2 Patch Management
|
||||
|
||||
**Update Schedule**:
|
||||
- **Proxmox VE**: Monthly (during maintenance window)
|
||||
- **VMs/Containers**: Bi-weekly (automated via Ansible)
|
||||
- **Docker Images**: Monthly (CI/CD pipeline)
|
||||
- **Host OS**: Weekly (security patches only)
|
||||
|
||||
**Ansible Patch Playbook**:
|
||||
```yaml
|
||||
# playbooks/patch-systems.yml
|
||||
- hosts: all
|
||||
become: yes
|
||||
tasks:
|
||||
- name: Update apt cache
|
||||
apt:
|
||||
update_cache: yes
|
||||
|
||||
- name: Upgrade all packages
|
||||
apt:
|
||||
upgrade: dist
|
||||
|
||||
- name: Reboot if required
|
||||
reboot:
|
||||
msg: "Rebooting after patching"
|
||||
when: reboot_required_file.stat.exists
|
||||
```
|
||||
|
||||
#### 8.3 Security Baseline Compliance
|
||||
|
||||
**CIS Docker Benchmark**:
|
||||
- See audit report: `/home/jramos/homelab/troubleshooting/SECURITY_AUDIT_2025-12-20.md`
|
||||
- Current compliance: ~40% (as of 2025-12-20)
|
||||
- Target compliance: 80% (by Q1 2026)
|
||||
|
||||
**NIST Cybersecurity Framework**:
|
||||
- **Identify**: Asset inventory (CLAUDE_STATUS.md)
|
||||
- **Protect**: Access control, encryption (this document)
|
||||
- **Detect**: Monitoring, logging (Grafana, Prometheus)
|
||||
- **Respond**: Incident response plan (Section 9)
|
||||
- **Recover**: Backup and DR (Section 7)
|
||||
|
||||
## 9. Incident Response
|
||||
|
||||
### 9.1 Incident Classification
|
||||
|
||||
| Severity | Definition | Examples |
|
||||
|----------|------------|----------|
|
||||
| P1 - Critical | Service outage, data breach | Proxmox node failure, credential leak |
|
||||
| P2 - High | Degraded service, security vulnerability | Single VM down, HIGH severity finding |
|
||||
| P3 - Medium | Non-critical issue | SSL certificate expiry warning |
|
||||
| P4 - Low | Informational, enhancement | Log rotation, optimization |
|
||||
|
||||
### 9.2 Response Procedure
|
||||
|
||||
**Phase 1: Detection**
|
||||
- Monitor alerts from Grafana/Prometheus
|
||||
- Review logs for anomalies
|
||||
- User-reported issues
|
||||
|
||||
**Phase 2: Containment**
|
||||
- Isolate affected systems (firewall rules, network disconnect)
|
||||
- Preserve evidence (logs, disk images)
|
||||
- Prevent spread (patch vulnerable services)
|
||||
|
||||
**Phase 3: Eradication**
|
||||
- Remove malware/backdoors
|
||||
- Patch vulnerabilities
|
||||
- Reset compromised credentials
|
||||
|
||||
**Phase 4: Recovery**
|
||||
- Restore from clean backups
|
||||
- Verify service functionality
|
||||
- Monitor for recurrence
|
||||
|
||||
**Phase 5: Post-Incident**
|
||||
- Document incident in `/troubleshooting/`
|
||||
- Update security controls
|
||||
- Conduct lessons learned review
|
||||
|
||||
### 9.3 Communication Plan
|
||||
|
||||
**Internal Communication**:
|
||||
- Incident lead: jramos
|
||||
- Status updates: CLAUDE_STATUS.md
|
||||
- Documentation: `/troubleshooting/INCIDENT-YYYY-MM-DD.md`
|
||||
|
||||
**External Communication**:
|
||||
- For homelab: Not applicable (internal environment)
|
||||
- For production: Define stakeholder notification procedure
|
||||
|
||||
## 10. Compliance and Auditing
|
||||
|
||||
### 10.1 Security Audits
|
||||
|
||||
**Audit Schedule**:
|
||||
- **Quarterly**: Internal security review
|
||||
- **Annually**: Comprehensive security audit
|
||||
- **Ad-hoc**: After major infrastructure changes
|
||||
|
||||
**Audit Scope**:
|
||||
- Credential management practices
|
||||
- Docker security configuration
|
||||
- SSL/TLS certificate status
|
||||
- Access control policies
|
||||
- Backup and recovery procedures
|
||||
- Vulnerability scan results
|
||||
|
||||
**Audit Documentation**:
|
||||
- Location: `/home/jramos/homelab/troubleshooting/SECURITY_AUDIT_*.md`
|
||||
- Latest Audit: 2025-12-20 (31 findings)
|
||||
- Next Audit: 2026-03-20 (Q1 2026)
|
||||
|
||||
### 10.2 Compliance Standards
|
||||
|
||||
**Applicable Standards** (for reference/practice):
|
||||
- CIS Docker Benchmark v1.6.0
|
||||
- NIST Cybersecurity Framework v1.1
|
||||
- OWASP Top 10 (for web services)
|
||||
- PCI-DSS v4.0 (if handling payment data - N/A for homelab)
|
||||
|
||||
**Compliance Tracking**:
|
||||
- Checklist: `/home/jramos/homelab/templates/SECURITY_CHECKLIST.md`
|
||||
- Status: CLAUDE_STATUS.md (Security Status section)
|
||||
- Evidence: `/troubleshooting/` and `/scripts/security/`
|
||||
|
||||
### 10.3 Documentation Requirements
|
||||
|
||||
**Required Security Documentation**:
|
||||
- [x] Security Policy (this document)
|
||||
- [x] Security Audit Reports (`/troubleshooting/SECURITY_AUDIT_*.md`)
|
||||
- [x] Pre-Deployment Security Checklist (`/templates/SECURITY_CHECKLIST.md`)
|
||||
- [x] Credential Rotation Procedures (`/scripts/security/*.sh`)
|
||||
- [x] Incident Response Plan (Section 9 of this document)
|
||||
- [ ] Network Topology Diagram (TBD in Phase 4)
|
||||
- [ ] Data Flow Diagrams (TBD in Phase 4)
|
||||
- [ ] Risk Assessment Matrix (TBD in Q1 2026)
|
||||
|
||||
## 11. Security Checklists
|
||||
|
||||
### Pre-Deployment Security Checklist
|
||||
|
||||
See comprehensive checklist: `/home/jramos/homelab/templates/SECURITY_CHECKLIST.md`
|
||||
|
||||
**Quick Validation**:
|
||||
```bash
|
||||
# Run quick security check
|
||||
bash /home/jramos/homelab/templates/SECURITY_CHECKLIST.md#quick-validation-script
|
||||
```
|
||||
|
||||
### Quarterly Security Review Checklist
|
||||
|
||||
- [ ] Review and rotate all service credentials
|
||||
- [ ] Scan all containers for vulnerabilities (Trivy)
|
||||
- [ ] Update all Docker images to latest versions
|
||||
- [ ] Review Proxmox audit logs for anomalies
|
||||
- [ ] Verify backup integrity and test restore
|
||||
- [ ] Review firewall rules and network ACLs
|
||||
- [ ] Update SSL certificates (if manual)
|
||||
- [ ] Review user access and permissions (RBAC)
|
||||
- [ ] Patch Proxmox VE, VMs, and containers
|
||||
- [ ] Update security documentation (this file)
|
||||
- [ ] Conduct penetration testing (if applicable)
|
||||
- [ ] Review and update incident response plan
|
||||
|
||||
## 12. Security Resources
|
||||
|
||||
### Internal Documentation
|
||||
|
||||
- **Security Audit Report**: `/home/jramos/homelab/troubleshooting/SECURITY_AUDIT_2025-12-20.md`
|
||||
- **Security Scripts**: `/home/jramos/homelab/scripts/security/`
|
||||
- **Security Checklist**: `/home/jramos/homelab/templates/SECURITY_CHECKLIST.md`
|
||||
- **Infrastructure Status**: `/home/jramos/homelab/CLAUDE_STATUS.md`
|
||||
- **Service Documentation**: `/home/jramos/homelab/services/README.md`
|
||||
|
||||
### External Resources
|
||||
|
||||
**Docker Security**:
|
||||
- [Docker Security Best Practices](https://docs.docker.com/engine/security/)
|
||||
- [CIS Docker Benchmark](https://www.cisecurity.org/benchmark/docker)
|
||||
- [OWASP Docker Security Cheat Sheet](https://cheatsheetseries.owasp.org/cheatsheets/Docker_Security_Cheat_Sheet.html)
|
||||
|
||||
**Proxmox Security**:
|
||||
- [Proxmox VE Security Guide](https://pve.proxmox.com/wiki/Security)
|
||||
- [Proxmox Firewall](https://pve.proxmox.com/wiki/Firewall)
|
||||
- [Proxmox User Management](https://pve.proxmox.com/wiki/User_Management)
|
||||
|
||||
**General Security**:
|
||||
- [NIST Cybersecurity Framework](https://www.nist.gov/cyberframework)
|
||||
- [OWASP Top 10](https://owasp.org/www-project-top-ten/)
|
||||
- [Mozilla SSL Configuration Generator](https://ssl-config.mozilla.org/)
|
||||
|
||||
**Security Tools**:
|
||||
- [Trivy Container Scanner](https://github.com/aquasecurity/trivy)
|
||||
- [Docker Bench Security](https://github.com/docker/docker-bench-security)
|
||||
- [Lynis Security Auditing Tool](https://cisofy.com/lynis/)
|
||||
|
||||
## 13. Change Log
|
||||
|
||||
| Date | Version | Changes | Author |
|
||||
|------|---------|---------|--------|
|
||||
| 2025-12-20 | 1.0 | Initial security policy creation following comprehensive security audit | jramos / Claude Sonnet 4.5 |
|
||||
|
||||
---
|
||||
|
||||
**Document Owner**: jramos
|
||||
**Review Frequency**: Quarterly
|
||||
**Next Review**: 2026-03-20
|
||||
**Classification**: Internal Use
|
||||
**Repository**: http://192.168.2.102:3060/jramos/homelab
|
||||
Reference in New Issue
Block a user