Files
homelab/troubleshooting/loki-stack-bugfix.md
Jordan Ramos 892684c46e feat(monitoring): resolve Loki-stack syslog ingestion with rsyslog filter fix
Fixed critical issue preventing UniFi router logs from reaching Loki/Promtail/Grafana.

Root Cause:
- rsyslog filter in /etc/rsyslog.d/unifi-router.conf filtered for 192.168.1.1
- VM 101 on VLAN 2, actual source IP is 192.168.2.1 (VLAN 2 gateway)
- Filter silently rejected all incoming syslog traffic

Solution:
- Updated rsyslog filter from 192.168.1.1 to 192.168.2.1
- Logs now flow: UniFi → rsyslog → Promtail → Loki → Grafana

Changes:
- Add services/loki-stack/* - Complete Loki/Promtail/Grafana stack configs
- Add services/logward/* - Logward service configuration
- Update troubleshooting/loki-stack-bugfix.md - Complete 5-phase resolution
- Update CLAUDE_STATUS.md - Document 2025-12-11 resolution
- Update sub-agents/scribe.md - Agent improvements
- Remove services/promtail-config.yml - Duplicate file cleanup

Status:  Monitoring stack fully operational, syslog ingestion active

Technical Details: See troubleshooting/loki-stack-bugfix.md for complete analysis

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2025-12-11 13:56:27 -07:00

5.8 KiB

Here is a summary of the troubleshooting session to build your centralized logging stack.

  1. The Objective Create a monitoring stack on Proxmox using Loki (database) and Promtail (log collector) to ingest logs from:

Proxmox Host: Via TCP (Reliable).

UniFi Dream Router: Via UDP (Legacy RFC3164 format).

  1. The Final Architecture Because Promtail strictly enforces modern log standards (RFC5424) and UniFi sends "dirty" legacy logs (RFC3164), we adopted a "Translator" Architecture.

UniFi Router: Sends UDP logs to the Host VM.

Host Rsyslog: Catches UDP, converts it to valid TCP, and forwards it to Docker.

Promtail: Receives clean TCP logs and pushes them to Loki.

  1. Troubleshooting Timeline Phase 1: Loki Instability The Issue: Loki kept crashing with "Schema" and "Compactor" errors.

The Cause: You were using a legacy configuration file with the modern Loki v3.0 image.

The Fix: Updated the Loki config to use schema: v13, tsdb, and added the required delete_request_store.

Phase 2: Proxmox Log Ingestion (TCP) The Issue: Promtail threw "Parsing Errors" when receiving logs from Proxmox.

The Cause: Proxmox defaults to an older syslog format.

The Fix: Reconfigured Proxmox (/etc/rsyslog.conf) to use the template RSYSLOG_SyslogProtocol23Format (RFC5424).

Phase 3: The UniFi UDP Saga (The Main Blocker) The Issue: Promtail rejected UniFi logs.

Attempt 1: We added format: rfc3164 to the Promtail config.

Result: Crash (field format not found).

Attempt 2: We upgraded Promtail from v2.9 to v3.0.

Result: Crash persisted.

Discovery: Promtail v3.0 still does not support legacy format toggles in the syslog receiver.

The Final Fix: We moved the UDP listener out of Docker and onto the Host OS (rsyslog), letting the Host handle the "dirty" UDP work and forward clean TCP to Promtail.

Phase 4: The "Ghost" Configuration The Issue: Promtail logs showed it trying to connect to 192.168.2.25 even though your config file said http://loki:3100.

The Cause: Docker was holding onto an old version of the configuration file.

The Fix: Used docker-compose down followed by docker-compose up -d (instead of just restart) to force a refresh of the volume mounts.

  1. The "Golden State" Configuration These are the settings that finally worked.

A. Docker Compose (docker-compose.yml)

Promtail Ports: Only TCP 1514:1514 mapped (UDP removed to prevent conflicts).

Volumes: Confirmed mapping ./promtail-config.yaml:/etc/promtail/config.yaml.

B. Promtail Config (promtail-config.yaml)

Clients: url: http://loki:3100/loki/api/v1/push (Using internal Docker DNS).

Scrape Config: Single job listening on tcp.

YAML

syslog: listen_address: 0.0.0.0:1514 listen_protocol: tcp C. Host Rsyslog (/etc/rsyslog.conf)

Inputs: imudp enabled on port 1514.

Forwarding: Rule added to send all UDP traffic to 127.0.0.1:1514 via TCP.


FINAL RESOLUTION - 2025-12-11

Root Cause Identified

IP address mismatch in rsyslog forwarding filter

Problem: /etc/rsyslog.d/unifi-router.conf on VM 101 was filtering for the wrong source IP

  • Filter was configured for: 192.168.1.1 (incorrect)
  • Actual source IP: 192.168.2.1 (VLAN 2 gateway interface)

Explanation: VM 101 is on VLAN 2 (192.168.2.x subnet). When the UniFi router sends syslog to 192.168.2.114, it uses its VLAN 2 interface IP (192.168.2.1) as the source address. The rsyslog filter was silently rejecting all incoming logs due to this IP mismatch.

Solution Implemented

File Modified: /etc/rsyslog.d/unifi-router.conf on VM 101

Change:

# Before (WRONG):
if $fromhost-ip == '192.168.1.1' then {

# After (CORRECT):
if $fromhost-ip == '192.168.2.1' then {

Complete corrected configuration:

# UniFi Router - VLAN 2 interface
if $fromhost-ip == '192.168.2.1' then {
    action(type="omfwd" Target="127.0.0.1" Port="1514" Protocol="tcp" Template="RSYSLOG_SyslogProtocol23Format")
    stop
}

Service restart:

sudo systemctl restart rsyslog
sudo systemctl status rsyslog

Result: Logs immediately began flowing: UniFi router → rsyslog → Promtail → Loki → Grafana

Verification Steps

# 1. Verify UDP listener (rsyslog)
sudo ss -tulnp | grep 1514
# Expected: udp UNCONN users:(("rsyslogd"))

# 2. Verify TCP listener (Promtail)
sudo ss -tulnp | grep 1514
# Expected: tcp LISTEN users:(("docker-proxy"))

# 3. Monitor Promtail ingestion
docker logs promtail --tail 50 -f
# Expected: "Successfully sent batch" messages

# 4. Test log injection
logger -n 127.0.0.1 -P 1514 "Test from monitoring-docker host"

Troubleshooting Phases Summary

This was a 5-phase troubleshooting effort:

  1. Phase 1: Fixed Loki schema errors (v13, tsdb, delete_request_store)
  2. Phase 2: Fixed Proxmox log parsing (RSYSLOG_SyslogProtocol23Format)
  3. Phase 3: Moved UDP listener from Docker to Host rsyslog (Promtail doesn't support RFC3164)
  4. Phase 4: Fixed "ghost" configuration (192.168.2.25 stale config in Docker volumes)
  5. Phase 5: Corrected rsyslog filter IP from 192.168.1.1 to 192.168.2.1

Data Flow Diagram

UniFi Router (192.168.2.1)
    ↓ UDP syslog port 1514
Host rsyslog (192.168.2.114:1514 UDP)
    ↓ TCP forward (RFC5424 format)
Docker Promtail (127.0.0.1:1514 TCP)
    ↓ HTTP push
Loki (loki:3100)
    ↓ Query
Grafana (192.168.2.114:3000)

Key Technical Details

  • VLAN Topology: VM 101 on VLAN 2, router uses 192.168.2.1 interface for that subnet
  • rsyslog Template: RSYSLOG_SyslogProtocol23Format (RFC5424) - required by Promtail
  • Port Binding: UDP 1514 (rsyslog) and TCP 1514 (Promtail) coexist on same port number, different protocols
  • Stop Directive: Prevents duplicate logging to local files after forwarding

Status

  • Monitoring Stack: Fully operational
  • Log Ingestion: Active
  • Grafana Dashboards: Receiving data
  • Resolution Date: 2025-12-11