docs: add Postgres migration plan and Kiro spec

- docs/guides/postgres-migration-plan.md: full migration manual with
  phases, port allocation, rollback plan, and timeline
- .kiro/specs/postgres-migration/: requirements, design, and tasks
- Replaces findings_json blob with individual indexed rows
- Enables per-BU closed counts via SQL queries
- Uses existing Postgres instance (port 5432), new cve_dashboard DB
- Testing on port 3003, cutover to 3001 with 30s downtime
This commit is contained in:
Jordan Ramos
2026-05-05 15:04:14 -06:00
parent bd5fcccacf
commit 5cdca09f40
3 changed files with 295 additions and 3 deletions

View File

@@ -0,0 +1,260 @@
# Postgres Migration Plan
## Overview
Migrate the STEAM Security Dashboard from SQLite (`cve_database.db`) to PostgreSQL. This eliminates the JSON blob performance bottleneck, enables per-BU closed finding counts, and supports the multi-tenancy feature properly.
## Current State
- **Database**: SQLite 3, single file `backend/cve_database.db` (13MB)
- **Performance bottleneck**: `ivanti_findings_cache.findings_json` — a 2.6MB TEXT column holding all findings as serialized JSON, parsed on every API request
- **Limitation**: No per-BU closed finding data (only a global count)
- **Concurrency**: SQLite single-writer lock blocks reads during sync writes
## Target State
- **Database**: PostgreSQL 16 (Docker container on port 5433)
- **Findings storage**: Individual rows in `ivanti_findings` table with indexed columns
- **Closed findings**: Stored as rows with `state = 'closed'` and `bu_ownership` column
- **Per-BU counts**: Simple `SELECT COUNT(*) WHERE state = ? AND bu_ownership LIKE ?`
- **Concurrency**: Connection pool (10 connections), reads never blocked by writes
## Infrastructure
### Port Allocation
| Port | Service | Status |
|------|---------|--------|
| 3000 | Frontend (production) | In use |
| 3001 | Backend (production) | In use |
| 3002 | Other project (Python) | In use — do not touch |
| 3003 | Test backend (temporary, during migration dev) | Available |
| 5000 | Other project (Python) | In use — do not touch |
| 5432 | Other project (Postgres) | In use — do not touch |
| 5433 | CVE Dashboard Postgres (Docker) | Available — ours |
### Docker Setup
```bash
docker run -d --name steam-postgres \
--restart unless-stopped \
-e POSTGRES_DB=cve_dashboard \
-e POSTGRES_USER=steam \
-e POSTGRES_PASSWORD=<generated-password> \
-p 5433:5432 \
-v steam-pgdata:/var/lib/postgresql/data \
postgres:16-alpine
```
### Connection String
```
postgresql://steam:<password>@localhost:5433/cve_dashboard
```
Added to `backend/.env` as `DATABASE_URL`.
## Migration Strategy
### Approach: Blue-Green on Same Box
1. Production stays on SQLite (port 3001) throughout development
2. New Postgres backend tested on port 3003
3. Cutover: stop old backend, start new backend on port 3001
4. Rollback: stop new backend, start old SQLite backend
### Branch Strategy
All work happens on `feature/multi-tenancy` branch (same branch as the multi-BU work). The Postgres migration is the infrastructure that makes multi-BU tenancy performant.
## Schema Design
### Key Changes from SQLite
| SQLite | PostgreSQL |
|--------|-----------|
| `findings_json` TEXT blob (2.6MB) | `ivanti_findings` table — one row per finding |
| Single `ivanti_counts_cache` row | Derived from `ivanti_findings` via queries |
| `TEXT` for everything | Proper types: `INTEGER`, `NUMERIC`, `TIMESTAMPTZ`, `TEXT[]` |
| No concurrent writes | Connection pool, MVCC |
| File-based | Docker volume `steam-pgdata` |
### New `ivanti_findings` Table
```sql
CREATE TABLE ivanti_findings (
id TEXT PRIMARY KEY, -- Ivanti finding ID
host_id INTEGER,
title TEXT NOT NULL DEFAULT '',
severity NUMERIC(4,2) NOT NULL DEFAULT 0,
vrr_group TEXT NOT NULL DEFAULT '',
host_name TEXT NOT NULL DEFAULT '',
ip_address TEXT NOT NULL DEFAULT '',
dns TEXT NOT NULL DEFAULT '',
status TEXT NOT NULL DEFAULT '',
sla_status TEXT NOT NULL DEFAULT '',
due_date DATE,
last_found_on DATE,
bu_ownership TEXT NOT NULL DEFAULT '',
cves TEXT[] DEFAULT '{}',
workflow_id TEXT,
workflow_state TEXT,
workflow_type TEXT,
state TEXT NOT NULL DEFAULT 'open' CHECK (state IN ('open', 'closed')),
note TEXT NOT NULL DEFAULT '',
override_host_name TEXT,
override_dns TEXT,
synced_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
CREATE INDEX idx_findings_state ON ivanti_findings(state);
CREATE INDEX idx_findings_bu ON ivanti_findings(bu_ownership);
CREATE INDEX idx_findings_severity ON ivanti_findings(severity);
CREATE INDEX idx_findings_state_bu ON ivanti_findings(state, bu_ownership);
```
### Per-BU Counts (No Separate Table Needed)
```sql
-- Open count for STEAM
SELECT COUNT(*) FROM ivanti_findings WHERE state = 'open' AND bu_ownership ILIKE '%STEAM%';
-- Closed count for STEAM
SELECT COUNT(*) FROM ivanti_findings WHERE state = 'closed' AND bu_ownership ILIKE '%STEAM%';
-- All BU counts in one query
SELECT
bu_ownership,
state,
COUNT(*) as count
FROM ivanti_findings
GROUP BY bu_ownership, state;
```
## Data Migration Script
A one-time script (`backend/scripts/migrate-to-postgres.js`) that:
1. Opens the SQLite database (read-only)
2. Connects to Postgres
3. Creates all tables (idempotent — `IF NOT EXISTS`)
4. Copies data table by table:
- `users``users` (direct copy + `bu_teams`)
- `sessions``sessions`
- `cves``cves`
- `documents``documents`
- `jira_tickets``jira_tickets`
- `archer_tickets``archer_tickets`
- `knowledge_base``knowledge_base`
- `audit_logs``audit_logs`
- `compliance_uploads``compliance_uploads`
- `compliance_items``compliance_items`
- `compliance_notes``compliance_notes`
- `ivanti_findings_cache.findings_json` → individual rows in `ivanti_findings` (state='open')
- `ivanti_finding_notes` → merged into `ivanti_findings.note`
- `ivanti_finding_overrides` → merged into `ivanti_findings.override_*`
- `ivanti_counts_history``ivanti_counts_history`
- `ivanti_finding_archives``ivanti_finding_archives`
- `ivanti_archive_transitions``ivanti_archive_transitions`
- `ivanti_sync_anomaly_log``ivanti_sync_anomaly_log`
- `ivanti_finding_bu_history``ivanti_finding_bu_history`
- `atlas_action_plans_cache``atlas_action_plans_cache`
- `ivanti_fp_submissions``ivanti_fp_submissions`
- `ivanti_fp_submission_history``ivanti_fp_submission_history`
- `ivanti_todo_queue``ivanti_todo_queue`
5. Verifies row counts match
6. Prints summary
## Code Changes
### Backend
1. **New dependency**: `pg` (node-postgres) replaces `sqlite3`
2. **Connection pool**: `backend/db.js` — creates and exports a `Pool` instance
3. **Query pattern change**:
```js
// Before (SQLite callback):
db.get('SELECT * FROM users WHERE id = ?', [id], (err, row) => { ... });
// After (Postgres async):
const { rows } = await pool.query('SELECT * FROM users WHERE id = $1', [id]);
const row = rows[0];
```
4. **Placeholder syntax**: `?` → `$1, $2, $3...`
5. **Findings sync**: Write individual rows via `INSERT ... ON CONFLICT (id) DO UPDATE`
6. **Closed findings sync**: Same pattern — upsert with `state = 'closed'`
7. **Counts**: Derived queries instead of a cache table
### Frontend
No changes needed — the API contract stays the same. The frontend already does client-side filtering.
## Cutover Procedure
```bash
# 1. Final sync on production (SQLite) to get latest data
curl -X POST http://localhost:3001/api/ivanti/findings/sync
# 2. Run migration script (copies SQLite → Postgres)
node backend/scripts/migrate-to-postgres.js
# 3. Stop production backend
systemctl stop cve-backend # or kill the process
# 4. Update .env to use Postgres
# DATABASE_URL=postgresql://steam:<pass>@localhost:5433/cve_dashboard
# DB_TYPE=postgres
# 5. Start new backend on same port
systemctl start cve-backend # now uses Postgres
# 6. Verify
curl http://localhost:3001/api/auth/me # should work
```
### Rollback (if needed)
```bash
# 1. Stop new backend
systemctl stop cve-backend
# 2. Revert .env
# DB_TYPE=sqlite (or remove DATABASE_URL)
# 3. Start old backend
systemctl start cve-backend
```
## Timeline Estimate
| Phase | Effort | Description |
|-------|--------|-------------|
| Docker setup | 5 min | One command |
| Schema creation | 1 hour | SQL DDL for all tables |
| DB abstraction layer | 2-3 hours | `backend/db.js` pool + query helpers |
| Route migration | 4-6 hours | Update all routes from sqlite3 callbacks to pg async |
| Findings redesign | 2-3 hours | New sync logic writing individual rows |
| Closed findings | 1-2 hours | Store closed findings, per-BU count queries |
| Data migration script | 1-2 hours | SQLite → Postgres copy |
| Testing | 2-3 hours | Verify all endpoints, sync, UI |
| Cutover | 30 min | Stop/start + verify |
| **Total** | **~15-20 hours** | |
## Risks and Mitigations
| Risk | Mitigation |
|------|-----------|
| Docker container crashes | `--restart unless-stopped` flag |
| Data loss during cutover | SQLite file preserved as backup forever |
| Postgres disk fills up | Docker volume on main disk; monitor with `df` |
| Connection pool exhaustion | Pool max = 10, with queue; log warnings at 8 |
| Migration script bugs | Run against dev DB first; verify row counts |
## Post-Migration Benefits
- **Instant BU filtering**: `WHERE bu_ownership ILIKE '%STEAM%'` on indexed column
- **Per-BU closed counts**: No more "N/A" — real numbers per team
- **No JSON parsing**: Findings are rows, not a blob
- **Concurrent access**: Multiple users can read while sync writes
- **Future-proof**: Easy to add full-text search, materialized views, partitioning