Files
cve-dashboard/.kiro/specs/vcl-multi-vertical-upload/design.md
2026-05-19 15:01:25 -06:00

538 lines
19 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Design Document: VCL Multi-Vertical Upload
## Overview
This feature adds a multi-file upload flow to the VCL reporting page that accepts per-vertical compliance xlsx files, stores them with vertical-scoped resolution logic, and generates cross-organizational executive reports with drill-down capability by vertical and metric. It is designed as a POC for the compliance team to evaluate before eventual CyberMetrics API integration.
The feature is architecturally separate from the existing single-file AEO compliance upload. It reuses the same Python parser and database schema (with additions), but introduces vertical-scoped commit logic and a new set of API endpoints prefixed with `/api/compliance/vcl-multi/`.
## Architecture
```mermaid
sequenceDiagram
participant U as Compliance Analyst
participant FE as React Frontend
participant BE as Express Backend
participant PY as Python Parser
participant DB as PostgreSQL
Note over U,DB: Multi-File Upload Flow
U->>FE: Drop/select 114 xlsx files
FE->>FE: Extract vertical + date from each filename
FE->>BE: POST /api/compliance/vcl-multi/preview (multipart, multiple files)
loop For each file
BE->>PY: parse_compliance_xlsx.py <file>
PY-->>BE: { items, summary, report_date, total }
BE->>DB: Query active items WHERE vertical = X
BE->>BE: Compute scoped diff (new/recurring/resolved within vertical)
end
BE-->>FE: { files: [{ vertical, date, diff, itemCount, tempFile }] }
FE->>FE: Display batch preview table
U->>FE: Confirm batch
FE->>BE: POST /api/compliance/vcl-multi/commit { files: [...] }
loop For each file (single transaction)
BE->>DB: Upsert items for vertical X
BE->>DB: Resolve missing items WHERE vertical = X only
BE->>DB: Update/create snapshot for vertical X
end
BE-->>FE: { committed: [...] }
Note over FE,DB: VCL Multi-Vertical Report Load
FE->>BE: GET /api/compliance/vcl-multi/stats
BE->>DB: Aggregate across all verticals
BE-->>FE: { stats, verticalBreakdown, donut }
FE->>BE: GET /api/compliance/vcl-multi/vertical/:code/metrics
BE->>DB: Per-metric breakdown for vertical
BE-->>FE: { metrics: [...] }
FE->>BE: GET /api/compliance/vcl-multi/vertical/:code/metric/:metricId/devices
BE->>DB: Device list for vertical + metric
BE-->>FE: { devices: [...] }
```
### Data Flow Summary
1. **Upload** — Multiple files uploaded simultaneously. Each file is parsed independently. Vertical identity comes from the filename, not from inside the xlsx.
2. **Scoped resolution** — Each file's commit only resolves items within its own vertical. Other verticals are untouched.
3. **Aggregation** — VCL stats endpoints aggregate across all verticals for the executive view.
4. **Drill-down** — Vertical → Metric → Device hierarchy for investigation.
5. **Burndown** — Computed from `resolution_date` values on non-compliant devices, bucketed by month per vertical.
## Data Model
### Schema Changes
#### New column on `compliance_items`
```sql
ALTER TABLE compliance_items ADD COLUMN IF NOT EXISTS vertical TEXT DEFAULT NULL;
CREATE INDEX IF NOT EXISTS idx_compliance_items_vertical ON compliance_items(vertical);
CREATE INDEX IF NOT EXISTS idx_compliance_items_vertical_status ON compliance_items(vertical, status);
```
The `vertical` column stores the organizational vertical code (NTS_AEO, SDIT_CISO, etc.) extracted from the filename at upload time. Existing items (from the old single-file flow) will have `vertical = NULL` — they continue to work with the existing AEO compliance page unchanged.
#### New column on `compliance_uploads`
```sql
ALTER TABLE compliance_uploads ADD COLUMN IF NOT EXISTS vertical TEXT DEFAULT NULL;
```
Tags each upload record with its vertical so we can query upload history per vertical.
#### New table: `vcl_multi_vertical_summary`
Stores the parsed Summary sheet data per vertical per upload for metric-level reporting.
```sql
CREATE TABLE IF NOT EXISTS vcl_multi_vertical_summary (
id SERIAL PRIMARY KEY,
upload_id INTEGER NOT NULL REFERENCES compliance_uploads(id) ON DELETE CASCADE,
vertical TEXT NOT NULL,
metric_id TEXT NOT NULL,
metric_desc TEXT DEFAULT '',
category TEXT DEFAULT 'Other',
team TEXT DEFAULT '',
priority TEXT DEFAULT '',
non_compliant INTEGER DEFAULT 0,
compliant INTEGER DEFAULT 0,
total INTEGER DEFAULT 0,
compliance_pct NUMERIC(5,2) DEFAULT 0,
target NUMERIC(5,2) DEFAULT 0,
status TEXT DEFAULT '',
created_at TIMESTAMPTZ DEFAULT NOW()
);
CREATE INDEX IF NOT EXISTS idx_vcl_multi_summary_vertical
ON vcl_multi_vertical_summary(vertical);
CREATE INDEX IF NOT EXISTS idx_vcl_multi_summary_upload
ON vcl_multi_vertical_summary(upload_id);
```
#### Updated `compliance_snapshots`
The existing snapshots table already has a `vertical` column. Multi-vertical uploads will create snapshots keyed on the vertical code (NTS_AEO, SDIT_CISO) rather than the team name (STEAM, ACCESS-ENG). An additional `_ALL` aggregate snapshot is created for the trend chart.
### Entity Relationships
```
compliance_uploads (1) ──── (N) compliance_items
│ │
│ vertical │ vertical
│ │
└──── (N) vcl_multi_vertical_summary
compliance_snapshots ─────────────────┘ (keyed on vertical + month)
```
### Vertical Identification Logic
```javascript
/**
* Extracts vertical code and report date from a filename.
* Pattern: <VERTICAL>_YYYY_MM_DD.xlsx
* Examples:
* NTS_AEO_2026_05_11.xlsx → { vertical: 'NTS_AEO', date: '2026-05-11' }
* SDIT_CISO_2026_05_11.xlsx → { vertical: 'SDIT_CISO', date: '2026-05-11' }
* SR_2026_05_11.xlsx → { vertical: 'SR', date: '2026-05-11' }
* AllOthers_2026_05_11.xlsx → { vertical: 'AllOthers', date: '2026-05-11' }
*/
function parseVerticalFilename(filename) {
const stem = filename.replace(/\.xlsx$/i, '');
const match = stem.match(/^(.+?)_(\d{4})_(\d{2})_(\d{2})$/);
if (!match) return null;
return {
vertical: match[1],
date: `${match[2]}-${match[3]}-${match[4]}`,
};
}
```
## API Endpoints
### Upload Flow
**`POST /api/compliance/vcl-multi/preview`**
Accepts multiple xlsx files via multipart form data. Parses each, computes per-vertical scoped diffs.
- Auth: `requireAuth()`, `requireGroup('Admin', 'Standard_User')`
- Body: multipart/form-data with field `files` (array of xlsx files)
- Response:
```json
{
"files": [
{
"filename": "NTS_AEO_2026_05_11.xlsx",
"vertical": "NTS_AEO",
"report_date": "2026-05-11",
"total_items": 342,
"diff": { "new_count": 12, "recurring_count": 320, "resolved_count": 8 },
"summary_entries": 24,
"tempFile": "/path/to/temp.json"
}
],
"unrecognized": ["weird_file.xlsx"]
}
```
**`POST /api/compliance/vcl-multi/commit`**
Commits all previewed files in a single transaction.
- Auth: `requireAuth()`, `requireGroup('Admin', 'Standard_User')`
- Body: `{ files: [{ tempFile, vertical, report_date, filename }] }`
- Response:
```json
{
"committed": [
{ "vertical": "NTS_AEO", "upload_id": 45, "new_count": 12, "recurring_count": 320, "resolved_count": 8 }
],
"total_new": 85,
"total_resolved": 42
}
```
### Reporting
**`GET /api/compliance/vcl-multi/stats`**
Aggregated cross-vertical executive summary.
- Response:
```json
{
"stats": {
"total_devices": 4200,
"compliant": 3800,
"non_compliant": 400,
"compliance_pct": 90,
"target_pct": 95
},
"donut": {
"blocked": { "count": 120, "pct": 30 },
"in_progress": { "count": 280, "pct": 70 }
},
"vertical_breakdown": [
{
"vertical": "NTS_AEO",
"total_devices": 800,
"compliant": 720,
"non_compliant": 80,
"compliance_pct": 90,
"blockers": 25,
"forecast_burndown": { "2026-06": 20, "2026-07": 35, "2026-08": 15 },
"last_upload": "2026-05-11"
}
],
"last_upload_date": "2026-05-11"
}
```
**`GET /api/compliance/vcl-multi/trend`**
Monthly trend data for the overview chart.
- Response:
```json
{
"months": [
{
"month": "2026-03",
"compliance_pct": 85,
"compliant": 3400,
"non_compliant": 600,
"forecast_pct": null,
"target_pct": 95
}
]
}
```
**`GET /api/compliance/vcl-multi/vertical/:code/metrics`**
Per-metric breakdown for a specific vertical.
- Response:
```json
{
"vertical": "NTS_AEO",
"metrics": [
{
"metric_id": "5.2.4",
"metric_desc": "MFA enforcement on privileged accounts",
"category": "Access & MFA",
"non_compliant": 15,
"compliant": 785,
"total": 800,
"compliance_pct": 98.1,
"target": 100
}
],
"categories": [
{ "category": "Access & MFA", "non_compliant": 45, "compliance_pct": 94.4 }
]
}
```
**`GET /api/compliance/vcl-multi/vertical/:code/metric/:metricId/devices`**
Device list for a specific vertical + metric combination.
- Response:
```json
{
"devices": [
{
"hostname": "srv-nts-001",
"ip_address": "10.1.2.3",
"device_type": "Router",
"team": "STEAM",
"seen_count": 4,
"first_seen": "2026-03-15",
"last_seen": "2026-05-11",
"resolution_date": "2026-07-01",
"remediation_plan": "Scheduled for next maintenance window"
}
]
}
```
**`GET /api/compliance/vcl-multi/vertical/:code/burndown`**
Burndown forecast for a specific vertical.
- Response:
```json
{
"vertical": "NTS_AEO",
"total_non_compliant": 80,
"blockers": 25,
"with_dates": 55,
"monthly_forecast": { "2026-06": 20, "2026-07": 35 },
"projected_clear_date": "2026-08"
}
```
## Frontend Components
### New Page: `VCLMultiVerticalPage.js`
Top-level page accessible from the nav drawer. Contains:
| Component | Purpose |
|---|---|
| `MultiVerticalUploadModal` | Multi-file drag-drop, filename parsing, batch preview, commit |
| `VCLMultiStatsBar` | Aggregated stats across all verticals |
| `VCLMultiVerticalTable` | Breakdown table with one row per vertical, clickable for drill-down |
| `VCLMultiTrendChart` | Monthly compliance trend with forecast line |
| `VCLMultiDonutChart` | Blocked vs In-Progress donut |
| `VerticalDetailView` | Per-metric breakdown when a vertical is selected |
| `MetricDeviceList` | Device list when a metric is selected within a vertical |
| `VerticalBurndownChart` | Per-vertical burndown projection |
### Navigation
- New entry in `NavDrawer.js`: "VCL Multi-Vertical" (or "CCP Metrics")
- Separate from existing "Compliance" and "VCL Report" entries
- Icon: `BarChart3` or `Building2` from lucide-react
### Drill-down UX Flow
```
VCL Multi-Vertical Overview
├── Stats Bar (aggregated)
├── Trend Chart (aggregated)
├── Donut Chart (aggregated)
└── Vertical Breakdown Table
├── NTS_AEO (90%) → click
│ ├── Metric Breakdown
│ │ ├── 5.2.4 — Access & MFA (98.1%) → click
│ │ │ └── Device List (15 devices)
│ │ ├── 1.1.1 — Logging & Monitoring (85%) → click
│ │ │ └── Device List (120 devices)
│ │ └── ...
│ └── Burndown Chart (vertical-specific)
├── SDIT_CISO (92%) → click
│ └── ...
└── ...
```
## Scoped Resolution Logic
This is the core architectural change from the existing upload flow.
### Current behavior (single-file)
```javascript
// Resolves ALL active items not in the upload — global scope
for (const [key, row] of Object.entries(activeMap)) {
if (!newKeys.has(key)) {
await client.query(`UPDATE compliance_items SET status = 'resolved' WHERE id = $1`, [row.id]);
}
}
```
### New behavior (multi-vertical)
```javascript
// Resolves only items within the same vertical — scoped
const { rows: activeRows } = await client.query(
`SELECT id, hostname, metric_id, seen_count FROM compliance_items
WHERE status = 'active' AND vertical = $1`,
[vertical]
);
for (const [key, row] of Object.entries(activeMap)) {
if (!newKeys.has(key)) {
await client.query(`UPDATE compliance_items SET status = 'resolved', resolved_upload_id = $1 WHERE id = $2`, [uploadId, row.id]);
}
}
```
The only difference is the `AND vertical = $1` filter on the active items query. This ensures uploading NTS_AEO data never touches SDIT_CISO items.
## Burndown Forecast Computation
### Per-vertical burndown
For each vertical, the burndown is computed from `resolution_date` values on active non-compliant items:
```javascript
function computeVerticalBurndown(items) {
const total = items.length;
const withDates = items.filter(i => i.resolution_date != null);
const blockers = items.filter(i => i.resolution_date == null);
// Bucket by month
const monthly = {};
for (const item of withDates) {
const month = item.resolution_date.slice(0, 7); // YYYY-MM
monthly[month] = (monthly[month] || 0) + 1;
}
// Cumulative projection
let remaining = total;
const projection = {};
for (const month of Object.keys(monthly).sort()) {
remaining -= monthly[month];
projection[month] = { remediated: monthly[month], remaining };
}
return { total, blockers: blockers.length, with_dates: withDates.length, monthly, projection };
}
```
### Aggregated trend forecast
The trend chart forecast uses linear regression on the last 3+ monthly snapshots to project forward. This reuses the same approach as the existing VCL trend endpoint.
## Migration Script
```javascript
// backend/migrations/add_vcl_multi_vertical.js
const pool = require('../db');
async function run() {
console.log('Starting VCL multi-vertical migration...');
try {
// Add vertical column to compliance_items
await pool.query(`ALTER TABLE compliance_items ADD COLUMN IF NOT EXISTS vertical TEXT DEFAULT NULL`);
console.log('✓ vertical column added to compliance_items');
await pool.query(`CREATE INDEX IF NOT EXISTS idx_compliance_items_vertical ON compliance_items(vertical)`);
console.log('✓ idx_compliance_items_vertical index created');
await pool.query(`CREATE INDEX IF NOT EXISTS idx_compliance_items_vertical_status ON compliance_items(vertical, status)`);
console.log('✓ idx_compliance_items_vertical_status index created');
// Add vertical column to compliance_uploads
await pool.query(`ALTER TABLE compliance_uploads ADD COLUMN IF NOT EXISTS vertical TEXT DEFAULT NULL`);
console.log('✓ vertical column added to compliance_uploads');
// Create summary table for per-vertical metric data
await pool.query(`
CREATE TABLE IF NOT EXISTS vcl_multi_vertical_summary (
id SERIAL PRIMARY KEY,
upload_id INTEGER NOT NULL REFERENCES compliance_uploads(id) ON DELETE CASCADE,
vertical TEXT NOT NULL,
metric_id TEXT NOT NULL,
metric_desc TEXT DEFAULT '',
category TEXT DEFAULT 'Other',
team TEXT DEFAULT '',
priority TEXT DEFAULT '',
non_compliant INTEGER DEFAULT 0,
compliant INTEGER DEFAULT 0,
total INTEGER DEFAULT 0,
compliance_pct NUMERIC(5,2) DEFAULT 0,
target NUMERIC(5,2) DEFAULT 0,
status TEXT DEFAULT '',
created_at TIMESTAMPTZ DEFAULT NOW()
)
`);
console.log('✓ vcl_multi_vertical_summary table created');
await pool.query(`CREATE INDEX IF NOT EXISTS idx_vcl_multi_summary_vertical ON vcl_multi_vertical_summary(vertical)`);
console.log('✓ idx_vcl_multi_summary_vertical index created');
await pool.query(`CREATE INDEX IF NOT EXISTS idx_vcl_multi_summary_upload ON vcl_multi_vertical_summary(upload_id)`);
console.log('✓ idx_vcl_multi_summary_upload index created');
} catch (err) {
console.error('Migration error:', err.message);
process.exit(1);
}
console.log('Migration complete.');
process.exit(0);
}
run();
```
## Correctness Properties
### Property 1: Vertical-Scoped Resolution Isolation
*For any* set of active compliance items across N verticals, committing an upload for vertical X must only resolve items where `vertical = X`. The count of active items for all other verticals must remain unchanged before and after the commit.
### Property 2: Filename Parsing Completeness
*For any* filename matching the pattern `<VERTICAL>_YYYY_MM_DD.xlsx` where VERTICAL contains only alphanumeric characters and underscores, `parseVerticalFilename` must return a non-null result with the correct vertical code and ISO date string.
### Property 3: Aggregated Stats Consistency
*For any* set of per-vertical stats, the aggregated `total_devices` must equal the sum of all vertical `total_devices`, `compliant` must equal the sum of all vertical `compliant`, and `compliance_pct` must equal `Math.round((sum_compliant / sum_total) * 100)`.
### Property 4: Burndown Forecast Conservation
*For any* set of non-compliant items with resolution dates, the sum of all monthly burndown bucket counts must equal the count of items with non-null resolution dates. No item is double-counted or lost.
### Property 5: Idempotent Re-upload
*For any* vertical, uploading the same file twice on the same day must produce the same final state as uploading it once. Specifically: same active item set, same seen_counts, same resolved set.
## Error Handling
| Condition | Behavior |
|---|---|
| Filename doesn't match pattern | File flagged as "unrecognized" in preview; user can assign vertical manually |
| Duplicate vertical in batch | Reject — only one file per vertical per batch |
| Parser failure on one file | That file is marked as errored; other files in batch can still proceed |
| Transaction failure during commit | Full rollback of entire batch — no partial commits |
| File exceeds 10MB | Rejected by multer before parsing |
| No items parsed from file | Warning in preview; user can still commit (creates upload record with 0 items) |
## Deployment Considerations
- The feature is self-contained behind `/api/compliance/vcl-multi/` endpoints
- Can be deployed on a separate instance with its own database
- No changes to existing AEO compliance upload flow
- Feature flag not needed — the nav entry and endpoints simply exist or don't
- Environment variable `VCL_TARGET_PCT` (default 95) applies to multi-vertical reporting as well