538 lines
19 KiB
Markdown
538 lines
19 KiB
Markdown
# Design Document: VCL Multi-Vertical Upload
|
||
|
||
## Overview
|
||
|
||
This feature adds a multi-file upload flow to the VCL reporting page that accepts per-vertical compliance xlsx files, stores them with vertical-scoped resolution logic, and generates cross-organizational executive reports with drill-down capability by vertical and metric. It is designed as a POC for the compliance team to evaluate before eventual CyberMetrics API integration.
|
||
|
||
The feature is architecturally separate from the existing single-file AEO compliance upload. It reuses the same Python parser and database schema (with additions), but introduces vertical-scoped commit logic and a new set of API endpoints prefixed with `/api/compliance/vcl-multi/`.
|
||
|
||
## Architecture
|
||
|
||
```mermaid
|
||
sequenceDiagram
|
||
participant U as Compliance Analyst
|
||
participant FE as React Frontend
|
||
participant BE as Express Backend
|
||
participant PY as Python Parser
|
||
participant DB as PostgreSQL
|
||
|
||
Note over U,DB: Multi-File Upload Flow
|
||
U->>FE: Drop/select 1–14 xlsx files
|
||
FE->>FE: Extract vertical + date from each filename
|
||
FE->>BE: POST /api/compliance/vcl-multi/preview (multipart, multiple files)
|
||
|
||
loop For each file
|
||
BE->>PY: parse_compliance_xlsx.py <file>
|
||
PY-->>BE: { items, summary, report_date, total }
|
||
BE->>DB: Query active items WHERE vertical = X
|
||
BE->>BE: Compute scoped diff (new/recurring/resolved within vertical)
|
||
end
|
||
|
||
BE-->>FE: { files: [{ vertical, date, diff, itemCount, tempFile }] }
|
||
FE->>FE: Display batch preview table
|
||
U->>FE: Confirm batch
|
||
FE->>BE: POST /api/compliance/vcl-multi/commit { files: [...] }
|
||
|
||
loop For each file (single transaction)
|
||
BE->>DB: Upsert items for vertical X
|
||
BE->>DB: Resolve missing items WHERE vertical = X only
|
||
BE->>DB: Update/create snapshot for vertical X
|
||
end
|
||
|
||
BE-->>FE: { committed: [...] }
|
||
|
||
Note over FE,DB: VCL Multi-Vertical Report Load
|
||
FE->>BE: GET /api/compliance/vcl-multi/stats
|
||
BE->>DB: Aggregate across all verticals
|
||
BE-->>FE: { stats, verticalBreakdown, donut }
|
||
|
||
FE->>BE: GET /api/compliance/vcl-multi/vertical/:code/metrics
|
||
BE->>DB: Per-metric breakdown for vertical
|
||
BE-->>FE: { metrics: [...] }
|
||
|
||
FE->>BE: GET /api/compliance/vcl-multi/vertical/:code/metric/:metricId/devices
|
||
BE->>DB: Device list for vertical + metric
|
||
BE-->>FE: { devices: [...] }
|
||
```
|
||
|
||
### Data Flow Summary
|
||
|
||
1. **Upload** — Multiple files uploaded simultaneously. Each file is parsed independently. Vertical identity comes from the filename, not from inside the xlsx.
|
||
2. **Scoped resolution** — Each file's commit only resolves items within its own vertical. Other verticals are untouched.
|
||
3. **Aggregation** — VCL stats endpoints aggregate across all verticals for the executive view.
|
||
4. **Drill-down** — Vertical → Metric → Device hierarchy for investigation.
|
||
5. **Burndown** — Computed from `resolution_date` values on non-compliant devices, bucketed by month per vertical.
|
||
|
||
## Data Model
|
||
|
||
### Schema Changes
|
||
|
||
#### New column on `compliance_items`
|
||
|
||
```sql
|
||
ALTER TABLE compliance_items ADD COLUMN IF NOT EXISTS vertical TEXT DEFAULT NULL;
|
||
CREATE INDEX IF NOT EXISTS idx_compliance_items_vertical ON compliance_items(vertical);
|
||
CREATE INDEX IF NOT EXISTS idx_compliance_items_vertical_status ON compliance_items(vertical, status);
|
||
```
|
||
|
||
The `vertical` column stores the organizational vertical code (NTS_AEO, SDIT_CISO, etc.) extracted from the filename at upload time. Existing items (from the old single-file flow) will have `vertical = NULL` — they continue to work with the existing AEO compliance page unchanged.
|
||
|
||
#### New column on `compliance_uploads`
|
||
|
||
```sql
|
||
ALTER TABLE compliance_uploads ADD COLUMN IF NOT EXISTS vertical TEXT DEFAULT NULL;
|
||
```
|
||
|
||
Tags each upload record with its vertical so we can query upload history per vertical.
|
||
|
||
#### New table: `vcl_multi_vertical_summary`
|
||
|
||
Stores the parsed Summary sheet data per vertical per upload for metric-level reporting.
|
||
|
||
```sql
|
||
CREATE TABLE IF NOT EXISTS vcl_multi_vertical_summary (
|
||
id SERIAL PRIMARY KEY,
|
||
upload_id INTEGER NOT NULL REFERENCES compliance_uploads(id) ON DELETE CASCADE,
|
||
vertical TEXT NOT NULL,
|
||
metric_id TEXT NOT NULL,
|
||
metric_desc TEXT DEFAULT '',
|
||
category TEXT DEFAULT 'Other',
|
||
team TEXT DEFAULT '',
|
||
priority TEXT DEFAULT '',
|
||
non_compliant INTEGER DEFAULT 0,
|
||
compliant INTEGER DEFAULT 0,
|
||
total INTEGER DEFAULT 0,
|
||
compliance_pct NUMERIC(5,2) DEFAULT 0,
|
||
target NUMERIC(5,2) DEFAULT 0,
|
||
status TEXT DEFAULT '',
|
||
created_at TIMESTAMPTZ DEFAULT NOW()
|
||
);
|
||
|
||
CREATE INDEX IF NOT EXISTS idx_vcl_multi_summary_vertical
|
||
ON vcl_multi_vertical_summary(vertical);
|
||
|
||
CREATE INDEX IF NOT EXISTS idx_vcl_multi_summary_upload
|
||
ON vcl_multi_vertical_summary(upload_id);
|
||
```
|
||
|
||
#### Updated `compliance_snapshots`
|
||
|
||
The existing snapshots table already has a `vertical` column. Multi-vertical uploads will create snapshots keyed on the vertical code (NTS_AEO, SDIT_CISO) rather than the team name (STEAM, ACCESS-ENG). An additional `_ALL` aggregate snapshot is created for the trend chart.
|
||
|
||
### Entity Relationships
|
||
|
||
```
|
||
compliance_uploads (1) ──── (N) compliance_items
|
||
│ │
|
||
│ vertical │ vertical
|
||
│ │
|
||
└──── (N) vcl_multi_vertical_summary
|
||
│
|
||
compliance_snapshots ─────────────────┘ (keyed on vertical + month)
|
||
```
|
||
|
||
### Vertical Identification Logic
|
||
|
||
```javascript
|
||
/**
|
||
* Extracts vertical code and report date from a filename.
|
||
* Pattern: <VERTICAL>_YYYY_MM_DD.xlsx
|
||
* Examples:
|
||
* NTS_AEO_2026_05_11.xlsx → { vertical: 'NTS_AEO', date: '2026-05-11' }
|
||
* SDIT_CISO_2026_05_11.xlsx → { vertical: 'SDIT_CISO', date: '2026-05-11' }
|
||
* SR_2026_05_11.xlsx → { vertical: 'SR', date: '2026-05-11' }
|
||
* AllOthers_2026_05_11.xlsx → { vertical: 'AllOthers', date: '2026-05-11' }
|
||
*/
|
||
function parseVerticalFilename(filename) {
|
||
const stem = filename.replace(/\.xlsx$/i, '');
|
||
const match = stem.match(/^(.+?)_(\d{4})_(\d{2})_(\d{2})$/);
|
||
if (!match) return null;
|
||
return {
|
||
vertical: match[1],
|
||
date: `${match[2]}-${match[3]}-${match[4]}`,
|
||
};
|
||
}
|
||
```
|
||
|
||
## API Endpoints
|
||
|
||
### Upload Flow
|
||
|
||
**`POST /api/compliance/vcl-multi/preview`**
|
||
|
||
Accepts multiple xlsx files via multipart form data. Parses each, computes per-vertical scoped diffs.
|
||
|
||
- Auth: `requireAuth()`, `requireGroup('Admin', 'Standard_User')`
|
||
- Body: multipart/form-data with field `files` (array of xlsx files)
|
||
- Response:
|
||
```json
|
||
{
|
||
"files": [
|
||
{
|
||
"filename": "NTS_AEO_2026_05_11.xlsx",
|
||
"vertical": "NTS_AEO",
|
||
"report_date": "2026-05-11",
|
||
"total_items": 342,
|
||
"diff": { "new_count": 12, "recurring_count": 320, "resolved_count": 8 },
|
||
"summary_entries": 24,
|
||
"tempFile": "/path/to/temp.json"
|
||
}
|
||
],
|
||
"unrecognized": ["weird_file.xlsx"]
|
||
}
|
||
```
|
||
|
||
**`POST /api/compliance/vcl-multi/commit`**
|
||
|
||
Commits all previewed files in a single transaction.
|
||
|
||
- Auth: `requireAuth()`, `requireGroup('Admin', 'Standard_User')`
|
||
- Body: `{ files: [{ tempFile, vertical, report_date, filename }] }`
|
||
- Response:
|
||
```json
|
||
{
|
||
"committed": [
|
||
{ "vertical": "NTS_AEO", "upload_id": 45, "new_count": 12, "recurring_count": 320, "resolved_count": 8 }
|
||
],
|
||
"total_new": 85,
|
||
"total_resolved": 42
|
||
}
|
||
```
|
||
|
||
### Reporting
|
||
|
||
**`GET /api/compliance/vcl-multi/stats`**
|
||
|
||
Aggregated cross-vertical executive summary.
|
||
|
||
- Response:
|
||
```json
|
||
{
|
||
"stats": {
|
||
"total_devices": 4200,
|
||
"compliant": 3800,
|
||
"non_compliant": 400,
|
||
"compliance_pct": 90,
|
||
"target_pct": 95
|
||
},
|
||
"donut": {
|
||
"blocked": { "count": 120, "pct": 30 },
|
||
"in_progress": { "count": 280, "pct": 70 }
|
||
},
|
||
"vertical_breakdown": [
|
||
{
|
||
"vertical": "NTS_AEO",
|
||
"total_devices": 800,
|
||
"compliant": 720,
|
||
"non_compliant": 80,
|
||
"compliance_pct": 90,
|
||
"blockers": 25,
|
||
"forecast_burndown": { "2026-06": 20, "2026-07": 35, "2026-08": 15 },
|
||
"last_upload": "2026-05-11"
|
||
}
|
||
],
|
||
"last_upload_date": "2026-05-11"
|
||
}
|
||
```
|
||
|
||
**`GET /api/compliance/vcl-multi/trend`**
|
||
|
||
Monthly trend data for the overview chart.
|
||
|
||
- Response:
|
||
```json
|
||
{
|
||
"months": [
|
||
{
|
||
"month": "2026-03",
|
||
"compliance_pct": 85,
|
||
"compliant": 3400,
|
||
"non_compliant": 600,
|
||
"forecast_pct": null,
|
||
"target_pct": 95
|
||
}
|
||
]
|
||
}
|
||
```
|
||
|
||
**`GET /api/compliance/vcl-multi/vertical/:code/metrics`**
|
||
|
||
Per-metric breakdown for a specific vertical.
|
||
|
||
- Response:
|
||
```json
|
||
{
|
||
"vertical": "NTS_AEO",
|
||
"metrics": [
|
||
{
|
||
"metric_id": "5.2.4",
|
||
"metric_desc": "MFA enforcement on privileged accounts",
|
||
"category": "Access & MFA",
|
||
"non_compliant": 15,
|
||
"compliant": 785,
|
||
"total": 800,
|
||
"compliance_pct": 98.1,
|
||
"target": 100
|
||
}
|
||
],
|
||
"categories": [
|
||
{ "category": "Access & MFA", "non_compliant": 45, "compliance_pct": 94.4 }
|
||
]
|
||
}
|
||
```
|
||
|
||
**`GET /api/compliance/vcl-multi/vertical/:code/metric/:metricId/devices`**
|
||
|
||
Device list for a specific vertical + metric combination.
|
||
|
||
- Response:
|
||
```json
|
||
{
|
||
"devices": [
|
||
{
|
||
"hostname": "srv-nts-001",
|
||
"ip_address": "10.1.2.3",
|
||
"device_type": "Router",
|
||
"team": "STEAM",
|
||
"seen_count": 4,
|
||
"first_seen": "2026-03-15",
|
||
"last_seen": "2026-05-11",
|
||
"resolution_date": "2026-07-01",
|
||
"remediation_plan": "Scheduled for next maintenance window"
|
||
}
|
||
]
|
||
}
|
||
```
|
||
|
||
**`GET /api/compliance/vcl-multi/vertical/:code/burndown`**
|
||
|
||
Burndown forecast for a specific vertical.
|
||
|
||
- Response:
|
||
```json
|
||
{
|
||
"vertical": "NTS_AEO",
|
||
"total_non_compliant": 80,
|
||
"blockers": 25,
|
||
"with_dates": 55,
|
||
"monthly_forecast": { "2026-06": 20, "2026-07": 35 },
|
||
"projected_clear_date": "2026-08"
|
||
}
|
||
```
|
||
|
||
## Frontend Components
|
||
|
||
### New Page: `VCLMultiVerticalPage.js`
|
||
|
||
Top-level page accessible from the nav drawer. Contains:
|
||
|
||
| Component | Purpose |
|
||
|---|---|
|
||
| `MultiVerticalUploadModal` | Multi-file drag-drop, filename parsing, batch preview, commit |
|
||
| `VCLMultiStatsBar` | Aggregated stats across all verticals |
|
||
| `VCLMultiVerticalTable` | Breakdown table with one row per vertical, clickable for drill-down |
|
||
| `VCLMultiTrendChart` | Monthly compliance trend with forecast line |
|
||
| `VCLMultiDonutChart` | Blocked vs In-Progress donut |
|
||
| `VerticalDetailView` | Per-metric breakdown when a vertical is selected |
|
||
| `MetricDeviceList` | Device list when a metric is selected within a vertical |
|
||
| `VerticalBurndownChart` | Per-vertical burndown projection |
|
||
|
||
### Navigation
|
||
|
||
- New entry in `NavDrawer.js`: "VCL Multi-Vertical" (or "CCP Metrics")
|
||
- Separate from existing "Compliance" and "VCL Report" entries
|
||
- Icon: `BarChart3` or `Building2` from lucide-react
|
||
|
||
### Drill-down UX Flow
|
||
|
||
```
|
||
VCL Multi-Vertical Overview
|
||
├── Stats Bar (aggregated)
|
||
├── Trend Chart (aggregated)
|
||
├── Donut Chart (aggregated)
|
||
└── Vertical Breakdown Table
|
||
├── NTS_AEO (90%) → click
|
||
│ ├── Metric Breakdown
|
||
│ │ ├── 5.2.4 — Access & MFA (98.1%) → click
|
||
│ │ │ └── Device List (15 devices)
|
||
│ │ ├── 1.1.1 — Logging & Monitoring (85%) → click
|
||
│ │ │ └── Device List (120 devices)
|
||
│ │ └── ...
|
||
│ └── Burndown Chart (vertical-specific)
|
||
├── SDIT_CISO (92%) → click
|
||
│ └── ...
|
||
└── ...
|
||
```
|
||
|
||
## Scoped Resolution Logic
|
||
|
||
This is the core architectural change from the existing upload flow.
|
||
|
||
### Current behavior (single-file)
|
||
|
||
```javascript
|
||
// Resolves ALL active items not in the upload — global scope
|
||
for (const [key, row] of Object.entries(activeMap)) {
|
||
if (!newKeys.has(key)) {
|
||
await client.query(`UPDATE compliance_items SET status = 'resolved' WHERE id = $1`, [row.id]);
|
||
}
|
||
}
|
||
```
|
||
|
||
### New behavior (multi-vertical)
|
||
|
||
```javascript
|
||
// Resolves only items within the same vertical — scoped
|
||
const { rows: activeRows } = await client.query(
|
||
`SELECT id, hostname, metric_id, seen_count FROM compliance_items
|
||
WHERE status = 'active' AND vertical = $1`,
|
||
[vertical]
|
||
);
|
||
|
||
for (const [key, row] of Object.entries(activeMap)) {
|
||
if (!newKeys.has(key)) {
|
||
await client.query(`UPDATE compliance_items SET status = 'resolved', resolved_upload_id = $1 WHERE id = $2`, [uploadId, row.id]);
|
||
}
|
||
}
|
||
```
|
||
|
||
The only difference is the `AND vertical = $1` filter on the active items query. This ensures uploading NTS_AEO data never touches SDIT_CISO items.
|
||
|
||
## Burndown Forecast Computation
|
||
|
||
### Per-vertical burndown
|
||
|
||
For each vertical, the burndown is computed from `resolution_date` values on active non-compliant items:
|
||
|
||
```javascript
|
||
function computeVerticalBurndown(items) {
|
||
const total = items.length;
|
||
const withDates = items.filter(i => i.resolution_date != null);
|
||
const blockers = items.filter(i => i.resolution_date == null);
|
||
|
||
// Bucket by month
|
||
const monthly = {};
|
||
for (const item of withDates) {
|
||
const month = item.resolution_date.slice(0, 7); // YYYY-MM
|
||
monthly[month] = (monthly[month] || 0) + 1;
|
||
}
|
||
|
||
// Cumulative projection
|
||
let remaining = total;
|
||
const projection = {};
|
||
for (const month of Object.keys(monthly).sort()) {
|
||
remaining -= monthly[month];
|
||
projection[month] = { remediated: monthly[month], remaining };
|
||
}
|
||
|
||
return { total, blockers: blockers.length, with_dates: withDates.length, monthly, projection };
|
||
}
|
||
```
|
||
|
||
### Aggregated trend forecast
|
||
|
||
The trend chart forecast uses linear regression on the last 3+ monthly snapshots to project forward. This reuses the same approach as the existing VCL trend endpoint.
|
||
|
||
## Migration Script
|
||
|
||
```javascript
|
||
// backend/migrations/add_vcl_multi_vertical.js
|
||
const pool = require('../db');
|
||
|
||
async function run() {
|
||
console.log('Starting VCL multi-vertical migration...');
|
||
try {
|
||
// Add vertical column to compliance_items
|
||
await pool.query(`ALTER TABLE compliance_items ADD COLUMN IF NOT EXISTS vertical TEXT DEFAULT NULL`);
|
||
console.log('✓ vertical column added to compliance_items');
|
||
|
||
await pool.query(`CREATE INDEX IF NOT EXISTS idx_compliance_items_vertical ON compliance_items(vertical)`);
|
||
console.log('✓ idx_compliance_items_vertical index created');
|
||
|
||
await pool.query(`CREATE INDEX IF NOT EXISTS idx_compliance_items_vertical_status ON compliance_items(vertical, status)`);
|
||
console.log('✓ idx_compliance_items_vertical_status index created');
|
||
|
||
// Add vertical column to compliance_uploads
|
||
await pool.query(`ALTER TABLE compliance_uploads ADD COLUMN IF NOT EXISTS vertical TEXT DEFAULT NULL`);
|
||
console.log('✓ vertical column added to compliance_uploads');
|
||
|
||
// Create summary table for per-vertical metric data
|
||
await pool.query(`
|
||
CREATE TABLE IF NOT EXISTS vcl_multi_vertical_summary (
|
||
id SERIAL PRIMARY KEY,
|
||
upload_id INTEGER NOT NULL REFERENCES compliance_uploads(id) ON DELETE CASCADE,
|
||
vertical TEXT NOT NULL,
|
||
metric_id TEXT NOT NULL,
|
||
metric_desc TEXT DEFAULT '',
|
||
category TEXT DEFAULT 'Other',
|
||
team TEXT DEFAULT '',
|
||
priority TEXT DEFAULT '',
|
||
non_compliant INTEGER DEFAULT 0,
|
||
compliant INTEGER DEFAULT 0,
|
||
total INTEGER DEFAULT 0,
|
||
compliance_pct NUMERIC(5,2) DEFAULT 0,
|
||
target NUMERIC(5,2) DEFAULT 0,
|
||
status TEXT DEFAULT '',
|
||
created_at TIMESTAMPTZ DEFAULT NOW()
|
||
)
|
||
`);
|
||
console.log('✓ vcl_multi_vertical_summary table created');
|
||
|
||
await pool.query(`CREATE INDEX IF NOT EXISTS idx_vcl_multi_summary_vertical ON vcl_multi_vertical_summary(vertical)`);
|
||
console.log('✓ idx_vcl_multi_summary_vertical index created');
|
||
|
||
await pool.query(`CREATE INDEX IF NOT EXISTS idx_vcl_multi_summary_upload ON vcl_multi_vertical_summary(upload_id)`);
|
||
console.log('✓ idx_vcl_multi_summary_upload index created');
|
||
|
||
} catch (err) {
|
||
console.error('Migration error:', err.message);
|
||
process.exit(1);
|
||
}
|
||
console.log('Migration complete.');
|
||
process.exit(0);
|
||
}
|
||
|
||
run();
|
||
```
|
||
|
||
## Correctness Properties
|
||
|
||
### Property 1: Vertical-Scoped Resolution Isolation
|
||
|
||
*For any* set of active compliance items across N verticals, committing an upload for vertical X must only resolve items where `vertical = X`. The count of active items for all other verticals must remain unchanged before and after the commit.
|
||
|
||
### Property 2: Filename Parsing Completeness
|
||
|
||
*For any* filename matching the pattern `<VERTICAL>_YYYY_MM_DD.xlsx` where VERTICAL contains only alphanumeric characters and underscores, `parseVerticalFilename` must return a non-null result with the correct vertical code and ISO date string.
|
||
|
||
### Property 3: Aggregated Stats Consistency
|
||
|
||
*For any* set of per-vertical stats, the aggregated `total_devices` must equal the sum of all vertical `total_devices`, `compliant` must equal the sum of all vertical `compliant`, and `compliance_pct` must equal `Math.round((sum_compliant / sum_total) * 100)`.
|
||
|
||
### Property 4: Burndown Forecast Conservation
|
||
|
||
*For any* set of non-compliant items with resolution dates, the sum of all monthly burndown bucket counts must equal the count of items with non-null resolution dates. No item is double-counted or lost.
|
||
|
||
### Property 5: Idempotent Re-upload
|
||
|
||
*For any* vertical, uploading the same file twice on the same day must produce the same final state as uploading it once. Specifically: same active item set, same seen_counts, same resolved set.
|
||
|
||
## Error Handling
|
||
|
||
| Condition | Behavior |
|
||
|---|---|
|
||
| Filename doesn't match pattern | File flagged as "unrecognized" in preview; user can assign vertical manually |
|
||
| Duplicate vertical in batch | Reject — only one file per vertical per batch |
|
||
| Parser failure on one file | That file is marked as errored; other files in batch can still proceed |
|
||
| Transaction failure during commit | Full rollback of entire batch — no partial commits |
|
||
| File exceeds 10MB | Rejected by multer before parsing |
|
||
| No items parsed from file | Warning in preview; user can still commit (creates upload record with 0 items) |
|
||
|
||
## Deployment Considerations
|
||
|
||
- The feature is self-contained behind `/api/compliance/vcl-multi/` endpoints
|
||
- Can be deployed on a separate instance with its own database
|
||
- No changes to existing AEO compliance upload flow
|
||
- Feature flag not needed — the nav entry and endpoints simply exist or don't
|
||
- Environment variable `VCL_TARGET_PCT` (default 95) applies to multi-vertical reporting as well
|