This document summarizes the design decisions and architectural choices for the VCL Multi-Vertical Upload feature. It is intended as a reference for presenting the approach to stakeholders and the compliance team.
---
## What We Are Building
A new upload flow on the STEAM Security Dashboard that accepts multiple per-vertical compliance xlsx files (one per organizational vertical), ingests them with vertical-scoped resolution logic, and generates an executive-level VCL compliance report across all organizations — with drill-down by vertical and by metric.
This is a POC. The compliance team currently exports data from CyberMetrics as xlsx files on a 24-hour cycle. This feature lets them upload those files and generate the same reports they currently build manually in PowerPoint/Excel for senior leadership.
---
## The Problem It Solves
Today the compliance team has 14 separate xlsx files — one per vertical (NTS_AEO, SDIT_CISO, TSI, etc.). The existing dashboard upload flow accepts a single consolidated file and treats it as the complete compliance state. If you upload just one vertical's file, the system incorrectly marks every device from the other 13 verticals as "resolved."
There is no automated way to:
- Ingest all 14 files and produce a unified report
- Drill down from the organizational view into specific metrics and devices
- Generate burndown forecasts across verticals
---
## Key Architectural Decisions
### 1. Vertical-Scoped Resolution
**Decision:** When a file for vertical X is committed, only items belonging to vertical X are evaluated for resolution. All other verticals are untouched.
**Why:** This is the fundamental change that makes per-vertical uploads safe. Without it, uploading one file would destroy data from the other 13 verticals.
**Implication:** Verticals are independent. You can upload NTS_AEO on Monday and SDIT_CISO on Wednesday without interference. This also supports the daily upload cadence the compliance team wants.
### 2. Vertical Identity Comes From the Filename
**Decision:** The vertical code is extracted from the filename pattern `<VERTICAL>_YYYY_MM_DD.xlsx`, not from data inside the xlsx.
**Why:** The internal xlsx structure is identical across verticals — same Summary sheet, same metric detail sheets, same columns. The only differentiator is the filename. This also means the Python parser requires zero changes.
**Implication:** Filenames must follow the convention. If they don't, the system flags them as "unrecognized" and the user can manually assign a vertical. This is a reasonable tradeoff for a POC.
### 3. Separate From Existing AEO Upload
**Decision:** This is a new flow with its own endpoints (`/api/compliance/vcl-multi/...`), its own UI page, and its own nav entry. The existing AEO compliance upload is unchanged.
**Why:**
- The existing flow works for the STEAM/ACCESS-ENG team's day-to-day operations
- The compliance team may deploy this on a separate instance to experiment without affecting production
- Different user groups with different needs — engineers vs. compliance analysts vs. senior leadership
**Implication:** There are now two ways to upload compliance data. They coexist via the `vertical` column — existing AEO data has `vertical = NULL`, multi-vertical data has a vertical code. The VCL report page can aggregate either or both.
### 4. Two-Dimensional Grouping (Vertical + Team)
**Decision:** `vertical` and `team` are separate fields. Vertical is the organizational unit (NTS_AEO, SDIT_CISO). Team is the sub-team within a vertical (STEAM, ACCESS-ENG, ACCESS-OPS).
**Why:** NTS_AEO contains multiple sub-teams. Senior leadership wants to see the vertical-level view. The STEAM team wants to see their team-level view. Both are valid groupings on the same data.
**Implication:** The cross-organizational report groups by vertical. Drilling into NTS_AEO still shows the STEAM/ACCESS-ENG/ACCESS-OPS breakdown because that data exists in the "Team" column inside the xlsx.
### 5. Summary Sheet Data Stored Separately
**Decision:** The parsed Summary sheet (metric-level health data) is stored in a dedicated `vcl_multi_vertical_summary` table, not just as JSON on the upload record.
**Why:** The metric drill-down view needs to query per-metric compliance percentages and targets efficiently. Storing structured rows enables filtering, sorting, and aggregation at the database level rather than parsing JSON blobs in application code.
**Implication:** Slightly more storage, but enables fast queries like "show me all metrics below target across all verticals" without full-table scans.
### 6. Batch Upload With Atomic Commit
**Decision:** All files in a batch are committed in a single database transaction. If any file fails, the entire batch rolls back.
**Why:** Partial commits would leave the report in an inconsistent state — some verticals updated, others stale. The compliance team uploads all 14 files together as a reporting cycle. It should either all succeed or all fail.
**Implication:** If one file has a parsing error, the user is shown the error in the preview phase (before commit). They can remove that file from the batch and commit the rest. Once they hit "Commit," it's all-or-nothing.
### 7. Daily Upload Support (Idempotent)
**Decision:** Re-uploading the same vertical on the same day produces the same final state as uploading it once. The system doesn't create duplicate records.
**Why:** CyberMetrics refreshes on a 24-hour cycle. The compliance team may want to upload daily to track movement. They shouldn't have to worry about "did I already upload today?"
**Implication:** The resolution logic uses `vertical + hostname + metric_id` as the identity key. Recurring items get their `seen_count` incremented and metadata updated. New items are inserted. Missing items are resolved. Same logic as today, just scoped to the vertical.
Each vertical's xlsx file contains two types of data:
1.**Summary sheet** — one row per metric per sub-team, with pre-calculated totals (compliant, non-compliant, total, compliance %, target). This is the source of truth for aggregate numbers.
2.**Detail sheets** — one sheet per metric, listing individual non-compliant devices (hostname, IP, device type, team). These feed the device-level drill-down.
### The Double-Counting Problem (and How We Solve It)
The Summary sheet contains **two levels of rows** for each metric:
| Rollup row | ALL: NTS-AEO | Sum of all sub-teams for that metric |
The rollup row already includes all sub-team totals. If you sum all rows naively, you count every device twice.
**Solution:** All aggregate calculations (stats bar, vertical breakdown, category totals, snapshots) use **only the ALL: rollup rows**. Sub-team rows are stored for drill-down display but never included in totals.
### What Each Number Means
| Metric | Source | Meaning |
|---|---|---|
| **Total Devices** | Sum of `total` from ALL: rows across all metrics for a vertical | Total device-metric pairs evaluated (a device appears once per metric it's measured against) |
| **Compliant** | Sum of `compliant` from ALL: rows | Device-metric pairs that pass the compliance check |
| **Non-Compliant** | Sum of `non_compliant` from ALL: rows | Device-metric pairs that fail |
| **Compliance %** | `compliant / total * 100` | Percentage of device-metric pairs passing |
| **Target %** | Per-metric value from the spreadsheet (e.g., 95%, 80%, 75%) | The threshold set by the compliance program |
| **Blockers** | Non-compliant devices in `compliance_items` with no `resolution_date` | Devices with no committed remediation timeline |
| **In-Progress** | Non-compliant devices with a `resolution_date` set | Devices with a planned fix date |
### Important: "Total Devices" Is Not Unique Devices
A single physical device (hostname) can appear in multiple metrics. For example, one router might be measured against metric 5.5.4i (vulnerability scanning), 7.1.1 (logging), and 2.3.6i (patching). The "Total Devices" count is the sum of all device-metric evaluations, not unique hostnames.
This matches how CyberMetrics reports — each metric has its own scope of applicable devices, and the overall compliance percentage reflects performance across all metrics.
### Per-Metric Compliance Percentage
Each metric row shows its own compliance percentage, which comes directly from the Summary sheet's "Current Compliance" column. This is a decimal between 0 and 1 (displayed as 0–100% in the UI). The target is also per-metric — some metrics have a 95% target, others 80% or 75%, depending on the compliance program's priorities.
### Category Aggregation
Metrics are grouped into categories (Logging & Monitoring, Vulnerability Management, Access & MFA, Endpoint Protection, etc.) based on a static mapping in `compliance_config.json`. The category cards in the drill-down view show the aggregate compliance % across all metrics in that category, using only rollup rows.
---
## Sub-Team Drill-Down
### How It Works
When you click into a vertical (e.g., NTS_AEO), the metrics table shows the **rollup totals** by default — one row per metric with the ALL: numbers. Two mechanisms expose sub-team data:
**1. Expand/Collapse (▸ arrow)**
Click the arrow on any metric row to reveal sub-team rows inline beneath it. Each sub-team row shows that team's compliant/non-compliant/total/% for that specific metric. The sub-team rows are visually indented and teal-highlighted.
This is useful for: "Which team is dragging down metric 5.5.4i?"
**2. Team Filter Buttons**
A row of filter buttons appears above the metrics table showing all teams in that vertical (e.g., ACCESS-ENG, ACCESS-OPS, INTELDEV, STEAM). Click one to filter the entire table to show only that team's numbers per metric. The "All (Rollup)" button returns to the aggregated view.
This is useful for: "Show me STEAM's compliance across all metrics."
### What "(Other)" Means
Some metrics have a team value of `(Other)` in the Summary sheet. This represents devices that don't map to a named sub-team. These are included in the ALL: rollup total but are not shown as a separate sub-team in the UI — they're noise for the compliance team's purposes.
Clicking a sub-team row in the metric sub-team view navigates to the device list — individual non-compliant hostnames for that vertical + metric + team combination. The device list is filtered to only show devices belonging to the selected team. This data comes from the detail sheets (not the Summary sheet) and shows:
If a metric has no sub-team breakdown (e.g., only an "(Other)" team), a "View All Devices" button is shown instead, which loads the full unfiltered device list for that metric.
- Existing compliance_items table structure — only adds a nullable `vertical` column
- Python parser — reused as-is, no modifications
- Auth model — same groups (Admin, Standard_User) required for upload
---
## Deployment Options
| Option | Description |
|---|---|
| Same instance | Add the feature to the existing dashboard. Multi-vertical data coexists with AEO data via the `vertical` column. |
| Separate instance | Deploy a fresh instance with its own database. Compliance team experiments freely. No risk to dev/production data. |
| Later: API integration | Replace xlsx upload with direct CyberMetrics API calls. Backend endpoints stay the same — just a different client pushing data. |
The architecture supports all three without code changes. The `vertical` column and scoped resolution logic work regardless of deployment topology.
---
## Open Questions for the Meeting
1.**Vertical list** — Are the 14 verticals in the screenshot the complete set, or do new verticals get added periodically? (Affects whether we hardcode a list or keep it dynamic.)
2.**Target % per vertical** — Is the 95% target uniform across all verticals, or do different verticals have different targets?
3.**Access control** — Should the compliance team have their own user accounts with a specific role, or do they use existing Admin/Standard_User groups?
4.**Naming** — What should this page be called in the nav? "CCP Metrics", "VCL Multi-Vertical", "Compliance Reporting", something else?
5.**Retention** — How long should historical upload data be kept? (Affects trend chart depth and storage.)