Files
cve-dashboard/.kiro/specs/compliance-duplicate-failing-metrics/design.md
2026-05-19 15:01:25 -06:00

512 lines
47 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Compliance Duplicate Failing Metrics Bugfix Design
## Overview
Five compliance endpoints in `backend/routes/compliance.js` (`GET /items`, `GET /items/:hostname`, `GET /vcl/stats`, `GET /mttr`) and the `compliance_snapshots` block inside `persistUpload()` all share the same root cause: each one reads `compliance_items` with either no vertical filter or a filter that admits both legacy `vertical IS NULL` rows and multi-vertical `vertical = 'NTS_AEO'` rows, but does not deduplicate on `(hostname, metric_id)`. When the same `(hostname, metric_id)` pair exists in two verticals (the documented multi-vertical workflow), the duplicate row distorts the response — it duplicates a metric chip in the UI, double-counts a device across teams, inflates aging buckets, inflates forecast row counts, and (in the snapshot block) lets a single hostname appear in both `compliant` and `non_compliant` columns of the same `(snapshot_month, vertical)` row.
The fix is uniform across endpoints: dedupe at the SQL layer using `DISTINCT ON (hostname, metric_id)` with a deterministic `ORDER BY` (highest `seen_count`, then most recent `upload_id`) so each unique violation contributes exactly one row to the aggregation, and rewrite the snapshot query so each hostname is classified by its `MIN(status)` (active wins over resolved) inside a CTE before the count. The `groupByHostname` helper and the `/items/:hostname` response builder also gain a defensive in-memory dedupe keyed by `metric_id` (or `(metric_id, status)` for the detail endpoint), so a duplicate row that slips through any unforeseen code path still cannot duplicate a chip in the UI.
The implementation is intentionally minimal: each fix changes a single SQL statement and (for two endpoints) a small JavaScript loop. No schema migration, no new column, no frontend change. The frontend `ComplianceDetailPanel` already keys metrics on `metric_id` for chip rendering and will render correctly once the API stops returning duplicate rows.
## Glossary
- **Bug_Condition (C)**: The condition that triggers the bug — two or more rows in `compliance_items` share the same `(hostname, metric_id)` pair across verticals (i.e., one row with `vertical IS NULL` and another with `vertical = 'NTS_AEO'`, or two non-null verticals that both pass an endpoint's `WHERE` clause).
- **Property (P)**: The desired behavior when C holds — each affected endpoint returns or counts each `(hostname, metric_id)` exactly once, using the row with the highest `seen_count` (with most recent `upload_id` as tiebreak) as the representative.
- **Preservation**: Behavior on rows where `(hostname, metric_id)` is unique across verticals, on the empty-data response shape, and on unrelated query parameters (e.g., `team` and `status` on `/items`) — all must be byte-for-byte unchanged.
- **vertical**: `TEXT` column on `compliance_items` and `compliance_uploads` identifying which xlsx (NTS_AEO, SDIT_CISO, TSI) the row originated from. `NULL` indicates a legacy AEO-only upload. The bug class is specifically about rows that share `(hostname, metric_id)` but differ in `vertical`.
- **groupByHostname(rows, noteHostnames)**: Helper in `backend/routes/compliance.js` (lines ~213230) that flattens a list of joined `compliance_items` rows into one device object per hostname, pushing each row's `metric_id` onto `failing_metrics`. It performs no deduplication.
- **bucketAgingItems(items)**: Helper in `backend/routes/compliance.js` (lines ~234254) that places each item into one of four `seen_count` buckets per team. It iterates rows directly, so duplicate rows produce double-counted buckets.
- **persistUpload()**: Function in `backend/routes/compliance.js` (lines ~81192) that writes a parsed upload to the DB and then writes per-vertical rows into `compliance_snapshots`. The snapshot query at the end of this function counts hostnames per team using `COUNT(DISTINCT CASE WHEN status = 'X' THEN hostname END)`, which double-counts a hostname into both the `compliant` and `non_compliant` columns when the hostname has both `active` and `resolved` rows across verticals.
- **representative row**: For a duplicated `(hostname, metric_id)`, the row chosen by `DISTINCT ON (hostname, metric_id) ... ORDER BY hostname, metric_id, seen_count DESC, upload_id DESC`. This is deterministic and aligns with the "use the row with the highest `seen_count` or most recent `upload_id`" rule from the requirements.
## Bug Details
### Bug Condition
The bug manifests when two or more rows in `compliance_items` share the same `(hostname, metric_id)` pair across different `vertical` values. This happens whenever the same device fails the same metric in more than one vertical's xlsx — including the originally reported case where a legacy `vertical IS NULL` AEO upload and a newer `vertical = 'NTS_AEO'` multi-vertical upload both contain `STEAM-INTERSIGHT` failing `7.1.1`.
**Formal Specification:**
```
FUNCTION isBugCondition(items)
INPUT: items — list of compliance_items rows
OUTPUT: boolean
// The bug condition is triggered for any (hostname, metric_id) pair with more than one row
GROUP items BY (hostname, metric_id) INTO groups
RETURN EXISTS group IN groups WHERE COUNT(group) > 1
END FUNCTION
```
For a single endpoint response to be considered buggy, the API output must additionally fail one of the following invariants (the per-endpoint manifestation of the same root cause):
```
FUNCTION isBuggyResponse(endpoint, response)
CASE endpoint OF
'/items': RETURN EXISTS device IN response.devices WHERE
COUNT(device.failing_metrics) != COUNT(DISTINCT metric_id IN device.failing_metrics)
'/items/:hostname': RETURN EXISTS (metric_id, status) WITH COUNT(*) > 1 IN response.metrics
'/vcl/stats heavy_hitters': RETURN SUM(hh.non_compliant FOR hh IN response.heavy_hitters) >
COUNT(DISTINCT hostname WHERE has_active_violation)
'/vcl/stats forecast': RETURN response.vertical_breakdown[i].blockers < 0
FOR SOME i WITH duplicated (hostname, metric_id) AND non-null resolution_date
'/mttr': RETURN SUM(b.total FOR b IN response.aging) >
COUNT(DISTINCT (hostname, metric_id) WHERE status = 'active')
'persistUpload snapshot': RETURN EXISTS row IN compliance_snapshots WHERE
compliant + non_compliant > total_devices
END CASE
END FUNCTION
```
### Examples
The originally reported case (GitLab issue #13, `STEAM-INTERSIGHT` / 172.16.30.40 / metric `7.1.1`) and the four sibling manifestations:
- **`/items`** — `STEAM-INTERSIGHT` has two active rows for `metric_id = '7.1.1'` (one with `vertical IS NULL`, one with `vertical = 'NTS_AEO'`). The handler's `WHERE` clause admits both via `(ci.vertical IS NULL OR ci.vertical = 'NTS_AEO')`, and `groupByHostname()` does an unconditional `dev.failing_metrics.push(...)` per row. The device's `failing_metrics` array contains two entries with `metric_id = '7.1.1'`. Expected: one entry per unique `metric_id`.
- **`/items/:hostname`** — Same hostname, but the detail query has no vertical filter at all (just `WHERE ci.hostname = $1`). Both rows come back, the response builder maps `metricRows.map(...)` over both, and the frontend's `MetricChip` renders `7.1.1` twice in the "Failing Metrics" section. Expected: one entry per `(metric_id, status)` pair.
- **`/vcl/stats` heavy-hitters and per-team totals** — A device whose `team` differs between verticals (e.g., `STEAM` in the legacy row and `ACCESS-ENG` in the NTS_AEO row) is counted under both teams by `COUNT(DISTINCT hostname) ... GROUP BY team`. The sum across `heavy_hitters[*].non_compliant` exceeds the dashboard's `stats.non_compliant`. Expected: the hostname is assigned to exactly one team — the team from its representative row — and the per-team sums reconcile with the global non-compliant count.
- **`/vcl/stats` forecast-burndown** — For a team with one duplicated `(hostname, metric_id)` whose `resolution_date` is non-null in both rows, the forecast query `SELECT resolution_date FROM compliance_items WHERE status = 'active' AND team = $1 AND resolution_date IS NOT NULL` returns two rows. `forecastItems.length` is 2, but `teamNonCompliant` (from the de-team-counted DISTINCT-hostname query) is 1, so `blockers = 1 - 2 = -1`. The route then clamps to 0, hiding the underlying inconsistency. Expected: forecast is deduped by `(hostname, metric_id)` so the count matches `teamNonCompliant`'s scoping and `blockers` is non-negative.
- **`/mttr`** — Same duplicated `(hostname, metric_id)`. `bucketAgingItems()` receives both rows, increments the bucket twice. The team total for `STEAM` (or whichever team appears in the duplicate row) is inflated. Expected: each unique `(hostname, metric_id)` contributes to exactly one bucket using a single representative `seen_count`.
- **Edge case — `persistUpload()` snapshot** — A hostname has two rows: one with `status = 'resolved'` (legacy vertical) and one with `status = 'active'` (NTS_AEO vertical). The snapshot query
```sql
COUNT(DISTINCT CASE WHEN status = 'resolved' THEN hostname END) AS compliant,
COUNT(DISTINCT CASE WHEN status = 'active' THEN hostname END) AS non_compliant
```
counts the hostname once in `compliant` and once in `non_compliant`, so `compliant + non_compliant > total_devices`. Expected: the hostname is classified by its worst-case status (active wins over resolved) inside a CTE so it appears in exactly one column.
## Expected Behavior
### Preservation Requirements
**Unchanged Behaviors:**
- Rows where `(hostname, metric_id)` is unique across verticals: every endpoint returns the same numbers, in the same shape, in the same order as before the fix.
- Empty-data responses: `/items` returns `{ devices: [], team, status }`, `/items/:hostname` returns `404` for unknown hostnames, `/vcl/stats` returns its zero-state shape, `/mttr` returns `{ aging: [] }`.
- `/items` `team` and `status` query parameters: still validated against `ALLOWED_TEAMS` and `['active', 'resolved']`, still reject invalid values with HTTP 400.
- `/items` `(ci.vertical IS NULL OR ci.vertical = 'NTS_AEO')` predicate: kept as-is so AEO-only legacy data continues to surface alongside NTS_AEO multi-vertical data. The fix only adds dedup on top, it does not narrow the set of admitted verticals.
- `/items/:hostname` ordering by `status DESC, metric_id`: unchanged so `active` metrics remain listed before `resolved` ones.
- `/vcl/stats` donut categorization (`blocked` / `in_progress` by `MAX(resolution_date)` per hostname): already deduped by `GROUP BY hostname` and stays unchanged.
- `/vcl/stats` global `stats.compliant` / `stats.non_compliant`: already use `COUNT(DISTINCT hostname)` and stay unchanged.
- Frontend components (`ComplianceDetailPanel`, `ComplianceCharts`, `CompliancePage`): no changes required. They already render one chip per `metric_id`; the bug was purely an upstream data issue.
- `persistUpload()` error handling: snapshot creation remains wrapped in `try/catch` that logs but does not fail the upload commit.
**Scope:**
All endpoint inputs that do not involve cross-vertical duplicate `(hostname, metric_id)` rows must be byte-for-byte identical to the pre-fix output. The fix only changes what happens when two or more `compliance_items` rows share `(hostname, metric_id)` across verticals.
## Hypothesized Root Cause
All five sites have the same shape of bug — missing dedup on `(hostname, metric_id)` after the multi-vertical migration admitted two-row scenarios — but with slightly different mechanics. Listing them explicitly so the test plan can confirm or refute each one:
1. **`/items` — `groupByHostname()` pushes every row.** The handler's `WHERE` clause `(ci.vertical IS NULL OR ci.vertical = 'NTS_AEO')` admits both verticals' rows. `groupByHostname()` then iterates each row and unconditionally calls `dev.failing_metrics.push({ metric_id, ... })`. There is no `Set`/`Map` keyed on `metric_id`, so a duplicate row produces a duplicate metric in the array.
2. **`/items/:hostname` — no vertical filter, no dedup.** The detail query is `WHERE ci.hostname = $1` with no `vertical` predicate at all, so every vertical's row for that hostname is returned. The response builder does `metricRows.map(r => ({ ...r }))` — there is no dedup step. Every duplicate row becomes a duplicate entry in `metrics`.
3. **`/vcl/stats` heavy-hitters and per-team totals — team chosen per row, not per hostname.** The `GROUP BY team ... COUNT(DISTINCT hostname)` query is correct for "how many distinct hostnames does each team see," but a hostname with rows under two different teams (because `team` differs across verticals) is counted in both groups. The dashboard's global `stats.non_compliant` (a single `COUNT(DISTINCT hostname)` with no team scoping) does not match `SUM(heavy_hitters[*].non_compliant)`.
4. **`/vcl/stats` forecast-burndown — duplicate rows inflate forecast count.** The query `SELECT resolution_date FROM compliance_items WHERE status = 'active' AND team = $1 AND resolution_date IS NOT NULL` returns one row per `compliance_items` row, not one row per `(hostname, metric_id)`. The downstream `blockers = teamNonCompliant - forecastItems.length` calculation can go negative because `forecastItems.length` is inflated relative to the deduped `teamNonCompliant`.
5. **`/mttr` — `bucketAgingItems()` iterates rows directly.** The query `SELECT seen_count, team FROM compliance_items WHERE status = 'active'` returns every active row, and `bucketAgingItems()` does `for (const item of items) { buckets[label].total += 1; ... }`. There is no `Set` keyed on `(hostname, metric_id)`, so each duplicate row increments its bucket twice.
6. **`persistUpload()` snapshot — `CASE WHEN status =` double-counts.** The snapshot query
```sql
COUNT(DISTINCT CASE WHEN status = 'resolved' THEN hostname END) AS compliant,
COUNT(DISTINCT CASE WHEN status = 'active' THEN hostname END) AS non_compliant
```
counts a hostname in `compliant` if any of its rows is `resolved` AND in `non_compliant` if any of its rows is `active`. With duplicate rows that disagree on status across verticals, the same hostname lands in both columns.
The common structural cause is that the multi-vertical migration (`add_vcl_multi_vertical.js`) added a `vertical` column to `compliance_items` but did not retrofit existing read queries — which assumed `(hostname, metric_id)` was effectively a unique key — to either dedupe explicitly or scope to a single vertical.
## Correctness Properties
Property 1: Bug Condition — `/items` failing_metrics array contains at most one entry per metric_id
_For any_ set of `compliance_items` rows including cross-vertical duplicates of `(hostname, metric_id)`, the response from `GET /items` SHALL contain, for every device in `response.devices`, exactly one entry per unique `metric_id` in `device.failing_metrics`. Formally: `device.failing_metrics.length == new Set(device.failing_metrics.map(m => m.metric_id)).size`.
**Validates: Requirements 2.1**
Property 2: Bug Condition — `/items/:hostname` returns one metric per (metric_id, status)
_For any_ device with cross-vertical duplicate `(hostname, metric_id)` rows, the response from `GET /items/:hostname` SHALL contain exactly one entry per unique `(metric_id, status)` pair in `response.metrics`. Each entry's `seen_count` SHALL equal the maximum `seen_count` across the duplicate rows for that pair.
**Validates: Requirements 2.2, 2.3**
Property 3: Bug Condition — `/vcl/stats` per-team device counts equal COUNT(DISTINCT hostname)
_For any_ set of `compliance_items` rows including cross-vertical duplicates where a hostname's `team` differs across verticals, the response from `GET /vcl/stats` SHALL satisfy `SUM(heavy_hitters[*].non_compliant) == stats.non_compliant`. Each hostname SHALL be assigned to exactly one team — the team from its representative row (highest `seen_count`, then most recent `upload_id`) — regardless of how many verticals contain rows for that hostname.
**Validates: Requirements 2.5**
Property 4: Bug Condition — `/vcl/stats` forecast-burndown is deduped by (hostname, metric_id) and blockers is non-negative
_For any_ team with cross-vertical duplicate `(hostname, metric_id)` rows where both rows have a non-null `resolution_date`, the deduped forecast row count for that team SHALL contribute at most one entry per unique `(hostname, metric_id)`, and `blockers = teamNonCompliant - dedupedForecastCount` SHALL be `>= 0` (no clamp required to satisfy the invariant).
**Validates: Requirements 2.6**
Property 5: Bug Condition — `/mttr` aging buckets count each unique (hostname, metric_id) exactly once
_For any_ set of active `compliance_items` rows including cross-vertical duplicates, the response from `GET /mttr` SHALL satisfy `SUM(aging[*].total) == COUNT(DISTINCT (hostname, metric_id) WHERE status = 'active')`. Each unique `(hostname, metric_id)` SHALL be bucketed exactly once using a single representative `seen_count`.
**Validates: Requirements 2.7**
Property 6: Bug Condition — `persistUpload()` snapshot rows satisfy compliant + non_compliant <= total_devices
_For any_ `persistUpload()` invocation, every row written into `compliance_snapshots` for the current `(snapshot_month, vertical)` pair SHALL satisfy `compliant + non_compliant <= total_devices`. A hostname with both `active` and `resolved` rows for the same team SHALL be classified as `non_compliant` (active wins over resolved) and SHALL appear in exactly one of the two columns.
**Validates: Requirements 2.4**
Property 7: Preservation — non-duplicated rows are unchanged
_For any_ set of `compliance_items` rows where every `(hostname, metric_id)` is unique across verticals, the responses from `/items`, `/items/:hostname`, `/vcl/stats`, `/mttr`, and the `compliance_snapshots` rows written by `persistUpload()` SHALL be identical to the pre-fix output for the same input. The fix SHALL NOT change behavior on the unique-`(hostname, metric_id)` case.
**Validates: Requirements 3.1, 3.2, 3.3, 3.4, 3.5, 3.6, 3.7**
Property 8: Preservation — `seen_count`, `first_seen`, and `last_seen` are correctly aggregated for the representative row
_For any_ duplicated `(hostname, metric_id)`, the surviving entry SHALL carry `seen_count = MAX(seen_count)` across the duplicates, `first_seen = MIN(first_seen)` across the duplicates, and `last_seen = MAX(last_seen)` across the duplicates. Active/resolved status separation SHALL be preserved (a metric still appears in the `active` section if any of its duplicate rows is `active`).
**Validates: Requirements 3.2, 3.3**
## Fix Implementation
### Changes Required
All changes are in `backend/routes/compliance.js`. No schema migration, no new column, no frontend change. SQL-level dedup is preferred wherever possible because it avoids materialising duplicates into Node memory and centralises the representative-row policy at the data layer.
#### Fix 1: `GET /items` — `DISTINCT ON (hostname, metric_id)` in SQL, defensive dedupe in `groupByHostname`
**File**: `backend/routes/compliance.js`
**Function**: `router.get('/items', ...)` (around line 535) and helper `groupByHostname` (around line 213)
**Specific Changes**:
1. Rewrite the items query to use `DISTINCT ON (hostname, metric_id)` so each unique violation contributes exactly one row, choosing the representative row by highest `seen_count` and most recent `upload_id`:
```sql
SELECT DISTINCT ON (ci.hostname, ci.metric_id)
ci.hostname, ci.ip_address, ci.device_type, ci.team,
ci.metric_id, ci.metric_desc, ci.category, ci.status, ci.seen_count,
fu.report_date AS first_seen,
lu.report_date AS last_seen,
ru.report_date AS resolved_on
FROM compliance_items ci
LEFT JOIN compliance_uploads fu ON ci.first_seen_upload_id = fu.id
LEFT JOIN compliance_uploads lu ON ci.upload_id = lu.id
LEFT JOIN compliance_uploads ru ON ci.resolved_upload_id = ru.id
WHERE ci.team = $1
AND ci.status = $2
AND (ci.vertical IS NULL OR ci.vertical = 'NTS_AEO')
ORDER BY ci.hostname, ci.metric_id, ci.seen_count DESC, ci.upload_id DESC
```
The `ORDER BY` lead expressions match the `DISTINCT ON` columns; the trailing expressions select the representative.
2. Add a defensive in-memory dedupe in `groupByHostname()` keyed on `metric_id` per device, so any future code path that bypasses the SQL dedupe still cannot duplicate a chip:
```javascript
function groupByHostname(rows, noteHostnames) {
const deviceMap = {};
for (const row of rows) {
if (!deviceMap[row.hostname]) {
deviceMap[row.hostname] = {
hostname: row.hostname, ip_address: row.ip_address || '', device_type: row.device_type || '',
team: row.team || '', status: row.status, failing_metrics: [],
_seenMetricIds: new Set(),
seen_count: row.seen_count || 1, first_seen: row.first_seen || null,
last_seen: row.last_seen || null, resolved_on: row.resolved_on || null,
has_notes: noteHostnames.has(row.hostname),
};
}
const dev = deviceMap[row.hostname];
if (!dev._seenMetricIds.has(row.metric_id)) {
dev._seenMetricIds.add(row.metric_id);
dev.failing_metrics.push({ metric_id: row.metric_id, metric_desc: row.metric_desc || '', category: row.category || '' });
}
if ((row.seen_count || 1) > dev.seen_count) dev.seen_count = row.seen_count;
if (row.first_seen && (!dev.first_seen || row.first_seen < dev.first_seen)) dev.first_seen = row.first_seen;
if (row.last_seen && (!dev.last_seen || row.last_seen > dev.last_seen)) dev.last_seen = row.last_seen;
}
return Object.values(deviceMap).map(({ _seenMetricIds, ...dev }) => dev);
}
```
The `_seenMetricIds` set is stripped before the device is returned so the response shape is unchanged.
#### Fix 2: `GET /items/:hostname` — `DISTINCT ON (metric_id, status)` in SQL
**File**: `backend/routes/compliance.js`
**Function**: `router.get('/items/:hostname', ...)` (around line 575)
**Specific Changes**:
1. Rewrite the metrics query to dedupe on `(metric_id, status)` so a metric that appears as both `active` and `resolved` is preserved (one row each), but cross-vertical duplicates of the same `(metric_id, status)` collapse to one:
```sql
SELECT DISTINCT ON (ci.metric_id, ci.status)
ci.metric_id, ci.metric_desc, ci.category, ci.status,
ci.ip_address, ci.device_type, ci.team, ci.seen_count, ci.extra_json,
ci.resolution_date, ci.remediation_plan,
fu.report_date AS first_seen, fu.uploaded_at AS first_seen_at,
lu.report_date AS last_seen, lu.uploaded_at AS last_seen_at,
ru.report_date AS resolved_on
FROM compliance_items ci
LEFT JOIN compliance_uploads fu ON ci.first_seen_upload_id = fu.id
LEFT JOIN compliance_uploads lu ON ci.upload_id = lu.id
LEFT JOIN compliance_uploads ru ON ci.resolved_upload_id = ru.id
WHERE ci.hostname = $1
ORDER BY ci.metric_id, ci.status, ci.seen_count DESC, ci.upload_id DESC
```
2. The existing post-query sort (`status DESC` for grouping, then `metric_id`) is rebuilt in JavaScript on the deduped result to preserve the response ordering:
```javascript
metricRows.sort((a, b) => {
if (a.status !== b.status) return b.status.localeCompare(a.status); // 'resolved' < 'active' alphabetically; we want 'active' first
return a.metric_id.localeCompare(b.metric_id);
});
```
(Note: the original SQL used `ORDER BY ci.status DESC, ci.metric_id`, which placed `resolved` before `active` because of `DESC` on the alphabetic order. The post-fix sort reproduces that exact ordering on the deduped rows.)
3. The `identity` lookup `metricRows.find(r => r.status === 'active') || metricRows[0]` is unchanged — it still picks the active representative for `ip_address`, `device_type`, `team`, `resolution_date`, and `remediation_plan`.
#### Fix 3: `GET /vcl/stats` heavy-hitters and per-team totals — dedupe to one team per hostname via CTE
**File**: `backend/routes/compliance.js`
**Function**: `router.get('/vcl/stats', ...)` (around line 990, the `teamRows` query)
**Specific Changes**:
1. Replace the heavy-hitters query with a CTE that first picks one representative row per hostname (using the same `DISTINCT ON (hostname)` policy: highest `seen_count`, then most recent `upload_id`), then groups by the representative's team:
```sql
WITH device_team AS (
SELECT DISTINCT ON (hostname)
hostname,
COALESCE(team, 'Unknown') AS team,
resolution_date
FROM compliance_items
WHERE status = 'active'
ORDER BY hostname, seen_count DESC, upload_id DESC
)
SELECT team,
COUNT(DISTINCT hostname)::int AS non_compliant,
MAX(resolution_date) AS compliance_date
FROM device_team
GROUP BY team
ORDER BY COUNT(DISTINCT hostname) DESC
```
Because the CTE already collapses each hostname to one row, `COUNT(DISTINCT hostname)` per team is equivalent to `COUNT(*)` per team, but `DISTINCT` is kept for defensive symmetry.
2. The dashboard's existing `stats.non_compliant` query (also `COUNT(DISTINCT hostname)` over all active rows, no team scoping) is unchanged — but now the property `SUM(heavy_hitters[*].non_compliant) == stats.non_compliant` holds because both numerators and denominators agree on "one hostname, one team."
3. The per-team-total query inside the `for (const teamRow of teamRows)` loop (`SELECT COUNT(DISTINCT hostname) AS total FROM compliance_items WHERE COALESCE(team, 'Unknown') = $1`) is similarly rewritten to use the same `device_team`-style CTE so the team's `total_devices` matches the team's `non_compliant`. The simplest form:
```sql
WITH device_team AS (
SELECT DISTINCT ON (hostname)
hostname,
COALESCE(team, 'Unknown') AS team
FROM compliance_items
ORDER BY hostname, seen_count DESC, upload_id DESC
)
SELECT COUNT(*)::int AS total FROM device_team WHERE team = $1
```
Note: the loop runs once per team, so this CTE is computed N times — acceptable given the small team count (4 teams). If profiling shows this is a hot path, hoist the `device_team` CTE out of the loop and compute totals in a single grouped query.
#### Fix 4: `GET /vcl/stats` forecast-burndown — dedupe by `(hostname, metric_id)` in SQL
**File**: `backend/routes/compliance.js`
**Function**: `router.get('/vcl/stats', ...)` (around line 1015, the `forecastItems` query)
**Specific Changes**:
1. Rewrite the forecast query to `DISTINCT ON (hostname, metric_id)` so each unique violation contributes at most one `resolution_date` to the forecast bucketing:
```sql
SELECT DISTINCT ON (hostname, metric_id) resolution_date
FROM compliance_items
WHERE status = 'active'
AND COALESCE(team, 'Unknown') = $1
AND resolution_date IS NOT NULL
ORDER BY hostname, metric_id, seen_count DESC, upload_id DESC
```
2. The downstream `blockers = teamNonCompliant - forecastItems.length` calculation is unchanged. With both sides now deduped consistently — `teamNonCompliant` from the `device_team` CTE in Fix 3, `forecastItems.length` from the deduped query above — the difference is non-negative without any clamp.
3. The `Math.max(blockers, 0)` clamp at the existing call site is left in place as a belt-and-braces safeguard. It SHOULD be a no-op after the fix; if a regression introduces inconsistency again, the property test for Property 4 will catch it.
#### Fix 5: `GET /mttr` — dedupe by `(hostname, metric_id)` in SQL
**File**: `backend/routes/compliance.js`
**Function**: `router.get('/mttr', ...)` (around line 824)
**Specific Changes**:
1. Rewrite the query to `DISTINCT ON (hostname, metric_id)` so each unique active violation contributes exactly one `(seen_count, team)` row to `bucketAgingItems()`:
```sql
SELECT DISTINCT ON (hostname, metric_id)
COALESCE(seen_count, 1) AS seen_count,
team
FROM compliance_items
WHERE status = 'active'
ORDER BY hostname, metric_id, seen_count DESC, upload_id DESC
```
2. `bucketAgingItems()` itself does not change. It already does the right thing when fed one row per unique violation.
3. SQL-level dedup is preferred over a JavaScript-side `Set` here because the helper is also called from `/vcl/stats` and changing its contract risks regressions in the other caller. Pushing the dedup into SQL keeps `bucketAgingItems()` pure and reusable.
#### Fix 6: `persistUpload()` snapshot block — classify hostnames via `MIN(status)` in a CTE
**File**: `backend/routes/compliance.js`
**Function**: `persistUpload()` (lines 81192), specifically the `verticalStats` query at line 157
**Specific Changes**:
1. Rewrite the snapshot query so each hostname is classified by its worst-case status (active wins over resolved) inside a CTE before counting. Because `'active' < 'resolved'` lexicographically, `MIN(status)` returns `'active'` for a hostname that has any active row and `'resolved'` only if all rows are `'resolved'`:
```sql
WITH hostname_status AS (
SELECT team,
hostname,
MIN(status) AS status -- 'active' beats 'resolved' alphabetically
FROM compliance_items
WHERE team IS NOT NULL
GROUP BY team, hostname
)
SELECT team AS vertical,
COUNT(*)::int AS total_devices,
COUNT(*) FILTER (WHERE status = 'resolved')::int AS compliant,
COUNT(*) FILTER (WHERE status = 'active')::int AS non_compliant
FROM hostname_status
GROUP BY team
```
Each hostname appears exactly once in `hostname_status`, so `compliant + non_compliant == total_devices` is structurally guaranteed.
2. The downstream `INSERT ... ON CONFLICT (snapshot_month, vertical) DO UPDATE` block is unchanged. It already keys snapshots on `vertical`, and the per-team counts now satisfy `compliant + non_compliant <= total_devices`.
3. The `compliance_pct` calculation `Math.round((vs.compliant / total) * 100 * 100) / 100` is unchanged. It is now defended against the previous off-by-one: with the old query a hostname could be in both `compliant` and `non_compliant`, double-counting itself in `compliant`'s numerator while only contributing once to `total_devices`'s denominator. The fix removes that.
> Note: Fix 6 is conceptually adjacent to but mechanically distinct from the `persistUpload` fix in `compliance-duplicate-chart-entries`. That spec adds a `WHERE vertical = $1` filter to scope the snapshot to one vertical at a time. This spec adds the `hostname_status` CTE so that within whichever vertical is snapshotted, each hostname is classified once. Both fixes are independently necessary; landing them together is preferred but each is correct on its own.
## Testing Strategy
### Validation Approach
The bug condition is straightforward to construct: insert two `compliance_items` rows for the same `(hostname, metric_id)` with different `vertical` values, then call each affected endpoint. The two-phase approach is to first run the tests against the unfixed code to confirm the duplication counterexamples, then run the same tests against the fixed code and add property-based tests that explore the input space more broadly.
### Exploratory Bug Condition Checking
**Goal**: Surface counterexamples that demonstrate each of the six manifestations BEFORE implementing the fix. Confirm or refute the root cause analysis for each endpoint independently — they share a structural cause but the SQL details differ.
**Test Plan**: Seed a clean test database with a fixture representing the original GitLab #13 scenario (`STEAM-INTERSIGHT` failing `7.1.1` in both `vertical IS NULL` and `vertical = 'NTS_AEO'`), plus targeted variants for each sibling endpoint. Call each affected endpoint and assert the buggy invariants. Run on UNFIXED code first.
**Test Cases**:
1. **`/items` Duplicate Failing Metric Test** — Insert two rows for `(STEAM-INTERSIGHT, 7.1.1)`: one with `vertical IS NULL, status = 'active', team = 'STEAM'`, another with `vertical = 'NTS_AEO', status = 'active', team = 'STEAM'`. Call `GET /items?team=STEAM&status=active`. Assert that `response.devices[0].failing_metrics.filter(m => m.metric_id === '7.1.1').length === 1`. (will fail on unfixed code — returns 2)
2. **`/items/:hostname` Duplicate Metric Entry Test** — Same fixture. Call `GET /items/STEAM-INTERSIGHT`. Assert `response.metrics.filter(m => m.metric_id === '7.1.1' && m.status === 'active').length === 1`. Assert `response.metrics[0].seen_count === MAX(seen_count across the duplicate rows)`. (will fail on unfixed code — returns 2)
3. **`/vcl/stats` Cross-Team Hostname Test** — Insert two rows for `(SOME-DEVICE, 7.1.1)`: one with `team = 'STEAM', vertical IS NULL`, another with `team = 'ACCESS-ENG', vertical = 'NTS_AEO'`. Call `GET /vcl/stats`. Assert `SUM(heavy_hitters[*].non_compliant) === stats.non_compliant`. (will fail on unfixed code — sum exceeds the global count by 1)
4. **`/vcl/stats` Forecast Negative Blockers Test** — Insert two rows for `(SOME-DEVICE, 7.1.1)` both with `team = 'STEAM', status = 'active'`, both with `resolution_date = '2025-09-30'`, but different verticals. Call `GET /vcl/stats`. Assert `vertical_breakdown.find(v => v.team === 'STEAM').blockers >= 0` AND that the unclamped `teamNonCompliant - dedupedForecastCount` equals the reported `blockers`. (will fail on unfixed code — clamped from `-1` to `0`, hiding the inconsistency; the unclamped check makes the failure visible)
5. **`/mttr` Inflated Bucket Test** — Insert two rows for `(SOME-DEVICE, 7.1.1)` with `seen_count = 5, team = 'STEAM', status = 'active'`, different verticals. Compare `SUM(aging[*].total)` to `COUNT(DISTINCT (hostname, metric_id) WHERE status = 'active')`. Assert equality. (will fail on unfixed code — bucket total exceeds distinct count by 1)
6. **`persistUpload()` Snapshot Inconsistency Test** — Pre-populate `compliance_items` with `(SOME-DEVICE, 7.1.1)` having one `status = 'active'` row and one `status = 'resolved'` row, both `team = 'STEAM'`, different verticals. Call `persistUpload()` with a no-op upload. Read back the `compliance_snapshots` row for the current month and `STEAM`. Assert `compliant + non_compliant <= total_devices`. (will fail on unfixed code — the hostname is counted in both columns so the sum exceeds `total_devices` by 1)
7. **Edge Case — Single-Vertical Regression Test** — Insert a fixture where every `(hostname, metric_id)` is unique (no cross-vertical duplicates). Call all five read endpoints and capture responses. Apply the fix, re-run, and assert response equality (byte-for-byte). Run `persistUpload()` on a single-vertical fixture and assert the resulting `compliance_snapshots` rows are identical pre-fix and post-fix. (should pass on unfixed code; will pass on fixed code; protects the preservation property)
**Expected Counterexamples**:
- `/items` returns `failing_metrics` arrays where `length > Set(metric_ids).size`. Cause: `groupByHostname()` pushes per row instead of per unique `metric_id`.
- `/items/:hostname` returns `metrics` arrays where `(metric_id, status)` collisions exist. Cause: no vertical filter and no dedup in the detail query or response builder.
- `/vcl/stats` returns `heavy_hitters` whose `non_compliant` values sum to more than the global `stats.non_compliant`. Cause: a hostname's `team` differs across verticals and the per-team `COUNT(DISTINCT hostname)` counts it under both teams.
- `/vcl/stats` reports `blockers` clamped to 0 when the unclamped expression is negative. Cause: the forecast query is not deduped on `(hostname, metric_id)` so its row count exceeds `teamNonCompliant`.
- `/mttr` returns `aging` whose total exceeds the distinct-violation count. Cause: `bucketAgingItems()` iterates rows directly with no dedup.
- `persistUpload()` writes `compliance_snapshots` rows where `compliant + non_compliant > total_devices`. Cause: the snapshot query's `CASE WHEN status = 'X'` pattern lets a hostname appear in both columns.
### Fix Checking
**Goal**: Verify that for all inputs where the bug condition holds (any `(hostname, metric_id)` shared by two or more rows across verticals), each fixed endpoint produces the expected deduped result.
**Pseudocode:**
```
FOR ALL items WHERE EXISTS (h, m) WITH COUNT(items WHERE hostname = h AND metric_id = m) > 1 DO
items_response := GET_items_fixed(items)
detail_response := GET_items_hostname_fixed(items, some_hostname_with_dups)
stats_response := GET_vcl_stats_fixed(items)
mttr_response := GET_mttr_fixed(items)
snapshot_rows := persistUpload_fixed(no_op_upload, items)
ASSERT no_duplicate_metrics_per_device(items_response.devices)
ASSERT no_duplicate_metric_status_pairs(detail_response.metrics)
ASSERT sum_of_team_counts_equals_global(stats_response)
ASSERT forecast_blockers_non_negative(stats_response.vertical_breakdown)
ASSERT mttr_total_equals_distinct_violations(mttr_response.aging, items)
ASSERT snapshot_invariant_holds(snapshot_rows)
END FOR
```
### Preservation Checking
**Goal**: Verify that for all inputs where the bug condition does NOT hold (every `(hostname, metric_id)` is unique across verticals), the fixed endpoints produce results identical to the original endpoints.
**Pseudocode:**
```
FOR ALL items WHERE FORALL (h, m), COUNT(items WHERE hostname = h AND metric_id = m) <= 1 DO
ASSERT GET_items_original(items) = GET_items_fixed(items)
ASSERT GET_items_hostname_original(items) = GET_items_hostname_fixed(items)
ASSERT GET_vcl_stats_original(items) = GET_vcl_stats_fixed(items)
ASSERT GET_mttr_original(items) = GET_mttr_fixed(items)
ASSERT persistUpload_original(items).snapshots = persistUpload_fixed(items).snapshots
END FOR
```
**Testing Approach**: Property-based testing is the right fit for preservation checking here:
- The unique-`(hostname, metric_id)` input space is large (any number of hostnames, any combination of metrics, any team and vertical mix) and exhaustive enumeration is impractical.
- The preservation property is strict equality, which is well-suited to PBT shrinking — any counterexample is a small fixture demonstrating a behavior change.
- The legacy AEO-only data shape (`vertical IS NULL`) must be exercised, which falls naturally out of generators that include `null` verticals.
**Test Plan**: Capture responses from the unfixed code on unique-`(hostname, metric_id)` fixtures (snapshot tests). After applying the fix, re-run the same fixtures and assert equality. Then run a property-based generator that produces random unique-key scenarios and asserts the same equality.
**Test Cases**:
1. **Snapshot Equality — Empty State** — Empty `compliance_items`. `/items?team=STEAM` returns `{ devices: [], team: 'STEAM', status: 'active' }`. `/items/:hostname` returns 404. `/vcl/stats` returns its zero-state shape. `/mttr` returns `{ aging: [] }`. `persistUpload()` writes no snapshot rows. Snapshot-test before and after the fix.
2. **Snapshot Equality — Single AEO-Only Items** — Items with `vertical IS NULL` only. Capture pre-fix responses, apply fix, assert equality across all five endpoints.
3. **Snapshot Equality — Multiple Unique Verticals** — A mix of items with `vertical IS NULL` and `vertical = 'NTS_AEO'`, but no `(hostname, metric_id)` collision across verticals. Capture pre-fix responses, apply fix, assert equality.
4. **`/items` Team and Status Filter Preservation** — Active and resolved items mixed across teams. Assert `?team=STEAM&status=active`, `?team=STEAM&status=resolved`, and `?team=ACCESS-ENG&status=active` each return the same devices pre-fix and post-fix on a unique-key fixture. Assert non-`ALLOWED_TEAMS` value (e.g., `?team=OTHER`) returns HTTP 400.
5. **`/items/:hostname` Active-Then-Resolved Ordering Preservation** — A device with both active and resolved metrics. Assert the response's `metrics` array has all `active` entries before all `resolved` entries, sorted by `metric_id` within each group, identical to pre-fix.
6. **`/vcl/stats` Donut Categorization Preservation** — Active items with mixed null/non-null `resolution_date`. Assert `donut.blocked` and `donut.in_progress` counts match pre-fix on a unique-key fixture (this is already deduped via `GROUP BY hostname` in the existing query).
7. **`persistUpload()` Snapshot Equality — Single-Status-Per-Hostname Fixture** — Pre-populate `compliance_items` so every hostname has rows of only one status (all active or all resolved within a team). Run `persistUpload()`. Assert `compliance_snapshots` rows are identical pre-fix and post-fix.
### Test Fixtures Needed
The following fixtures must be reusable across unit, integration, and property-based tests. Place them under `backend/__tests__/fixtures/compliance-duplicate-failing-metrics/` (or inline as factory functions in the test file, matching the convention used in `vcl-compliance-reporting.test.js`).
1. **`fixtureCrossVerticalDuplicateActive`** — A single hostname with two rows for the same `metric_id`, both `status = 'active'`, both `team = 'STEAM'`, one with `vertical IS NULL` and one with `vertical = 'NTS_AEO'`. Different `seen_count` (e.g., 3 and 5) so the representative-row policy is exercised. Used by `/items`, `/items/:hostname`, `/mttr`, and the `/vcl/stats` forecast tests.
2. **`fixtureCrossVerticalTeamMismatch`** — A single hostname with two active rows for the same `metric_id` but different `team` across verticals (`STEAM` in legacy, `ACCESS-ENG` in NTS_AEO). Used by the `/vcl/stats` heavy-hitters and per-team-totals test (Property 3).
3. **`fixtureCrossVerticalStatusMismatch`** — A single hostname with two rows for the same `metric_id` and same team, but different `status` across verticals (`active` in legacy, `resolved` in NTS_AEO). Used by the `persistUpload()` snapshot test (Property 6).
4. **`fixtureMultiHostnameMixed`** — A combination of unique-key hostnames and one duplicated `(hostname, metric_id)`. Used to verify that the dedup applies only to the duplicates and leaves unique entries untouched.
5. **`fixtureLegacyOnly`** — All rows with `vertical IS NULL`. Used for preservation checks that confirm legacy data flows are unchanged.
6. **`fixtureNTSAEOOnly`** — All rows with `vertical = 'NTS_AEO'`. Used for preservation checks that confirm multi-vertical-only data flows are unchanged.
7. **`fixtureForecastDuplicateResolutionDate`** — A duplicated `(hostname, metric_id)` where both rows have a non-null `resolution_date` (same value). Used to verify Property 4 (`blockers >= 0`).
8. **`fixtureSeenCountTiebreak`** — A duplicated `(hostname, metric_id)` where both rows have the same `seen_count` but different `upload_id`. Used to verify the `upload_id DESC` tiebreaker in the `DISTINCT ON ... ORDER BY` policy.
### Unit Tests
- `/items`: insert `fixtureCrossVerticalDuplicateActive`, assert `failing_metrics.length === 1` and `seen_count === 5`.
- `/items/:hostname`: insert `fixtureCrossVerticalDuplicateActive`, assert `metrics.length === 1` for `(7.1.1, active)`.
- `/items/:hostname` ordering: insert a fixture with one active and one resolved metric, assert `metrics[0].status === 'active'` and `metrics[1].status === 'resolved'`.
- `/vcl/stats` heavy-hitters: insert `fixtureCrossVerticalTeamMismatch`, assert hostname counted exactly once in the team derived from the representative row.
- `/vcl/stats` forecast: insert `fixtureForecastDuplicateResolutionDate`, assert `blockers >= 0` and `forecastItems.length === teamNonCompliant`.
- `/mttr`: insert `fixtureCrossVerticalDuplicateActive` with `seen_count = 5`, assert exactly one increment in the `46 cycles` bucket for `STEAM`.
- `persistUpload()` snapshot: insert `fixtureCrossVerticalStatusMismatch`, run `persistUpload()`, assert the resulting snapshot row has `compliant === 0`, `non_compliant === 1`, `total_devices === 1`.
- `groupByHostname()` direct: pass a list with two rows for the same `(hostname, metric_id)`, assert one entry in `failing_metrics`. This protects the helper's contract independently of the SQL dedup.
### Property-Based Tests
Use `fast-check` (already in use in `backend/__tests__/*.property.test.js`). Generators should mix `vertical IS NULL` and `vertical = 'NTS_AEO'` rows, with controllable rates of `(hostname, metric_id)` collision.
- **Property 1 — `/items`**: For any list of `compliance_items` rows including cross-vertical duplicates, assert that for every device in the response, `device.failing_metrics.length === new Set(device.failing_metrics.map(m => m.metric_id)).size`.
- **Property 2 — `/items/:hostname`**: For any duplicated `(hostname, metric_id, status)`, assert `response.metrics.filter(m => m.metric_id === target_metric && m.status === target_status).length === 1`. Assert the surviving entry's `seen_count` is the maximum across duplicates.
- **Property 3 — `/vcl/stats`**: For any input including team-mismatched cross-vertical duplicates, assert `SUM(heavy_hitters[*].non_compliant) === stats.non_compliant`. Generate inputs that vary the rate of team mismatch from 0% to 100%.
- **Property 4 — `/vcl/stats` forecast**: For any input including duplicated `(hostname, metric_id)` with non-null resolution dates, assert `vertical_breakdown[*].blockers >= 0` and assert the unclamped expression equals `teamNonCompliant - dedupedForecastCount`.
- **Property 5 — `/mttr`**: For any input including cross-vertical duplicate active rows, assert `SUM(aging[*].total) === COUNT(DISTINCT (hostname, metric_id) WHERE status = 'active')` over the input. Per-team totals: assert `SUM(aging[*][team])` over each team equals the distinct count for that team.
- **Property 6 — `persistUpload()` snapshot invariant**: For any combination of cross-vertical status mismatches, assert every row written into `compliance_snapshots` satisfies `compliant + non_compliant <= total_devices` and `compliant + non_compliant === total_devices` (equality, since every hostname is classified into exactly one column).
- **Property 7 — Preservation**: For any input where every `(hostname, metric_id)` is unique across verticals, the responses from all five endpoints SHALL equal the responses on the same input from the original (unfixed) implementations. Use fast-check's shrinking to surface the smallest counterexample if the property fails.
- **Property 8 — Representative-row policy**: For any duplicated `(hostname, metric_id)`, the surviving entry's `seen_count` SHALL equal `max(duplicates.seen_count)`, and ties SHALL be broken by `max(duplicates.upload_id)`.
### Integration Tests
- Full flow: upload a legacy AEO xlsx, then upload an NTS_AEO multi-vertical xlsx that overlaps on at least one `(hostname, metric_id)`. Call `/items`, `/items/:hostname`, `/vcl/stats`, `/mttr`, and read `compliance_snapshots` for the current month. Assert all six post-fix invariants hold.
- Cross-page consistency: load `/vcl/stats` and `/items?team=STEAM` for the same backing data. Assert that the device count from `/vcl/stats` heavy-hitters for `STEAM` equals the number of devices in `/items?team=STEAM&status=active`.
- ComplianceDetailPanel render: load a device with cross-vertical duplicates via `/items/:hostname`, render `ComplianceDetailPanel`, assert exactly one `MetricChip` with the duplicated `metric_id` is present in the "Failing Metrics" section.
- `persistUpload()` end-to-end: with cross-vertical status mismatch present, run a fresh upload and assert the new `compliance_snapshots` row's invariant holds.