cve-dashboard/.kiro/specs/compliance-duplicate-failing-metrics/design.md

# Compliance Duplicate Failing Metrics Bugfix Design

## Overview

Five compliance endpoints in `backend/routes/compliance.js` (`GET /items`, `GET /items/:hostname`, `GET /vcl/stats`, `GET /mttr`) and the `compliance_snapshots` block inside `persistUpload()` all share the same root cause: each one reads `compliance_items` with either no vertical filter or a filter that admits both legacy `vertical IS NULL` rows and multi-vertical `vertical = 'NTS_AEO'` rows, but does not deduplicate on `(hostname, metric_id)`. When the same `(hostname, metric_id)` pair exists in two verticals (the documented multi-vertical workflow), the duplicate row distorts the response — it duplicates a metric chip in the UI, double-counts a device across teams, inflates aging buckets, inflates forecast row counts, and (in the snapshot block) lets a single hostname appear in both `compliant` and `non_compliant` columns of the same `(snapshot_month, vertical)` row.

The fix is uniform across endpoints: dedupe at the SQL layer using `DISTINCT ON (hostname, metric_id)` with a deterministic `ORDER BY` (highest `seen_count`, then most recent `upload_id`) so each unique violation contributes exactly one row to the aggregation, and rewrite the snapshot query so each hostname is classified by its `MIN(status)` (active wins over resolved) inside a CTE before the count. The `groupByHostname` helper and the `/items/:hostname` response builder also gain a defensive in-memory dedupe keyed by `metric_id` (or `(metric_id, status)` for the detail endpoint), so a duplicate row that slips through any unforeseen code path still cannot duplicate a chip in the UI.

The implementation is intentionally minimal: each fix changes a single SQL statement and (for two endpoints) a small JavaScript loop. No schema migration, no new column, no frontend change. The frontend `ComplianceDetailPanel` already keys metrics on `metric_id` for chip rendering and will render correctly once the API stops returning duplicate rows.

## Glossary

- **Bug_Condition (C)**: The condition that triggers the bug — two or more rows in `compliance_items` share the same `(hostname, metric_id)` pair across verticals (i.e., one row with `vertical IS NULL` and another with `vertical = 'NTS_AEO'`, or two non-null verticals that both pass an endpoint's `WHERE` clause).
- **Property (P)**: The desired behavior when C holds — each affected endpoint returns or counts each `(hostname, metric_id)` exactly once, using the row with the highest `seen_count` (with most recent `upload_id` as tiebreak) as the representative.
- **Preservation**: Behavior on rows where `(hostname, metric_id)` is unique across verticals, on the empty-data response shape, and on unrelated query parameters (e.g., `team` and `status` on `/items`) — all must be byte-for-byte unchanged.
- **vertical**: `TEXT` column on `compliance_items` and `compliance_uploads` identifying which xlsx (NTS_AEO, SDIT_CISO, TSI) the row originated from. `NULL` indicates a legacy AEO-only upload. The bug class is specifically about rows that share `(hostname, metric_id)` but differ in `vertical`.
- **groupByHostname(rows, noteHostnames)**: Helper in `backend/routes/compliance.js` (lines ~213–230) that flattens a list of joined `compliance_items` rows into one device object per hostname, pushing each row's `metric_id` onto `failing_metrics`. It performs no deduplication.
- **bucketAgingItems(items)**: Helper in `backend/routes/compliance.js` (lines ~234–254) that places each item into one of four `seen_count` buckets per team. It iterates rows directly, so duplicate rows produce double-counted buckets.
- **persistUpload()**: Function in `backend/routes/compliance.js` (lines ~81–192) that writes a parsed upload to the DB and then writes per-vertical rows into `compliance_snapshots`. The snapshot query at the end of this function counts hostnames per team using `COUNT(DISTINCT CASE WHEN status = 'X' THEN hostname END)`, which double-counts a hostname into both the `compliant` and `non_compliant` columns when the hostname has both `active` and `resolved` rows across verticals.
- **representative row**: For a duplicated `(hostname, metric_id)`, the row chosen by `DISTINCT ON (hostname, metric_id) ... ORDER BY hostname, metric_id, seen_count DESC, upload_id DESC`. This is deterministic and aligns with the "use the row with the highest `seen_count` or most recent `upload_id`" rule from the requirements.

## Bug Details

### Bug Condition

The bug manifests when two or more rows in `compliance_items` share the same `(hostname, metric_id)` pair across different `vertical` values. This happens whenever the same device fails the same metric in more than one vertical's xlsx — including the originally reported case where a legacy `vertical IS NULL` AEO upload and a newer `vertical = 'NTS_AEO'` multi-vertical upload both contain `STEAM-INTERSIGHT` failing `7.1.1`.

**Formal Specification:**
```
FUNCTION isBugCondition(items)
  INPUT: items — list of compliance_items rows
  OUTPUT: boolean

  // The bug condition is triggered for any (hostname, metric_id) pair with more than one row
  GROUP items BY (hostname, metric_id) INTO groups
  RETURN EXISTS group IN groups WHERE COUNT(group) > 1
END FUNCTION
```

For a single endpoint response to be considered buggy, the API output must additionally fail one of the following invariants (the per-endpoint manifestation of the same root cause):

```
FUNCTION isBuggyResponse(endpoint, response)
  CASE endpoint OF
    '/items':              RETURN EXISTS device IN response.devices WHERE
                                  COUNT(device.failing_metrics) != COUNT(DISTINCT metric_id IN device.failing_metrics)
    '/items/:hostname':    RETURN EXISTS (metric_id, status) WITH COUNT(*) > 1 IN response.metrics
    '/vcl/stats heavy_hitters': RETURN SUM(hh.non_compliant FOR hh IN response.heavy_hitters) >
                                       COUNT(DISTINCT hostname WHERE has_active_violation)
    '/vcl/stats forecast': RETURN response.vertical_breakdown[i].blockers < 0
                                  FOR SOME i WITH duplicated (hostname, metric_id) AND non-null resolution_date
    '/mttr':               RETURN SUM(b.total FOR b IN response.aging) >
                                  COUNT(DISTINCT (hostname, metric_id) WHERE status = 'active')
    'persistUpload snapshot': RETURN EXISTS row IN compliance_snapshots WHERE
                                     compliant + non_compliant > total_devices
  END CASE
END FUNCTION
```

### Examples

The originally reported case (GitLab issue #13, `STEAM-INTERSIGHT` / 172.16.30.40 / metric `7.1.1`) and the four sibling manifestations:

- **`/items`** — `STEAM-INTERSIGHT` has two active rows for `metric_id = '7.1.1'` (one with `vertical IS NULL`, one with `vertical = 'NTS_AEO'`). The handler's `WHERE` clause admits both via `(ci.vertical IS NULL OR ci.vertical = 'NTS_AEO')`, and `groupByHostname()` does an unconditional `dev.failing_metrics.push(...)` per row. The device's `failing_metrics` array contains two entries with `metric_id = '7.1.1'`. Expected: one entry per unique `metric_id`.

- **`/items/:hostname`** — Same hostname, but the detail query has no vertical filter at all (just `WHERE ci.hostname = $1`). Both rows come back, the response builder maps `metricRows.map(...)` over both, and the frontend's `MetricChip` renders `7.1.1` twice in the "Failing Metrics" section. Expected: one entry per `(metric_id, status)` pair.

- **`/vcl/stats` heavy-hitters and per-team totals** — A device whose `team` differs between verticals (e.g., `STEAM` in the legacy row and `ACCESS-ENG` in the NTS_AEO row) is counted under both teams by `COUNT(DISTINCT hostname) ... GROUP BY team`. The sum across `heavy_hitters[*].non_compliant` exceeds the dashboard's `stats.non_compliant`. Expected: the hostname is assigned to exactly one team — the team from its representative row — and the per-team sums reconcile with the global non-compliant count.

- **`/vcl/stats` forecast-burndown** — For a team with one duplicated `(hostname, metric_id)` whose `resolution_date` is non-null in both rows, the forecast query `SELECT resolution_date FROM compliance_items WHERE status = 'active' AND team = $1 AND resolution_date IS NOT NULL` returns two rows. `forecastItems.length` is 2, but `teamNonCompliant` (from the de-team-counted DISTINCT-hostname query) is 1, so `blockers = 1 - 2 = -1`. The route then clamps to 0, hiding the underlying inconsistency. Expected: forecast is deduped by `(hostname, metric_id)` so the count matches `teamNonCompliant`'s scoping and `blockers` is non-negative.

- **`/mttr`** — Same duplicated `(hostname, metric_id)`. `bucketAgingItems()` receives both rows, increments the bucket twice. The team total for `STEAM` (or whichever team appears in the duplicate row) is inflated. Expected: each unique `(hostname, metric_id)` contributes to exactly one bucket using a single representative `seen_count`.

- **Edge case — `persistUpload()` snapshot** — A hostname has two rows: one with `status = 'resolved'` (legacy vertical) and one with `status = 'active'` (NTS_AEO vertical). The snapshot query
  ```sql
  COUNT(DISTINCT CASE WHEN status = 'resolved' THEN hostname END) AS compliant,
  COUNT(DISTINCT CASE WHEN status = 'active'   THEN hostname END) AS non_compliant
  ```
  counts the hostname once in `compliant` and once in `non_compliant`, so `compliant + non_compliant > total_devices`. Expected: the hostname is classified by its worst-case status (active wins over resolved) inside a CTE so it appears in exactly one column.

## Expected Behavior

### Preservation Requirements

**Unchanged Behaviors:**
- Rows where `(hostname, metric_id)` is unique across verticals: every endpoint returns the same numbers, in the same shape, in the same order as before the fix.
- Empty-data responses: `/items` returns `{ devices: [], team, status }`, `/items/:hostname` returns `404` for unknown hostnames, `/vcl/stats` returns its zero-state shape, `/mttr` returns `{ aging: [] }`.
- `/items` `team` and `status` query parameters: still validated against `ALLOWED_TEAMS` and `['active', 'resolved']`, still reject invalid values with HTTP 400.
- `/items` `(ci.vertical IS NULL OR ci.vertical = 'NTS_AEO')` predicate: kept as-is so AEO-only legacy data continues to surface alongside NTS_AEO multi-vertical data. The fix only adds dedup on top, it does not narrow the set of admitted verticals.
- `/items/:hostname` ordering by `status DESC, metric_id`: unchanged so `active` metrics remain listed before `resolved` ones.
- `/vcl/stats` donut categorization (`blocked` / `in_progress` by `MAX(resolution_date)` per hostname): already deduped by `GROUP BY hostname` and stays unchanged.
- `/vcl/stats` global `stats.compliant` / `stats.non_compliant`: already use `COUNT(DISTINCT hostname)` and stay unchanged.
- Frontend components (`ComplianceDetailPanel`, `ComplianceCharts`, `CompliancePage`): no changes required. They already render one chip per `metric_id`; the bug was purely an upstream data issue.
- `persistUpload()` error handling: snapshot creation remains wrapped in `try/catch` that logs but does not fail the upload commit.

**Scope:**
All endpoint inputs that do not involve cross-vertical duplicate `(hostname, metric_id)` rows must be byte-for-byte identical to the pre-fix output. The fix only changes what happens when two or more `compliance_items` rows share `(hostname, metric_id)` across verticals.

## Hypothesized Root Cause

All five sites have the same shape of bug — missing dedup on `(hostname, metric_id)` after the multi-vertical migration admitted two-row scenarios — but with slightly different mechanics. Listing them explicitly so the test plan can confirm or refute each one:

1. **`/items` — `groupByHostname()` pushes every row.** The handler's `WHERE` clause `(ci.vertical IS NULL OR ci.vertical = 'NTS_AEO')` admits both verticals' rows. `groupByHostname()` then iterates each row and unconditionally calls `dev.failing_metrics.push({ metric_id, ... })`. There is no `Set`/`Map` keyed on `metric_id`, so a duplicate row produces a duplicate metric in the array.

2. **`/items/:hostname` — no vertical filter, no dedup.** The detail query is `WHERE ci.hostname = $1` with no `vertical` predicate at all, so every vertical's row for that hostname is returned. The response builder does `metricRows.map(r => ({ ...r }))` — there is no dedup step. Every duplicate row becomes a duplicate entry in `metrics`.

3. **`/vcl/stats` heavy-hitters and per-team totals — team chosen per row, not per hostname.** The `GROUP BY team ... COUNT(DISTINCT hostname)` query is correct for "how many distinct hostnames does each team see," but a hostname with rows under two different teams (because `team` differs across verticals) is counted in both groups. The dashboard's global `stats.non_compliant` (a single `COUNT(DISTINCT hostname)` with no team scoping) does not match `SUM(heavy_hitters[*].non_compliant)`.

4. **`/vcl/stats` forecast-burndown — duplicate rows inflate forecast count.** The query `SELECT resolution_date FROM compliance_items WHERE status = 'active' AND team = $1 AND resolution_date IS NOT NULL` returns one row per `compliance_items` row, not one row per `(hostname, metric_id)`. The downstream `blockers = teamNonCompliant - forecastItems.length` calculation can go negative because `forecastItems.length` is inflated relative to the deduped `teamNonCompliant`.

5. **`/mttr` — `bucketAgingItems()` iterates rows directly.** The query `SELECT seen_count, team FROM compliance_items WHERE status = 'active'` returns every active row, and `bucketAgingItems()` does `for (const item of items) { buckets[label].total += 1; ... }`. There is no `Set` keyed on `(hostname, metric_id)`, so each duplicate row increments its bucket twice.

6. **`persistUpload()` snapshot — `CASE WHEN status =` double-counts.** The snapshot query
   ```sql
   COUNT(DISTINCT CASE WHEN status = 'resolved' THEN hostname END) AS compliant,
   COUNT(DISTINCT CASE WHEN status = 'active'   THEN hostname END) AS non_compliant
   ```
   counts a hostname in `compliant` if any of its rows is `resolved` AND in `non_compliant` if any of its rows is `active`. With duplicate rows that disagree on status across verticals, the same hostname lands in both columns.

The common structural cause is that the multi-vertical migration (`add_vcl_multi_vertical.js`) added a `vertical` column to `compliance_items` but did not retrofit existing read queries — which assumed `(hostname, metric_id)` was effectively a unique key — to either dedupe explicitly or scope to a single vertical.

## Correctness Properties

Property 1: Bug Condition — `/items` failing_metrics array contains at most one entry per metric_id

_For any_ set of `compliance_items` rows including cross-vertical duplicates of `(hostname, metric_id)`, the response from `GET /items` SHALL contain, for every device in `response.devices`, exactly one entry per unique `metric_id` in `device.failing_metrics`. Formally: `device.failing_metrics.length == new Set(device.failing_metrics.map(m => m.metric_id)).size`.

**Validates: Requirements 2.1**

Property 2: Bug Condition — `/items/:hostname` returns one metric per (metric_id, status)

_For any_ device with cross-vertical duplicate `(hostname, metric_id)` rows, the response from `GET /items/:hostname` SHALL contain exactly one entry per unique `(metric_id, status)` pair in `response.metrics`. Each entry's `seen_count` SHALL equal the maximum `seen_count` across the duplicate rows for that pair.

**Validates: Requirements 2.2, 2.3**

Property 3: Bug Condition — `/vcl/stats` per-team device counts equal COUNT(DISTINCT hostname)

_For any_ set of `compliance_items` rows including cross-vertical duplicates where a hostname's `team` differs across verticals, the response from `GET /vcl/stats` SHALL satisfy `SUM(heavy_hitters[*].non_compliant) == stats.non_compliant`. Each hostname SHALL be assigned to exactly one team — the team from its representative row (highest `seen_count`, then most recent `upload_id`) — regardless of how many verticals contain rows for that hostname.

**Validates: Requirements 2.5**

Property 4: Bug Condition — `/vcl/stats` forecast-burndown is deduped by (hostname, metric_id) and blockers is non-negative

_For any_ team with cross-vertical duplicate `(hostname, metric_id)` rows where both rows have a non-null `resolution_date`, the deduped forecast row count for that team SHALL contribute at most one entry per unique `(hostname, metric_id)`, and `blockers = teamNonCompliant - dedupedForecastCount` SHALL be `>= 0` (no clamp required to satisfy the invariant).

**Validates: Requirements 2.6**

Property 5: Bug Condition — `/mttr` aging buckets count each unique (hostname, metric_id) exactly once

_For any_ set of active `compliance_items` rows including cross-vertical duplicates, the response from `GET /mttr` SHALL satisfy `SUM(aging[*].total) == COUNT(DISTINCT (hostname, metric_id) WHERE status = 'active')`. Each unique `(hostname, metric_id)` SHALL be bucketed exactly once using a single representative `seen_count`.

**Validates: Requirements 2.7**

Property 6: Bug Condition — `persistUpload()` snapshot rows satisfy compliant + non_compliant <= total_devices

_For any_ `persistUpload()` invocation, every row written into `compliance_snapshots` for the current `(snapshot_month, vertical)` pair SHALL satisfy `compliant + non_compliant <= total_devices`. A hostname with both `active` and `resolved` rows for the same team SHALL be classified as `non_compliant` (active wins over resolved) and SHALL appear in exactly one of the two columns.

**Validates: Requirements 2.4**

Property 7: Preservation — non-duplicated rows are unchanged

_For any_ set of `compliance_items` rows where every `(hostname, metric_id)` is unique across verticals, the responses from `/items`, `/items/:hostname`, `/vcl/stats`, `/mttr`, and the `compliance_snapshots` rows written by `persistUpload()` SHALL be identical to the pre-fix output for the same input. The fix SHALL NOT change behavior on the unique-`(hostname, metric_id)` case.

**Validates: Requirements 3.1, 3.2, 3.3, 3.4, 3.5, 3.6, 3.7**

Property 8: Preservation — `seen_count`, `first_seen`, and `last_seen` are correctly aggregated for the representative row

_For any_ duplicated `(hostname, metric_id)`, the surviving entry SHALL carry `seen_count = MAX(seen_count)` across the duplicates, `first_seen = MIN(first_seen)` across the duplicates, and `last_seen = MAX(last_seen)` across the duplicates. Active/resolved status separation SHALL be preserved (a metric still appears in the `active` section if any of its duplicate rows is `active`).

**Validates: Requirements 3.2, 3.3**

## Fix Implementation

### Changes Required

All changes are in `backend/routes/compliance.js`. No schema migration, no new column, no frontend change. SQL-level dedup is preferred wherever possible because it avoids materialising duplicates into Node memory and centralises the representative-row policy at the data layer.

#### Fix 1: `GET /items` — `DISTINCT ON (hostname, metric_id)` in SQL, defensive dedupe in `groupByHostname`

**File**: `backend/routes/compliance.js`

**Function**: `router.get('/items', ...)` (around line 535) and helper `groupByHostname` (around line 213)

**Specific Changes**:
1. Rewrite the items query to use `DISTINCT ON (hostname, metric_id)` so each unique violation contributes exactly one row, choosing the representative row by highest `seen_count` and most recent `upload_id`:
   ```sql
   SELECT DISTINCT ON (ci.hostname, ci.metric_id)
          ci.hostname, ci.ip_address, ci.device_type, ci.team,
          ci.metric_id, ci.metric_desc, ci.category, ci.status, ci.seen_count,
          fu.report_date AS first_seen,
          lu.report_date AS last_seen,
          ru.report_date AS resolved_on
   FROM compliance_items ci
   LEFT JOIN compliance_uploads fu ON ci.first_seen_upload_id = fu.id
   LEFT JOIN compliance_uploads lu ON ci.upload_id = lu.id
   LEFT JOIN compliance_uploads ru ON ci.resolved_upload_id = ru.id
   WHERE ci.team = $1
     AND ci.status = $2
     AND (ci.vertical IS NULL OR ci.vertical = 'NTS_AEO')
   ORDER BY ci.hostname, ci.metric_id, ci.seen_count DESC, ci.upload_id DESC
   ```
   The `ORDER BY` lead expressions match the `DISTINCT ON` columns; the trailing expressions select the representative.
2. Add a defensive in-memory dedupe in `groupByHostname()` keyed on `metric_id` per device, so any future code path that bypasses the SQL dedupe still cannot duplicate a chip:
   ```javascript
   function groupByHostname(rows, noteHostnames) {
       const deviceMap = {};
       for (const row of rows) {
           if (!deviceMap[row.hostname]) {
               deviceMap[row.hostname] = {
                   hostname: row.hostname, ip_address: row.ip_address || '', device_type: row.device_type || '',
                   team: row.team || '', status: row.status, failing_metrics: [],
                   _seenMetricIds: new Set(),
                   seen_count: row.seen_count || 1, first_seen: row.first_seen || null,
                   last_seen: row.last_seen || null, resolved_on: row.resolved_on || null,
                   has_notes: noteHostnames.has(row.hostname),
               };
           }
           const dev = deviceMap[row.hostname];
           if (!dev._seenMetricIds.has(row.metric_id)) {
               dev._seenMetricIds.add(row.metric_id);
               dev.failing_metrics.push({ metric_id: row.metric_id, metric_desc: row.metric_desc || '', category: row.category || '' });
           }
           if ((row.seen_count || 1) > dev.seen_count) dev.seen_count = row.seen_count;
           if (row.first_seen && (!dev.first_seen || row.first_seen < dev.first_seen)) dev.first_seen = row.first_seen;
           if (row.last_seen && (!dev.last_seen || row.last_seen > dev.last_seen)) dev.last_seen = row.last_seen;
       }
       return Object.values(deviceMap).map(({ _seenMetricIds, ...dev }) => dev);
   }
   ```
   The `_seenMetricIds` set is stripped before the device is returned so the response shape is unchanged.

#### Fix 2: `GET /items/:hostname` — `DISTINCT ON (metric_id, status)` in SQL

**File**: `backend/routes/compliance.js`

**Function**: `router.get('/items/:hostname', ...)` (around line 575)

**Specific Changes**:
1. Rewrite the metrics query to dedupe on `(metric_id, status)` so a metric that appears as both `active` and `resolved` is preserved (one row each), but cross-vertical duplicates of the same `(metric_id, status)` collapse to one:
   ```sql
   SELECT DISTINCT ON (ci.metric_id, ci.status)
          ci.metric_id, ci.metric_desc, ci.category, ci.status,
          ci.ip_address, ci.device_type, ci.team, ci.seen_count, ci.extra_json,
          ci.resolution_date, ci.remediation_plan,
          fu.report_date AS first_seen, fu.uploaded_at AS first_seen_at,
          lu.report_date AS last_seen,  lu.uploaded_at AS last_seen_at,
          ru.report_date AS resolved_on
   FROM compliance_items ci
   LEFT JOIN compliance_uploads fu ON ci.first_seen_upload_id = fu.id
   LEFT JOIN compliance_uploads lu ON ci.upload_id = lu.id
   LEFT JOIN compliance_uploads ru ON ci.resolved_upload_id = ru.id
   WHERE ci.hostname = $1
   ORDER BY ci.metric_id, ci.status, ci.seen_count DESC, ci.upload_id DESC
   ```
2. The existing post-query sort (`status DESC` for grouping, then `metric_id`) is rebuilt in JavaScript on the deduped result to preserve the response ordering:
   ```javascript
   metricRows.sort((a, b) => {
       if (a.status !== b.status) return b.status.localeCompare(a.status);  // 'resolved' < 'active' alphabetically; we want 'active' first
       return a.metric_id.localeCompare(b.metric_id);
   });
   ```
   (Note: the original SQL used `ORDER BY ci.status DESC, ci.metric_id`, which placed `resolved` before `active` because of `DESC` on the alphabetic order. The post-fix sort reproduces that exact ordering on the deduped rows.)
3. The `identity` lookup `metricRows.find(r => r.status === 'active') || metricRows[0]` is unchanged — it still picks the active representative for `ip_address`, `device_type`, `team`, `resolution_date`, and `remediation_plan`.

#### Fix 3: `GET /vcl/stats` heavy-hitters and per-team totals — dedupe to one team per hostname via CTE

**File**: `backend/routes/compliance.js`

**Function**: `router.get('/vcl/stats', ...)` (around line 990, the `teamRows` query)

**Specific Changes**:
1. Replace the heavy-hitters query with a CTE that first picks one representative row per hostname (using the same `DISTINCT ON (hostname)` policy: highest `seen_count`, then most recent `upload_id`), then groups by the representative's team:
   ```sql
   WITH device_team AS (
       SELECT DISTINCT ON (hostname)
              hostname,
              COALESCE(team, 'Unknown') AS team,
              resolution_date
       FROM compliance_items
       WHERE status = 'active'
       ORDER BY hostname, seen_count DESC, upload_id DESC
   )
   SELECT team,
          COUNT(DISTINCT hostname)::int AS non_compliant,
          MAX(resolution_date)          AS compliance_date
   FROM device_team
   GROUP BY team
   ORDER BY COUNT(DISTINCT hostname) DESC
   ```
   Because the CTE already collapses each hostname to one row, `COUNT(DISTINCT hostname)` per team is equivalent to `COUNT(*)` per team, but `DISTINCT` is kept for defensive symmetry.
2. The dashboard's existing `stats.non_compliant` query (also `COUNT(DISTINCT hostname)` over all active rows, no team scoping) is unchanged — but now the property `SUM(heavy_hitters[*].non_compliant) == stats.non_compliant` holds because both numerators and denominators agree on "one hostname, one team."
3. The per-team-total query inside the `for (const teamRow of teamRows)` loop (`SELECT COUNT(DISTINCT hostname) AS total FROM compliance_items WHERE COALESCE(team, 'Unknown') = $1`) is similarly rewritten to use the same `device_team`-style CTE so the team's `total_devices` matches the team's `non_compliant`. The simplest form:
   ```sql
   WITH device_team AS (
       SELECT DISTINCT ON (hostname)
              hostname,
              COALESCE(team, 'Unknown') AS team
       FROM compliance_items
       ORDER BY hostname, seen_count DESC, upload_id DESC
   )
   SELECT COUNT(*)::int AS total FROM device_team WHERE team = $1
   ```
   Note: the loop runs once per team, so this CTE is computed N times — acceptable given the small team count (4 teams). If profiling shows this is a hot path, hoist the `device_team` CTE out of the loop and compute totals in a single grouped query.

#### Fix 4: `GET /vcl/stats` forecast-burndown — dedupe by `(hostname, metric_id)` in SQL

**File**: `backend/routes/compliance.js`

**Function**: `router.get('/vcl/stats', ...)` (around line 1015, the `forecastItems` query)

**Specific Changes**:
1. Rewrite the forecast query to `DISTINCT ON (hostname, metric_id)` so each unique violation contributes at most one `resolution_date` to the forecast bucketing:
   ```sql
   SELECT DISTINCT ON (hostname, metric_id) resolution_date
   FROM compliance_items
   WHERE status = 'active'
     AND COALESCE(team, 'Unknown') = $1
     AND resolution_date IS NOT NULL
   ORDER BY hostname, metric_id, seen_count DESC, upload_id DESC
   ```
2. The downstream `blockers = teamNonCompliant - forecastItems.length` calculation is unchanged. With both sides now deduped consistently — `teamNonCompliant` from the `device_team` CTE in Fix 3, `forecastItems.length` from the deduped query above — the difference is non-negative without any clamp.
3. The `Math.max(blockers, 0)` clamp at the existing call site is left in place as a belt-and-braces safeguard. It SHOULD be a no-op after the fix; if a regression introduces inconsistency again, the property test for Property 4 will catch it.

#### Fix 5: `GET /mttr` — dedupe by `(hostname, metric_id)` in SQL

**File**: `backend/routes/compliance.js`

**Function**: `router.get('/mttr', ...)` (around line 824)

**Specific Changes**:
1. Rewrite the query to `DISTINCT ON (hostname, metric_id)` so each unique active violation contributes exactly one `(seen_count, team)` row to `bucketAgingItems()`:
   ```sql
   SELECT DISTINCT ON (hostname, metric_id)
          COALESCE(seen_count, 1) AS seen_count,
          team
   FROM compliance_items
   WHERE status = 'active'
   ORDER BY hostname, metric_id, seen_count DESC, upload_id DESC
   ```
2. `bucketAgingItems()` itself does not change. It already does the right thing when fed one row per unique violation.
3. SQL-level dedup is preferred over a JavaScript-side `Set` here because the helper is also called from `/vcl/stats` and changing its contract risks regressions in the other caller. Pushing the dedup into SQL keeps `bucketAgingItems()` pure and reusable.

#### Fix 6: `persistUpload()` snapshot block — classify hostnames via `MIN(status)` in a CTE

**File**: `backend/routes/compliance.js`

**Function**: `persistUpload()` (lines 81–192), specifically the `verticalStats` query at line 157

**Specific Changes**:
1. Rewrite the snapshot query so each hostname is classified by its worst-case status (active wins over resolved) inside a CTE before counting. Because `'active' < 'resolved'` lexicographically, `MIN(status)` returns `'active'` for a hostname that has any active row and `'resolved'` only if all rows are `'resolved'`:
   ```sql
   WITH hostname_status AS (
       SELECT team,
              hostname,
              MIN(status) AS status  -- 'active' beats 'resolved' alphabetically
       FROM compliance_items
       WHERE team IS NOT NULL
       GROUP BY team, hostname
   )
   SELECT team AS vertical,
          COUNT(*)::int                                                 AS total_devices,
          COUNT(*) FILTER (WHERE status = 'resolved')::int              AS compliant,
          COUNT(*) FILTER (WHERE status = 'active')::int                AS non_compliant
   FROM hostname_status
   GROUP BY team
   ```
   Each hostname appears exactly once in `hostname_status`, so `compliant + non_compliant == total_devices` is structurally guaranteed.
2. The downstream `INSERT ... ON CONFLICT (snapshot_month, vertical) DO UPDATE` block is unchanged. It already keys snapshots on `vertical`, and the per-team counts now satisfy `compliant + non_compliant <= total_devices`.
3. The `compliance_pct` calculation `Math.round((vs.compliant / total) * 100 * 100) / 100` is unchanged. It is now defended against the previous off-by-one: with the old query a hostname could be in both `compliant` and `non_compliant`, double-counting itself in `compliant`'s numerator while only contributing once to `total_devices`'s denominator. The fix removes that.

> Note: Fix 6 is conceptually adjacent to but mechanically distinct from the `persistUpload` fix in `compliance-duplicate-chart-entries`. That spec adds a `WHERE vertical = $1` filter to scope the snapshot to one vertical at a time. This spec adds the `hostname_status` CTE so that within whichever vertical is snapshotted, each hostname is classified once. Both fixes are independently necessary; landing them together is preferred but each is correct on its own.

## Testing Strategy

### Validation Approach

The bug condition is straightforward to construct: insert two `compliance_items` rows for the same `(hostname, metric_id)` with different `vertical` values, then call each affected endpoint. The two-phase approach is to first run the tests against the unfixed code to confirm the duplication counterexamples, then run the same tests against the fixed code and add property-based tests that explore the input space more broadly.

### Exploratory Bug Condition Checking

**Goal**: Surface counterexamples that demonstrate each of the six manifestations BEFORE implementing the fix. Confirm or refute the root cause analysis for each endpoint independently — they share a structural cause but the SQL details differ.

**Test Plan**: Seed a clean test database with a fixture representing the original GitLab #13 scenario (`STEAM-INTERSIGHT` failing `7.1.1` in both `vertical IS NULL` and `vertical = 'NTS_AEO'`), plus targeted variants for each sibling endpoint. Call each affected endpoint and assert the buggy invariants. Run on UNFIXED code first.

**Test Cases**:

1. **`/items` Duplicate Failing Metric Test** — Insert two rows for `(STEAM-INTERSIGHT, 7.1.1)`: one with `vertical IS NULL, status = 'active', team = 'STEAM'`, another with `vertical = 'NTS_AEO', status = 'active', team = 'STEAM'`. Call `GET /items?team=STEAM&status=active`. Assert that `response.devices[0].failing_metrics.filter(m => m.metric_id === '7.1.1').length === 1`. (will fail on unfixed code — returns 2)

2. **`/items/:hostname` Duplicate Metric Entry Test** — Same fixture. Call `GET /items/STEAM-INTERSIGHT`. Assert `response.metrics.filter(m => m.metric_id === '7.1.1' && m.status === 'active').length === 1`. Assert `response.metrics[0].seen_count === MAX(seen_count across the duplicate rows)`. (will fail on unfixed code — returns 2)

3. **`/vcl/stats` Cross-Team Hostname Test** — Insert two rows for `(SOME-DEVICE, 7.1.1)`: one with `team = 'STEAM', vertical IS NULL`, another with `team = 'ACCESS-ENG', vertical = 'NTS_AEO'`. Call `GET /vcl/stats`. Assert `SUM(heavy_hitters[*].non_compliant) === stats.non_compliant`. (will fail on unfixed code — sum exceeds the global count by 1)

4. **`/vcl/stats` Forecast Negative Blockers Test** — Insert two rows for `(SOME-DEVICE, 7.1.1)` both with `team = 'STEAM', status = 'active'`, both with `resolution_date = '2025-09-30'`, but different verticals. Call `GET /vcl/stats`. Assert `vertical_breakdown.find(v => v.team === 'STEAM').blockers >= 0` AND that the unclamped `teamNonCompliant - dedupedForecastCount` equals the reported `blockers`. (will fail on unfixed code — clamped from `-1` to `0`, hiding the inconsistency; the unclamped check makes the failure visible)

5. **`/mttr` Inflated Bucket Test** — Insert two rows for `(SOME-DEVICE, 7.1.1)` with `seen_count = 5, team = 'STEAM', status = 'active'`, different verticals. Compare `SUM(aging[*].total)` to `COUNT(DISTINCT (hostname, metric_id) WHERE status = 'active')`. Assert equality. (will fail on unfixed code — bucket total exceeds distinct count by 1)

6. **`persistUpload()` Snapshot Inconsistency Test** — Pre-populate `compliance_items` with `(SOME-DEVICE, 7.1.1)` having one `status = 'active'` row and one `status = 'resolved'` row, both `team = 'STEAM'`, different verticals. Call `persistUpload()` with a no-op upload. Read back the `compliance_snapshots` row for the current month and `STEAM`. Assert `compliant + non_compliant <= total_devices`. (will fail on unfixed code — the hostname is counted in both columns so the sum exceeds `total_devices` by 1)

7. **Edge Case — Single-Vertical Regression Test** — Insert a fixture where every `(hostname, metric_id)` is unique (no cross-vertical duplicates). Call all five read endpoints and capture responses. Apply the fix, re-run, and assert response equality (byte-for-byte). Run `persistUpload()` on a single-vertical fixture and assert the resulting `compliance_snapshots` rows are identical pre-fix and post-fix. (should pass on unfixed code; will pass on fixed code; protects the preservation property)

**Expected Counterexamples**:
- `/items` returns `failing_metrics` arrays where `length > Set(metric_ids).size`. Cause: `groupByHostname()` pushes per row instead of per unique `metric_id`.
- `/items/:hostname` returns `metrics` arrays where `(metric_id, status)` collisions exist. Cause: no vertical filter and no dedup in the detail query or response builder.
- `/vcl/stats` returns `heavy_hitters` whose `non_compliant` values sum to more than the global `stats.non_compliant`. Cause: a hostname's `team` differs across verticals and the per-team `COUNT(DISTINCT hostname)` counts it under both teams.
- `/vcl/stats` reports `blockers` clamped to 0 when the unclamped expression is negative. Cause: the forecast query is not deduped on `(hostname, metric_id)` so its row count exceeds `teamNonCompliant`.
- `/mttr` returns `aging` whose total exceeds the distinct-violation count. Cause: `bucketAgingItems()` iterates rows directly with no dedup.
- `persistUpload()` writes `compliance_snapshots` rows where `compliant + non_compliant > total_devices`. Cause: the snapshot query's `CASE WHEN status = 'X'` pattern lets a hostname appear in both columns.

### Fix Checking

**Goal**: Verify that for all inputs where the bug condition holds (any `(hostname, metric_id)` shared by two or more rows across verticals), each fixed endpoint produces the expected deduped result.

**Pseudocode:**
```
FOR ALL items WHERE EXISTS (h, m) WITH COUNT(items WHERE hostname = h AND metric_id = m) > 1 DO
  items_response   := GET_items_fixed(items)
  detail_response  := GET_items_hostname_fixed(items, some_hostname_with_dups)
  stats_response   := GET_vcl_stats_fixed(items)
  mttr_response    := GET_mttr_fixed(items)
  snapshot_rows    := persistUpload_fixed(no_op_upload, items)

  ASSERT no_duplicate_metrics_per_device(items_response.devices)
  ASSERT no_duplicate_metric_status_pairs(detail_response.metrics)
  ASSERT sum_of_team_counts_equals_global(stats_response)
  ASSERT forecast_blockers_non_negative(stats_response.vertical_breakdown)
  ASSERT mttr_total_equals_distinct_violations(mttr_response.aging, items)
  ASSERT snapshot_invariant_holds(snapshot_rows)
END FOR
```

### Preservation Checking

**Goal**: Verify that for all inputs where the bug condition does NOT hold (every `(hostname, metric_id)` is unique across verticals), the fixed endpoints produce results identical to the original endpoints.

**Pseudocode:**
```
FOR ALL items WHERE FORALL (h, m), COUNT(items WHERE hostname = h AND metric_id = m) <= 1 DO
  ASSERT GET_items_original(items)          = GET_items_fixed(items)
  ASSERT GET_items_hostname_original(items) = GET_items_hostname_fixed(items)
  ASSERT GET_vcl_stats_original(items)      = GET_vcl_stats_fixed(items)
  ASSERT GET_mttr_original(items)           = GET_mttr_fixed(items)
  ASSERT persistUpload_original(items).snapshots = persistUpload_fixed(items).snapshots
END FOR
```

**Testing Approach**: Property-based testing is the right fit for preservation checking here:
- The unique-`(hostname, metric_id)` input space is large (any number of hostnames, any combination of metrics, any team and vertical mix) and exhaustive enumeration is impractical.
- The preservation property is strict equality, which is well-suited to PBT shrinking — any counterexample is a small fixture demonstrating a behavior change.
- The legacy AEO-only data shape (`vertical IS NULL`) must be exercised, which falls naturally out of generators that include `null` verticals.

**Test Plan**: Capture responses from the unfixed code on unique-`(hostname, metric_id)` fixtures (snapshot tests). After applying the fix, re-run the same fixtures and assert equality. Then run a property-based generator that produces random unique-key scenarios and asserts the same equality.

**Test Cases**:
1. **Snapshot Equality — Empty State** — Empty `compliance_items`. `/items?team=STEAM` returns `{ devices: [], team: 'STEAM', status: 'active' }`. `/items/:hostname` returns 404. `/vcl/stats` returns its zero-state shape. `/mttr` returns `{ aging: [] }`. `persistUpload()` writes no snapshot rows. Snapshot-test before and after the fix.
2. **Snapshot Equality — Single AEO-Only Items** — Items with `vertical IS NULL` only. Capture pre-fix responses, apply fix, assert equality across all five endpoints.
3. **Snapshot Equality — Multiple Unique Verticals** — A mix of items with `vertical IS NULL` and `vertical = 'NTS_AEO'`, but no `(hostname, metric_id)` collision across verticals. Capture pre-fix responses, apply fix, assert equality.
4. **`/items` Team and Status Filter Preservation** — Active and resolved items mixed across teams. Assert `?team=STEAM&status=active`, `?team=STEAM&status=resolved`, and `?team=ACCESS-ENG&status=active` each return the same devices pre-fix and post-fix on a unique-key fixture. Assert non-`ALLOWED_TEAMS` value (e.g., `?team=OTHER`) returns HTTP 400.
5. **`/items/:hostname` Active-Then-Resolved Ordering Preservation** — A device with both active and resolved metrics. Assert the response's `metrics` array has all `active` entries before all `resolved` entries, sorted by `metric_id` within each group, identical to pre-fix.
6. **`/vcl/stats` Donut Categorization Preservation** — Active items with mixed null/non-null `resolution_date`. Assert `donut.blocked` and `donut.in_progress` counts match pre-fix on a unique-key fixture (this is already deduped via `GROUP BY hostname` in the existing query).
7. **`persistUpload()` Snapshot Equality — Single-Status-Per-Hostname Fixture** — Pre-populate `compliance_items` so every hostname has rows of only one status (all active or all resolved within a team). Run `persistUpload()`. Assert `compliance_snapshots` rows are identical pre-fix and post-fix.

### Test Fixtures Needed

The following fixtures must be reusable across unit, integration, and property-based tests. Place them under `backend/__tests__/fixtures/compliance-duplicate-failing-metrics/` (or inline as factory functions in the test file, matching the convention used in `vcl-compliance-reporting.test.js`).

1. **`fixtureCrossVerticalDuplicateActive`** — A single hostname with two rows for the same `metric_id`, both `status = 'active'`, both `team = 'STEAM'`, one with `vertical IS NULL` and one with `vertical = 'NTS_AEO'`. Different `seen_count` (e.g., 3 and 5) so the representative-row policy is exercised. Used by `/items`, `/items/:hostname`, `/mttr`, and the `/vcl/stats` forecast tests.

2. **`fixtureCrossVerticalTeamMismatch`** — A single hostname with two active rows for the same `metric_id` but different `team` across verticals (`STEAM` in legacy, `ACCESS-ENG` in NTS_AEO). Used by the `/vcl/stats` heavy-hitters and per-team-totals test (Property 3).

3. **`fixtureCrossVerticalStatusMismatch`** — A single hostname with two rows for the same `metric_id` and same team, but different `status` across verticals (`active` in legacy, `resolved` in NTS_AEO). Used by the `persistUpload()` snapshot test (Property 6).

4. **`fixtureMultiHostnameMixed`** — A combination of unique-key hostnames and one duplicated `(hostname, metric_id)`. Used to verify that the dedup applies only to the duplicates and leaves unique entries untouched.

5. **`fixtureLegacyOnly`** — All rows with `vertical IS NULL`. Used for preservation checks that confirm legacy data flows are unchanged.

6. **`fixtureNTSAEOOnly`** — All rows with `vertical = 'NTS_AEO'`. Used for preservation checks that confirm multi-vertical-only data flows are unchanged.

7. **`fixtureForecastDuplicateResolutionDate`** — A duplicated `(hostname, metric_id)` where both rows have a non-null `resolution_date` (same value). Used to verify Property 4 (`blockers >= 0`).

8. **`fixtureSeenCountTiebreak`** — A duplicated `(hostname, metric_id)` where both rows have the same `seen_count` but different `upload_id`. Used to verify the `upload_id DESC` tiebreaker in the `DISTINCT ON ... ORDER BY` policy.

### Unit Tests

- `/items`: insert `fixtureCrossVerticalDuplicateActive`, assert `failing_metrics.length === 1` and `seen_count === 5`.
- `/items/:hostname`: insert `fixtureCrossVerticalDuplicateActive`, assert `metrics.length === 1` for `(7.1.1, active)`.
- `/items/:hostname` ordering: insert a fixture with one active and one resolved metric, assert `metrics[0].status === 'active'` and `metrics[1].status === 'resolved'`.
- `/vcl/stats` heavy-hitters: insert `fixtureCrossVerticalTeamMismatch`, assert hostname counted exactly once in the team derived from the representative row.
- `/vcl/stats` forecast: insert `fixtureForecastDuplicateResolutionDate`, assert `blockers >= 0` and `forecastItems.length === teamNonCompliant`.
- `/mttr`: insert `fixtureCrossVerticalDuplicateActive` with `seen_count = 5`, assert exactly one increment in the `4–6 cycles` bucket for `STEAM`.
- `persistUpload()` snapshot: insert `fixtureCrossVerticalStatusMismatch`, run `persistUpload()`, assert the resulting snapshot row has `compliant === 0`, `non_compliant === 1`, `total_devices === 1`.
- `groupByHostname()` direct: pass a list with two rows for the same `(hostname, metric_id)`, assert one entry in `failing_metrics`. This protects the helper's contract independently of the SQL dedup.

### Property-Based Tests

Use `fast-check` (already in use in `backend/__tests__/*.property.test.js`). Generators should mix `vertical IS NULL` and `vertical = 'NTS_AEO'` rows, with controllable rates of `(hostname, metric_id)` collision.

- **Property 1 — `/items`**: For any list of `compliance_items` rows including cross-vertical duplicates, assert that for every device in the response, `device.failing_metrics.length === new Set(device.failing_metrics.map(m => m.metric_id)).size`.
- **Property 2 — `/items/:hostname`**: For any duplicated `(hostname, metric_id, status)`, assert `response.metrics.filter(m => m.metric_id === target_metric && m.status === target_status).length === 1`. Assert the surviving entry's `seen_count` is the maximum across duplicates.
- **Property 3 — `/vcl/stats`**: For any input including team-mismatched cross-vertical duplicates, assert `SUM(heavy_hitters[*].non_compliant) === stats.non_compliant`. Generate inputs that vary the rate of team mismatch from 0% to 100%.
- **Property 4 — `/vcl/stats` forecast**: For any input including duplicated `(hostname, metric_id)` with non-null resolution dates, assert `vertical_breakdown[*].blockers >= 0` and assert the unclamped expression equals `teamNonCompliant - dedupedForecastCount`.
- **Property 5 — `/mttr`**: For any input including cross-vertical duplicate active rows, assert `SUM(aging[*].total) === COUNT(DISTINCT (hostname, metric_id) WHERE status = 'active')` over the input. Per-team totals: assert `SUM(aging[*][team])` over each team equals the distinct count for that team.
- **Property 6 — `persistUpload()` snapshot invariant**: For any combination of cross-vertical status mismatches, assert every row written into `compliance_snapshots` satisfies `compliant + non_compliant <= total_devices` and `compliant + non_compliant === total_devices` (equality, since every hostname is classified into exactly one column).
- **Property 7 — Preservation**: For any input where every `(hostname, metric_id)` is unique across verticals, the responses from all five endpoints SHALL equal the responses on the same input from the original (unfixed) implementations. Use fast-check's shrinking to surface the smallest counterexample if the property fails.
- **Property 8 — Representative-row policy**: For any duplicated `(hostname, metric_id)`, the surviving entry's `seen_count` SHALL equal `max(duplicates.seen_count)`, and ties SHALL be broken by `max(duplicates.upload_id)`.

### Integration Tests

- Full flow: upload a legacy AEO xlsx, then upload an NTS_AEO multi-vertical xlsx that overlaps on at least one `(hostname, metric_id)`. Call `/items`, `/items/:hostname`, `/vcl/stats`, `/mttr`, and read `compliance_snapshots` for the current month. Assert all six post-fix invariants hold.
- Cross-page consistency: load `/vcl/stats` and `/items?team=STEAM` for the same backing data. Assert that the device count from `/vcl/stats` heavy-hitters for `STEAM` equals the number of devices in `/items?team=STEAM&status=active`.
- ComplianceDetailPanel render: load a device with cross-vertical duplicates via `/items/:hostname`, render `ComplianceDetailPanel`, assert exactly one `MetricChip` with the duplicated `metric_id` is present in the "Failing Metrics" section.
- `persistUpload()` end-to-end: with cross-vertical status mismatch present, run a fresh upload and assert the new `compliance_snapshots` row's invariant holds.