8.0 KiB
Bugfix Requirements Document
Introduction
Several Compliance backend endpoints select from compliance_items without scoping to a single vertical, so when the same (hostname, metric_id) exists in both a legacy vertical IS NULL upload and a multi-vertical vertical = 'NTS_AEO' upload, the duplicate rows distort the response. The originally reported symptom is on the device-level violation view: hostname STEAM-INTERSIGHT (IP 172.16.30.40) shows metric 7.1.1 listed twice in its failing metrics list (GitLab issue #13, reported by nkapur). Investigation found the same duplication pattern in additional endpoints that drive per-team and per-vertical reporting.
This spec covers the full class of "duplicate (hostname, metric_id) rows across verticals" bugs in backend/routes/compliance.js. The affected surfaces are:
GET /items— failing metrics list per device (originally reported)GET /items/:hostname— device detail metrics array (originally reported)persistUpload()compliance_snapshotscreation block — per-vertical compliant/non-compliant countsGET /vcl/stats— heavy-hitters team counts, per-team device totals, and forecast-burndown row countsGET /mttr— per-team aging bucket counts
Root Cause: Each affected query selects from compliance_items either with a vertical filter that admits both legacy and multi-vertical rows (vertical IS NULL OR vertical = 'NTS_AEO') or with no vertical filter at all. Some queries dedupe at the hostname level via COUNT(DISTINCT hostname), which protects against per-team device totals being inflated, but does not protect aggregations that depend on (hostname, metric_id) uniqueness, status uniqueness per hostname, or team uniqueness per hostname. The groupByHostname helper and the /items/:hostname query likewise have no deduplication at all, so every duplicate row becomes a duplicate metric in the response.
Bug Analysis
Current Behavior (Defect)
1.1 WHEN a device has compliance_items rows for the same (hostname, metric_id) pair across multiple verticals (e.g., one row with vertical IS NULL and another with vertical = 'NTS_AEO') THEN the /items endpoint returns both rows and the groupByHostname function adds the same metric_id to failing_metrics multiple times
1.2 WHEN the /items/:hostname detail endpoint is called for a device that has compliance_items rows across multiple verticals THEN the system returns duplicate metric entries in the metrics array because the query has no vertical filter or deduplication
1.3 WHEN the ComplianceDetailPanel renders the metrics array for a device with duplicate entries THEN the same metric_id chip appears multiple times in the "Failing Metrics" section, confusing users about the actual number of distinct violations
1.4 WHEN persistUpload() builds per-team rows for compliance_snapshots and the same hostname has compliance_items rows in both a legacy vertical IS NULL upload and an vertical = 'NTS_AEO' upload with different statuses (e.g., active in one vertical and resolved in the other) THEN the snapshot query counts that hostname in BOTH the compliant and non_compliant columns for the team, inflating per-team totals and producing a row where compliant + non_compliant > total_devices
1.5 WHEN /vcl/stats computes the heavy-hitters table and per-team totals and the same hostname has rows in two verticals where the team column differs (e.g., team = 'STEAM' in the legacy row and team = 'ACCESS-ENG' in the NTS_AEO row) THEN the COUNT(DISTINCT hostname) aggregate counts the hostname under both team groups, double-counting the device across teams
1.6 WHEN /vcl/stats builds the forecast-burndown for a team by selecting resolution_date rows from compliance_items without DISTINCT AND the same (hostname, metric_id) has duplicate active rows across verticals, both with a non-null resolution_date THEN the forecast row count is inflated and the blockers = teamNonCompliant - forecastItems.length calculation can go negative or report a misleadingly low blocker count
1.7 WHEN /mttr selects seen_count, team from active compliance_items without deduplication AND the same (hostname, metric_id) has duplicate active rows across verticals THEN each duplicate row is bucketed independently in bucketAgingItems, inflating per-team aging totals for that team
Expected Behavior (Correct)
2.1 WHEN a device has compliance_items rows for the same (hostname, metric_id) pair across multiple verticals THEN the /items endpoint SHALL return only one entry per unique (hostname, metric_id) combination in the failing_metrics array, using the row with the highest seen_count or most recent upload_id as the representative
2.2 WHEN the /items/:hostname detail endpoint is called for a device with rows across multiple verticals THEN the system SHALL return only one metric entry per unique (metric_id, status) combination, preferring the row with the highest seen_count or most recent data
2.3 WHEN the ComplianceDetailPanel renders the metrics for a device THEN each distinct metric_id SHALL appear exactly once in the "Failing Metrics" section regardless of how many underlying compliance_items rows exist for that metric across verticals
2.4 WHEN persistUpload() writes a per-team row to compliance_snapshots THEN the system SHALL count each unique hostname at most once across the (compliant, non_compliant) columns for that team, classifying a hostname as non_compliant if it has any active row in any vertical for the team and compliant only if all of its rows for the team are resolved, so that compliant + non_compliant ≤ total_devices always holds
2.5 WHEN /vcl/stats computes heavy-hitters and per-team totals THEN the system SHALL count each unique hostname under exactly one team — the team derived from the most recent (or otherwise canonical) compliance_items row for that hostname across all verticals — so that summing non_compliant across teams equals the total non-compliant device count
2.6 WHEN /vcl/stats builds the forecast-burndown for a team THEN the forecast row count SHALL be deduplicated by (hostname, metric_id) so that cross-vertical duplicate rows contribute at most one entry per unique violation, and blockers = teamNonCompliant - dedupedForecastCount SHALL never be negative
2.7 WHEN /mttr computes aging buckets per team THEN each unique (hostname, metric_id) active violation SHALL be bucketed exactly once using a single representative seen_count value, regardless of how many duplicate rows exist across verticals
Unchanged Behavior (Regression Prevention)
3.1 WHEN a device has multiple distinct failing metric_ids (e.g., 7.1.1 and 7.2.1) THEN the system SHALL CONTINUE TO display each distinct metric_id separately in the failing metrics list
3.2 WHEN a device has both active and resolved entries for the same metric_id THEN the system SHALL CONTINUE TO show the metric in the appropriate section (active or resolved) based on its status
3.3 WHEN only one compliance upload exists per vertical for a device (no cross-vertical duplication) THEN the system SHALL CONTINUE TO display metrics unchanged with correct seen_count, first_seen, and last_seen values
3.4 WHEN the /items list endpoint is called with a team filter THEN the system SHALL CONTINUE TO return all devices for that team with their correct (now deduplicated) failing metrics and accurate seen_count values
3.5 WHEN persistUpload() builds compliance_snapshots for a team whose devices exist in only one vertical THEN per-team total_devices, compliant, non_compliant, and compliance_pct SHALL CONTINUE TO match their pre-fix values
3.6 WHEN /vcl/stats computes overall stats, donut categorization, heavy-hitters, per-team totals, and forecast-burndown for devices that exist in only one vertical THEN every field in the response SHALL CONTINUE TO match its pre-fix value
3.7 WHEN /mttr computes aging buckets for teams whose active items exist in only one vertical THEN per-team and total bucket counts SHALL CONTINUE TO match their pre-fix values