Files
cve-dashboard/.kiro/specs/sync-anomaly-detection/design.md
root 6ee68f5521 Add sync anomaly detection, BU drift monitoring, and findings count investigation
- Add BU drift checker that classifies archived findings as BU reassignment,
  severity drift, closure, or decommission via unfiltered Ivanti API queries
- Add post-sync anomaly summary with significance threshold and classification
  breakdown stored in ivanti_sync_anomaly_log table
- Add per-finding BU tracking that detects BU changes across syncs and records
  them in ivanti_finding_bu_history table
- Add drift guard that skips trend history writes when total drops more than 50%
- Add CLOSED_GONE archive state for findings that vanish from the closed set
- Add anomaly banner UI on Vulnerability Triage page for significant sync changes
- Add API endpoints for anomaly latest/history and BU change tracking
- Add diagnostic scripts for drift checking and BU reassignment verification
- Add investigation document and xlsx export for the April 2026 BU reassignment
  incident where 109 findings were moved to SDIT-CSD-ITLS-PIES
- Migrations required: add_closed_gone_state.js, add_sync_anomaly_tables.js
2026-04-24 20:34:34 +00:00

455 lines
23 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Design Document: Sync Anomaly Detection and BU Drift Monitoring
## Overview
This feature extends the Ivanti sync pipeline to automatically classify why findings disappear from filtered sync results. The current archive system detects disappearances but labels them all as `severity_score_drift` — a default that proved incorrect during the April 2026 incident where 109 findings silently disappeared due to a bulk BU reassignment.
The design adds three capabilities to the existing `ivantiFindings.js` sync pipeline:
1. **BU Drift Checker** — a post-sync step that queries the Ivanti API without BU/severity filters for newly archived finding IDs, classifying each disappearance as `bu_reassignment`, `severity_drift`, `closed_on_platform`, or `decommissioned`.
2. **Sync Anomaly Summary** — a structured report computed after each sync that breaks down count changes by cause and stores the result in a new `ivanti_sync_anomaly_log` table.
3. **Finding-Level BU Tracking** — per-finding BU comparison during `syncFindings()` that detects BU changes across syncs and records them in a new `ivanti_finding_bu_history` table.
The approach formalizes the ad-hoc diagnostic patterns from `drift-check.js` and `bu-reassignment-check.js` into the automated sync pipeline, with results surfaced through new API endpoints and an anomaly banner on the Vulnerability Triage page.
---
## Architecture
The feature integrates into the existing sync pipeline as post-sync steps, keeping the core sync logic unchanged. No new route modules are created — all new endpoints and logic live within the existing `ivantiFindings.js` module and its factory-pattern router.
```mermaid
flowchart TD
A[syncFindings - fetch all pages] --> B[Compare previous vs current findings]
B --> C[detectArchiveChanges - existing]
B --> D[BU comparison - new]
D --> E[Insert BU changes into ivanti_finding_bu_history]
C --> F[syncClosedCount - existing]
F --> G[detectClosedFindings - existing]
G --> H[detectClosedGoneFindings - existing]
H --> I[runBUDriftChecker - new]
I --> J[Batch unfiltered queries for newly archived IDs]
J --> K[Classify each: bu_reassignment / severity_drift / closed_on_platform / decommissioned]
K --> L[Update archive transition reasons]
L --> M[computeAnomalySummary - new]
M --> N[Insert row into ivanti_sync_anomaly_log]
style I fill:#F59E0B,color:#000
style D fill:#F59E0B,color:#000
style M fill:#F59E0B,color:#000
```
**Key design decisions:**
- **Post-sync, not inline**: The BU drift checker runs after all existing sync steps complete. This means a sync failure does not block drift checking of previously archived findings, and drift checking failures do not block the sync.
- **Same module, no new route file**: The anomaly and BU history endpoints are added to the existing `createIvantiFindingsRouter`. This keeps the Ivanti findings API surface in one place and avoids a new factory-pattern module for four endpoints.
- **Batched unfiltered queries**: Finding IDs are chunked into groups of 50 for the unfiltered Ivanti API call, matching the pattern proven in `bu-reassignment-check.js`. This stays within API limits while keeping the number of HTTP calls manageable.
- **BU comparison in syncFindings**: The per-finding BU comparison happens during the existing previous-vs-current comparison in `syncFindings()`, before the cache is overwritten. This is the only point where both the old and new BU values are available in memory.
---
## Components and Interfaces
### 1. BU Drift Checker (`runBUDriftChecker`)
A new async function added to `ivantiFindings.js` that runs after `detectClosedGoneFindings()` in the sync pipeline.
**Signature:**
```javascript
async function runBUDriftChecker(db, newlyArchivedIds, apiKey, clientId, skipTls)
```
**Parameters:**
- `db` — SQLite database instance
- `newlyArchivedIds` — array of finding ID strings that were newly archived in this sync cycle (from `detectArchiveChanges`)
- `apiKey`, `clientId`, `skipTls` — Ivanti API credentials (same as existing sync functions)
**Behavior:**
1. If `newlyArchivedIds` is empty, return immediately (no API calls).
2. Chunk the IDs into batches of 50.
3. For each batch, call `ivantiPost()` with a filter on `id` field only (no BU, severity, or state filters) — the same unfiltered query pattern used in `bu-reassignment-check.js`.
4. For each finding ID, classify the result:
- **Found, BU differs from expected** → `bu_reassignment`
- **Found, BU matches, severity < 8.5** → `severity_drift`
- **Found, BU matches, state is Closed** → `closed_on_platform`
- **Not found** → `decommissioned`
5. Update the corresponding `ivanti_archive_transitions` row's `reason` field with the classification.
6. Return a classification summary object: `{ bu_reassignment: N, severity_drift: N, closed_on_platform: N, decommissioned: N }`.
**Expected BUs** are the same values used in `FINDINGS_FILTERS`: `NTS-AEO-ACCESS-ENG` and `NTS-AEO-STEAM`.
**Error handling:** If an individual batch API call fails, log the error and skip that batch. The findings in the failed batch retain their default `severity_score_drift` reason. The function never throws — it returns whatever partial results it collected.
### 2. Anomaly Summary Computation (`computeAnomalySummary`)
A new async function that runs after the BU drift checker completes.
**Signature:**
```javascript
async function computeAnomalySummary(db, openCountDelta, closedCountDelta, newlyArchivedCount, returnedCount, classificationBreakdown)
```
**Parameters:**
- `db` — SQLite database instance
- `openCountDelta` — integer, current open count minus previous open count
- `closedCountDelta` — integer, current closed count minus previous closed count
- `newlyArchivedCount` — integer, number of findings archived in this sync
- `returnedCount` — integer, number of findings that returned in this sync
- `classificationBreakdown` — object from `runBUDriftChecker`, e.g. `{ bu_reassignment: 38, severity_drift: 5, ... }`
**Behavior:**
1. Determine `is_significant`: true if `newlyArchivedCount > 5`.
2. Insert a row into `ivanti_sync_anomaly_log` with all fields.
3. Log the summary to console.
### 3. Finding-Level BU Comparison
Integrated into `syncFindings()` between reading previous findings and writing the new cache. Uses the existing `previousFindings` and `allFindings` arrays.
**Logic:**
```
for each finding in allFindings:
previousFinding = previousMap.get(finding.id)
if previousFinding exists AND previousFinding.buOwnership !== finding.buOwnership
AND both values are non-empty:
INSERT into ivanti_finding_bu_history
```
The `buOwnership` field is already extracted by `extractFinding()` from `assetCustomAttributes['1550_host_1']`. No changes to `extractFinding()` are needed — it already stores `buOwnership` on each finding object.
### 4. New API Endpoints
All endpoints are added to the existing `createIvantiFindingsRouter` and require authentication via `requireAuth(db)`.
| Method | Path | Description |
|---|---|---|
| GET | `/api/ivanti/findings/anomaly/latest` | Returns the most recent anomaly summary row |
| GET | `/api/ivanti/findings/anomaly/history` | Returns anomaly history (last 30 or date-filtered) |
| GET | `/api/ivanti/findings/bu-changes` | Returns all BU change events, newest first |
| GET | `/api/ivanti/findings/:findingId/bu-history` | Returns BU change history for a specific finding |
**GET /anomaly/latest response:**
```json
{
"anomaly": {
"id": 1,
"sync_timestamp": "2026-04-24T12:00:00",
"open_count_delta": -45,
"closed_count_delta": -94,
"newly_archived_count": 45,
"returned_count": 0,
"classification": {
"bu_reassignment": 38,
"severity_drift": 1,
"closed_on_platform": 4,
"decommissioned": 2
},
"is_significant": true
}
}
```
Returns `{ anomaly: null }` if no anomaly records exist.
**GET /anomaly/history query parameters:**
- `from` (optional) — ISO date string, inclusive start
- `to` (optional) — ISO date string, inclusive end
- If neither provided, returns last 30 rows
**GET /bu-changes response:**
```json
{
"changes": [
{
"id": 1,
"finding_id": "2687687777",
"finding_title": "OpenSSH regreSSHion",
"host_name": "syn-098-120-000-078",
"previous_bu": "NTS-AEO-STEAM",
"new_bu": "SDIT-CSD-ITLS-PIES",
"detected_at": "2026-04-24T12:00:00"
}
]
}
```
**GET /:findingId/bu-history response:**
```json
{
"finding_id": "2687687777",
"history": [
{
"previous_bu": "NTS-AEO-STEAM",
"new_bu": "SDIT-CSD-ITLS-PIES",
"detected_at": "2026-04-24T12:00:00"
}
]
}
```
### 5. Anomaly Banner Component (`AnomalyBanner.js`)
A new React component placed in `frontend/src/components/pages/AnomalyBanner.js`, rendered on the Vulnerability Triage page above the `IvantiCountsChart`.
**Props:** None — fetches its own data from `/api/ivanti/findings/anomaly/latest`.
**Behavior:**
1. On mount, fetch the latest anomaly summary.
2. If `is_significant` is false or no anomaly exists, render nothing.
3. If `is_significant` is true, render a warning banner with:
- Amber background tint (`rgba(245, 158, 11, 0.15)`) with amber border (`rgba(245, 158, 11, 0.3)`)
- `AlertTriangle` icon from lucide-react
- Summary text: "45 findings archived — 38 BU reassignment, 5 severity drift, 2 decommissioned"
- Expandable detail section (click to toggle) showing affected findings grouped by classification
- Dismiss button (X icon) that hides the banner for the current session via `useState`
4. Uses monospace typography and dark theme colors per `DESIGN_SYSTEM.md`.
**Session dismiss:** Uses React state only — no localStorage. The banner reappears on page reload, which is appropriate since the anomaly data persists until the next sync produces a non-significant result.
```mermaid
stateDiagram-v2
[*] --> Loading: Component mounts
Loading --> Hidden: No anomaly or not significant
Loading --> Visible: Significant anomaly
Visible --> Expanded: Click breakdown text
Expanded --> Visible: Click breakdown text
Visible --> Dismissed: Click dismiss
Expanded --> Dismissed: Click dismiss
Dismissed --> [*]
```
### 6. Migration Script
Located at `backend/migrations/add_sync_anomaly_tables.js`. Uses the same pattern as existing migrations (`add_closed_gone_state.js`): standalone Node script, opens the database directly, uses `CREATE TABLE IF NOT EXISTS` and `CREATE INDEX IF NOT EXISTS` for idempotency.
---
## Data Models
### New Table: `ivanti_sync_anomaly_log`
Stores one row per sync cycle with the anomaly summary.
| Column | Type | Constraints | Description |
|---|---|---|---|
| `id` | INTEGER | PRIMARY KEY AUTOINCREMENT | Unique row identifier |
| `sync_timestamp` | DATETIME | NOT NULL DEFAULT CURRENT_TIMESTAMP | When the sync completed |
| `open_count_delta` | INTEGER | NOT NULL DEFAULT 0 | Current open count minus previous open count |
| `closed_count_delta` | INTEGER | NOT NULL DEFAULT 0 | Current closed count minus previous closed count |
| `newly_archived_count` | INTEGER | NOT NULL DEFAULT 0 | Number of findings archived in this sync |
| `returned_count` | INTEGER | NOT NULL DEFAULT 0 | Number of findings that returned in this sync |
| `classification_json` | TEXT | NOT NULL DEFAULT '{}' | JSON object: `{ bu_reassignment, severity_drift, closed_on_platform, decommissioned }` |
| `is_significant` | INTEGER | NOT NULL DEFAULT 0 | 1 if `newly_archived_count > 5`, else 0 |
| `created_at` | DATETIME | DEFAULT CURRENT_TIMESTAMP | Row creation timestamp |
**Indexes:**
- `idx_anomaly_sync_timestamp` on `sync_timestamp` — for efficient latest-record and date-range queries
### New Table: `ivanti_finding_bu_history`
Stores BU change events detected during sync.
| Column | Type | Constraints | Description |
|---|---|---|---|
| `id` | INTEGER | PRIMARY KEY AUTOINCREMENT | Unique row identifier |
| `finding_id` | TEXT | NOT NULL | Ivanti finding identifier |
| `finding_title` | TEXT | NOT NULL DEFAULT '' | Finding title at time of detection |
| `host_name` | TEXT | NOT NULL DEFAULT '' | Host name at time of detection |
| `previous_bu` | TEXT | NOT NULL | BU value from previous sync |
| `new_bu` | TEXT | NOT NULL | BU value from current sync |
| `detected_at` | DATETIME | NOT NULL DEFAULT CURRENT_TIMESTAMP | When the change was detected |
| `created_at` | DATETIME | DEFAULT CURRENT_TIMESTAMP | Row creation timestamp |
**Indexes:**
- `idx_bu_history_finding_id` on `finding_id` — for per-finding history lookups
- `idx_bu_history_detected_at` on `detected_at` — for chronological queries
### Modified: `ivanti_archive_transitions.reason` field
No schema change needed — the `reason` column is already `TEXT NOT NULL DEFAULT ''`. The change is in the values written:
| Previous values | New values |
|---|---|
| `severity_score_drift` | `bu_reassignment:<new_bu>` |
| `reappeared_in_sync` | `severity_drift:<new_severity>` |
| `remediated_in_ivanti` | `closed_on_platform` |
| `disappeared_from_closed_set` | `decommissioned` |
Existing rows with `severity_score_drift` are not modified — the enhanced reasons apply only to transitions created after deployment.
### Existing: `ivanti_findings_cache.findings_json`
No schema change. The `buOwnership` field is already present in each finding object within the JSON array, extracted by `extractFinding()` from `assetCustomAttributes['1550_host_1']`.
---
## Correctness Properties
*A property is a characteristic or behavior that should hold true across all valid executions of a system — essentially, a formal statement about what the system should do. Properties serve as the bridge between human-readable specifications and machine-verifiable correctness guarantees.*
### Property 1: Classification correctness
*For any* finding returned by an unfiltered Ivanti API query, the BU drift classifier SHALL produce the correct classification based on the combination of BU value, severity, and state:
- BU not in {NTS-AEO-ACCESS-ENG, NTS-AEO-STEAM} → `bu_reassignment`
- BU matches expected, severity < 8.5 → `severity_drift`
- BU matches expected, severity >= 8.5, state is Closed → `closed_on_platform`
- Finding not returned by API → `decommissioned`
**Validates: Requirements 1.2, 1.3, 1.4, 1.5**
### Property 2: Archive transition reason formatting
*For any* classification result, the archive transition reason field SHALL be formatted correctly:
- `bu_reassignment` classification with BU value B → reason is `bu_reassignment:B`
- `severity_drift` classification with severity S → reason is `severity_drift:S`
- `closed_on_platform` → reason is `closed_on_platform`
- `decommissioned` → reason is `decommissioned`
**Validates: Requirements 6.1, 6.2, 6.3, 6.4**
### Property 3: Batch size constraint
*For any* list of finding IDs of length N, the BU drift checker SHALL partition them into ceil(N/50) batches where each batch contains at most 50 IDs and the union of all batches equals the original list.
**Validates: Requirements 1.7**
### Property 4: Significance threshold
*For any* non-negative integer `newly_archived_count`, the anomaly summary's `is_significant` flag SHALL be true if and only if `newly_archived_count > 5`.
**Validates: Requirements 2.7**
### Property 5: Count delta computation
*For any* pair of non-negative integers (previous_count, current_count), the anomaly summary SHALL compute the delta as `current_count - previous_count` for both open and closed counts.
**Validates: Requirements 2.1**
### Property 6: BU extraction preservation
*For any* raw Ivanti finding object with a non-empty `assetCustomAttributes['1550_host_1']` array, `extractFinding` SHALL produce a finding object whose `buOwnership` field equals the first element of that array.
**Validates: Requirements 3.1**
### Property 7: BU change detection and recording
*For any* finding that appears in both the previous and current sync results with different non-empty `buOwnership` values, the sync pipeline SHALL insert exactly one row into `ivanti_finding_bu_history` with the correct `finding_id`, `previous_bu`, and `new_bu`. *For any* finding that appears for the first time (no previous entry) or has the same BU value, no history row SHALL be inserted.
**Validates: Requirements 3.2, 3.3, 3.6**
### Property 8: Latest anomaly returns most recent
*For any* non-empty sequence of anomaly summary rows with distinct timestamps, the `/anomaly/latest` endpoint SHALL return the row with the maximum `sync_timestamp`.
**Validates: Requirements 2.5**
### Property 9: Anomaly history ordering and limit
*For any* set of N anomaly summary rows, the `/anomaly/history` endpoint (without date parameters) SHALL return min(N, 30) rows ordered by `sync_timestamp` descending.
**Validates: Requirements 2.6, 7.2**
### Property 10: Date-range filtering with complete response shape
*For any* date range [from, to] and set of anomaly summary rows, the `/anomaly/history` endpoint SHALL return only rows whose `sync_timestamp` falls within the range (inclusive), ordered by `sync_timestamp` descending. Each returned row SHALL include `sync_timestamp`, `open_count_delta`, `closed_count_delta`, `newly_archived_count`, `returned_count`, `classification` (parsed as an object from `classification_json`), and `is_significant`.
**Validates: Requirements 7.1, 7.4**
### Property 11: BU changes endpoint ordering
*For any* set of BU change history rows, the `/bu-changes` endpoint SHALL return all rows ordered by `detected_at` descending.
**Validates: Requirements 3.4**
### Property 12: Per-finding BU history filtering
*For any* finding ID F and set of BU history rows across multiple findings, the `/:findingId/bu-history` endpoint SHALL return only rows where `finding_id = F`, ordered by `detected_at` descending.
**Validates: Requirements 3.5**
---
## Error Handling
### BU Drift Checker Errors
- **Individual batch API failure**: Log the error with the batch range, skip the batch, continue with remaining batches. Findings in the failed batch retain the default `severity_score_drift` reason. The function returns partial results.
- **All batches fail**: The classification breakdown will be all zeros. The anomaly summary is still written with `newly_archived_count` reflecting the archive detection results (which don't depend on the drift checker).
- **API timeout**: The existing 15-second timeout in `ivantiPost()` applies. Timed-out batches are treated as failed batches.
- **Malformed API response**: If `JSON.parse` fails on the response body, treat the batch as failed. Log the raw response length for debugging.
### Anomaly Summary Errors
- **Database write failure**: Log the error. The sync itself has already completed successfully — the anomaly summary is informational. Do not retry.
- **Missing previous counts**: If no previous anomaly row exists (first sync after deployment), use 0 for previous counts. The first anomaly row will have deltas equal to the current counts.
### BU Comparison Errors
- **Database insert failure**: Log the error for the specific finding, continue processing remaining findings. BU comparison failures are non-fatal.
- **Missing buOwnership field**: If either the previous or current finding has an empty/undefined `buOwnership`, skip the comparison for that finding (per requirement 3.6).
### API Endpoint Errors
- **Database read failure**: Return 500 with a generic error message. Do not expose internal error details.
- **Invalid date parameters**: If `from` or `to` are not valid ISO date strings, ignore them and fall back to the default last-30 behavior. Log a warning.
- **Authentication failure**: Handled by existing `requireAuth(db)` middleware — returns 401.
### Migration Errors
- **Table already exists**: `CREATE TABLE IF NOT EXISTS` handles this silently.
- **Index already exists**: `CREATE INDEX IF NOT EXISTS` handles this silently.
- **Database locked**: The migration script opens its own connection. If the server is running, SQLite's WAL mode allows concurrent reads. If a write lock conflict occurs, the migration will fail with a clear error message and can be retried.
---
## Testing Strategy
### Property-Based Tests
Property-based testing is appropriate for this feature because the core logic involves classification functions, data transformations, and query behaviors that have clear input/output relationships and universal properties.
**Library:** [fast-check](https://github.com/dubzzz/fast-check) — the standard PBT library for JavaScript/Node.js.
**Configuration:**
- Minimum 100 iterations per property test
- Each test tagged with: `Feature: sync-anomaly-detection, Property {N}: {title}`
**Properties to implement:**
| Property | Test approach |
|---|---|
| 1: Classification correctness | Generate random {bu, severity, state, found} tuples, verify classifier output |
| 2: Reason formatting | Generate random classification results, verify reason string format |
| 3: Batch size constraint | Generate random-length ID arrays, verify chunking |
| 4: Significance threshold | Generate random integers, verify is_significant flag |
| 5: Delta computation | Generate random count pairs, verify subtraction |
| 6: BU extraction | Generate random raw finding objects, verify buOwnership extraction |
| 7: BU change detection | Generate random previous/current finding pairs, verify history insertion |
| 812: API query properties | Generate random DB state, verify endpoint responses |
### Unit Tests (Example-Based)
Unit tests cover specific scenarios, edge cases, and integration points not suited for PBT:
- **Migration idempotency**: Run migration twice, verify no errors on second run (Req 4.6)
- **API error resilience**: Mock `ivantiPost` to return errors, verify drift checker doesn't throw (Req 1.6)
- **Anomaly banner rendering**: Mock API response, verify banner shows/hides based on `is_significant` (Req 5.2, 5.3)
- **Banner dismiss**: Click dismiss button, verify banner hidden (Req 5.4)
- **Banner expand/collapse**: Click breakdown text, verify detail section toggles (Req 5.7)
- **Authentication enforcement**: Unauthenticated requests return 401 (Req 7.3)
- **Fixed reason strings**: Verify `decommissioned` and `closed_on_platform` are exact strings (Req 6.3, 6.4)
- **Backward compatibility**: Existing `severity_score_drift` rows are not modified (Req 6.5)
### Integration Tests
- **End-to-end sync with drift checker**: Mock Ivanti API, run full sync pipeline, verify anomaly log and BU history tables are populated correctly
- **API endpoint responses**: Seed database, call each endpoint, verify response shape and content
### Test File Locations
- `backend/__tests__/bu-drift-classification.property.test.js` — Properties 16
- `backend/__tests__/anomaly-api.property.test.js` — Properties 712
- `backend/__tests__/sync-anomaly-detection.test.js` — Unit and integration tests
- `frontend/src/components/pages/__tests__/AnomalyBanner.test.js` — UI component tests