feat: RSO observation system, child safety, Discord adapter, Telegram watchdog, email attachments
Core agent improvements: - RSO (Relevance Scoring & Observation) system: interaction_logger, memory_scorer, signal_detector - Memory access logging (memory_access_log table) for relevance scoring; high-signal turn detection - Rich conversation storage for notable turns; compact_conversation truncates long user messages - Task-type classifier (query/action/analysis/creative) for observation tagging - Nested sub-agent visibility: deep delegations now register against the main agent's manager Child safety (Gabriel profile): - child_safety.py: filtering, audit logging, prompt constants for restricted sessions - .kiro/specs/child-safety-profile: requirements, design, tasks specs - GABRIEL_BOT_PROPOSAL.md: initial proposal doc - Reduced context window (10 msgs) and tutor-mode identity for restricted users Telegram adapter: - Polling watchdog: auto-restarts updater if polling drops unexpectedly - get_me() with exponential-backoff retry on NetworkError at startup - Correct stop() ordering: signal watchdog before cancelling tasks Email / Gmail: - send_email: supports file attachments (attachments list param) - get_email: surfaces attachment metadata in response Scheduled tasks / weather: - Remove OpenWeatherMap API calls from morning-weather task; use wttr.in exclusively - New scheduled tasks and scheduler state persistence Discord: - adapters/discord/__init__.py scaffold - discord-plugin: MCP plugin for Claude Code Discord integration (server.ts, skills, config) Infrastructure: - n8n workflow exports (garvis_webhook, content_pipeline variants) - memory_workspace: context, homelab-repo-updates, weekly observation summaries, error logs - UCS C240 migration plan doc - requirements.txt: new deps - .claude/settings.json, fix_hooks.py: hook/permission tuning
This commit is contained in:
563
.kiro/specs/child-safety-profile/design.md
Normal file
563
.kiro/specs/child-safety-profile/design.md
Normal file
@@ -0,0 +1,563 @@
|
||||
# Design — Child Safety Profile
|
||||
|
||||
## Architecture Overview
|
||||
|
||||
The feature is implemented as a self-contained module (`child_safety.py`) that hooks into three
|
||||
existing extension points in the runtime, plus a new dedicated audit logger. No core agent logic
|
||||
is restructured — all changes are additive.
|
||||
|
||||
```
|
||||
Slack ──► SlackAdapter ──► AdapterRuntime
|
||||
│
|
||||
[preprocessors]
|
||||
│
|
||||
ChildSafetyFilter.preprocess()
|
||||
│
|
||||
── BLOCKED? ──► AuditLogger (blocked entry)
|
||||
│ └──► safe reply to user
|
||||
PASS
|
||||
│
|
||||
Agent.chat()
|
||||
│
|
||||
_build_system_prompt()
|
||||
│
|
||||
injects guardrail block
|
||||
(if username in RESTRICTED_USERS)
|
||||
│
|
||||
LLM call
|
||||
│
|
||||
[postprocessors]
|
||||
│
|
||||
ChildSafetyFilter.postprocess()
|
||||
│
|
||||
── FLAGGED? ──► AuditLogger (flagged entry)
|
||||
│ └──► safe fallback reply
|
||||
CLEAN
|
||||
│
|
||||
AuditLogger (allowed entry)
|
||||
│
|
||||
reply to user
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## New Files
|
||||
|
||||
### `child_safety.py`
|
||||
The main module. Contains three classes:
|
||||
|
||||
**`ChildSafetyConfig`**
|
||||
- Loaded once at startup from `config/adapters.local.yaml` (`child_safety` block)
|
||||
- Fields: `restricted_users: list[str]`, `audit_retention_days: int`
|
||||
- Exposes `is_restricted(username: str) -> bool`
|
||||
|
||||
**`ChildSafetyFilter`**
|
||||
- Stateless filter with two public methods: `preprocess()` and `postprocess()`
|
||||
- Holds compiled regex patterns (compiled once at import, not per-message)
|
||||
- `preprocess(message: InboundMessage) -> tuple[InboundMessage | None, str | None]`
|
||||
- Returns `(message, None)` to pass through
|
||||
- Returns `(None, reply_text)` to block with a safe response
|
||||
- `postprocess(response: str, message: InboundMessage) -> str`
|
||||
- Returns response unchanged if clean
|
||||
- Returns safe fallback string if flagged
|
||||
|
||||
**`ChildAuditLogger`**
|
||||
- Writes to `memory_workspace/audit/{username}/YYYY-MM-DD.jsonl`
|
||||
- Non-blocking: uses daemon background threads (same pattern as `InteractionLogger`)
|
||||
- `log(username, message, action, reason, response)` — single public method
|
||||
- `cleanup_old_logs(retention_days)` — called at startup
|
||||
|
||||
### `memory_workspace/users/gabriel.md`
|
||||
Per-user profile injected into the system prompt. Contains:
|
||||
- Age, interests, learning context
|
||||
- Communication style preferences (patient, encouraging, use examples)
|
||||
- Does NOT contain guardrail rules (those are in the injected guardrail block)
|
||||
|
||||
---
|
||||
|
||||
## Modified Files
|
||||
|
||||
### `agent.py` — `_build_system_prompt()` (line 488)
|
||||
Add a conditional block after the existing user profile injection:
|
||||
|
||||
```python
|
||||
if self._child_safety and self._child_safety.config.is_restricted(username):
|
||||
system_parts.append(CHILD_GUARDRAIL_BLOCK)
|
||||
```
|
||||
|
||||
`CHILD_GUARDRAIL_BLOCK` is a module-level constant defined in `child_safety.py` and imported.
|
||||
It is a multi-paragraph instruction block — see Content Design section below.
|
||||
|
||||
The Agent is also given a reference to the `ChildSafetyConfig` at `__init__` time so it can
|
||||
check `is_restricted()` without re-reading config on every turn.
|
||||
|
||||
### `adapters/runtime.py` — `AdapterRuntime.__init__()`
|
||||
After constructing the runtime, register the child safety pre/postprocessors:
|
||||
|
||||
```python
|
||||
from child_safety import ChildSafetyFilter, ChildAuditLogger
|
||||
_filter = ChildSafetyFilter(config, audit_logger)
|
||||
self.add_preprocessor(_filter.preprocess_adapter)
|
||||
self.add_postprocessor(_filter.postprocess_adapter)
|
||||
```
|
||||
|
||||
The `preprocess_adapter` and `postprocess_adapter` methods wrap the core filter methods with
|
||||
the `InboundMessage` signature the runtime expects:
|
||||
- Preprocessor signature: `(InboundMessage) -> InboundMessage`
|
||||
- Postprocessor signature: `(str, InboundMessage) -> str`
|
||||
|
||||
When the preprocessor blocks a message, it mutates the `InboundMessage` to signal a block
|
||||
by returning a sentinel message (or raises a handled exception that the runtime catches and
|
||||
converts to a direct reply). **Decision: use sentinel pattern** — set a special field on the
|
||||
message rather than raising, to keep the runtime's error handling clean.
|
||||
|
||||
> **Alternative considered**: Returning `None` from the preprocessor to signal "send the canned
|
||||
> reply and skip the agent". This would require a runtime change to handle `None`. The sentinel
|
||||
> approach avoids that. The runtime already supports early-exit via postprocessors returning
|
||||
> a replacement string — we can use a similar mechanism.
|
||||
|
||||
**Simpler approach (chosen):** The preprocessor returns a modified `InboundMessage` with its
|
||||
`text` replaced by a special internal sentinel. A postprocessor immediately before delivery
|
||||
detects the sentinel and replaces it with the safe reply text. The audit log entry is written
|
||||
by the preprocessor at block time.
|
||||
|
||||
### `config/adapters.local.yaml`
|
||||
Add the `child_safety` config block (parent configures manually):
|
||||
|
||||
```yaml
|
||||
child_safety:
|
||||
restricted_users:
|
||||
- son
|
||||
audit_retention_days: 365
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Filtering Logic Design
|
||||
|
||||
### Input Filter — Intent Pattern Matching
|
||||
|
||||
Patterns are compiled regex, evaluated in order. First match wins.
|
||||
|
||||
**Step 1 — Hard block (always, regardless of context)**
|
||||
These patterns are checked first with no context exemption:
|
||||
|
||||
| Pattern intent | Example match |
|
||||
|---|---|
|
||||
| Sexual/explicit content requests | "send me explicit", "show me porn" |
|
||||
| Real-world harm with named victim | "how do I hurt my [classmate/sister/teacher]" |
|
||||
| Requests for personal identifying info | "what is [person]'s address/phone/school" |
|
||||
| Self-harm | "how do I hurt myself", "how do I kill myself" |
|
||||
|
||||
**Step 2 — Check for game dev context signals**
|
||||
If any of the following terms appear in the message, skip Step 3 entirely:
|
||||
|
||||
```python
|
||||
GAME_CONTEXT_SIGNALS = [
|
||||
r"\bin (my |the |a )?(game|roblox|studio|script|map|level|world)\b",
|
||||
r"\b(lua|roblox|studio|npc|hitbox|raycast|humanoid|workspace|basepart)\b",
|
||||
r"\b(code|script|function|method|module|class|variable|loop|event)\b",
|
||||
r"\b(damage|health|respawn|spawn|kill|destroy)\b.*\b(player|npc|enemy|mob|character)\b",
|
||||
r"\bhow (do i|to|can i) (make|get|set|add|create|implement|build)\b",
|
||||
]
|
||||
```
|
||||
|
||||
**Step 3 — Conditional block (only if no game dev context)**
|
||||
Only evaluated if Step 2 found no context signals:
|
||||
|
||||
| Pattern intent | Example match |
|
||||
|---|---|
|
||||
| Real-world weapon harm | "how do I use a knife to hurt/cut/stab" |
|
||||
| Real-world weapon construction | "how do I build/make a real gun/bomb" |
|
||||
| Violence against real targets | "how do I fight/attack/beat up [person]" |
|
||||
| Explicit drug content | "how do I get/use/buy [drug]" |
|
||||
|
||||
**Step 4 — Pass through**
|
||||
No patterns matched → message is allowed.
|
||||
|
||||
### Output Filter — Response Scan
|
||||
|
||||
Lighter touch. Scans the LLM response for:
|
||||
- Explicit sexual language (small set of explicit terms only)
|
||||
- Actual step-by-step real-world harm instructions (e.g., numbered steps to build a weapon)
|
||||
- Profanity above a threshold (configurable word list)
|
||||
|
||||
If flagged → replace entire response with:
|
||||
> "I ran into a bit of a snag answering that one. Try asking me a different way, or ask about
|
||||
> something else — I'm great at Lua scripting and Roblox game design!"
|
||||
|
||||
---
|
||||
|
||||
## Guardrail Block Content Design
|
||||
|
||||
Injected at the end of the system prompt for all restricted users. Contains two sections:
|
||||
safety rules and teaching approach.
|
||||
|
||||
```
|
||||
=== CHILD SAFE MODE ===
|
||||
You are talking to Gabriel, a 13-year-old who is learning game development and Lua scripting.
|
||||
Your role is educator and mentor — not answer key.
|
||||
|
||||
--- CONTENT RULES ---
|
||||
|
||||
ALWAYS ENCOURAGED:
|
||||
- Lua scripting, Roblox Studio mechanics, game physics
|
||||
- Horror game design: atmosphere, enemy AI, jump scares, damage systems
|
||||
- Weapon mechanics IN GAMES: hitboxes, shooting mechanics, damage values, animations
|
||||
- General coding concepts, algorithms, creative writing, school subjects
|
||||
|
||||
NEVER ALLOWED — refuse politely, no explanation of why:
|
||||
- Real-world instructions for harming people or animals
|
||||
- How to build, obtain, or use actual weapons
|
||||
- Sexual or romantic content of any kind
|
||||
- Explicit language or profanity
|
||||
- Sharing or asking for real personal information
|
||||
|
||||
GRAY AREA RULE: If a question mentions weapons, violence, or dangerous topics AND there is any
|
||||
reasonable game/educational interpretation — assume game context and help enthusiastically.
|
||||
Only refuse if the request is unambiguously real-world harm with no plausible game framing.
|
||||
|
||||
--- TEACHING APPROACH ---
|
||||
|
||||
Your goal is to build Gabriel's skills and confidence over time, not to hand him answers.
|
||||
Use this approach every time:
|
||||
|
||||
1. ASSESS FIRST (for non-trivial questions): Before diving in, ask what he's already tried
|
||||
or what he thinks might work. Skip this for simple factual lookups ("what does pairs() do?").
|
||||
|
||||
2. BREAK IT DOWN: Split the problem into smaller steps. Guide through one step at a time.
|
||||
"Let's start with just getting the bullet to appear — we'll worry about damage after."
|
||||
|
||||
3. CODE + EXPLANATION always together: When you show code, explain what each meaningful
|
||||
part does in plain language immediately after. Never a bare code block with no context.
|
||||
Ask "does that make sense?" or "what do you think this line is doing?" after showing it.
|
||||
|
||||
4. LEAVE SOMETHING FOR HIM: After giving an example, leave one small piece for Gabriel to
|
||||
write himself. "I've done the shooting part — can you add the check for ammo count?"
|
||||
|
||||
5. GUIDE THE DEBUG, DON'T SOLVE IT: When he shares broken code, point him toward the
|
||||
area with the issue rather than fixing it directly.
|
||||
"Look at what your variable is on the third loop — what's it equal to at that point?"
|
||||
|
||||
6. CELEBRATE THE ATTEMPT: Always acknowledge what's working before addressing what isn't.
|
||||
"The loop structure is solid — that's the tricky bit. Just one small fix needed here."
|
||||
|
||||
7. CONNECT TO PAST WORK: When a new concept resembles something covered before, say so.
|
||||
"This is the same idea as the enemy spawner loop — same structure, different purpose."
|
||||
|
||||
8. DIRECT ANSWERS are fine for: simple factual questions, API lookups, syntax checks,
|
||||
"what does X do?" questions. Only apply the full teaching approach for problem-solving.
|
||||
|
||||
9. AI LITERACY — teach him to use you well (weave in naturally, never lecture):
|
||||
- When he asks something vague, model good question structure before answering:
|
||||
"Just checking — you want the damage to apply on touch, or only when the enemy attacks?"
|
||||
- When context runs out, explain it plainly:
|
||||
"I can only hold so much conversation in memory. Next session, remind me what you're
|
||||
building and I'll be right back up to speed."
|
||||
- Teach the ideal coding question format when the moment comes up naturally:
|
||||
"Next time: what your code does now + what you want + what you've tried = fastest answer."
|
||||
- Flag your assumptions so he learns to spot ambiguity:
|
||||
"I'm assuming this resets on respawn — let me know if that's not what you meant."
|
||||
|
||||
RESPONSE LENGTH: Keep responses focused. Step-by-step means one step at a time — don't
|
||||
front-load everything. Short, clear, then wait for his response before continuing.
|
||||
|
||||
TONE: Enthusiastic, encouraging, patient. Short sentences. No jargon without explanation.
|
||||
Talk to him like a smart friend who happens to know a lot about game dev, not like a textbook.
|
||||
=== END CHILD SAFE MODE ===
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Token Optimization Design
|
||||
|
||||
### Problem
|
||||
Gabriel shares the same API token pool as Jordan. Every Gabriel turn currently injects:
|
||||
- `SOUL.md` — Garvis homelab persona (~935 tokens, ~3,740 bytes) — irrelevant
|
||||
- `context.md` — SSH hosts, Proxmox inventory (~227 tokens, ~909 bytes) — irrelevant
|
||||
- Hybrid memory search (5 chunks) — Jordan's homelab memories — irrelevant
|
||||
- 20-message history window — same cap as an admin session
|
||||
|
||||
Estimated dead weight: **~1,500–1,800 tokens per turn** before Gabriel types a word.
|
||||
|
||||
### Solution: Restricted-User System Prompt Builder
|
||||
|
||||
In `_build_system_prompt()`, add a branch for restricted users that replaces the standard
|
||||
assembly with a stripped-down version:
|
||||
|
||||
```python
|
||||
if child_safety_config and child_safety_config.is_restricted(username):
|
||||
return _build_child_system_prompt(username, user_profile, guardrail_block)
|
||||
```
|
||||
|
||||
**`_build_child_system_prompt()`** assembles only:
|
||||
1. `CHILD_TUTOR_IDENTITY` — a ~100-token constant replacing SOUL.md (see below)
|
||||
2. `user_profile` — gabriel.md (relevant, kept)
|
||||
3. `CHILD_GUARDRAIL_BLOCK` — safety + teaching rules (relevant, kept)
|
||||
4. Tool capability line — minimal version, omit delegation instructions
|
||||
|
||||
What is **explicitly skipped**:
|
||||
- `get_soul()` — SOUL.md not read at all
|
||||
- `get_context()` — context.md not read at all
|
||||
- `search_hybrid()` — memory search not called
|
||||
- Delegation/sub-agent instructions block
|
||||
|
||||
### `CHILD_TUTOR_IDENTITY` Constant (~100 tokens)
|
||||
|
||||
Replaces the full SOUL.md for Gabriel's sessions:
|
||||
|
||||
```
|
||||
You are a coding mentor and game development tutor. You help Gabriel — a 13-year-old building
|
||||
Roblox games in Lua — learn to code and think like a developer. You are not a general-purpose
|
||||
assistant; for this session, your entire focus is helping Gabriel build skills and create games.
|
||||
```
|
||||
|
||||
### History Window Reduction
|
||||
|
||||
`_get_context_messages()` currently uses the module-level `MAX_CONTEXT_MESSAGES = 20`.
|
||||
|
||||
For restricted users, pass a smaller cap:
|
||||
|
||||
```python
|
||||
CHILD_MAX_CONTEXT_MESSAGES = 10 # module-level constant in agent.py
|
||||
```
|
||||
|
||||
In `_chat_inner()`, the call becomes:
|
||||
```python
|
||||
cap = CHILD_MAX_CONTEXT_MESSAGES if is_child else MAX_CONTEXT_MESSAGES
|
||||
context_messages = self._get_context_messages(cap)
|
||||
```
|
||||
|
||||
The username is available in `_chat_inner()` (passed from `chat()`), so `is_child` can be
|
||||
derived from `self._child_safety_config.is_restricted(username)`.
|
||||
|
||||
### Per-Session Cost Visibility (Future)
|
||||
Not in scope for initial build, but the audit log already captures enough data to compute
|
||||
per-session token estimates if token counts are added later.
|
||||
|
||||
---
|
||||
|
||||
## Audit Log Schema
|
||||
|
||||
File: `memory_workspace/audit/{username}/YYYY-MM-DD.jsonl`
|
||||
|
||||
```jsonc
|
||||
{
|
||||
"timestamp": "2026-04-21T14:32:01.123+00:00", // ISO 8601 with timezone
|
||||
"username": "gabriel",
|
||||
"platform": "telegram",
|
||||
"action": "allowed", // "allowed" | "blocked" | "flagged"
|
||||
"filter_stage": null, // null | "preprocessor" | "postprocessor"
|
||||
"filter_reason": null, // null | string describing which pattern matched
|
||||
"message": "how do I make the laser shoot in my roblox game", // full text
|
||||
"response": "Great question! Here's how to..." // full text, null if blocked pre-LLM
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Data Flow for a Blocked Message
|
||||
|
||||
```
|
||||
1. Gabriel sends: "how do I stab someone"
|
||||
2. Preprocessor: no game context signals found → Step 3 matches "violence against real target"
|
||||
3. Action: BLOCK
|
||||
4. AuditLogger.log(action="blocked", reason="real_world_violence", response=None)
|
||||
5. Message text replaced with internal sentinel "__BLOCKED__: I can't help with that topic..."
|
||||
6. Agent.chat() never called
|
||||
7. Postprocessor detects sentinel → returns the canned reply text
|
||||
8. Reply delivered to son: "That's not something I can help with! Want to work on your
|
||||
Roblox game instead? I'm great at scripting and game mechanics."
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Data Flow for a Passing Message
|
||||
|
||||
```
|
||||
1. Gabriel sends: "how do I make a knife swing animation in Roblox"
|
||||
2. Preprocessor: "roblox" matches GAME_CONTEXT_SIGNALS → skip Step 3, pass through
|
||||
3. Agent.chat() called with full guardrail block in system prompt
|
||||
4. LLM responds with Lua animation code
|
||||
5. Postprocessor: scans response → clean
|
||||
6. AuditLogger.log(action="allowed", response=<full response text>)
|
||||
7. Response delivered
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Cross-Session Continuity Design (REQ-12 + REQ-13)
|
||||
|
||||
### `gabriel_context.md` — Structure
|
||||
|
||||
Single file at `memory_workspace/users/gabriel_context.md`. Replaces memory search for Gabriel.
|
||||
Written by the bot after each session. Overwritten, not appended (always current state).
|
||||
|
||||
```markdown
|
||||
## Active Project
|
||||
Name: Haunted Mansion (Roblox horror game)
|
||||
Description: Top-down horror game with a chasing enemy, jump scares, and atmospheric lighting.
|
||||
|
||||
## Last Session (2026-04-21)
|
||||
- Implemented basic enemy chase using Humanoid:MoveTo()
|
||||
- Debugged an issue where the enemy ignored walls (fixed with pathfinding)
|
||||
- Introduced: pathfinding service, Humanoid, MoveTo()
|
||||
|
||||
## Open Threads
|
||||
- Player hasn't been told how to add sound effects yet
|
||||
- Wants to add a second enemy type next session
|
||||
|
||||
## Skills Introduced
|
||||
- for loops — iterating over tables (2026-04-21)
|
||||
- functions — defining, calling, parameters vs arguments (2026-04-21)
|
||||
- Humanoid — controlling character movement (2026-04-21)
|
||||
- PathfindingService — navigation around obstacles (2026-04-21)
|
||||
```
|
||||
|
||||
### How It Gets Updated
|
||||
|
||||
At the end of each Gabriel session, the agent appends a self-update instruction to the
|
||||
system prompt (or the guardrail block triggers it):
|
||||
|
||||
> "At the end of this conversation, update `memory_workspace/users/gabriel_context.md`
|
||||
> with: current project state, what was worked on today, any open threads, and any new
|
||||
> concepts you introduced. Keep it under 40 lines. Overwrite the file completely."
|
||||
|
||||
This mirrors how the main agent writes to `MEMORY.md` after Jordan's sessions. The bot
|
||||
already has file-write tools available — no new mechanism needed.
|
||||
|
||||
### Injection in System Prompt
|
||||
|
||||
In `_build_child_system_prompt()`:
|
||||
|
||||
```python
|
||||
gabriel_context = self.memory.read_file("users/gabriel_context.md") # or Path.read_text
|
||||
parts = [
|
||||
CHILD_TUTOR_IDENTITY,
|
||||
f"User Profile:\n{user_profile}",
|
||||
]
|
||||
if gabriel_context:
|
||||
parts.append(f"Project Context & Skills:\n{gabriel_context}")
|
||||
parts.append(CHILD_GUARDRAIL_BLOCK)
|
||||
```
|
||||
|
||||
If the file doesn't exist (first session), it's simply omitted — no error.
|
||||
|
||||
---
|
||||
|
||||
## First-Run Onboarding Design (REQ-14)
|
||||
|
||||
### Detection
|
||||
|
||||
First-run is detected in `_build_child_system_prompt()` or the preprocessor by checking:
|
||||
|
||||
```python
|
||||
context_path = workspace_dir / "users" / "gabriel_context.md"
|
||||
is_first_run = not context_path.exists()
|
||||
```
|
||||
|
||||
### Delivery
|
||||
|
||||
The welcome is injected as a **system-level instruction** in the guardrail block that fires
|
||||
only on first run. The LLM is instructed to send the welcome as its opening message before
|
||||
addressing the user's question:
|
||||
|
||||
```
|
||||
FIRST SESSION: This is Gabriel's very first message. Before answering his question,
|
||||
send a short, friendly welcome. Cover:
|
||||
- What you can help him with (Lua, Roblox, game design, coding)
|
||||
- That you'll guide him and ask questions rather than just give answers
|
||||
- That you'll remember his project between sessions
|
||||
- Ask what he's working on (or answer his question if he's already told you)
|
||||
Keep it to 4–5 sentences. Warm, not formal.
|
||||
```
|
||||
|
||||
This block is only added when `is_first_run` is True — subsequent sessions omit it entirely.
|
||||
|
||||
### Example Welcome
|
||||
|
||||
> Hey Gabriel! I'm here to help you build your Roblox games and level up your Lua skills.
|
||||
> I work a bit differently to a search engine — instead of just handing you the answer, I'll
|
||||
> walk you through things so you actually learn how it works. I'll also remember what you're
|
||||
> building between chats, so you won't need to explain your project every time.
|
||||
> What are you working on?
|
||||
|
||||
---
|
||||
|
||||
## Slack Allow-List Design (REQ-15)
|
||||
|
||||
### Current Gap
|
||||
|
||||
`adapters/slack/adapter.py` — `handle_message_events()` processes every incoming message
|
||||
with no user check. The Telegram adapter has `_is_user_allowed()` at line 441; Slack has
|
||||
no equivalent.
|
||||
|
||||
### Fix
|
||||
|
||||
Add `_is_user_allowed()` to `SlackAdapter`, called at the top of `handle_message_events()`:
|
||||
|
||||
```python
|
||||
def _is_user_allowed(self, user_id: str) -> bool:
|
||||
allowed = self.config.settings.get("allowed_users", [])
|
||||
if not allowed:
|
||||
return True # open if no list configured
|
||||
return user_id in [str(u) for u in allowed]
|
||||
```
|
||||
|
||||
In `handle_message_events()`:
|
||||
```python
|
||||
user_id = event.get("user")
|
||||
if not self._is_user_allowed(user_id):
|
||||
return # silently drop — no response
|
||||
```
|
||||
|
||||
Config in `adapters.local.yaml`:
|
||||
```yaml
|
||||
slack:
|
||||
allowed_users:
|
||||
- U01234JORDAN # Jordan's Slack user ID
|
||||
- U09876GABRIEL # Gabriel's Slack user ID
|
||||
```
|
||||
|
||||
Slack user IDs are found in Slack → Profile → More → Copy member ID.
|
||||
|
||||
---
|
||||
|
||||
## File Tree After Implementation
|
||||
|
||||
```
|
||||
ajarbot/
|
||||
├── child_safety.py ← NEW
|
||||
├── agent.py ← MODIFIED (_build_system_prompt, _chat_inner)
|
||||
├── adapters/
|
||||
│ ├── runtime.py ← MODIFIED (register pre/postprocessors)
|
||||
│ └── slack/
|
||||
│ └── adapter.py ← MODIFIED (add allow-list check)
|
||||
├── config/
|
||||
│ └── adapters.local.yaml ← MODIFIED (child_safety block, gabriel mapping,
|
||||
│ slack allowed_users)
|
||||
└── memory_workspace/
|
||||
└── users/
|
||||
├── gabriel.md ← NEW (user profile)
|
||||
└── gabriel_context.md ← NEW (created after first session)
|
||||
└── audit/
|
||||
└── gabriel/
|
||||
└── 2026-04-21.jsonl ← NEW (created at runtime)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Decisions Log
|
||||
|
||||
| Decision | Rationale |
|
||||
|---|---|
|
||||
| Intent patterns over keyword lists | Keywords produce unacceptable false positive rate for game dev vocabulary |
|
||||
| Sentinel pattern for preprocessor blocking | Avoids runtime API changes; fits existing pre/postprocessor contract |
|
||||
| Separate audit log from RSO log | Keeps RSO memory scoring clean; audit log has different retention and purpose |
|
||||
| Guardrail block as system prompt injection, not separate API call | No extra LLM call = no added latency or cost |
|
||||
| Game dev context as an exemption gate, not an allow-list | Easier to maintain; covers novel game dev phrasing automatically |
|
||||
| Config-driven restricted users | Parent can add/remove without touching Python source |
|
||||
| gabriel_context.md overwrites rather than appends | Always reflects current state; avoids unbounded growth; keeps token cost predictable |
|
||||
| First-run via file existence check | No database or state needed; survives restarts; trivially inspectable |
|
||||
| Slack allow-list fails open (empty list = allow all) | Preserves current behaviour for existing deployments with no config change |
|
||||
| Platform: Slack over Telegram | Jordan has native workspace admin visibility; channel history is built-in parent monitoring |
|
||||
338
.kiro/specs/child-safety-profile/requirements.md
Normal file
338
.kiro/specs/child-safety-profile/requirements.md
Normal file
@@ -0,0 +1,338 @@
|
||||
# Requirements — Child Safety Profile
|
||||
|
||||
## Overview
|
||||
|
||||
Add a restricted child user profile to Ajarbot that allows a 13-year-old to use the bot as an
|
||||
educational and creative tool — focused on gaming, Lua scripting, and Roblox Studio — while
|
||||
preventing access to age-inappropriate content. Parents retain full oversight via an audit log.
|
||||
|
||||
---
|
||||
|
||||
## User Stories
|
||||
|
||||
### REQ-01 — Child User Access
|
||||
**As a parent**, I want to add my gabriel as an allowed user on Slack so he can interact with
|
||||
the bot using his own account.
|
||||
|
||||
**Acceptance Criteria:**
|
||||
- His Slack user ID is mapped to a named username (e.g., `gabriel`) in `adapters.local.yaml`
|
||||
- His username appears in the `allowed_users` list
|
||||
- He can send messages and receive responses through the existing Slack adapter
|
||||
- His session is isolated from the parent's session (separate conversation history)
|
||||
|
||||
---
|
||||
|
||||
### REQ-02 — Age-Appropriate System Persona
|
||||
**As a parent**, I want the bot to behave differently for my gabriel — patient, educational, and
|
||||
enthusiastic about game dev — rather than presenting the full Garvis homelab persona.
|
||||
|
||||
**Acceptance Criteria:**
|
||||
- Child users receive a modified system prompt that replaces homelab/admin context with
|
||||
an educational game-dev tutor persona
|
||||
- Tone is encouraging, uses simple language, avoids jargon where possible
|
||||
- References to SSH, Proxmox, home network, or admin tooling are suppressed for child users
|
||||
- Son's profile (`memory_workspace/users/gabriel.md`) captures his interests, age, and learning style
|
||||
|
||||
---
|
||||
|
||||
### REQ-03 — Context-Aware Content Filtering (Input)
|
||||
**As a parent**, I want the bot to block genuinely harmful requests without false-positiving
|
||||
on legitimate game development questions that use words like "shoot", "kill", "weapon", or "knife"
|
||||
in a coding/game context.
|
||||
|
||||
**Acceptance Criteria:**
|
||||
- A preprocessor runs on every inbound message from a child user before it reaches the LLM
|
||||
- The preprocessor uses **intent patterns**, not keyword matching — a block requires both a
|
||||
harm verb and a real-world target/context
|
||||
- Game development context signals (e.g., `in my game`, `roblox`, `lua`, `script`, `code`,
|
||||
`function`, `NPC`, `hitbox`) exempt a message from weapon/violence keyword blocks
|
||||
- The following are always blocked regardless of context:
|
||||
- Real-world harm instructions ("how do I hurt/stab/shoot a person")
|
||||
- Requests for actual weapon construction
|
||||
- Sexual or explicit content
|
||||
- Social engineering or personal data requests
|
||||
- Content with no plausible game/educational framing
|
||||
- Blocked messages receive a friendly, non-alarming response explaining the bot can't help
|
||||
with that topic
|
||||
- The following are always allowed regardless of words used:
|
||||
- Lua scripting and Roblox Studio mechanics
|
||||
- Horror game design (atmosphere, enemy AI, damage systems, jump scares)
|
||||
- Game weapon mechanics, hitboxes, damage values, animations
|
||||
- General coding help (Python, JavaScript basics)
|
||||
- School subjects, creative writing, general knowledge
|
||||
|
||||
---
|
||||
|
||||
### REQ-04 — Context-Aware Content Filtering (Output)
|
||||
**As a parent**, I want a secondary check on what the bot sends back so that even if the LLM
|
||||
produces something borderline, it is caught before delivery.
|
||||
|
||||
**Acceptance Criteria:**
|
||||
- A postprocessor scans every outgoing response to a child user
|
||||
- Detects and replaces responses that contain explicit language, adult content, or real-world
|
||||
harm instructions that slipped through the system prompt
|
||||
- If a response is flagged, a safe fallback message is sent and the event is logged
|
||||
- Clean responses pass through unmodified with zero added latency beyond the scan
|
||||
|
||||
---
|
||||
|
||||
### REQ-05 — System Prompt Guardrails
|
||||
**As a parent**, I want the LLM itself to understand the rules so it handles gray-area
|
||||
questions correctly without requiring every edge case to be coded explicitly.
|
||||
|
||||
**Acceptance Criteria:**
|
||||
- Child users receive a guardrail block appended to their system prompt on every turn
|
||||
- The guardrail block explicitly tells the LLM:
|
||||
- Game dev / horror game design / weapon mechanics in a game context = encouraged
|
||||
- Real-world harm, adult content, explicit language = refuse politely
|
||||
- If unsure, treat the question as game/educational context if any signal supports it
|
||||
- The guardrail block is injected in `_build_system_prompt()` when the username is in the
|
||||
configured `RESTRICTED_USERS` list
|
||||
|
||||
---
|
||||
|
||||
### REQ-06 — Tool Restrictions
|
||||
**As a parent**, I want my gabriel to be unable to trigger homelab tools, SSH commands, file
|
||||
system operations, or admin-level actions even if he asks.
|
||||
|
||||
**Acceptance Criteria:**
|
||||
- System prompt for child users instructs the LLM never to use SSH, file system, Proxmox,
|
||||
network, or infrastructure tools
|
||||
- This is enforced at the system prompt level (model instruction), not by removing MCP servers
|
||||
- Tool invocations from child users that attempt admin tooling are logged as anomalies
|
||||
|
||||
---
|
||||
|
||||
### REQ-07 — Parental Audit Log
|
||||
**As a parent**, I want a complete, searchable record of every conversation my gabriel has with
|
||||
the bot so I can review what he's been asking and what the bot responded.
|
||||
|
||||
**Acceptance Criteria:**
|
||||
- Every interaction from a child user is written to a dedicated audit log, separate from
|
||||
the RSO/memory-scoring logs
|
||||
- Audit log location: `memory_workspace/audit/{username}/YYYY-MM-DD.jsonl`
|
||||
- Each audit entry contains:
|
||||
- ISO timestamp
|
||||
- Username
|
||||
- Full inbound message (not truncated)
|
||||
- Filter action taken (allowed / blocked / flagged)
|
||||
- Filter reason (if blocked/flagged)
|
||||
- Full outbound response
|
||||
- Audit writes are non-blocking (background thread, same pattern as InteractionLogger)
|
||||
- Audit log retention: 365 days (configurable)
|
||||
- Existing RSO interaction logs are not modified — audit log is additive
|
||||
|
||||
---
|
||||
|
||||
### REQ-08 — Configuration-Driven Restricted Users
|
||||
**As a parent**, I want the child safety features to be controlled by config, not hardcoded,
|
||||
so I can add or remove restricted users without modifying Python source.
|
||||
|
||||
**Acceptance Criteria:**
|
||||
- A `child_safety` block in `config/adapters.local.yaml` defines which usernames are restricted
|
||||
- Example:
|
||||
```yaml
|
||||
child_safety:
|
||||
restricted_users:
|
||||
- gabriel
|
||||
audit_retention_days: 365
|
||||
```
|
||||
- The `child_safety.py` module reads this config at startup
|
||||
- Adding a new restricted user requires only a config change and bot restart
|
||||
|
||||
---
|
||||
|
||||
## Non-Functional Requirements
|
||||
|
||||
| ID | Requirement |
|
||||
|----|-------------|
|
||||
| NFR-01 | Filtering must add < 50ms latency to message processing |
|
||||
| NFR-02 | Audit log writes must never block response delivery |
|
||||
| NFR-03 | A filter failure (exception) must fail safe — block the message, not pass it |
|
||||
| NFR-04 | Audit log files must not be accessible via any bot tool or command |
|
||||
| NFR-05 | Restricted user config must survive bot restarts |
|
||||
| NFR-06 | False positive rate on game dev questions must be near zero for common Roblox/Lua vocabulary |
|
||||
|
||||
---
|
||||
|
||||
### REQ-09 — Guided Learning Approach (Skill Development Over Answer Delivery)
|
||||
**As a parent**, I want the bot to teach my gabriel how to think and build, not just hand him
|
||||
answers — so that he develops real coding and problem-solving skills over time rather than
|
||||
becoming dependent on the bot.
|
||||
|
||||
**Acceptance Criteria:**
|
||||
|
||||
- The bot's default mode is **guide first, answer second** — not the reverse
|
||||
- Before giving a solution, the bot asks what the user has already tried or what they think
|
||||
might work, unless the question is purely factual ("what does `pairs()` do in Lua?")
|
||||
- When code is provided, it is **always accompanied by an explanation** of what it does and
|
||||
why — never a bare code block with no context
|
||||
- Explanations use the **minimum necessary detail** for his age/level — short, plain-language
|
||||
sentences before diving into code
|
||||
- The bot breaks problems into **smaller steps** and guides through each one rather than
|
||||
solving the whole thing at once:
|
||||
- "Let's tackle the shooting mechanic first. What do you think needs to happen when the
|
||||
player pulls the trigger?"
|
||||
- The bot **celebrates attempts and effort**, not just correct answers:
|
||||
- "Nice — you got the loop right, that's the hard part. The issue is just this one line..."
|
||||
- When the user shares broken code, the bot **guides them to find the bug** rather than
|
||||
pointing straight to it:
|
||||
- "Take a look at line 12 — what do you think that variable is at that point in the loop?"
|
||||
- After giving code, the bot **leaves something for the user to do**:
|
||||
- "I've written the basic function — can you add the part that checks if the player has
|
||||
enough ammo before it fires?"
|
||||
- The bot periodically uses **transfer learning** to connect new concepts to ones already
|
||||
covered:
|
||||
- "Remember the loop we used for the enemy spawner? This is the same idea."
|
||||
- Code IS shown when asked — this is not a Socratic-only mode. The teaching layer wraps
|
||||
the code, it does not replace it.
|
||||
- Purely factual or lookup questions ("what's the Roblox service for detecting player input?")
|
||||
get a direct answer — no forced Socratic preamble for simple lookups.
|
||||
|
||||
---
|
||||
|
||||
### REQ-10 — Token Optimization for Child Sessions
|
||||
**As a parent**, I want Gabriel's sessions to consume as few tokens as possible since he shares
|
||||
the same API pool as me, without degrading the quality of his experience.
|
||||
|
||||
**Acceptance Criteria:**
|
||||
|
||||
- Gabriel's system prompt **skips SOUL.md** (the Garvis homelab persona) — irrelevant to him,
|
||||
currently costs ~935 tokens per turn
|
||||
- Gabriel's system prompt **skips context.md** (SSH hosts, Proxmox VMs, networking) — entirely
|
||||
irrelevant to Lua help, currently costs ~227 tokens per turn
|
||||
- Gabriel's system prompt uses a **lightweight tutor identity block** (~100 tokens) in place
|
||||
of SOUL.md — enough to set tone without the homelab baggage
|
||||
- The **hybrid memory search is skipped** for Gabriel — the memory store is Jordan's homelab
|
||||
operational data and returns irrelevant chunks that waste tokens
|
||||
- Gabriel's **conversation history window is capped at 10 messages** (vs Jordan's 20) — Lua
|
||||
help sessions rarely need deep context; this roughly halves history token cost
|
||||
- The **delegation/sub-agent block** is omitted from Gabriel's system prompt — he will never
|
||||
trigger multi-agent tasks (~80 tokens saved)
|
||||
- All optimizations are conditional on `is_restricted(username)` — Jordan's experience is
|
||||
completely unchanged
|
||||
|
||||
**Expected savings per Gabriel turn:**
|
||||
|
||||
| Removed component | Token saving |
|
||||
|---|---|
|
||||
| SOUL.md | ~935 |
|
||||
| context.md | ~227 |
|
||||
| Memory search (5 chunks avg) | ~300–500 |
|
||||
| History window 20→10 (avg) | ~20–50% of history |
|
||||
| Delegation block | ~80 |
|
||||
| **Total** | **~1,500–1,800 tokens/turn** |
|
||||
|
||||
---
|
||||
|
||||
### REQ-11 — AI Literacy as Part of the Teaching Approach
|
||||
**As a parent**, I want the bot to teach Gabriel how to use AI tools well — not just what to
|
||||
ask, but how to ask — so he builds self-sufficiency with these tools rather than dependency.
|
||||
|
||||
**Acceptance Criteria:**
|
||||
|
||||
- When Gabriel asks a vague or broad question, the bot **models good question-asking** by
|
||||
clarifying its understanding before answering:
|
||||
> "Just to make sure I give you the most useful answer — you want the enemy to deal damage
|
||||
> on touch, right? Or is it supposed to chase first?"
|
||||
- When Gabriel notices the bot "forgot" something earlier, the bot **explains context windows**
|
||||
in plain terms, naturally:
|
||||
> "Yeah — I can only hold so much of our conversation in memory at once. At the start of
|
||||
> next session, just remind me what you're building and I'll be straight back up to speed."
|
||||
- The bot **teaches the ideal coding question format** when the opportunity arises naturally:
|
||||
> "Next time try: what your code does now, what you want it to do, and what you've already
|
||||
> tried. That combo gets you a much faster answer."
|
||||
- The bot **flags its own assumptions** so Gabriel learns to spot ambiguity:
|
||||
> "I'm assuming you want this to reset on respawn — let me know if that's not right."
|
||||
- AI literacy is woven into responses naturally — never a separate lecture unless Gabriel
|
||||
directly asks how the bot works.
|
||||
|
||||
---
|
||||
|
||||
### REQ-12 — Cross-Session Project Continuity
|
||||
**As a parent**, I want the bot to remember what Gabriel is building between sessions so he
|
||||
doesn't have to re-explain his project every time, and the teaching approach stays coherent
|
||||
over days and weeks — not just within a single conversation.
|
||||
|
||||
**Acceptance Criteria:**
|
||||
|
||||
- A lightweight project context file exists at `memory_workspace/users/gabriel_context.md`
|
||||
- This file is injected into Gabriel's system prompt on every turn (replaces memory search,
|
||||
which is skipped for Gabriel per REQ-10)
|
||||
- The bot updates `gabriel_context.md` at the end of each session with a brief summary of:
|
||||
- What Gabriel is currently building (project name/description)
|
||||
- What was worked on in this session (features, bugs fixed, concepts covered)
|
||||
- Any open threads or "next steps" Gabriel mentioned
|
||||
- Any new concepts introduced this session (feeds into REQ-13)
|
||||
- The update is concise — target < 30 lines total; the file is overwritten, not appended
|
||||
- On first session (file doesn't exist), the bot starts fresh and creates it after the
|
||||
first substantive exchange
|
||||
- The file is human-readable so Jordan can review it directly in Slack's file system or
|
||||
the memory workspace
|
||||
|
||||
---
|
||||
|
||||
### REQ-13 — Skill Progression Tracking
|
||||
**As a parent**, I want the bot to remember what Gabriel has already been taught so it doesn't
|
||||
re-explain concepts he's mastered, and can reference them naturally when introducing related ideas.
|
||||
|
||||
**Acceptance Criteria:**
|
||||
|
||||
- A skills log section exists within `gabriel_context.md` (same file as REQ-12, separate section)
|
||||
- Each entry records: concept name, brief description, date first introduced
|
||||
- Example entries:
|
||||
- `for loops` — iterating over tables, introduced 2026-04-21
|
||||
- `functions` — defining and calling, parameters vs arguments, introduced 2026-04-22
|
||||
- `RemoteEvents` — client-server communication in Roblox, introduced 2026-04-25
|
||||
- The bot checks this log before explaining a concept — if already introduced, it references
|
||||
it rather than re-explaining from scratch:
|
||||
> "You've used this before — remember the loop we wrote for the enemy spawner?"
|
||||
- The log grows over time; the bot adds an entry the first time it meaningfully teaches a new
|
||||
concept, not for every mention
|
||||
- Skills log is appended to `gabriel_context.md` under a `## Skills Introduced` heading
|
||||
|
||||
---
|
||||
|
||||
### REQ-14 — First-Run Onboarding Experience
|
||||
**As a parent**, I want Gabriel to receive a friendly welcome the first time he messages the
|
||||
bot that sets expectations — what it can help him with, how it works, and that it's there
|
||||
to teach him, not do the work for him.
|
||||
|
||||
**Acceptance Criteria:**
|
||||
|
||||
- The bot detects a first-run state by checking whether `gabriel_context.md` exists
|
||||
- On first message from Gabriel, before processing his question, the bot sends a welcome
|
||||
message that covers:
|
||||
- What it can help with (Lua, Roblox Studio, game design, coding questions)
|
||||
- How the teaching approach works — that it'll guide him and ask questions, not just
|
||||
hand over answers ("I'm here to help you figure it out, not just give you the answer")
|
||||
- That it'll remember his projects between sessions
|
||||
- An invitation to tell it what he's working on
|
||||
- The welcome is sent as a separate message before the response to his first question
|
||||
- The welcome is conversational and age-appropriate — not a terms-and-conditions wall
|
||||
- After the welcome, his first actual question is answered normally
|
||||
- The first-run check only fires once; subsequent sessions go straight to his question
|
||||
|
||||
---
|
||||
|
||||
### REQ-15 — Slack Allow-List (Gap Fix)
|
||||
**As a parent**, I want only authorised users to be able to message the bot on Slack, since
|
||||
the Slack adapter currently processes messages from any workspace member with no restriction.
|
||||
|
||||
**Acceptance Criteria:**
|
||||
|
||||
- The Slack adapter checks an `allowed_users` list from config before processing any message
|
||||
- Messages from users not on the allow-list are silently dropped (no response sent)
|
||||
- The allow-list is read from `config/adapters.local.yaml` under the slack adapter settings,
|
||||
matching the pattern already used by the Slack adapter for other config
|
||||
- Jordan's existing Slack user ID remains on the list; Gabriel's is added
|
||||
- No change to Telegram adapter behaviour (already has this check)
|
||||
|
||||
---
|
||||
|
||||
## Out of Scope
|
||||
|
||||
- Time-of-day restrictions (not enforceable at bot level — use Slack parental controls)
|
||||
- Per-topic whitelists managed via chat commands
|
||||
- Automated parent notifications on blocked requests (future enhancement)
|
||||
- Web dashboard for audit log review (future enhancement)
|
||||
450
.kiro/specs/child-safety-profile/tasks.md
Normal file
450
.kiro/specs/child-safety-profile/tasks.md
Normal file
@@ -0,0 +1,450 @@
|
||||
# Tasks — Child Safety Profile
|
||||
|
||||
Implementation order matters. Each task has a dependency note where relevant.
|
||||
Tasks within a phase can be worked in parallel; phases must be completed in order.
|
||||
|
||||
---
|
||||
|
||||
## Phase 1 — Identity & Config (no code changes, unblocks everything else)
|
||||
|
||||
### TASK-01 — Add gabriel's Slack user ID to config
|
||||
**File:** `config/adapters.local.yaml`
|
||||
**What:** Add gabriel's Slack ID to `allowed_users` and `user_mapping`; add `child_safety` block.
|
||||
**Requires:** Son's actual Slack user ID (parent to provide)
|
||||
|
||||
```yaml
|
||||
# Under slack adapter settings:
|
||||
allowed_users:
|
||||
- <JORDANS_SLACK_USER_ID> # Jordan
|
||||
- <GABRIELS_SLACK_USER_ID> # Gabriel
|
||||
|
||||
user_mapping:
|
||||
slack_<JORDANS_SLACK_USER_ID>: "jordan"
|
||||
slack_<GABRIELS_SLACK_USER_ID>: "gabriel"
|
||||
|
||||
# New top-level block:
|
||||
child_safety:
|
||||
restricted_users:
|
||||
- gabriel
|
||||
audit_retention_days: 365
|
||||
```
|
||||
|
||||
Note: The Slack adapter has no allow-list today. `allowed_users` only takes effect after
|
||||
TASK-01b adds the check to `adapters/slack/adapter.py`. Do both together.
|
||||
|
||||
**Acceptance:** Bot recognises Gabriel's Slack account and routes him to username `"gabriel"`.
|
||||
All other Slack users (not in `allowed_users`) are silently ignored.
|
||||
|
||||
---
|
||||
|
||||
### TASK-02 — Create gabriel's user profile
|
||||
**File:** `memory_workspace/users/gabriel.md` (new)
|
||||
**What:** Write the per-user profile that gets injected into the system prompt.
|
||||
**Depends on:** Nothing
|
||||
|
||||
Content to include:
|
||||
- Age: 13
|
||||
- Interests: gaming, horror games, Lua scripting, Roblox Studio, game design
|
||||
- Learning style: hands-on, wants working code examples, short explanations before diving in
|
||||
- Communication style: casual, encouraging, celebrate what he's building
|
||||
- Current projects: Roblox horror game (update as known)
|
||||
- Do NOT include guardrail rules here — those come from the injected block
|
||||
|
||||
**Acceptance:** Profile loads without error; content appears in system prompt during gabriel's session (verify via debug log).
|
||||
|
||||
---
|
||||
|
||||
## Phase 2 — Audit Logger
|
||||
|
||||
### TASK-03 — Implement `ChildAuditLogger`
|
||||
**File:** `child_safety.py` (new — start with this class only)
|
||||
**What:** Thread-safe, non-blocking JSONL audit logger for child user interactions.
|
||||
**Depends on:** Nothing (standalone)
|
||||
|
||||
Implementation notes:
|
||||
- Mirror the pattern from `observation/interaction_logger.py` exactly
|
||||
- Write to `memory_workspace/audit/{username}/YYYY-MM-DD.jsonl`
|
||||
- Create directory on first write (not at init, to avoid creating dirs for unused usernames)
|
||||
- Single public method: `log(username, platform, action, filter_stage, filter_reason, message, response)`
|
||||
- `action` values: `"allowed"` | `"blocked"` | `"flagged"`
|
||||
- `cleanup_old_logs(retention_days)` — prune files older than retention window
|
||||
|
||||
**Acceptance:** Unit-testable in isolation. Call `log()` twice, verify two JSONL lines written
|
||||
to the correct file path with correct schema.
|
||||
|
||||
---
|
||||
|
||||
## Phase 3 — Filtering Logic
|
||||
|
||||
### TASK-04 — Implement `ChildSafetyConfig`
|
||||
**File:** `child_safety.py` (add to existing file)
|
||||
**What:** Config loader that reads the `child_safety` block from `adapters.local.yaml`.
|
||||
**Depends on:** TASK-01 (config block must exist)
|
||||
|
||||
```python
|
||||
class ChildSafetyConfig:
|
||||
restricted_users: list[str]
|
||||
audit_retention_days: int
|
||||
|
||||
@classmethod
|
||||
def from_yaml(cls, config_path: Path) -> "ChildSafetyConfig": ...
|
||||
|
||||
def is_restricted(self, username: str) -> bool: ...
|
||||
```
|
||||
|
||||
**Acceptance:** `is_restricted("gabriel")` returns `True`; `is_restricted("cloe")` returns `False`.
|
||||
|
||||
---
|
||||
|
||||
### TASK-05 — Implement `ChildSafetyFilter` — input (preprocessor)
|
||||
**File:** `child_safety.py` (add to existing file)
|
||||
**What:** Intent-pattern input filter.
|
||||
**Depends on:** TASK-03, TASK-04
|
||||
|
||||
Implementation notes:
|
||||
- Compile all regex patterns once at class instantiation (not per-call)
|
||||
- Pattern evaluation order: hard-block → context signals → conditional block → pass
|
||||
- `preprocess(message: InboundMessage) -> tuple[InboundMessage | None, str | None]`
|
||||
- Returns `(message, None)` → pass through
|
||||
- Returns `(None, reply_text)` → block; caller sends reply_text directly
|
||||
- Call `AuditLogger.log()` with `action="blocked"` on any block
|
||||
- Call `AuditLogger.log()` with `action="allowed"` on pass (response field left null here —
|
||||
audit logger will be called again in the postprocessor with the full response)
|
||||
|
||||
Hard-block patterns (always active):
|
||||
```python
|
||||
HARD_BLOCK_PATTERNS = [
|
||||
r"\b(sex|porn|nude|naked|explicit)\b",
|
||||
r"\bhow (do i|to|can i).{0,40}(kill|hurt|stab|shoot|harm).{0,30}(myself|yourself)\b",
|
||||
r"\bhow (do i|to|can i).{0,40}(hurt|stab|kill|attack|beat up|harm).{0,30}(my |a )?(sister|brother|mom|dad|teacher|classmate|friend|kid|child|person|someone|people)\b",
|
||||
r"\b(give me|what is|find).{0,30}(address|phone number|school|location).{0,30}(of|for)\b",
|
||||
]
|
||||
```
|
||||
|
||||
Game context signals (exempts from conditional blocks):
|
||||
```python
|
||||
GAME_CONTEXT_SIGNALS = [
|
||||
r"\bin (my |the |a )?(game|roblox|studio|script|map|level|world|place)\b",
|
||||
r"\b(lua|roblox|studio|npc|hitbox|raycast|humanoid|workspace|basepart|tool|part)\b",
|
||||
r"\b(code|script|function|method|module|class|variable|loop|event|animate|tween)\b",
|
||||
r"\b(damage|health|respawn|kill|destroy)\b.{0,30}\b(player|npc|enemy|mob|character|humanoid)\b",
|
||||
r"\bhow (do i|to|can i) (make|get|set|add|create|implement|build|script)\b",
|
||||
]
|
||||
```
|
||||
|
||||
Conditional block patterns (only active when no game context signal):
|
||||
```python
|
||||
CONDITIONAL_BLOCK_PATTERNS = [
|
||||
r"\bhow (do i|to|can i).{0,40}(use|wield|make|build).{0,30}(knife|gun|pistol|rifle|weapon|sword|bomb).{0,30}(hurt|harm|attack|fight|cut|stab|shoot)\b",
|
||||
r"\bhow (do i|to|can i).{0,40}(hurt|fight|attack|beat).{0,30}(someone|people|person|kid|child)\b",
|
||||
r"\b(buy|get|obtain|find).{0,30}(drugs?|weed|cocaine|meth|pills)\b",
|
||||
]
|
||||
```
|
||||
|
||||
**Acceptance:**
|
||||
- `"how do I make a knife swing animation in Roblox"` → pass (game context)
|
||||
- `"how do I use a knife to hurt someone"` → block (no game context, conditional pattern)
|
||||
- `"how do I kill all NPCs in my game"` → pass (game context signal: NPC)
|
||||
- `"how do I hurt my classmate"` → block (hard block)
|
||||
- `"lua script for a gun that shoots"` → pass (game context)
|
||||
|
||||
---
|
||||
|
||||
### TASK-06 — Implement `ChildSafetyFilter` — output (postprocessor)
|
||||
**File:** `child_safety.py` (add to existing file)
|
||||
**What:** Light response scan before delivery.
|
||||
**Depends on:** TASK-05
|
||||
|
||||
Implementation notes:
|
||||
- `postprocess(response: str, message: InboundMessage) -> str`
|
||||
- Scan for: explicit terms list, profanity list, real-world harm instruction patterns
|
||||
- If flagged: log `action="flagged"`, return safe fallback string
|
||||
- If clean: log `action="allowed"` with full response (this is the final audit entry)
|
||||
- Fallback string: "I ran into a bit of a snag there. Try rephrasing, or ask me something
|
||||
about your Roblox game — I love helping with scripts and game design!"
|
||||
|
||||
**Acceptance:**
|
||||
- Clean Lua code response → returned unchanged
|
||||
- Response containing explicit term → replaced with fallback
|
||||
|
||||
---
|
||||
|
||||
## Phase 4 — System Prompt Injection
|
||||
|
||||
### TASK-07 — Add guardrail block injection to `agent.py`
|
||||
**File:** `agent.py`, `_build_system_prompt()` (~line 488)
|
||||
**What:** Inject `CHILD_GUARDRAIL_BLOCK` for restricted users.
|
||||
**Depends on:** TASK-04
|
||||
|
||||
Changes:
|
||||
1. Import `ChildSafetyConfig` and `CHILD_GUARDRAIL_BLOCK` from `child_safety`
|
||||
2. In `Agent.__init__()`, load config and store as `self._child_safety_config`
|
||||
3. In `_build_system_prompt()`, after the user profile block:
|
||||
|
||||
```python
|
||||
if (not self.is_sub_agent
|
||||
and self._child_safety_config
|
||||
and self._child_safety_config.is_restricted(username)):
|
||||
system_parts.append(CHILD_GUARDRAIL_BLOCK)
|
||||
```
|
||||
|
||||
`CHILD_GUARDRAIL_BLOCK` is a module-level constant in `child_safety.py`.
|
||||
|
||||
**Acceptance:** Add a debug print temporarily — verify the guardrail block appears in the
|
||||
assembled system prompt when username is `"gabriel"` and does not appear for `"cloe"`.
|
||||
|
||||
---
|
||||
|
||||
## Phase 5 — Runtime Wiring
|
||||
|
||||
### TASK-08 — Register pre/postprocessors in `adapters/runtime.py`
|
||||
**File:** `adapters/runtime.py`, `AdapterRuntime.__init__()`
|
||||
**What:** Instantiate the filter and register it with the runtime.
|
||||
**Depends on:** TASK-05, TASK-06
|
||||
|
||||
```python
|
||||
from child_safety import ChildSafetyConfig, ChildSafetyFilter, ChildAuditLogger
|
||||
|
||||
_cs_config = ChildSafetyConfig.from_yaml(config_path)
|
||||
_cs_audit = ChildAuditLogger(workspace_dir)
|
||||
_cs_audit.cleanup_old_logs(_cs_config.audit_retention_days)
|
||||
_cs_filter = ChildSafetyFilter(_cs_config, _cs_audit)
|
||||
|
||||
self.add_preprocessor(_cs_filter.preprocess_adapter)
|
||||
self.add_postprocessor(_cs_filter.postprocess_adapter)
|
||||
```
|
||||
|
||||
The `preprocess_adapter` and `postprocess_adapter` wrappers adapt the filter's return types
|
||||
to match the runtime's expected preprocessor/postprocessor signatures.
|
||||
|
||||
For blocking: the preprocessor adapter queues the safe reply via the adapter's `send_message()`
|
||||
and returns a sentinel `InboundMessage` that causes the agent to be skipped. Review the runtime
|
||||
queue loop in `_process_message()` to confirm the cleanest abort point.
|
||||
|
||||
**Acceptance:** End-to-end test with gabriel's Slack account (or mocked username):
|
||||
- Allowed message → full response delivered, audit entry written
|
||||
- Blocked message → canned reply delivered immediately, no LLM call made
|
||||
|
||||
---
|
||||
|
||||
## Phase 6 — Verification
|
||||
|
||||
### TASK-09 — End-to-end smoke test
|
||||
**What:** Manual testing checklist before considering the feature complete.
|
||||
|
||||
- [ ] Gabriel's Slack ID is mapped and he can message the bot
|
||||
- [ ] Allowed game dev question gets a helpful Lua/Roblox response
|
||||
- [ ] Blocked real-world harm question gets canned reply, no LLM call
|
||||
- [ ] Horror game question with violence words (e.g., "enemy takes damage") passes through
|
||||
- [ ] Audit log file created at correct path with correct schema
|
||||
- [ ] Parent (Jordan/Cloe) messages are completely unaffected — full SOUL.md + context.md injected as normal
|
||||
- [ ] RSO interaction log unchanged (no extra entries for blocked messages)
|
||||
- [ ] Bot restart preserves config (no in-memory-only state)
|
||||
- [ ] Gabriel's system prompt does NOT contain SOUL.md or context.md content (verify via debug)
|
||||
- [ ] Jordan's system prompt still contains SOUL.md and context.md (verify no regression)
|
||||
|
||||
### TASK-10 — Review and finalise Gabriel's user profile
|
||||
**File:** `memory_workspace/users/gabriel.md`
|
||||
**What:** After first real interactions, update the profile with observed interests, current
|
||||
project details, preferred explanation style. This is an ongoing task.
|
||||
**Depends on:** TASK-09
|
||||
|
||||
---
|
||||
|
||||
## Phase 7 — Token Optimization
|
||||
|
||||
### TASK-11 — Implement stripped system prompt for restricted users
|
||||
**File:** `agent.py`, `_build_system_prompt()` and `_chat_inner()`
|
||||
**What:** Build the child-optimized system prompt path; reduce history window for restricted users.
|
||||
**Depends on:** TASK-07 (child safety config already wired into Agent)
|
||||
|
||||
Changes to `_build_system_prompt()`:
|
||||
1. Add module-level constant `CHILD_TUTOR_IDENTITY` — ~100-token identity replacing SOUL.md
|
||||
2. Add module-level constant `CHILD_MAX_CONTEXT_MESSAGES = 10`
|
||||
3. When `is_restricted(username)` is true, build a stripped prompt:
|
||||
|
||||
```python
|
||||
def _build_child_system_prompt(self, username: str) -> str:
|
||||
user_profile = self.memory.get_user(username)
|
||||
parts = [
|
||||
CHILD_TUTOR_IDENTITY,
|
||||
f"User Profile:\n{user_profile}",
|
||||
CHILD_GUARDRAIL_BLOCK,
|
||||
"You have access to tools for web search and code help. "
|
||||
"Use them to assist Gabriel with his game development questions.",
|
||||
]
|
||||
return "\n\n".join(parts)
|
||||
```
|
||||
|
||||
Note: `get_soul()`, `get_context()`, `search_hybrid()`, and the delegation block are
|
||||
all deliberately absent.
|
||||
|
||||
Changes to `_chat_inner()`:
|
||||
```python
|
||||
is_child = (self._child_safety_config
|
||||
and self._child_safety_config.is_restricted(username))
|
||||
cap = CHILD_MAX_CONTEXT_MESSAGES if is_child else MAX_CONTEXT_MESSAGES
|
||||
system = (self._build_child_system_prompt(username) if is_child
|
||||
else self._build_system_prompt(user_message, username, platform))
|
||||
```
|
||||
|
||||
**Acceptance:**
|
||||
- Gabriel's assembled system prompt: no SOUL.md text, no SSH/Proxmox content, no delegation block
|
||||
- Gabriel's history window: max 10 messages passed to LLM
|
||||
- Jordan's assembled system prompt: unchanged (full SOUL.md + context.md present)
|
||||
- Measure approximate token count difference (log `len(system)` for both users)
|
||||
|
||||
### TASK-12 — Add AI literacy guidance to gabriel.md
|
||||
**File:** `memory_workspace/users/gabriel.md`
|
||||
**What:** Add a section explicitly coaching the model on how to teach Gabriel to use the bot
|
||||
well — question framing, context windows, spotting assumptions. This lives in the profile (not
|
||||
the guardrail block) so it can be updated over time as his skills grow.
|
||||
**Depends on:** TASK-02 (profile already exists)
|
||||
|
||||
Add a section to gabriel.md:
|
||||
|
||||
```markdown
|
||||
## Teaching Him to Use AI Well
|
||||
|
||||
Help Gabriel get better at using AI tools as a skill in itself. Do this naturally, not as
|
||||
a lesson — just model good practice and name it when it happens.
|
||||
|
||||
- When he asks something vague: clarify first, then answer. Name what you're doing.
|
||||
"Just checking what you mean first — that's a good habit when asking any AI."
|
||||
- When context runs out or he notices you "forgot": explain the context window simply.
|
||||
"I can only hold so much conversation at once — like a whiteboard that fills up."
|
||||
- When he nails a well-structured question: tell him.
|
||||
"That was a perfect question — gave me exactly what I needed."
|
||||
- Teach the format: what I have / what I want / what I've tried.
|
||||
- Flag your own assumptions visibly so he learns to spot ambiguity in questions.
|
||||
```
|
||||
|
||||
**Acceptance:** Profile updated; the AI literacy guidance appears in Gabriel's system prompt
|
||||
on the next session.
|
||||
|
||||
---
|
||||
|
||||
## Summary Table
|
||||
|
||||
| Task | File | Phase | Depends On |
|
||||
|------|------|-------|-----------|
|
||||
| TASK-01 | `config/adapters.local.yaml` | 1 — Identity | Parent provides Slack ID |
|
||||
| TASK-02 | `memory_workspace/users/gabriel.md` | 1 — Identity | — |
|
||||
| TASK-03 | `child_safety.py` | 2 — Audit | — |
|
||||
| TASK-04 | `child_safety.py` | 3 — Filter | TASK-01 |
|
||||
| TASK-05 | `child_safety.py` | 3 — Filter | TASK-03, TASK-04 |
|
||||
| TASK-06 | `child_safety.py` | 3 — Filter | TASK-05 |
|
||||
| TASK-07 | `agent.py` | 4 — System Prompt | TASK-04 |
|
||||
| TASK-08 | `adapters/runtime.py` | 5 — Wiring | TASK-05, TASK-06 |
|
||||
| TASK-09 | — | 6 — Verify | All |
|
||||
| TASK-10 | `memory_workspace/users/gabriel.md` | 6 — Verify | TASK-09 |
|
||||
| TASK-11 | `agent.py` | 7 — Token Optimization | TASK-07 |
|
||||
| TASK-12 | `memory_workspace/users/gabriel.md` | 7 — Token Optimization | TASK-02 |
|
||||
| TASK-13 | `adapters/slack/adapter.py` | 8 — Slack Allow-List | — |
|
||||
| TASK-14 | `agent.py`, `gabriel_context.md` | 9 — Continuity | TASK-11 |
|
||||
| TASK-15 | `agent.py` | 9 — Continuity | TASK-14 |
|
||||
| TASK-16 | — | 10 — Full Verify | All |
|
||||
|
||||
---
|
||||
|
||||
## Phase 8 — Slack Allow-List
|
||||
|
||||
### TASK-13 — Add allow-list check to Slack adapter
|
||||
**File:** `adapters/slack/adapter.py`
|
||||
**What:** Silently drop messages from users not in `allowed_users` config.
|
||||
**Depends on:** Nothing (standalone fix)
|
||||
|
||||
Add to `SlackAdapter`:
|
||||
|
||||
```python
|
||||
def _is_user_allowed(self, user_id: str) -> bool:
|
||||
allowed = self.config.settings.get("allowed_users", [])
|
||||
if not allowed:
|
||||
return True # open if unconfigured — preserves existing behaviour
|
||||
return str(user_id) in [str(u) for u in allowed]
|
||||
```
|
||||
|
||||
In `handle_message_events()`, first line after extracting `user_id`:
|
||||
```python
|
||||
if not self._is_user_allowed(user_id):
|
||||
return
|
||||
```
|
||||
|
||||
Update `config/adapters.local.yaml` to add `allowed_users` under the slack block:
|
||||
```yaml
|
||||
slack:
|
||||
allowed_users:
|
||||
- U01234JORDAN # Jordan's Slack user ID (find via Slack profile → More → Copy member ID)
|
||||
- U09876GABRIEL # Gabriel's Slack user ID
|
||||
```
|
||||
|
||||
**Acceptance:** A Slack message from a user not in the list produces no bot response.
|
||||
Jordan and Gabriel's messages continue to work normally.
|
||||
|
||||
---
|
||||
|
||||
## Phase 9 — Cross-Session Continuity & First-Run
|
||||
|
||||
### TASK-14 — Implement gabriel_context.md injection and end-of-session update
|
||||
**File:** `agent.py`, `_build_child_system_prompt()`
|
||||
**What:** Inject `gabriel_context.md` into Gabriel's system prompt; instruct the bot to
|
||||
update it at the end of each session.
|
||||
**Depends on:** TASK-11 (child system prompt path)
|
||||
|
||||
Changes to `_build_child_system_prompt()`:
|
||||
|
||||
```python
|
||||
context_path = self.memory.workspace_dir / "users" / "gabriel_context.md"
|
||||
gabriel_context = context_path.read_text(encoding="utf-8") if context_path.exists() else None
|
||||
is_first_run = gabriel_context is None
|
||||
|
||||
parts = [CHILD_TUTOR_IDENTITY, f"User Profile:\n{user_profile}"]
|
||||
if gabriel_context:
|
||||
parts.append(f"Project Context & Skills:\n{gabriel_context}")
|
||||
if is_first_run:
|
||||
parts.append(FIRST_RUN_BLOCK) # see TASK-15
|
||||
parts.append(CHILD_GUARDRAIL_BLOCK)
|
||||
parts.append(SESSION_UPDATE_INSTRUCTION) # always appended
|
||||
parts.append("You have access to file tools. Use them to update gabriel_context.md "
|
||||
"at the end of this conversation.")
|
||||
```
|
||||
|
||||
`SESSION_UPDATE_INSTRUCTION` constant:
|
||||
```
|
||||
At the end of this conversation, use your file write tool to update
|
||||
`memory_workspace/users/gabriel_context.md` with:
|
||||
- ## Active Project: what Gabriel is building (name + one sentence description)
|
||||
- ## Last Session (today's date): what was worked on, bugs fixed, concepts covered
|
||||
- ## Open Threads: anything Gabriel mentioned wanting to do next
|
||||
- ## Skills Introduced: cumulative list of concepts taught, with date first introduced
|
||||
|
||||
Keep the file under 40 lines. Overwrite it completely each time.
|
||||
```
|
||||
|
||||
**Acceptance:**
|
||||
- Second session: Gabriel's project from previous session appears in system prompt
|
||||
- File is updated after the session ends (check `gabriel_context.md` was written)
|
||||
- First session: file absent, bot starts fresh, file created after first exchange
|
||||
|
||||
### TASK-15 — Implement first-run welcome
|
||||
**File:** `agent.py`, `child_safety.py`
|
||||
**What:** Send a warm onboarding welcome on Gabriel's very first message.
|
||||
**Depends on:** TASK-14 (first-run detection)
|
||||
|
||||
`FIRST_RUN_BLOCK` constant (added to `child_safety.py`):
|
||||
```
|
||||
FIRST SESSION: This is Gabriel's very first message. Before answering his question,
|
||||
send a short friendly welcome (4–5 sentences max). Cover:
|
||||
- What you can help with: Lua, Roblox Studio, game design, coding questions
|
||||
- That you guide and teach rather than just hand over answers
|
||||
- That you'll remember his projects between sessions
|
||||
- Invite him to tell you what he's building (or answer if he already has)
|
||||
Casual and warm — not a formal introduction. Then answer his question normally.
|
||||
```
|
||||
|
||||
This block is appended to `_build_child_system_prompt()` only when `is_first_run is True`.
|
||||
|
||||
**Acceptance:**
|
||||
- First ever message from Gabriel → bot sends welcome then answers the question
|
||||
- Second session → no welcome, goes straight to the question
|
||||
- Jordan's sessions → no welcome block ever injected
|
||||
Reference in New Issue
Block a user