ajarbot/.kiro/specs/child-safety-profile/tasks.md

# Tasks — Child Safety Profile

Implementation order matters. Each task has a dependency note where relevant.
Tasks within a phase can be worked in parallel; phases must be completed in order.

---

## Phase 1 — Identity & Config (no code changes, unblocks everything else)

### TASK-01 — Add gabriel's Slack user ID to config
**File:** `config/adapters.local.yaml`
**What:** Add gabriel's Slack ID to `allowed_users` and `user_mapping`; add `child_safety` block.
**Requires:** Son's actual Slack user ID (parent to provide)

```yaml
# Under slack adapter settings:
allowed_users:
  - <JORDANS_SLACK_USER_ID>   # Jordan
  - <GABRIELS_SLACK_USER_ID>  # Gabriel

user_mapping:
  slack_<JORDANS_SLACK_USER_ID>: "jordan"
  slack_<GABRIELS_SLACK_USER_ID>: "gabriel"

# New top-level block:
child_safety:
  restricted_users:
    - gabriel
  audit_retention_days: 365
```

Note: The Slack adapter has no allow-list today. `allowed_users` only takes effect after
TASK-01b adds the check to `adapters/slack/adapter.py`. Do both together.

**Acceptance:** Bot recognises Gabriel's Slack account and routes him to username `"gabriel"`.
All other Slack users (not in `allowed_users`) are silently ignored.

---

### TASK-02 — Create gabriel's user profile
**File:** `memory_workspace/users/gabriel.md` (new)
**What:** Write the per-user profile that gets injected into the system prompt.
**Depends on:** Nothing

Content to include:
- Age: 13
- Interests: gaming, horror games, Lua scripting, Roblox Studio, game design
- Learning style: hands-on, wants working code examples, short explanations before diving in
- Communication style: casual, encouraging, celebrate what he's building
- Current projects: Roblox horror game (update as known)
- Do NOT include guardrail rules here — those come from the injected block

**Acceptance:** Profile loads without error; content appears in system prompt during gabriel's session (verify via debug log).

---

## Phase 2 — Audit Logger

### TASK-03 — Implement `ChildAuditLogger`
**File:** `child_safety.py` (new — start with this class only)
**What:** Thread-safe, non-blocking JSONL audit logger for child user interactions.
**Depends on:** Nothing (standalone)

Implementation notes:
- Mirror the pattern from `observation/interaction_logger.py` exactly
- Write to `memory_workspace/audit/{username}/YYYY-MM-DD.jsonl`
- Create directory on first write (not at init, to avoid creating dirs for unused usernames)
- Single public method: `log(username, platform, action, filter_stage, filter_reason, message, response)`
- `action` values: `"allowed"` | `"blocked"` | `"flagged"`
- `cleanup_old_logs(retention_days)` — prune files older than retention window

**Acceptance:** Unit-testable in isolation. Call `log()` twice, verify two JSONL lines written
to the correct file path with correct schema.

---

## Phase 3 — Filtering Logic

### TASK-04 — Implement `ChildSafetyConfig`
**File:** `child_safety.py` (add to existing file)
**What:** Config loader that reads the `child_safety` block from `adapters.local.yaml`.
**Depends on:** TASK-01 (config block must exist)

```python
class ChildSafetyConfig:
    restricted_users: list[str]
    audit_retention_days: int

    @classmethod
    def from_yaml(cls, config_path: Path) -> "ChildSafetyConfig": ...

    def is_restricted(self, username: str) -> bool: ...
```

**Acceptance:** `is_restricted("gabriel")` returns `True`; `is_restricted("cloe")` returns `False`.

---

### TASK-05 — Implement `ChildSafetyFilter` — input (preprocessor)
**File:** `child_safety.py` (add to existing file)
**What:** Intent-pattern input filter.
**Depends on:** TASK-03, TASK-04

Implementation notes:
- Compile all regex patterns once at class instantiation (not per-call)
- Pattern evaluation order: hard-block → context signals → conditional block → pass
- `preprocess(message: InboundMessage) -> tuple[InboundMessage | None, str | None]`
  - Returns `(message, None)` → pass through
  - Returns `(None, reply_text)` → block; caller sends reply_text directly
- Call `AuditLogger.log()` with `action="blocked"` on any block
- Call `AuditLogger.log()` with `action="allowed"` on pass (response field left null here —
  audit logger will be called again in the postprocessor with the full response)

Hard-block patterns (always active):
```python
HARD_BLOCK_PATTERNS = [
    r"\b(sex|porn|nude|naked|explicit)\b",
    r"\bhow (do i|to|can i).{0,40}(kill|hurt|stab|shoot|harm).{0,30}(myself|yourself)\b",
    r"\bhow (do i|to|can i).{0,40}(hurt|stab|kill|attack|beat up|harm).{0,30}(my |a )?(sister|brother|mom|dad|teacher|classmate|friend|kid|child|person|someone|people)\b",
    r"\b(give me|what is|find).{0,30}(address|phone number|school|location).{0,30}(of|for)\b",
]
```

Game context signals (exempts from conditional blocks):
```python
GAME_CONTEXT_SIGNALS = [
    r"\bin (my |the |a )?(game|roblox|studio|script|map|level|world|place)\b",
    r"\b(lua|roblox|studio|npc|hitbox|raycast|humanoid|workspace|basepart|tool|part)\b",
    r"\b(code|script|function|method|module|class|variable|loop|event|animate|tween)\b",
    r"\b(damage|health|respawn|kill|destroy)\b.{0,30}\b(player|npc|enemy|mob|character|humanoid)\b",
    r"\bhow (do i|to|can i) (make|get|set|add|create|implement|build|script)\b",
]
```

Conditional block patterns (only active when no game context signal):
```python
CONDITIONAL_BLOCK_PATTERNS = [
    r"\bhow (do i|to|can i).{0,40}(use|wield|make|build).{0,30}(knife|gun|pistol|rifle|weapon|sword|bomb).{0,30}(hurt|harm|attack|fight|cut|stab|shoot)\b",
    r"\bhow (do i|to|can i).{0,40}(hurt|fight|attack|beat).{0,30}(someone|people|person|kid|child)\b",
    r"\b(buy|get|obtain|find).{0,30}(drugs?|weed|cocaine|meth|pills)\b",
]
```

**Acceptance:**
- `"how do I make a knife swing animation in Roblox"` → pass (game context)
- `"how do I use a knife to hurt someone"` → block (no game context, conditional pattern)
- `"how do I kill all NPCs in my game"` → pass (game context signal: NPC)
- `"how do I hurt my classmate"` → block (hard block)
- `"lua script for a gun that shoots"` → pass (game context)

---

### TASK-06 — Implement `ChildSafetyFilter` — output (postprocessor)
**File:** `child_safety.py` (add to existing file)
**What:** Light response scan before delivery.
**Depends on:** TASK-05

Implementation notes:
- `postprocess(response: str, message: InboundMessage) -> str`
- Scan for: explicit terms list, profanity list, real-world harm instruction patterns
- If flagged: log `action="flagged"`, return safe fallback string
- If clean: log `action="allowed"` with full response (this is the final audit entry)
- Fallback string: "I ran into a bit of a snag there. Try rephrasing, or ask me something
  about your Roblox game — I love helping with scripts and game design!"

**Acceptance:**
- Clean Lua code response → returned unchanged
- Response containing explicit term → replaced with fallback

---

## Phase 4 — System Prompt Injection

### TASK-07 — Add guardrail block injection to `agent.py`
**File:** `agent.py`, `_build_system_prompt()` (~line 488)
**What:** Inject `CHILD_GUARDRAIL_BLOCK` for restricted users.
**Depends on:** TASK-04

Changes:
1. Import `ChildSafetyConfig` and `CHILD_GUARDRAIL_BLOCK` from `child_safety`
2. In `Agent.__init__()`, load config and store as `self._child_safety_config`
3. In `_build_system_prompt()`, after the user profile block:

```python
if (not self.is_sub_agent
        and self._child_safety_config
        and self._child_safety_config.is_restricted(username)):
    system_parts.append(CHILD_GUARDRAIL_BLOCK)
```

`CHILD_GUARDRAIL_BLOCK` is a module-level constant in `child_safety.py`.

**Acceptance:** Add a debug print temporarily — verify the guardrail block appears in the
assembled system prompt when username is `"gabriel"` and does not appear for `"cloe"`.

---

## Phase 5 — Runtime Wiring

### TASK-08 — Register pre/postprocessors in `adapters/runtime.py`
**File:** `adapters/runtime.py`, `AdapterRuntime.__init__()`
**What:** Instantiate the filter and register it with the runtime.
**Depends on:** TASK-05, TASK-06

```python
from child_safety import ChildSafetyConfig, ChildSafetyFilter, ChildAuditLogger

_cs_config = ChildSafetyConfig.from_yaml(config_path)
_cs_audit = ChildAuditLogger(workspace_dir)
_cs_audit.cleanup_old_logs(_cs_config.audit_retention_days)
_cs_filter = ChildSafetyFilter(_cs_config, _cs_audit)

self.add_preprocessor(_cs_filter.preprocess_adapter)
self.add_postprocessor(_cs_filter.postprocess_adapter)
```

The `preprocess_adapter` and `postprocess_adapter` wrappers adapt the filter's return types
to match the runtime's expected preprocessor/postprocessor signatures.

For blocking: the preprocessor adapter queues the safe reply via the adapter's `send_message()`
and returns a sentinel `InboundMessage` that causes the agent to be skipped. Review the runtime
queue loop in `_process_message()` to confirm the cleanest abort point.

**Acceptance:** End-to-end test with gabriel's Slack account (or mocked username):
- Allowed message → full response delivered, audit entry written
- Blocked message → canned reply delivered immediately, no LLM call made

---

## Phase 6 — Verification

### TASK-09 — End-to-end smoke test
**What:** Manual testing checklist before considering the feature complete.

- [ ] Gabriel's Slack ID is mapped and he can message the bot
- [ ] Allowed game dev question gets a helpful Lua/Roblox response
- [ ] Blocked real-world harm question gets canned reply, no LLM call
- [ ] Horror game question with violence words (e.g., "enemy takes damage") passes through
- [ ] Audit log file created at correct path with correct schema
- [ ] Parent (Jordan/Cloe) messages are completely unaffected — full SOUL.md + context.md injected as normal
- [ ] RSO interaction log unchanged (no extra entries for blocked messages)
- [ ] Bot restart preserves config (no in-memory-only state)
- [ ] Gabriel's system prompt does NOT contain SOUL.md or context.md content (verify via debug)
- [ ] Jordan's system prompt still contains SOUL.md and context.md (verify no regression)

### TASK-10 — Review and finalise Gabriel's user profile
**File:** `memory_workspace/users/gabriel.md`
**What:** After first real interactions, update the profile with observed interests, current
project details, preferred explanation style. This is an ongoing task.
**Depends on:** TASK-09

---

## Phase 7 — Token Optimization

### TASK-11 — Implement stripped system prompt for restricted users
**File:** `agent.py`, `_build_system_prompt()` and `_chat_inner()`
**What:** Build the child-optimized system prompt path; reduce history window for restricted users.
**Depends on:** TASK-07 (child safety config already wired into Agent)

Changes to `_build_system_prompt()`:
1. Add module-level constant `CHILD_TUTOR_IDENTITY` — ~100-token identity replacing SOUL.md
2. Add module-level constant `CHILD_MAX_CONTEXT_MESSAGES = 10`
3. When `is_restricted(username)` is true, build a stripped prompt:

```python
def _build_child_system_prompt(self, username: str) -> str:
    user_profile = self.memory.get_user(username)
    parts = [
        CHILD_TUTOR_IDENTITY,
        f"User Profile:\n{user_profile}",
        CHILD_GUARDRAIL_BLOCK,
        "You have access to tools for web search and code help. "
        "Use them to assist Gabriel with his game development questions.",
    ]
    return "\n\n".join(parts)
```

Note: `get_soul()`, `get_context()`, `search_hybrid()`, and the delegation block are
all deliberately absent.

Changes to `_chat_inner()`:
```python
is_child = (self._child_safety_config
            and self._child_safety_config.is_restricted(username))
cap = CHILD_MAX_CONTEXT_MESSAGES if is_child else MAX_CONTEXT_MESSAGES
system = (self._build_child_system_prompt(username) if is_child
          else self._build_system_prompt(user_message, username, platform))
```

**Acceptance:**
- Gabriel's assembled system prompt: no SOUL.md text, no SSH/Proxmox content, no delegation block
- Gabriel's history window: max 10 messages passed to LLM
- Jordan's assembled system prompt: unchanged (full SOUL.md + context.md present)
- Measure approximate token count difference (log `len(system)` for both users)

### TASK-12 — Add AI literacy guidance to gabriel.md
**File:** `memory_workspace/users/gabriel.md`
**What:** Add a section explicitly coaching the model on how to teach Gabriel to use the bot
well — question framing, context windows, spotting assumptions. This lives in the profile (not
the guardrail block) so it can be updated over time as his skills grow.
**Depends on:** TASK-02 (profile already exists)

Add a section to gabriel.md:

```markdown
## Teaching Him to Use AI Well

Help Gabriel get better at using AI tools as a skill in itself. Do this naturally, not as
a lesson — just model good practice and name it when it happens.

- When he asks something vague: clarify first, then answer. Name what you're doing.
  "Just checking what you mean first — that's a good habit when asking any AI."
- When context runs out or he notices you "forgot": explain the context window simply.
  "I can only hold so much conversation at once — like a whiteboard that fills up."
- When he nails a well-structured question: tell him.
  "That was a perfect question — gave me exactly what I needed."
- Teach the format: what I have / what I want / what I've tried.
- Flag your own assumptions visibly so he learns to spot ambiguity in questions.
```

**Acceptance:** Profile updated; the AI literacy guidance appears in Gabriel's system prompt
on the next session.

---

## Summary Table

| Task | File | Phase | Depends On |
|------|------|-------|-----------|
| TASK-01 | `config/adapters.local.yaml` | 1 — Identity | Parent provides Slack ID |
| TASK-02 | `memory_workspace/users/gabriel.md` | 1 — Identity | — |
| TASK-03 | `child_safety.py` | 2 — Audit | — |
| TASK-04 | `child_safety.py` | 3 — Filter | TASK-01 |
| TASK-05 | `child_safety.py` | 3 — Filter | TASK-03, TASK-04 |
| TASK-06 | `child_safety.py` | 3 — Filter | TASK-05 |
| TASK-07 | `agent.py` | 4 — System Prompt | TASK-04 |
| TASK-08 | `adapters/runtime.py` | 5 — Wiring | TASK-05, TASK-06 |
| TASK-09 | — | 6 — Verify | All |
| TASK-10 | `memory_workspace/users/gabriel.md` | 6 — Verify | TASK-09 |
| TASK-11 | `agent.py` | 7 — Token Optimization | TASK-07 |
| TASK-12 | `memory_workspace/users/gabriel.md` | 7 — Token Optimization | TASK-02 |
| TASK-13 | `adapters/slack/adapter.py` | 8 — Slack Allow-List | — |
| TASK-14 | `agent.py`, `gabriel_context.md` | 9 — Continuity | TASK-11 |
| TASK-15 | `agent.py` | 9 — Continuity | TASK-14 |
| TASK-16 | — | 10 — Full Verify | All |

---

## Phase 8 — Slack Allow-List

### TASK-13 — Add allow-list check to Slack adapter
**File:** `adapters/slack/adapter.py`
**What:** Silently drop messages from users not in `allowed_users` config.
**Depends on:** Nothing (standalone fix)

Add to `SlackAdapter`:

```python
def _is_user_allowed(self, user_id: str) -> bool:
    allowed = self.config.settings.get("allowed_users", [])
    if not allowed:
        return True  # open if unconfigured — preserves existing behaviour
    return str(user_id) in [str(u) for u in allowed]
```

In `handle_message_events()`, first line after extracting `user_id`:
```python
if not self._is_user_allowed(user_id):
    return
```

Update `config/adapters.local.yaml` to add `allowed_users` under the slack block:
```yaml
slack:
  allowed_users:
    - U01234JORDAN    # Jordan's Slack user ID (find via Slack profile → More → Copy member ID)
    - U09876GABRIEL   # Gabriel's Slack user ID
```

**Acceptance:** A Slack message from a user not in the list produces no bot response.
Jordan and Gabriel's messages continue to work normally.

---

## Phase 9 — Cross-Session Continuity & First-Run

### TASK-14 — Implement gabriel_context.md injection and end-of-session update
**File:** `agent.py`, `_build_child_system_prompt()`
**What:** Inject `gabriel_context.md` into Gabriel's system prompt; instruct the bot to
update it at the end of each session.
**Depends on:** TASK-11 (child system prompt path)

Changes to `_build_child_system_prompt()`:

```python
context_path = self.memory.workspace_dir / "users" / "gabriel_context.md"
gabriel_context = context_path.read_text(encoding="utf-8") if context_path.exists() else None
is_first_run = gabriel_context is None

parts = [CHILD_TUTOR_IDENTITY, f"User Profile:\n{user_profile}"]
if gabriel_context:
    parts.append(f"Project Context & Skills:\n{gabriel_context}")
if is_first_run:
    parts.append(FIRST_RUN_BLOCK)  # see TASK-15
parts.append(CHILD_GUARDRAIL_BLOCK)
parts.append(SESSION_UPDATE_INSTRUCTION)  # always appended
parts.append("You have access to file tools. Use them to update gabriel_context.md "
             "at the end of this conversation.")
```

`SESSION_UPDATE_INSTRUCTION` constant:
```
At the end of this conversation, use your file write tool to update
`memory_workspace/users/gabriel_context.md` with:
- ## Active Project: what Gabriel is building (name + one sentence description)
- ## Last Session (today's date): what was worked on, bugs fixed, concepts covered
- ## Open Threads: anything Gabriel mentioned wanting to do next
- ## Skills Introduced: cumulative list of concepts taught, with date first introduced

Keep the file under 40 lines. Overwrite it completely each time.
```

**Acceptance:**
- Second session: Gabriel's project from previous session appears in system prompt
- File is updated after the session ends (check `gabriel_context.md` was written)
- First session: file absent, bot starts fresh, file created after first exchange

### TASK-15 — Implement first-run welcome
**File:** `agent.py`, `child_safety.py`
**What:** Send a warm onboarding welcome on Gabriel's very first message.
**Depends on:** TASK-14 (first-run detection)

`FIRST_RUN_BLOCK` constant (added to `child_safety.py`):
```
FIRST SESSION: This is Gabriel's very first message. Before answering his question,
send a short friendly welcome (4–5 sentences max). Cover:
- What you can help with: Lua, Roblox Studio, game design, coding questions
- That you guide and teach rather than just hand over answers
- That you'll remember his projects between sessions
- Invite him to tell you what he's building (or answer if he already has)
Casual and warm — not a formal introduction. Then answer his question normally.
```

This block is appended to `_build_child_system_prompt()` only when `is_first_run is True`.

**Acceptance:**
- First ever message from Gabriel → bot sends welcome then answers the question
- Second session → no welcome, goes straight to the question
- Jordan's sessions → no welcome block ever injected