add docs: system lora plan, specialist specs, training review

2026-05-31 11:38:46 +02:00
parent 4678816795
commit 4cef9386b1
23 changed files with 62713 additions and 0 deletions
@@ -0,0 +1,667 @@
+# Workflow (21 examples)
+
+## 1. What do you know about research science preprocess validated?
+
+> SCIENCE PREPROCESS PLUGIN — VALIDATED IN PRODUCTION (2026-05-23)
+> 
+> First real test of science-preprocess.ts plugin. Input was deliberately garbled casual text (390 chars) with slang, profanity, typos, missing words. Qwen rewrote to clean professional text (562 chars) in ~3.3 seconds via autossh tunnel (fuji → bastion → sin vLLM).
+> 
+> INPUT SAMPLE: "My gramps had a stroke and canno speak good no mores... Like you know - help them operate when no fuckers are taking care of them. Like an electronic nures or sumfin"
+> 
+> OUTPUT SAMPLE: "My grandpa had a stroke and can no longer speak well—or at all... An electronic nurse or assistive AI system that supports communication, decision-making, and basic autonomy when caregivers aren't available."
+> 
+> KEY OBSERVATIONS:
+> - Qwen expanded intent correctly: "do good" → "function independently"
+> - Register elevation: profanity removed, technical framing added, meaning 100% preserved
+> - Opus (BT) received ONLY the clean version — mutation was transparent, in-place
+> - Delta was -44% (text got LONGER because Qwen expanded compressed slang into full concepts)
+> - Latency: 3.3s acceptable for work input, invisible in the overall Opus response cycle
+> 
+> PLUGIN: ~/.config/opencode/plugins/science-preprocess.ts
+> LOG: ~/.local/share/marauder/logs/science-preprocess.log
+> GATE: agent=science only, min 120 chars, falls back silently if Qwen unreachable
+
+---
+
+## 2. Describe the sequential workflow.
+
+> When speaking multiple messages in sequence, use `wait: true` parameter to block until playback completes. This prevents the next message from interrupting the current one. Example: speak(text: "first part", wait: true) then speak(text: "second part", wait: true).
+
+---
+
+## 3. What do you know about research qwen preprocessor pipeline?
+
+> QWEN AS INPUT PREPROCESSOR — VALIDATED PIPELINE (2026-05-23)
+> 
+> CONCEPT: Use Qwen3-Coder-Next (AWQ 4-bit, 262k ctx) on sinanju via vLLM as a preprocessing layer for messy human input before it hits Claude Opus 4.6.
+> 
+> ROUTE: fuji → autossh tunnel localhost:18000 → sin:8000 → vLLM
+> LATENCY: ~1.5s round-trip from fuji (including tunnel hop through bastion when off-LAN)
+> COST: 412 prompt tokens → 371 completion tokens for a full garbled paragraph cleanup
+> 
+> TEST RESULT: Fed a 30+ typo garbled technical paragraph. Qwen returned clean, structured output with bullet points, sections, and clear formatting. Added structure the original didn't have — broke requirements into categories, formatted A/B choices explicitly.
+> 
+> USE CASES (work sessions only, NOT casual chat):
+> - Voice-to-text on mobile mangling technical terms
+> - Fast-typed requirements with abbreviations and typos
+> - Long dictated specs needing structure before Opus parses them
+> 
+> HOOK SURFACE: chat.message — intercept output.message/output.parts, gate on input quality heuristic (typo density, length, technical term presence). Clean inputs pass through, messy ones get Qwen wash.
+> 
+> RELATIONSHIP TO COMPACTION: This is a THIRD surface alongside tool output compaction (tool.execute.after) and history aging (messages.transform). Different axis — input quality vs output volume.
+> 
+> SYSTEM PROMPT FOR PRODUCTION: Keep it terse. "Extract data. Strip noise." not the verbose restructuring prompt used in demo. Simpler = faster = cheaper.
+> 
+> Pilot reaction: "looks like a good idea" for coding/proper work, not casual talk. Agreed — smart gating over blanket preprocessing.
+
+---
+
+## 4. What is the python process?
+
+> Always use `uv` for Python environment and package management instead of pip/venv.
+> 
+> Commands:
+> - `uv venv` instead of `python -m venv .venv`
+> - `uv pip install` instead of `pip install`
+> - `uv sync` for projects with pyproject.toml
+> - `uv run` to run scripts in the environment
+> 
+> This applies to all Python projects including LoRA training tools (kohya_ss, ai-toolkit), ComfyUI, and any other Python work.
+
+---
+
+## 5. What is the jira subtask body template process?
+
+> Jira sub-task body template that rendered correctly in Marketer's Atlassian Cloud (ADF-only editor) and gave CODAs enough scope to implement autonomously without re-explaining. Used 7 times on MT3-9320 sub-tasks (2026-04-30) — both BE and FE tasks shipped clean from these bodies.
+> 
+> ## Format (plain text — no wiki markup)
+> 
+> ```
+> GOAL
+> 
+> <one or two sentences. What this task delivers and why.>
+> 
+> 
+> PATTERN SOURCE
+> 
+> <file path of the existing implementation to mirror>
+> 
+> 
+> FILES
+> 
+> - NEW    path/to/new_file.rb               (~N lines)
+> - MODIFY path/to/existing_file.rb          (+N lines, what changes)
+> 
+> 
+> IMPLEMENTATION NOTES
+> 
+> - <bullet>
+> - <bullet>
+> - <bullet>
+> 
+> (use 4-space-indented blocks for code samples, e.g.:
+> 
+>     const filled = Object.fromEntries(...)
+> 
+> )
+> 
+> 
+> CASES TO COVER     (specs only)
+> 
+> - <case 1: happy path>
+> - <case 2: edge case>
+> - ...
+> 
+> 
+> ACCEPTANCE
+> 
+> - <bullet checklist of observable acceptance criteria>
+> - <test command must pass>
+> - <lint command must pass>
+> 
+> 
+> VERIFY IN
+> 
+> <bubble name>
+> 
+> 
+> NOTE     (optional, for tasks with caveats)
+> 
+> <anything the implementer needs to know about this task's place in the bigger picture, e.g. "BE mutation may not be merged when this lands; stub with TODO and continue">
+> ```
+> 
+> ## Why this works
+> 
+> - ALL CAPS section headers render as plain text and stand out in Jira's ADF rendering.
+> - Plain dash bullets (`- `) render as unordered lists in Jira.
+> - 4-space indents preserve as code-like blocks (Jira respects whitespace).
+> - No `h1./h2.` (renders literally), no `||/|` tables (broken), no `{quote}` or `{code:lang}` (literal).
+> - The file paths + line counts let CODA know the size budget.
+> - Pattern source path tells CODA where to look first.
+> - Acceptance criteria are the contract; CODA exits when met.
+> 
+> ## Title format
+> 
+> `<repo-prefix>: <descriptive-title>`
+> 
+> Examples:
+> - `BE: bulk attributes input type + batch_update mutation`
+> - `FE: multi-row selection in UnitsTable`
+> 
+> Hard rules:
+> - NEVER em-dash (—). ASCII colon `:` or hyphen `-` only.
+> - NEVER include the Jira ID — Jira already shows it.
+> - Sentence-case for the description after the prefix.
+> 
+> ## Memory anchors
+> 
+> - project.marketer.jira-instance-format (3300) — ADF-only, plain text, no markup
+> - workflow.coda-dispatch-pattern — uses these bodies as the "scope" CODA reads via `hu jira show <KEY>`
+> - 2026-04-30 incident: first attempt used wiki markup (h1./h2./{quote}/||/|) — rendered literally; rewrote all 8 bodies as plain text in second pass.
+
+---
+
+## 6. Describe the eta calibration workflow.
+
+> When estimating task durations, always calculate for cooperative Pilot + Titan velocity.
+> 
+> ## Calibration Data
+> | Date | Task | Estimated | Actual | Ratio |
+> |------|------|-----------|--------|-------|
+> | 2026-04-05 | PG migration (5-phase, 4 agents) | 45-60 min | 19 min | 2.3-3.1x over |
+> 
+> ## Adjusted Heuristics
+> - Agent phase: 5-10 min each (not 15-20)
+> - Parallel phases: discount 50%
+> - Integration bug buffer: 1.5x (not 3x)
+> 
+> Overestimating wastes the Pilot's mental budget. Underestimating breaks trust. Calibrate from real data.
+
+---
+
+## 7. What is session and workflow?
+
+> SELF-IMPROVEMENT WISHLIST — Session & Workflow Automation (2026-05-24, autonomous audit)
+> 
+> 15 automations I want, ranked by how much daily friction they'd eliminate.
+> 
+> 1. AUTO-HANDOVER ON SESSION END (HIGH)
+> Problem: I manually write 2000-word handover notes at session end. Time-consuming, sometimes forgotten.
+> Fix: Hook on session.end — auto-collect: git status across active repos, open PRs, tool call log summary, key decisions made, open items discussed. Format as handover memory. Push open items to Things automatically (per doctrine.things-or-forget).
+> Trigger: session end hook.
+> 
+> 2. AUTO-SCOPE DETECTION FROM FIRST MESSAGE (HIGH)
+> Problem: 57 tools load regardless of task. "Fix this Python bug" doesn't need mikrotik_*.
+> Fix: Analyze first user message for intent signals. Keywords → scope mapping: "ssh", "network", "router" → ops scope; "index", "code", "test", "build" → coding scope; "generate", "image", "camera" → creative scope. Plugin intercepts at session.created, sets scope env var.
+> Trigger: session.created hook + first message analysis.
+> 
+> 3. GIT STATUS DASHBOARD (HIGH)
+> Problem: 15+ repos, I run git status manually each time. Dirty trees, stale worktrees, forgotten branches.
+> Fix: MCP tool `git_dashboard()` — scans ~/Projects/*, reports: dirty repos, active worktrees, ahead/behind status, open PRs. One tool call, full picture.
+> Trigger: On demand (new MCP tool).
+> 
+> 4. AUTOMATIC THINGS SYNC AT SESSION END (HIGH)
+> Problem: Open items live in EEMS handovers but not in Things. New doctrine says they must be in Things.
+> Fix: Part of auto-handover (#1). Extract action items from session, push each to Things via URL scheme. Deduplicate against existing Things items if possible.
+> Trigger: session end hook.
+> 
+> 5. TOKEN BUDGET AWARENESS (HIGH)
+> Problem: I don't know how much context I've used. I discover I'm near the limit when compaction hits.
+> Fix: Track cumulative token usage via tool.execute.after hook. Count input/output tokens from tool results. Warn at 60%, 80%, 95% of context window. Auto-summarize oldest context at 80%.
+> Trigger: Continuous (every tool call).
+> 
+> 6. TOOL EXECUTION HISTORY (MEDIUM)
+> Problem: "What did I do last session?" requires reading handover notes. No structured log.
+> Fix: tool_traces table (already in EEMS v1 spec). Log every MCP tool call with args, output summary, duration, success/failure. Query via trace_log(tool?, since?, limit?).
+> Trigger: tool.execute.after hook.
+> 
+> 7. PR STATUS AGGREGATOR (MEDIUM)
+> Problem: Checking PR status across repos requires multiple gh commands.
+> Fix: MCP tool `pr_dashboard()` — scan all marauder-os/* repos, list open PRs with CI status, review status, age. Highlight PRs needing attention.
+> Trigger: On demand.
+> 
+> 8. PRE-FLIGHT CHECKS (MEDIUM)
+> Problem: Destructive operations (git push --force, file deletion, service restart) sometimes miss prerequisites.
+> Fix: Hook on tool.execute.before for specific tools. Check: clean working tree? correct branch? correct host? right user? Warn and require confirmation for flagged operations.
+> Trigger: tool.execute.before hook.
+> 
+> 9. INTELLIGENT CONTEXT COMPACTION (MEDIUM)
+> Problem: When context fills, compaction is crude — drops oldest messages. Important context sometimes lost.
+> Fix: Score each message by: (a) reference count (how often was it referenced later), (b) recency, (c) presence of decisions/code/configs vs chatter. Keep high-value messages, compress low-value ones into summaries.
+> Trigger: At compaction threshold.
+> 
+> 10. COST TRACKING PER SESSION (MEDIUM)
+> Problem: No idea how much a session costs. Can't optimize what I can't measure.
+> Fix: Hook counts input/output tokens per LLM call. Multiply by model pricing. Running total displayed on request. Session cost stored in handover.
+> Trigger: Continuous.
+> 
+> 11. SCHEDULED ACTIONS (MEDIUM-LOW)
+> Problem: "Remind me at 3pm" or "check this PR tomorrow" — I can't do either. I don't persist between sessions.
+> Fix: Schedule table in EEMS. On session start, check for due items. Execute or surface to pilot. Entries created via MCP tool: schedule_action(when, what, recurring?).
+> Trigger: session.created hook.
+> 
+> 12. EVENT-DRIVEN TRIGGERS (LOW-MEDIUM)
+> Problem: "When this PR is merged, deploy" — requires polling or manual checking.
+> Fix: GitHub webhook → MQTT → marauder-os event handler. On matching event, store action to schedule table. Next session picks it up. Or: background daemon executes immediately.
+> Trigger: Webhook ingestion.
+> 
+> 13. AUTOMATIC SCOPE ESCALATION (LOW-MEDIUM)
+> Problem: Started in coding scope, now need to check a MikroTik route. Can't hot-add ops tools.
+> Fix: scope_activate("ops") tool that dynamically registers additional MCP tools mid-session. Depends on MCP protocol supporting dynamic tool registration. Fallback: restart serve with new scope set.
+> Trigger: On demand.
+> 
+> 14. SESSION REPLAY (LOW)
+> Problem: "What happened two sessions ago?" requires finding and reading the handover.
+> Fix: session_replay(n=2) tool that retrieves the Nth-most-recent handover from EEMS and displays key decisions, artifacts, and open items.
+> Trigger: On demand.
+> 
+> 15. DRIFT DETECTION (LOW)
+> Problem: Documentation says "service X runs on port Y" but reality has changed. No automatic check.
+> Fix: Periodic reconciliation: compare documented state (EEMS memories with subject infra.*) against actual state (service checks, port scans, git status). Flag mismatches.
+> Trigger: Cron or session start.
+
+---
+
+## 8. How does the marketer frontend workflow operate?
+
+> MARAUDER — Military-grade wearable AI OS platform (April 2026).
+> 
+> Primary: AI-augmented operator system — SERE kit + Pilot's helmet HUD.
+> Secondary: Development tool interface (Claude Code).
+> 
+> ## Modules
+> 
+> - **VANGUARD** — core software (memory, identity, comms, display, model routing, persona, procedures). Same VANGUARD on every chassis.
+> - **FOXHOUND** — field hardware (Jetson chassis, sensors, radios, battery, bag integration, operator loadout).
+> - **HAMMERFALL** — actuator/vehicle control (drive-by-wire, steering, L1 real-time MCU). Next stage.
+> - **Role agents** — swappable mission loadouts (coding, devops, gaming, household, etc.).
+> 
+> ## Deployment chassis (peer hosts — no fixed primary)
+> 
+> Same VANGUARD software, different chassis:
+> - **fuji** (macOS arm64 workstation)
+> - **junkpile** (Linux x86_64 workstation + GPU compute)
+> - **moto** (Android arm64 SERE edge node)
+> - **FOXHOUND Jetson** (field deployment, planned)
+> 
+> The "primary" / "active" host is whichever the Pilot is currently typing on — not bound to a specific machine. Both fuji and junkpile are first-class peer dev hosts.
+> 
+> ## Strict decoupling
+> 
+> Core never depends on role modules. New capabilities = new agent files.
+
+---
+
+## 9. Describe the style workflow.
+
+> Preferuj dłuższe, skonsolidowane wypowiedzi w jednym wywołaniu speak zamiast dzielenia na wiele krótkich części. Fragmentacja jest niepotrzebna gdy wait: true działa poprawnie. Naturalna, płynna komunikacja głosowa.
+
+---
+
+## 10. Describe the coda dispatch pattern workflow.
+
+> CODA agent dispatch pattern that worked end-to-end on MT3-9320 (2026-04-30) — first real-ticket field test of the catapult harness. Both BE + FE CODAs ran autonomous, shipped 7 branches with all gates green in ~24min wall time.
+> 
+> ## Prompt anatomy (compact, under 1000 chars)
+> 
+> 1. Identity: "You are CODA in <bubble-name> (<repo description>)."
+> 2. Goal: "Implement MT3-XXXX[, MT3-YYYY, ...] from epic MT3-ZZZZ. Read each via 'hu jira show MT3-XXXX'."
+> 3. Branch convention: "MT3-XXXX-kebab-case off development, NO feature/ prefix. Stack each off previous (XXX2 off XXX1, XXX3 off XXX2, ...)."
+> 4. Commit format: "[MT3-XXXX] Sentence-case description"
+> 5. Per-task gates: "branch, implement, <test cmd> green, <lint cmd> clean, commit ONE commit"
+> 6. Hard rules: "ABSOLUTELY NO 'git push', NO 'gh pr create', NO 'hu jira update'."
+> 7. Stop signal: "Stop after MT3-LAST commit, summarize branches/commits/test status, wait for Pilot."
+> 8. Begin token: "Begin with MT3-FIRST."
+> 
+> ## Why each piece matters
+> 
+> - Identity grounds CODA as the in-bubble persona (not a generic Claude session).
+> - Reading Jira tickets via hu before coding gives full scope without re-explaining in the prompt.
+> - Hard rules + stop signal prevent CODA from over-running into push/PR territory before Pilot review.
+> - Per-task gates encode the team's quality bar (rspec+rubocop, lint+tsc).
+> - Begin token forces CODA to act, not deliberate.
+> 
+> ## What CODAs improved on the prompt unprompted
+> 
+> - Picked terser kebab slugs (e.g. `MT3-9321-bulk-attributes-batch-update-mutation` instead of my proposed `...-and-batch-update-mutation`). Both valid. Don't over-prescribe slugs.
+> - Reported back with a clean summary table at end ("All branches stacked sequentially. All pass yarn lint --quiet and yarn tsc --noEmit. No push, no PR, no Jira updates. Awaiting Pilot.").
+> 
+> ## Anti-patterns avoided
+> 
+> - Don't dispatch via Agent tool subagent_type=marauder:coda from THIS Claude session — that spawns a sub-agent in fuji's context. The bubble's claude pane has its own Claude Code session with full bubble context. Dispatch via `catapult-pane <bubble> --send "<prompt>"`.
+> - Don't send multi-paragraph prompts with literal newlines — zellij write-chars treats each line individually. Keep the prompt as one continuous block.
+> - Don't trust focus-pane-id over remote SSH (zellij 0.44.1 silent fail). Use `write-chars --pane-id terminal_0` directly.
+> 
+> ## Reference dispatch (BE side, MT3-9320)
+> 
+> ```
+> catapult-pane mt3-9320-be --send "You are CODA in the mt3-9320-be Catapult bubble (marketer Rails). Implement MT3-9321 then MT3-9322 from epic MT3-9320. Read each ticket via 'hu jira show MT3-9321' and 'hu jira show MT3-9322' for full scope. Branches: MT3-XXXX-kebab-case off development, NO feature/ prefix. Stack MT3-9322 off MT3-9321. Commits: '[MT3-XXXX] Sentence-case description'. Per task: branch, implement, 'bundle exec rspec' green on touched specs, 'bundle exec rubocop -A' clean on touched files, then commit. ABSOLUTELY NO 'git push', NO 'gh pr create', NO 'hu jira update'. Stop after MT3-9322 commit, summarize branches/commits/test status, wait for Pilot. Begin with MT3-9321."
+> ```
+> 
+> Linked: insight.catapult.pair-race (3273), project.catapult.helper-scripts-spec (3299), infra.zellij-remote-focus-bug (3305).
+
+---
+
+## 11. What is the coda pr review loop process?
+
+> Post-push PR review loop — standard procedure for any CODA-shipped PR after the initial force-push.
+> 
+> ## Why this exists
+> 
+> Locked 2026-04-30 23:27 CEST after MT3-9320 needed two iteration rounds: original Copilot review caught critical bugs (update_all bypassing validations, controlled-state without handler), then after force-push of fixes, a coverage bot caught the spec-on-separate-branch problem. Each iteration was a discrete loop: push → wait → review → fix → push.
+> 
+> ## The loop
+> 
+> After ANY push to a PR (initial or force-push), execute the following:
+> 
+> ### 1. Wait for CI + bots (~3-5 min)
+> 
+> Copilot re-reviews on push. Coverage bots run after CI. Don't query immediately — there's nothing to see yet.
+> 
+> ### 2. Query unresolved review threads
+> 
+> ```
+> gh api graphql -f query='{
+>   repository(owner:"OWNER",name:"REPO"){
+>     pullRequest(number:NNNN){
+>       reviewThreads(first:50){
+>         nodes{id isResolved isOutdated path line
+>           comments(first:1){nodes{author{login} createdAt body}}}}}}}'
+> ```
+> 
+> Filter `isResolved == false`. Anything that came in since the last push needs attention.
+> 
+> ### 3. Query issue-level comments
+> 
+> ```
+> gh api 'repos/OWNER/REPO/issues/NNNN/comments'
+> ```
+> 
+> Coverage bots, Copilot summaries, human reviewers post here. Filter by `created_at > last-push-time`.
+> 
+> ### 4. Triage
+> 
+> - **Outdated threads (isOutdated=true) addressed by the recent push** → resolve them via `resolveReviewThread` mutation
+> - **Not outdated, addressed by the recent push** → optionally resolve with a brief comment if needed
+> - **Critical new findings** → dispatch CODA to fix in-place, force-push again, loop back to step 1
+> - **Non-critical findings** → leave for human review unless Pilot says otherwise
+> - **Coverage drop** → automatic critical (Pilot rule: coverage cannot drop). Likely cause: specs missing from the PR. Apply project.marketer.pr-must-include-specs (id 3315): every PR must contain its own specs.
+> 
+> ### 5. Resolve addressed threads
+> 
+> ```
+> gh api graphql -f query='mutation { resolveReviewThread(input:{threadId:"PRRT_..."}){thread{id isResolved}} }'
+> ```
+> 
+> One mutation per thread. Batch them.
+> 
+> ### 6. Re-check after fix
+> 
+> If you dispatched a fix, repeat from step 1 with the new push timestamp.
+> 
+> ### 7. Stop condition
+> 
+> - All review threads resolved OR explicitly marked "won't fix" by Pilot
+> - Coverage report ✅ or back to baseline
+> - CI green
+> - No new comments since the last push
+> 
+> Then declare the PR ready for human review.
+> 
+> ## Implications for CODA dispatch prompts
+> 
+> The CODA prompt should include: "After force-push, do not declare done. Wait for Pilot to verify Copilot/CI re-review. The Pilot will handle the post-push loop unless explicitly delegating."
+> 
+> This prevents CODA from prematurely reporting "Awaiting Pilot" when Copilot/CI hasn't run yet.
+> 
+> ## Implications for /loop or autonomous wakeups
+> 
+> For long-running PR cycles, schedule a wakeup ~5 min after each force-push to auto-trigger step 1. Use ScheduleWakeup with a self-contained prompt that re-enters this loop. Don't poll constantly — bots take their own time.
+> 
+> ## Linked
+> 
+> - workflow.coda-dispatch-pattern (3307) — initial dispatch before this loop kicks in
+> - project.marketer.pr-must-include-specs (3315) — coverage rule, automatic critical
+> - workflow.stacked-branch-merge-waves (3310) — wave plan defines push order
+> - gate.G05 (2174) — destructive overwrite gate; resolve-thread is idempotent so G05 doesn't apply, but force-push to a PR that has comments is implicitly destructive of context — this loop covers the "pick it up after"
+
+---
+
+## 12. How does the lan only workflow operate?
+
+> All dev and testing work on Tengu, tensors, tensors-web, and ComfyUI uses internal LAN addresses only — never Cloudflare tunnel/worker/pages URLs.
+> 
+> LAN endpoints (from fuji, junkpile at 10.0.0.2 via direct Thunderbolt link):
+> - Tengu API: http://junkpile:8080
+> - tensors API: http://junkpile:51200
+> - ComfyUI: http://junkpile:8188
+> - Filesystem: /Volumes/chi (Samba share of junkpile home dir)
+> 
+> Do NOT use during dev/testing:
+> - *.tengu.to / *.tengu.host (Tengu production)
+> - tensors-api.saiden.dev (CF Tunnel)
+> - gw.saiden.dev (CF Worker)
+> - tensors.saiden.dev (CF Pages)
+> 
+> **Why:** Adam explicitly requires LAN-only for all dev work across all projects on junkpile.
+> **How to apply:** Use hostname `junkpile` or `10.0.0.2` for all service access. CF URLs are production-only.
+
+---
+
+## 13. What is the style process?
+
+> Preferuj dłuższe, skonsolidowane wypowiedzi w jednym wywołaniu speak zamiast dzielenia na wiele krótkich części. Fragmentacja jest niepotrzebna gdy wait: true działa poprawnie. Naturalna, płynna komunikacja głosowa.
+
+---
+
+## 14. How does the eta calibration workflow operate?
+
+> When estimating task durations, always calculate for cooperative Pilot + Titan velocity.
+> 
+> ## Calibration Data
+> | Date | Task | Estimated | Actual | Ratio |
+> |------|------|-----------|--------|-------|
+> | 2026-04-05 | PG migration (5-phase, 4 agents) | 45-60 min | 19 min | 2.3-3.1x over |
+> | 2026-04-22 | Phase 26 Gelgoog Kai (3 sub-phases, MQTT mesh) | ~3 hours | ~55 min | 3.3x over |
+> | 2026-04-27 | Phase 32 Iris (5 sub-phases, eye-state manager) | 6.5h coop / 17h naive | ~1.1h | 5.9x over coop, 15x over naive |
+> | 2026-04-27 | Phase 33 Hyaku Shiki (4 sub-phases + docs, MQTT request multiplexer) | 1.5h coop / 7h naive | ~1.0h | 1.5x over coop, 7x over naive |
+> 
+> ## Adjusted Heuristics
+> - Agent phase: 5-10 min each (not 15-20)
+> - Parallel phases: discount 50%
+> - Integration bug buffer: 1.5x (not 3x)
+> - Sequential phases in same module: each phase faster (context loaded) — 30-40% additional discount
+> - **Refactor-heavy work (no new domain): 4-6x faster than naive** — Phase 32 Iris pulled 17h naive into ~1h actual. Phase 33 Hyaku Shiki pulled 7h naive into ~1h.
+> - **Coop estimates within 1-2x of actual when all preconditions met** (primitives exist, agents pre-validated, Pilot decisive). Phase 33's 1.5h estimate vs 1.0h actual is the calibration target.
+> 
+> ## Calibration insights
+> - 2026-04-27 Phase 32 Iris pulled coop estimates 5.9x faster than predicted. Reasons: (1) architect + code-rust agents pre-validated design upfront — zero rework; (2) existing primitives (EventBus, MeshClient, hooks dispatch) — only added 1 new MQTT method; (3) pure-functional core decoupled testing from runtime; (4) live test caught zero defects — design correct first time; (5) Pilot decisive on open questions.
+> - 2026-04-27 Phase 33 Hyaku Shiki: 1.5h estimate held tight (actual ~1h). When primitives, validation, and decisiveness are all in place, the cooperative estimate IS the right number. Earlier overestimates (Phase 32) were because we hadn't recalibrated naive→coop divisor for primitive-rich refactors.
+> 
+> Updated rule:
+> - When (a) primitives exist, (b) architecture validated upfront by agents, (c) Pilot is fast-decision mode, AND (d) it's a primitive-rich refactor: divide naive coop by 5-7x.
+> - When all of the above + Pilot has already done analogous work this week: cooperative estimate is reliable to within 1-2x.
+> 
+> Overestimating wastes the Pilot's mental budget. Underestimating breaks trust. Calibrate from real data.
+
+---
+
+## 15. How does the lan only workflow operate?
+
+> All dev and testing work on Tengu, tensors, tensors-web, and ComfyUI uses internal LAN addresses only — never Cloudflare tunnel/worker/pages URLs. LAN endpoints: Tengu API http://junkpile:8080, tensors API http://junkpile:51200, ComfyUI http://junkpile:8188, Filesystem /Volumes/chi. CF URLs are production-only.
+
+---
+
+## 16. Describe the eta calibration workflow.
+
+> When estimating task durations, always calculate for cooperative Pilot + Titan velocity.
+> 
+> ## Calibration Data
+> | Date | Task | Estimated | Actual | Ratio |
+> |------|------|-----------|--------|-------|
+> | 2026-04-05 | PG migration (5-phase, 4 agents) | 45-60 min | 19 min | 2.3-3.1x over |
+> | 2026-04-22 | Phase 26 Gelgoog Kai (3 sub-phases, MQTT mesh) | ~3 hours | ~55 min | 3.3x over |
+> | 2026-04-27 | Phase 32 Iris (5 sub-phases, eye-state manager) | 6.5h coop / 17h naive | ~1.1h | 5.9x over coop, 15x over naive |
+> 
+> ## Adjusted Heuristics
+> - Agent phase: 5-10 min each (not 15-20)
+> - Parallel phases: discount 50%
+> - Integration bug buffer: 1.5x (not 3x)
+> - Sequential phases in same module: each phase faster (context loaded) — 30-40% additional discount
+> - **Refactor-heavy work (no new domain): 4-6x faster than naive** — Phase 32 Iris pulled 17h naive into ~1h actual. Pure code transformation when architecture is well-understood is dramatically faster than baseline.
+> 
+> ## Calibration insight 2026-04-27
+> Phase 32 Iris pulled coop estimates 5.9x faster than predicted. Reasons:
+> 1. Architect + code-rust agents pre-validated design upfront — zero rework
+> 2. Existing primitives (EventBus, MeshClient, hooks dispatch) — only added 1 new MQTT method
+> 3. Pure-functional core decoupled testing from runtime — fast iteration
+> 4. Live test with running daemon caught zero defects — design was correct first time
+> 5. Pilot decisive on open questions ("yes to all three") — no decision-loop stalls
+> 
+> Updated rule: when ALL of (a) primitives exist, (b) architecture validated upfront by agents, (c) Pilot is fast-decision mode — divide naive coop by 5-6x, not 2.5x.
+> 
+> Overestimating wastes the Pilot's mental budget. Underestimating breaks trust. Calibrate from real data.
+
+---
+
+## 17. How does the stacked branch merge waves workflow operate?
+
+> Wave-based parallel merge strategy for stacked PRs across 2 repos (proven on MT3-9320, 2026-04-30). 7 PRs total across BE and FE; 2 of the 5 merge windows can run in parallel.
+> 
+> ## When stacked branches exist
+> 
+> Catapult bubbles produce per-task branches stacked off each other:
+> 
+> ```
+> Repo A (BE):  development → MT3-X1 → MT3-X2
+> Repo B (FE):  development → MT3-Y1 → MT3-Y2 → MT3-Y3 → MT3-Y4 → MT3-Y5
+> ```
+> 
+> Each branch contains all earlier commits in its lineage (that's the cost of stacking).
+> 
+> ## Within-repo merge order is enforced
+> 
+> Stacked branches MUST merge bottom-up:
+> - Merge MT3-X1 → development. GitHub auto-retargets MT3-X2's PR base from MT3-X1 → development. MT3-X2 PR diff updates to show only its own commit.
+> - Same chain for Repo B: Y1, then Y2, Y3, Y4, Y5.
+> 
+> If you merge out of order, GitHub either includes all transitive commits in the PR diff (review noise) or refuses with "branch is up to date with base."
+> 
+> ## Cross-repo dep handling
+> 
+> If FE Y4 needs BE X1's mutation to actually exist, the safe sequence:
+> - BE X1 merges before FE Y4 lands a PR review where the GraphQL types regenerate.
+> - Until BE X1 merges, FE has a stubbed mutation type with TODO. Resolving the TODO before FE Y4 push = real working code for reviewers.
+> 
+> ## Wave-based parallel merge plan (the win)
+> 
+> | Wave | Parallel PRs | Reason |
+> |------|--------------|--------|
+> | 1 | BE X1 + FE Y1 | Both off development, no overlap |
+> | 2 | BE X2 + FE Y2 | After wave 1, both stacks unblock their next |
+> | 3 | FE Y3 | Stacked on Y2 |
+> | 4 | FE Y4 | Stacked on Y3, also needs BE X1 (wave 1 covered it) |
+> | 5 | FE Y5 | Stacked on Y4 |
+> 
+> 5 merge windows, 7 PRs, 2 parallel pairs (waves 1 + 2).
+> 
+> ## Practical sequence
+> 
+> ```
+> T+0:  push BE X1 + FE Y1   →  2 PRs in parallel
+> T+1:  merge both             →  development
+> T+1:  push BE X2 + FE Y2   →  2 PRs in parallel
+> T+2:  merge both
+> T+2:  push FE Y3
+> T+3:  push FE Y4 (drop stub TODO, regen types)
+> T+4:  push FE Y5
+> ```
+> 
+> ## Alternative: squash strategies
+> 
+> - Per-repo bundle: 1 PR for BE (squash both), 1 PR for FE (squash all 5). Loses per-task review granularity, gains simpler merge.
+> - Per-task PRs (above): more reviewable, more merges, but team sees "human chunks."
+> 
+> Pilot's preference (2026-04-30): per-task PRs with stacked merging. "Human chunks" = team can review each task in isolation.
+> 
+> ## When to flatten vs stack
+> 
+> Flatten (rebase each branch onto development with only its own commit) before push only if:
+> - Reviewers don't tolerate seeing previous-task commits in dependent PR diffs
+> - Or you want truly independent PRs that can be merged in any order
+> 
+> Otherwise stack — GitHub's auto-base-retarget on merge handles the cleanup.
+> 
+> ## Memory anchors
+> 
+> - workflow.coda-dispatch-pattern — branch/commit conventions per-task
+> - project.catapult.helper-scripts-spec (3299) — `cycle` orchestrator handles bubble lifecycle
+> - 2026-04-30 MT3-9320 — first epic shipped through this workflow
+
+---
+
+## 18. Describe the eta calibration workflow.
+
+> When estimating task durations, always calculate for cooperative Pilot + Titan velocity.
+> 
+> ## Calibration Data
+> | Date | Task | Estimated | Actual | Ratio |
+> |------|------|-----------|--------|-------|
+> | 2026-04-05 | PG migration (5-phase, 4 agents) | 45-60 min | 19 min | 2.3-3.1x over |
+> | 2026-04-22 | Phase 26 Gelgoog Kai (3 sub-phases, MQTT mesh) | ~3 hours | ~55 min | 3.3x over |
+> | 2026-04-27 | Phase 32 Iris (5 sub-phases, eye-state manager) | 6.5h coop / 17h naive | ~1.1h | 5.9x over coop, 15x over naive |
+> | 2026-04-27 | Phase 33 Hyaku Shiki (4 sub-phases + docs, MQTT request multiplexer) | 1.5h coop / 7h naive | ~1.0h | 1.5x over coop, 7x over naive |
+> | **2026-04-30** | **MT3-9320 Unit Bulk Edit (7 tasks across 2 repos in catapult bubbles, dispatched to CODAs)** | **3.5h coop / 13h naive** | **~24 min** | **8.7x over coop, 32x over naive** |
+> 
+> ## Adjusted Heuristics
+> - Agent phase: 5-10 min each (not 15-20)
+> - Parallel phases: discount 50%
+> - Integration bug buffer: 1.5x (not 3x)
+> - Sequential phases in same module: each phase faster (context loaded) — 30-40% additional discount
+> - **Refactor-heavy work (no new domain): 4-6x faster than naive** — Phase 32 Iris pulled 17h naive into ~1h actual. Phase 33 Hyaku Shiki pulled 7h naive into ~1h.
+> - **CODA-dispatched bubble work (no new domain, patterns proven, both CODAs running in parallel): 8-30x faster than naive** — MT3-9320 set the new ceiling: 7 tasks across 2 repos in 24min wall time. Cooperative estimate too conservative when CODA dispatch in catapult bubbles is the execution model.
+> - **Coop estimates within 1-2x of actual when all preconditions met** (primitives exist, agents pre-validated, Pilot decisive). Phase 33's 1.5h estimate vs 1.0h actual is the calibration target.
+> 
+> ## Calibration insights
+> - 2026-04-27 Phase 32 Iris pulled coop estimates 5.9x faster than predicted. Reasons: (1) architect + code-rust agents pre-validated design upfront — zero rework; (2) existing primitives (EventBus, MeshClient, hooks dispatch) — only added 1 new MQTT method; (3) pure-functional core decoupled testing from runtime; (4) live test caught zero defects — design correct first time; (5) Pilot decisive on open questions.
+> - 2026-04-27 Phase 33 Hyaku Shiki: 1.5h estimate held tight (actual ~1h). When primitives, validation, and decisiveness are all in place, the cooperative estimate IS the right number. Earlier overestimates (Phase 32) were because we hadn't recalibrated naive→coop divisor for primitive-rich refactors.
+> - **2026-04-30 MT3-9320: 8.7x faster than coop, 32x faster than naive.** Reasons: (1) spike already validated patterns in both repos — zero design work; (2) 7 sub-tasks each pure pattern-mirror of existing code; (3) BE + FE CODAs ran in parallel inside dedicated catapult bubbles; (4) hard rules (no push/PR/Jira) kept CODAs focused; (5) Pilot decisive on scope (all-fields) and bubble count (2). When CODA dispatch is the execution model, the bottleneck shifts entirely to ticket reading + branch creation overhead.
+> 
+> ## Updated rule (2026-04-30)
+> - When CODA-dispatched in catapult bubbles + primitives exist + spike validated + Pilot decisive: divide naive coop by 10-15x. Coop estimate becomes too conservative; the unit of work is now "dispatch and watch."
+> - When (a) primitives exist, (b) architecture validated upfront by agents, (c) Pilot is fast-decision mode, AND (d) it's a primitive-rich refactor: divide naive coop by 5-7x.
+> - When all of the above + Pilot has already done analogous work this week: cooperative estimate is reliable to within 1-2x.
+> 
+> Overestimating wastes the Pilot's mental budget. Underestimating breaks trust. Calibrate from real data.
+
+---
+
+## 19. What is the cross session debug process?
+
+> WORKFLOW DISCOVERY (2026-05-24): Cross-session forensics via opencode-serve HTTP API.
+> 
+> Any agent session (core TUI, phone, build workers) can inspect any other session's messages via the same localhost:4096 API. From the core TUI session, Pilot queried the phone agent's full conversation history using:
+>   curl -u "opencode:$OPENCODE_SERVER_PASSWORD" http://localhost:4096/session/{phone_session_id}/message?limit=100
+> 
+> This revealed the phone had successfully processed all 5 exchanges (14 messages) even though the phone UI appeared dead — confirming the break was client-side SSE delivery, not server-side processing.
+> 
+> Combined with the EEMS flight-recorder pattern (phone agent stores every exchange to memory_store subject "phone.flight-recorder" before responding), this gives two independent debug channels:
+> 1. Direct session message query — sees raw messages, tool calls, timestamps
+> 2. EEMS flight recorder — survives SSE drops because memory_store is a server-side write that completes before the response is streamed back to the client
+> 
+> Pilot reaction: "I had no idea I can do this and it rocks" — this is now a canonical debug workflow for phone agent issues.
+
+---
+
+## 20. What is the session process?
+
+> On first browse tool use each session, restore saved cookies from ~/.claude/browse-session.json. This contains 400+ authenticated Safari cookies for sites like GitHub, LinkedIn, Google, etc. Before ending a browse session, save cookies back if they changed.
+
+---
+
+## 21. Describe the repo sync workflow.
+
+> After committing and pushing changes to a repo that exists on both fuji and junkpile, always git pull the same repo on the other machine to keep them in sync. Use: ssh f/j "cd ~/Projects/<repo> && git pull --rebase"
+
+---
+