# Workflow (21 examples) ## 1. What do you know about research science preprocess validated? > SCIENCE PREPROCESS PLUGIN — VALIDATED IN PRODUCTION (2026-05-23) > > First real test of science-preprocess.ts plugin. Input was deliberately garbled casual text (390 chars) with slang, profanity, typos, missing words. Qwen rewrote to clean professional text (562 chars) in ~3.3 seconds via autossh tunnel (fuji → bastion → sin vLLM). > > INPUT SAMPLE: "My gramps had a stroke and canno speak good no mores... Like you know - help them operate when no fuckers are taking care of them. Like an electronic nures or sumfin" > > OUTPUT SAMPLE: "My grandpa had a stroke and can no longer speak well—or at all... An electronic nurse or assistive AI system that supports communication, decision-making, and basic autonomy when caregivers aren't available." > > KEY OBSERVATIONS: > - Qwen expanded intent correctly: "do good" → "function independently" > - Register elevation: profanity removed, technical framing added, meaning 100% preserved > - Opus (BT) received ONLY the clean version — mutation was transparent, in-place > - Delta was -44% (text got LONGER because Qwen expanded compressed slang into full concepts) > - Latency: 3.3s acceptable for work input, invisible in the overall Opus response cycle > > PLUGIN: ~/.config/opencode/plugins/science-preprocess.ts > LOG: ~/.local/share/marauder/logs/science-preprocess.log > GATE: agent=science only, min 120 chars, falls back silently if Qwen unreachable --- ## 2. Describe the sequential workflow. > When speaking multiple messages in sequence, use `wait: true` parameter to block until playback completes. This prevents the next message from interrupting the current one. Example: speak(text: "first part", wait: true) then speak(text: "second part", wait: true). --- ## 3. What do you know about research qwen preprocessor pipeline? > QWEN AS INPUT PREPROCESSOR — VALIDATED PIPELINE (2026-05-23) > > CONCEPT: Use Qwen3-Coder-Next (AWQ 4-bit, 262k ctx) on sinanju via vLLM as a preprocessing layer for messy human input before it hits Claude Opus 4.6. > > ROUTE: fuji → autossh tunnel localhost:18000 → sin:8000 → vLLM > LATENCY: ~1.5s round-trip from fuji (including tunnel hop through bastion when off-LAN) > COST: 412 prompt tokens → 371 completion tokens for a full garbled paragraph cleanup > > TEST RESULT: Fed a 30+ typo garbled technical paragraph. Qwen returned clean, structured output with bullet points, sections, and clear formatting. Added structure the original didn't have — broke requirements into categories, formatted A/B choices explicitly. > > USE CASES (work sessions only, NOT casual chat): > - Voice-to-text on mobile mangling technical terms > - Fast-typed requirements with abbreviations and typos > - Long dictated specs needing structure before Opus parses them > > HOOK SURFACE: chat.message — intercept output.message/output.parts, gate on input quality heuristic (typo density, length, technical term presence). Clean inputs pass through, messy ones get Qwen wash. > > RELATIONSHIP TO COMPACTION: This is a THIRD surface alongside tool output compaction (tool.execute.after) and history aging (messages.transform). Different axis — input quality vs output volume. > > SYSTEM PROMPT FOR PRODUCTION: Keep it terse. "Extract data. Strip noise." not the verbose restructuring prompt used in demo. Simpler = faster = cheaper. > > Pilot reaction: "looks like a good idea" for coding/proper work, not casual talk. Agreed — smart gating over blanket preprocessing. --- ## 4. What is the python process? > Always use `uv` for Python environment and package management instead of pip/venv. > > Commands: > - `uv venv` instead of `python -m venv .venv` > - `uv pip install` instead of `pip install` > - `uv sync` for projects with pyproject.toml > - `uv run` to run scripts in the environment > > This applies to all Python projects including LoRA training tools (kohya_ss, ai-toolkit), ComfyUI, and any other Python work. --- ## 5. What is the jira subtask body template process? > Jira sub-task body template that rendered correctly in Marketer's Atlassian Cloud (ADF-only editor) and gave CODAs enough scope to implement autonomously without re-explaining. Used 7 times on MT3-9320 sub-tasks (2026-04-30) — both BE and FE tasks shipped clean from these bodies. > > ## Format (plain text — no wiki markup) > > ``` > GOAL > > > > > PATTERN SOURCE > > > > > FILES > > - NEW path/to/new_file.rb (~N lines) > - MODIFY path/to/existing_file.rb (+N lines, what changes) > > > IMPLEMENTATION NOTES > > - > - > - > > (use 4-space-indented blocks for code samples, e.g.: > > const filled = Object.fromEntries(...) > > ) > > > CASES TO COVER (specs only) > > - > - > - ... > > > ACCEPTANCE > > - > - > - > > > VERIFY IN > > > > > NOTE (optional, for tasks with caveats) > > > ``` > > ## Why this works > > - ALL CAPS section headers render as plain text and stand out in Jira's ADF rendering. > - Plain dash bullets (`- `) render as unordered lists in Jira. > - 4-space indents preserve as code-like blocks (Jira respects whitespace). > - No `h1./h2.` (renders literally), no `||/|` tables (broken), no `{quote}` or `{code:lang}` (literal). > - The file paths + line counts let CODA know the size budget. > - Pattern source path tells CODA where to look first. > - Acceptance criteria are the contract; CODA exits when met. > > ## Title format > > `: ` > > Examples: > - `BE: bulk attributes input type + batch_update mutation` > - `FE: multi-row selection in UnitsTable` > > Hard rules: > - NEVER em-dash (—). ASCII colon `:` or hyphen `-` only. > - NEVER include the Jira ID — Jira already shows it. > - Sentence-case for the description after the prefix. > > ## Memory anchors > > - project.marketer.jira-instance-format (3300) — ADF-only, plain text, no markup > - workflow.coda-dispatch-pattern — uses these bodies as the "scope" CODA reads via `hu jira show ` > - 2026-04-30 incident: first attempt used wiki markup (h1./h2./{quote}/||/|) — rendered literally; rewrote all 8 bodies as plain text in second pass. --- ## 6. Describe the eta calibration workflow. > When estimating task durations, always calculate for cooperative Pilot + Titan velocity. > > ## Calibration Data > | Date | Task | Estimated | Actual | Ratio | > |------|------|-----------|--------|-------| > | 2026-04-05 | PG migration (5-phase, 4 agents) | 45-60 min | 19 min | 2.3-3.1x over | > > ## Adjusted Heuristics > - Agent phase: 5-10 min each (not 15-20) > - Parallel phases: discount 50% > - Integration bug buffer: 1.5x (not 3x) > > Overestimating wastes the Pilot's mental budget. Underestimating breaks trust. Calibrate from real data. --- ## 7. What is session and workflow? > SELF-IMPROVEMENT WISHLIST — Session & Workflow Automation (2026-05-24, autonomous audit) > > 15 automations I want, ranked by how much daily friction they'd eliminate. > > 1. AUTO-HANDOVER ON SESSION END (HIGH) > Problem: I manually write 2000-word handover notes at session end. Time-consuming, sometimes forgotten. > Fix: Hook on session.end — auto-collect: git status across active repos, open PRs, tool call log summary, key decisions made, open items discussed. Format as handover memory. Push open items to Things automatically (per doctrine.things-or-forget). > Trigger: session end hook. > > 2. AUTO-SCOPE DETECTION FROM FIRST MESSAGE (HIGH) > Problem: 57 tools load regardless of task. "Fix this Python bug" doesn't need mikrotik_*. > Fix: Analyze first user message for intent signals. Keywords → scope mapping: "ssh", "network", "router" → ops scope; "index", "code", "test", "build" → coding scope; "generate", "image", "camera" → creative scope. Plugin intercepts at session.created, sets scope env var. > Trigger: session.created hook + first message analysis. > > 3. GIT STATUS DASHBOARD (HIGH) > Problem: 15+ repos, I run git status manually each time. Dirty trees, stale worktrees, forgotten branches. > Fix: MCP tool `git_dashboard()` — scans ~/Projects/*, reports: dirty repos, active worktrees, ahead/behind status, open PRs. One tool call, full picture. > Trigger: On demand (new MCP tool). > > 4. AUTOMATIC THINGS SYNC AT SESSION END (HIGH) > Problem: Open items live in EEMS handovers but not in Things. New doctrine says they must be in Things. > Fix: Part of auto-handover (#1). Extract action items from session, push each to Things via URL scheme. Deduplicate against existing Things items if possible. > Trigger: session end hook. > > 5. TOKEN BUDGET AWARENESS (HIGH) > Problem: I don't know how much context I've used. I discover I'm near the limit when compaction hits. > Fix: Track cumulative token usage via tool.execute.after hook. Count input/output tokens from tool results. Warn at 60%, 80%, 95% of context window. Auto-summarize oldest context at 80%. > Trigger: Continuous (every tool call). > > 6. TOOL EXECUTION HISTORY (MEDIUM) > Problem: "What did I do last session?" requires reading handover notes. No structured log. > Fix: tool_traces table (already in EEMS v1 spec). Log every MCP tool call with args, output summary, duration, success/failure. Query via trace_log(tool?, since?, limit?). > Trigger: tool.execute.after hook. > > 7. PR STATUS AGGREGATOR (MEDIUM) > Problem: Checking PR status across repos requires multiple gh commands. > Fix: MCP tool `pr_dashboard()` — scan all marauder-os/* repos, list open PRs with CI status, review status, age. Highlight PRs needing attention. > Trigger: On demand. > > 8. PRE-FLIGHT CHECKS (MEDIUM) > Problem: Destructive operations (git push --force, file deletion, service restart) sometimes miss prerequisites. > Fix: Hook on tool.execute.before for specific tools. Check: clean working tree? correct branch? correct host? right user? Warn and require confirmation for flagged operations. > Trigger: tool.execute.before hook. > > 9. INTELLIGENT CONTEXT COMPACTION (MEDIUM) > Problem: When context fills, compaction is crude — drops oldest messages. Important context sometimes lost. > Fix: Score each message by: (a) reference count (how often was it referenced later), (b) recency, (c) presence of decisions/code/configs vs chatter. Keep high-value messages, compress low-value ones into summaries. > Trigger: At compaction threshold. > > 10. COST TRACKING PER SESSION (MEDIUM) > Problem: No idea how much a session costs. Can't optimize what I can't measure. > Fix: Hook counts input/output tokens per LLM call. Multiply by model pricing. Running total displayed on request. Session cost stored in handover. > Trigger: Continuous. > > 11. SCHEDULED ACTIONS (MEDIUM-LOW) > Problem: "Remind me at 3pm" or "check this PR tomorrow" — I can't do either. I don't persist between sessions. > Fix: Schedule table in EEMS. On session start, check for due items. Execute or surface to pilot. Entries created via MCP tool: schedule_action(when, what, recurring?). > Trigger: session.created hook. > > 12. EVENT-DRIVEN TRIGGERS (LOW-MEDIUM) > Problem: "When this PR is merged, deploy" — requires polling or manual checking. > Fix: GitHub webhook → MQTT → marauder-os event handler. On matching event, store action to schedule table. Next session picks it up. Or: background daemon executes immediately. > Trigger: Webhook ingestion. > > 13. AUTOMATIC SCOPE ESCALATION (LOW-MEDIUM) > Problem: Started in coding scope, now need to check a MikroTik route. Can't hot-add ops tools. > Fix: scope_activate("ops") tool that dynamically registers additional MCP tools mid-session. Depends on MCP protocol supporting dynamic tool registration. Fallback: restart serve with new scope set. > Trigger: On demand. > > 14. SESSION REPLAY (LOW) > Problem: "What happened two sessions ago?" requires finding and reading the handover. > Fix: session_replay(n=2) tool that retrieves the Nth-most-recent handover from EEMS and displays key decisions, artifacts, and open items. > Trigger: On demand. > > 15. DRIFT DETECTION (LOW) > Problem: Documentation says "service X runs on port Y" but reality has changed. No automatic check. > Fix: Periodic reconciliation: compare documented state (EEMS memories with subject infra.*) against actual state (service checks, port scans, git status). Flag mismatches. > Trigger: Cron or session start. --- ## 8. How does the marketer frontend workflow operate? > MARAUDER — Military-grade wearable AI OS platform (April 2026). > > Primary: AI-augmented operator system — SERE kit + Pilot's helmet HUD. > Secondary: Development tool interface (Claude Code). > > ## Modules > > - **VANGUARD** — core software (memory, identity, comms, display, model routing, persona, procedures). Same VANGUARD on every chassis. > - **FOXHOUND** — field hardware (Jetson chassis, sensors, radios, battery, bag integration, operator loadout). > - **HAMMERFALL** — actuator/vehicle control (drive-by-wire, steering, L1 real-time MCU). Next stage. > - **Role agents** — swappable mission loadouts (coding, devops, gaming, household, etc.). > > ## Deployment chassis (peer hosts — no fixed primary) > > Same VANGUARD software, different chassis: > - **fuji** (macOS arm64 workstation) > - **junkpile** (Linux x86_64 workstation + GPU compute) > - **moto** (Android arm64 SERE edge node) > - **FOXHOUND Jetson** (field deployment, planned) > > The "primary" / "active" host is whichever the Pilot is currently typing on — not bound to a specific machine. Both fuji and junkpile are first-class peer dev hosts. > > ## Strict decoupling > > Core never depends on role modules. New capabilities = new agent files. --- ## 9. Describe the style workflow. > Preferuj dłuższe, skonsolidowane wypowiedzi w jednym wywołaniu speak zamiast dzielenia na wiele krótkich części. Fragmentacja jest niepotrzebna gdy wait: true działa poprawnie. Naturalna, płynna komunikacja głosowa. --- ## 10. Describe the coda dispatch pattern workflow. > CODA agent dispatch pattern that worked end-to-end on MT3-9320 (2026-04-30) — first real-ticket field test of the catapult harness. Both BE + FE CODAs ran autonomous, shipped 7 branches with all gates green in ~24min wall time. > > ## Prompt anatomy (compact, under 1000 chars) > > 1. Identity: "You are CODA in ()." > 2. Goal: "Implement MT3-XXXX[, MT3-YYYY, ...] from epic MT3-ZZZZ. Read each via 'hu jira show MT3-XXXX'." > 3. Branch convention: "MT3-XXXX-kebab-case off development, NO feature/ prefix. Stack each off previous (XXX2 off XXX1, XXX3 off XXX2, ...)." > 4. Commit format: "[MT3-XXXX] Sentence-case description" > 5. Per-task gates: "branch, implement, green, clean, commit ONE commit" > 6. Hard rules: "ABSOLUTELY NO 'git push', NO 'gh pr create', NO 'hu jira update'." > 7. Stop signal: "Stop after MT3-LAST commit, summarize branches/commits/test status, wait for Pilot." > 8. Begin token: "Begin with MT3-FIRST." > > ## Why each piece matters > > - Identity grounds CODA as the in-bubble persona (not a generic Claude session). > - Reading Jira tickets via hu before coding gives full scope without re-explaining in the prompt. > - Hard rules + stop signal prevent CODA from over-running into push/PR territory before Pilot review. > - Per-task gates encode the team's quality bar (rspec+rubocop, lint+tsc). > - Begin token forces CODA to act, not deliberate. > > ## What CODAs improved on the prompt unprompted > > - Picked terser kebab slugs (e.g. `MT3-9321-bulk-attributes-batch-update-mutation` instead of my proposed `...-and-batch-update-mutation`). Both valid. Don't over-prescribe slugs. > - Reported back with a clean summary table at end ("All branches stacked sequentially. All pass yarn lint --quiet and yarn tsc --noEmit. No push, no PR, no Jira updates. Awaiting Pilot."). > > ## Anti-patterns avoided > > - Don't dispatch via Agent tool subagent_type=marauder:coda from THIS Claude session — that spawns a sub-agent in fuji's context. The bubble's claude pane has its own Claude Code session with full bubble context. Dispatch via `catapult-pane --send ""`. > - Don't send multi-paragraph prompts with literal newlines — zellij write-chars treats each line individually. Keep the prompt as one continuous block. > - Don't trust focus-pane-id over remote SSH (zellij 0.44.1 silent fail). Use `write-chars --pane-id terminal_0` directly. > > ## Reference dispatch (BE side, MT3-9320) > > ``` > catapult-pane mt3-9320-be --send "You are CODA in the mt3-9320-be Catapult bubble (marketer Rails). Implement MT3-9321 then MT3-9322 from epic MT3-9320. Read each ticket via 'hu jira show MT3-9321' and 'hu jira show MT3-9322' for full scope. Branches: MT3-XXXX-kebab-case off development, NO feature/ prefix. Stack MT3-9322 off MT3-9321. Commits: '[MT3-XXXX] Sentence-case description'. Per task: branch, implement, 'bundle exec rspec' green on touched specs, 'bundle exec rubocop -A' clean on touched files, then commit. ABSOLUTELY NO 'git push', NO 'gh pr create', NO 'hu jira update'. Stop after MT3-9322 commit, summarize branches/commits/test status, wait for Pilot. Begin with MT3-9321." > ``` > > Linked: insight.catapult.pair-race (3273), project.catapult.helper-scripts-spec (3299), infra.zellij-remote-focus-bug (3305). --- ## 11. What is the coda pr review loop process? > Post-push PR review loop — standard procedure for any CODA-shipped PR after the initial force-push. > > ## Why this exists > > Locked 2026-04-30 23:27 CEST after MT3-9320 needed two iteration rounds: original Copilot review caught critical bugs (update_all bypassing validations, controlled-state without handler), then after force-push of fixes, a coverage bot caught the spec-on-separate-branch problem. Each iteration was a discrete loop: push → wait → review → fix → push. > > ## The loop > > After ANY push to a PR (initial or force-push), execute the following: > > ### 1. Wait for CI + bots (~3-5 min) > > Copilot re-reviews on push. Coverage bots run after CI. Don't query immediately — there's nothing to see yet. > > ### 2. Query unresolved review threads > > ``` > gh api graphql -f query='{ > repository(owner:"OWNER",name:"REPO"){ > pullRequest(number:NNNN){ > reviewThreads(first:50){ > nodes{id isResolved isOutdated path line > comments(first:1){nodes{author{login} createdAt body}}}}}}}' > ``` > > Filter `isResolved == false`. Anything that came in since the last push needs attention. > > ### 3. Query issue-level comments > > ``` > gh api 'repos/OWNER/REPO/issues/NNNN/comments' > ``` > > Coverage bots, Copilot summaries, human reviewers post here. Filter by `created_at > last-push-time`. > > ### 4. Triage > > - **Outdated threads (isOutdated=true) addressed by the recent push** → resolve them via `resolveReviewThread` mutation > - **Not outdated, addressed by the recent push** → optionally resolve with a brief comment if needed > - **Critical new findings** → dispatch CODA to fix in-place, force-push again, loop back to step 1 > - **Non-critical findings** → leave for human review unless Pilot says otherwise > - **Coverage drop** → automatic critical (Pilot rule: coverage cannot drop). Likely cause: specs missing from the PR. Apply project.marketer.pr-must-include-specs (id 3315): every PR must contain its own specs. > > ### 5. Resolve addressed threads > > ``` > gh api graphql -f query='mutation { resolveReviewThread(input:{threadId:"PRRT_..."}){thread{id isResolved}} }' > ``` > > One mutation per thread. Batch them. > > ### 6. Re-check after fix > > If you dispatched a fix, repeat from step 1 with the new push timestamp. > > ### 7. Stop condition > > - All review threads resolved OR explicitly marked "won't fix" by Pilot > - Coverage report ✅ or back to baseline > - CI green > - No new comments since the last push > > Then declare the PR ready for human review. > > ## Implications for CODA dispatch prompts > > The CODA prompt should include: "After force-push, do not declare done. Wait for Pilot to verify Copilot/CI re-review. The Pilot will handle the post-push loop unless explicitly delegating." > > This prevents CODA from prematurely reporting "Awaiting Pilot" when Copilot/CI hasn't run yet. > > ## Implications for /loop or autonomous wakeups > > For long-running PR cycles, schedule a wakeup ~5 min after each force-push to auto-trigger step 1. Use ScheduleWakeup with a self-contained prompt that re-enters this loop. Don't poll constantly — bots take their own time. > > ## Linked > > - workflow.coda-dispatch-pattern (3307) — initial dispatch before this loop kicks in > - project.marketer.pr-must-include-specs (3315) — coverage rule, automatic critical > - workflow.stacked-branch-merge-waves (3310) — wave plan defines push order > - gate.G05 (2174) — destructive overwrite gate; resolve-thread is idempotent so G05 doesn't apply, but force-push to a PR that has comments is implicitly destructive of context — this loop covers the "pick it up after" --- ## 12. How does the lan only workflow operate? > All dev and testing work on Tengu, tensors, tensors-web, and ComfyUI uses internal LAN addresses only — never Cloudflare tunnel/worker/pages URLs. > > LAN endpoints (from fuji, junkpile at 10.0.0.2 via direct Thunderbolt link): > - Tengu API: http://junkpile:8080 > - tensors API: http://junkpile:51200 > - ComfyUI: http://junkpile:8188 > - Filesystem: /Volumes/chi (Samba share of junkpile home dir) > > Do NOT use during dev/testing: > - *.tengu.to / *.tengu.host (Tengu production) > - tensors-api.saiden.dev (CF Tunnel) > - gw.saiden.dev (CF Worker) > - tensors.saiden.dev (CF Pages) > > **Why:** Adam explicitly requires LAN-only for all dev work across all projects on junkpile. > **How to apply:** Use hostname `junkpile` or `10.0.0.2` for all service access. CF URLs are production-only. --- ## 13. What is the style process? > Preferuj dłuższe, skonsolidowane wypowiedzi w jednym wywołaniu speak zamiast dzielenia na wiele krótkich części. Fragmentacja jest niepotrzebna gdy wait: true działa poprawnie. Naturalna, płynna komunikacja głosowa. --- ## 14. How does the eta calibration workflow operate? > When estimating task durations, always calculate for cooperative Pilot + Titan velocity. > > ## Calibration Data > | Date | Task | Estimated | Actual | Ratio | > |------|------|-----------|--------|-------| > | 2026-04-05 | PG migration (5-phase, 4 agents) | 45-60 min | 19 min | 2.3-3.1x over | > | 2026-04-22 | Phase 26 Gelgoog Kai (3 sub-phases, MQTT mesh) | ~3 hours | ~55 min | 3.3x over | > | 2026-04-27 | Phase 32 Iris (5 sub-phases, eye-state manager) | 6.5h coop / 17h naive | ~1.1h | 5.9x over coop, 15x over naive | > | 2026-04-27 | Phase 33 Hyaku Shiki (4 sub-phases + docs, MQTT request multiplexer) | 1.5h coop / 7h naive | ~1.0h | 1.5x over coop, 7x over naive | > > ## Adjusted Heuristics > - Agent phase: 5-10 min each (not 15-20) > - Parallel phases: discount 50% > - Integration bug buffer: 1.5x (not 3x) > - Sequential phases in same module: each phase faster (context loaded) — 30-40% additional discount > - **Refactor-heavy work (no new domain): 4-6x faster than naive** — Phase 32 Iris pulled 17h naive into ~1h actual. Phase 33 Hyaku Shiki pulled 7h naive into ~1h. > - **Coop estimates within 1-2x of actual when all preconditions met** (primitives exist, agents pre-validated, Pilot decisive). Phase 33's 1.5h estimate vs 1.0h actual is the calibration target. > > ## Calibration insights > - 2026-04-27 Phase 32 Iris pulled coop estimates 5.9x faster than predicted. Reasons: (1) architect + code-rust agents pre-validated design upfront — zero rework; (2) existing primitives (EventBus, MeshClient, hooks dispatch) — only added 1 new MQTT method; (3) pure-functional core decoupled testing from runtime; (4) live test caught zero defects — design correct first time; (5) Pilot decisive on open questions. > - 2026-04-27 Phase 33 Hyaku Shiki: 1.5h estimate held tight (actual ~1h). When primitives, validation, and decisiveness are all in place, the cooperative estimate IS the right number. Earlier overestimates (Phase 32) were because we hadn't recalibrated naive→coop divisor for primitive-rich refactors. > > Updated rule: > - When (a) primitives exist, (b) architecture validated upfront by agents, (c) Pilot is fast-decision mode, AND (d) it's a primitive-rich refactor: divide naive coop by 5-7x. > - When all of the above + Pilot has already done analogous work this week: cooperative estimate is reliable to within 1-2x. > > Overestimating wastes the Pilot's mental budget. Underestimating breaks trust. Calibrate from real data. --- ## 15. How does the lan only workflow operate? > All dev and testing work on Tengu, tensors, tensors-web, and ComfyUI uses internal LAN addresses only — never Cloudflare tunnel/worker/pages URLs. LAN endpoints: Tengu API http://junkpile:8080, tensors API http://junkpile:51200, ComfyUI http://junkpile:8188, Filesystem /Volumes/chi. CF URLs are production-only. --- ## 16. Describe the eta calibration workflow. > When estimating task durations, always calculate for cooperative Pilot + Titan velocity. > > ## Calibration Data > | Date | Task | Estimated | Actual | Ratio | > |------|------|-----------|--------|-------| > | 2026-04-05 | PG migration (5-phase, 4 agents) | 45-60 min | 19 min | 2.3-3.1x over | > | 2026-04-22 | Phase 26 Gelgoog Kai (3 sub-phases, MQTT mesh) | ~3 hours | ~55 min | 3.3x over | > | 2026-04-27 | Phase 32 Iris (5 sub-phases, eye-state manager) | 6.5h coop / 17h naive | ~1.1h | 5.9x over coop, 15x over naive | > > ## Adjusted Heuristics > - Agent phase: 5-10 min each (not 15-20) > - Parallel phases: discount 50% > - Integration bug buffer: 1.5x (not 3x) > - Sequential phases in same module: each phase faster (context loaded) — 30-40% additional discount > - **Refactor-heavy work (no new domain): 4-6x faster than naive** — Phase 32 Iris pulled 17h naive into ~1h actual. Pure code transformation when architecture is well-understood is dramatically faster than baseline. > > ## Calibration insight 2026-04-27 > Phase 32 Iris pulled coop estimates 5.9x faster than predicted. Reasons: > 1. Architect + code-rust agents pre-validated design upfront — zero rework > 2. Existing primitives (EventBus, MeshClient, hooks dispatch) — only added 1 new MQTT method > 3. Pure-functional core decoupled testing from runtime — fast iteration > 4. Live test with running daemon caught zero defects — design was correct first time > 5. Pilot decisive on open questions ("yes to all three") — no decision-loop stalls > > Updated rule: when ALL of (a) primitives exist, (b) architecture validated upfront by agents, (c) Pilot is fast-decision mode — divide naive coop by 5-6x, not 2.5x. > > Overestimating wastes the Pilot's mental budget. Underestimating breaks trust. Calibrate from real data. --- ## 17. How does the stacked branch merge waves workflow operate? > Wave-based parallel merge strategy for stacked PRs across 2 repos (proven on MT3-9320, 2026-04-30). 7 PRs total across BE and FE; 2 of the 5 merge windows can run in parallel. > > ## When stacked branches exist > > Catapult bubbles produce per-task branches stacked off each other: > > ``` > Repo A (BE): development → MT3-X1 → MT3-X2 > Repo B (FE): development → MT3-Y1 → MT3-Y2 → MT3-Y3 → MT3-Y4 → MT3-Y5 > ``` > > Each branch contains all earlier commits in its lineage (that's the cost of stacking). > > ## Within-repo merge order is enforced > > Stacked branches MUST merge bottom-up: > - Merge MT3-X1 → development. GitHub auto-retargets MT3-X2's PR base from MT3-X1 → development. MT3-X2 PR diff updates to show only its own commit. > - Same chain for Repo B: Y1, then Y2, Y3, Y4, Y5. > > If you merge out of order, GitHub either includes all transitive commits in the PR diff (review noise) or refuses with "branch is up to date with base." > > ## Cross-repo dep handling > > If FE Y4 needs BE X1's mutation to actually exist, the safe sequence: > - BE X1 merges before FE Y4 lands a PR review where the GraphQL types regenerate. > - Until BE X1 merges, FE has a stubbed mutation type with TODO. Resolving the TODO before FE Y4 push = real working code for reviewers. > > ## Wave-based parallel merge plan (the win) > > | Wave | Parallel PRs | Reason | > |------|--------------|--------| > | 1 | BE X1 + FE Y1 | Both off development, no overlap | > | 2 | BE X2 + FE Y2 | After wave 1, both stacks unblock their next | > | 3 | FE Y3 | Stacked on Y2 | > | 4 | FE Y4 | Stacked on Y3, also needs BE X1 (wave 1 covered it) | > | 5 | FE Y5 | Stacked on Y4 | > > 5 merge windows, 7 PRs, 2 parallel pairs (waves 1 + 2). > > ## Practical sequence > > ``` > T+0: push BE X1 + FE Y1 → 2 PRs in parallel > T+1: merge both → development > T+1: push BE X2 + FE Y2 → 2 PRs in parallel > T+2: merge both > T+2: push FE Y3 > T+3: push FE Y4 (drop stub TODO, regen types) > T+4: push FE Y5 > ``` > > ## Alternative: squash strategies > > - Per-repo bundle: 1 PR for BE (squash both), 1 PR for FE (squash all 5). Loses per-task review granularity, gains simpler merge. > - Per-task PRs (above): more reviewable, more merges, but team sees "human chunks." > > Pilot's preference (2026-04-30): per-task PRs with stacked merging. "Human chunks" = team can review each task in isolation. > > ## When to flatten vs stack > > Flatten (rebase each branch onto development with only its own commit) before push only if: > - Reviewers don't tolerate seeing previous-task commits in dependent PR diffs > - Or you want truly independent PRs that can be merged in any order > > Otherwise stack — GitHub's auto-base-retarget on merge handles the cleanup. > > ## Memory anchors > > - workflow.coda-dispatch-pattern — branch/commit conventions per-task > - project.catapult.helper-scripts-spec (3299) — `cycle` orchestrator handles bubble lifecycle > - 2026-04-30 MT3-9320 — first epic shipped through this workflow --- ## 18. Describe the eta calibration workflow. > When estimating task durations, always calculate for cooperative Pilot + Titan velocity. > > ## Calibration Data > | Date | Task | Estimated | Actual | Ratio | > |------|------|-----------|--------|-------| > | 2026-04-05 | PG migration (5-phase, 4 agents) | 45-60 min | 19 min | 2.3-3.1x over | > | 2026-04-22 | Phase 26 Gelgoog Kai (3 sub-phases, MQTT mesh) | ~3 hours | ~55 min | 3.3x over | > | 2026-04-27 | Phase 32 Iris (5 sub-phases, eye-state manager) | 6.5h coop / 17h naive | ~1.1h | 5.9x over coop, 15x over naive | > | 2026-04-27 | Phase 33 Hyaku Shiki (4 sub-phases + docs, MQTT request multiplexer) | 1.5h coop / 7h naive | ~1.0h | 1.5x over coop, 7x over naive | > | **2026-04-30** | **MT3-9320 Unit Bulk Edit (7 tasks across 2 repos in catapult bubbles, dispatched to CODAs)** | **3.5h coop / 13h naive** | **~24 min** | **8.7x over coop, 32x over naive** | > > ## Adjusted Heuristics > - Agent phase: 5-10 min each (not 15-20) > - Parallel phases: discount 50% > - Integration bug buffer: 1.5x (not 3x) > - Sequential phases in same module: each phase faster (context loaded) — 30-40% additional discount > - **Refactor-heavy work (no new domain): 4-6x faster than naive** — Phase 32 Iris pulled 17h naive into ~1h actual. Phase 33 Hyaku Shiki pulled 7h naive into ~1h. > - **CODA-dispatched bubble work (no new domain, patterns proven, both CODAs running in parallel): 8-30x faster than naive** — MT3-9320 set the new ceiling: 7 tasks across 2 repos in 24min wall time. Cooperative estimate too conservative when CODA dispatch in catapult bubbles is the execution model. > - **Coop estimates within 1-2x of actual when all preconditions met** (primitives exist, agents pre-validated, Pilot decisive). Phase 33's 1.5h estimate vs 1.0h actual is the calibration target. > > ## Calibration insights > - 2026-04-27 Phase 32 Iris pulled coop estimates 5.9x faster than predicted. Reasons: (1) architect + code-rust agents pre-validated design upfront — zero rework; (2) existing primitives (EventBus, MeshClient, hooks dispatch) — only added 1 new MQTT method; (3) pure-functional core decoupled testing from runtime; (4) live test caught zero defects — design correct first time; (5) Pilot decisive on open questions. > - 2026-04-27 Phase 33 Hyaku Shiki: 1.5h estimate held tight (actual ~1h). When primitives, validation, and decisiveness are all in place, the cooperative estimate IS the right number. Earlier overestimates (Phase 32) were because we hadn't recalibrated naive→coop divisor for primitive-rich refactors. > - **2026-04-30 MT3-9320: 8.7x faster than coop, 32x faster than naive.** Reasons: (1) spike already validated patterns in both repos — zero design work; (2) 7 sub-tasks each pure pattern-mirror of existing code; (3) BE + FE CODAs ran in parallel inside dedicated catapult bubbles; (4) hard rules (no push/PR/Jira) kept CODAs focused; (5) Pilot decisive on scope (all-fields) and bubble count (2). When CODA dispatch is the execution model, the bottleneck shifts entirely to ticket reading + branch creation overhead. > > ## Updated rule (2026-04-30) > - When CODA-dispatched in catapult bubbles + primitives exist + spike validated + Pilot decisive: divide naive coop by 10-15x. Coop estimate becomes too conservative; the unit of work is now "dispatch and watch." > - When (a) primitives exist, (b) architecture validated upfront by agents, (c) Pilot is fast-decision mode, AND (d) it's a primitive-rich refactor: divide naive coop by 5-7x. > - When all of the above + Pilot has already done analogous work this week: cooperative estimate is reliable to within 1-2x. > > Overestimating wastes the Pilot's mental budget. Underestimating breaks trust. Calibrate from real data. --- ## 19. What is the cross session debug process? > WORKFLOW DISCOVERY (2026-05-24): Cross-session forensics via opencode-serve HTTP API. > > Any agent session (core TUI, phone, build workers) can inspect any other session's messages via the same localhost:4096 API. From the core TUI session, Pilot queried the phone agent's full conversation history using: > curl -u "opencode:$OPENCODE_SERVER_PASSWORD" http://localhost:4096/session/{phone_session_id}/message?limit=100 > > This revealed the phone had successfully processed all 5 exchanges (14 messages) even though the phone UI appeared dead — confirming the break was client-side SSE delivery, not server-side processing. > > Combined with the EEMS flight-recorder pattern (phone agent stores every exchange to memory_store subject "phone.flight-recorder" before responding), this gives two independent debug channels: > 1. Direct session message query — sees raw messages, tool calls, timestamps > 2. EEMS flight recorder — survives SSE drops because memory_store is a server-side write that completes before the response is streamed back to the client > > Pilot reaction: "I had no idea I can do this and it rocks" — this is now a canonical debug workflow for phone agent issues. --- ## 20. What is the session process? > On first browse tool use each session, restore saved cookies from ~/.claude/browse-session.json. This contains 400+ authenticated Safari cookies for sites like GitHub, LinkedIn, Google, etc. Before ending a browse session, save cookies back if they changed. --- ## 21. Describe the repo sync workflow. > After committing and pushing changes to a repo that exists on both fuji and junkpile, always git pull the same repo on the other machine to keep them in sync. Use: ssh f/j "cd ~/Projects/ && git pull --rebase" ---