add docs: system lora plan, specialist specs, training review

2026-05-31 11:38:46 +02:00
parent 4678816795
commit 4cef9386b1
23 changed files with 62713 additions and 0 deletions
@@ -0,0 +1,156 @@
+# Forge — Ruby Specialist LoRA
+
+Adapter codename: `forge`
+Agent: `build-ruby`
+Base model: Qwen/Qwen3.5-27B
+
+## Objective
+
+Teach the model Ruby/Rails-idiomatic code generation aligned with the build-ruby
+agent's system prompt. The adapter should internalize:
+
+- `# frozen_string_literal: true` on all new files
+- Symbols over strings for hash keys
+- Guard clauses, Result pattern in service objects
+- No monkey-patching unless project already does it
+- ViewComponent, concerns, service objects, scopes
+- standardrb/rubocop → rspec/minitest verification cycle
+
+## Data Sources
+
+### Session extraction (~30–60 examples)
+
+Ruby sessions are sparse. Classify by:
+- File paths: `.rb`, `.erb`, `.haml`, `Gemfile`, `Rakefile`, `.ruby-version`
+- Bash commands: `bundle`, `rails`, `rake`, `rspec`, `rubocop`, `standardrb`
+- Error patterns: `NoMethodError`, `NameError`, Rails backtrace format
+
+### Git repo mining (~30–50 examples)
+
+Target repos:
+- Any Rails projects in madcat-os org or personal repos
+- Gem libraries if any exist
+
+Extract patterns:
+- ActiveRecord model definitions (validations, associations, scopes)
+- Controller implementations (strong params, error handling)
+- Service object patterns
+- Migration files
+- RSpec test patterns
+
+### Synthetic (~60–120 examples)
+
+Ruby/Rails is the most synthetic-heavy adapter due to sparse session data:
+- Rails 8 + Hotwire/Turbo patterns
+- Service object with Result pattern
+- ActiveRecord query optimization (N+1, eager loading)
+- Concern extraction and composition
+- API-only Rails setup
+- Background job patterns (Sidekiq, Solid Queue)
+- ViewComponent implementations
+- RSpec: let/subject/shared_examples patterns
+
+## Example Format
+
+```json
+{
+  "messages": [
+    {
+      "role": "system",
+      "content": "<build-ruby agent system prompt>"
+    },
+    {
+      "role": "user",
+      "content": "Create a service object for user registration with email verification"
+    },
+    {
+      "role": "assistant",
+      "content": null,
+      "tool_calls": [
+        {
+          "type": "function",
+          "function": {
+            "name": "read",
+            "arguments": {"filePath": "app/services"}
+          }
+        }
+      ]
+    },
+    {
+      "role": "tool",
+      "content": "<directory listing showing existing service pattern>"
+    },
+    {
+      "role": "assistant",
+      "content": null,
+      "tool_calls": [
+        {
+          "type": "function",
+          "function": {
+            "name": "write",
+            "arguments": {
+              "filePath": "app/services/users/register.rb",
+              "content": "# frozen_string_literal: true\n\nmodule Users\n  class Register\n    Result = Data.define(:user, :errors)\n    ..."
+            }
+          }
+        }
+      ]
+    },
+    {
+      "role": "tool",
+      "content": "File written"
+    },
+    {
+      "role": "assistant",
+      "content": null,
+      "tool_calls": [
+        {
+          "type": "function",
+          "function": {
+            "name": "bash",
+            "arguments": {"command": "bundle exec standardrb app/services/users/register.rb"}
+          }
+        }
+      ]
+    },
+    {
+      "role": "tool",
+      "content": "0 offenses"
+    },
+    {
+      "role": "assistant",
+      "content": "Service object created. Uses Result pattern (Data.define), guard clauses, frozen string literal. Standardrb clean."
+    }
+  ]
+}
+```
+
+## Evaluation Criteria
+
+1. All files have `# frozen_string_literal: true`
+2. Symbol keys in hashes (not strings)
+3. Guard clauses for early returns
+4. Service objects use Result/value objects, not raised exceptions
+5. `bundle exec standardrb` or `rubocop` passes
+6. Rails conventions: scopes over class methods, concerns for shared behavior
+7. Tool call sequence: explore → read → implement → lint → test
+
+## Training Config Overrides
+
+```python
+MAX_SEQ = 8192
+LR      = 5e-5
+```
+
+## Estimated Size
+
+- 100–200 examples (synthetic-heavy)
+- ~0.8M tokens
+- Training time: ~1 hr on H100
+- Adapter size: ~305 MB
+
+## Risk: Synthetic Quality
+
+With >50% synthetic data, risk of hallucinated gem names or outdated Rails patterns.
+Mitigation: curate synthetic examples manually, verify all gem references exist,
+test generated code against a real Rails 8 scaffold.
@@ -0,0 +1,151 @@
+# Oxidizer — Rust Specialist LoRA
+
+Adapter codename: `oxidizer`
+Agent: `build-rust`
+Base model: Qwen/Qwen3.5-27B
+
+## Objective
+
+Teach the model Rust-idiomatic code generation aligned with the build-rust agent's
+system prompt constraints. The adapter should internalize:
+
+- `Result<T, E>` everywhere, no `unwrap()` in library code
+- `thiserror` for libs, `anyhow` for binaries
+- `tokio` 1.x async, `tracing` for logging
+- Edition 2024, no `Box<dyn Error>`, no unjustified `unsafe`
+- cargo fmt → clippy → test verification cycle
+- Workspace-aware Cargo.toml patterns
+
+## Data Sources
+
+### Session extraction (~150–250 examples)
+
+Classify from opencode `build` sessions by:
+- File paths containing `.rs`, `Cargo.toml`, `Cargo.lock`
+- Bash commands: `cargo build`, `cargo test`, `cargo clippy`, `cargo fmt`, `cargo add`
+- Compiler output patterns: `error[E`, `warning:`, `rustc`
+- Tool calls editing `.rs` files
+
+### Git repo mining (~100–200 examples)
+
+Target repos:
+- `madcat-os/tengu` — Rust mesh daemon, async networking, MQTT
+- `madcat-os/madcat-core` — core library, data structures
+- `madcat-os/madcat-tts` — TTS pipeline, audio processing
+- Any other `Cargo.toml`-rooted repos in madcat-os org
+
+Extract patterns:
+- Bug fixes with before/after (error diagnosis training)
+- Refactors (code improvement patterns)
+- New feature implementations (scaffold → impl → test)
+- Dependency additions with rationale
+
+### Synthetic (~50–100 examples)
+
+Generate for gaps:
+- Lifetime annotation scenarios
+- Trait object vs generics decisions
+- `Pin<Box<dyn Future>>` patterns
+- Cross-crate error propagation
+- Workspace dependency management
+- `#[derive]` and proc macro usage
+
+## Example Format
+
+```json
+{
+  "messages": [
+    {
+      "role": "system",
+      "content": "<build-rust agent system prompt>"
+    },
+    {
+      "role": "user",
+      "content": "Add error handling to the MQTT connection retry loop in src/mqtt.rs"
+    },
+    {
+      "role": "assistant",
+      "content": null,
+      "tool_calls": [
+        {
+          "type": "function",
+          "function": {
+            "name": "read",
+            "arguments": {"filePath": "/path/to/src/mqtt.rs"}
+          }
+        }
+      ]
+    },
+    {
+      "role": "tool",
+      "content": "<file contents>"
+    },
+    {
+      "role": "assistant",
+      "content": null,
+      "tool_calls": [
+        {
+          "type": "function",
+          "function": {
+            "name": "edit",
+            "arguments": {
+              "filePath": "/path/to/src/mqtt.rs",
+              "oldString": "<original code>",
+              "newString": "<fixed code with Result<T, E>>"
+            }
+          }
+        }
+      ]
+    },
+    {
+      "role": "tool",
+      "content": "Edit applied successfully"
+    },
+    {
+      "role": "assistant",
+      "content": null,
+      "tool_calls": [
+        {
+          "type": "function",
+          "function": {
+            "name": "bash",
+            "arguments": {"command": "cargo clippy --all-targets -- -D warnings"}
+          }
+        }
+      ]
+    },
+    {
+      "role": "tool",
+      "content": "<clippy output — clean>"
+    },
+    {
+      "role": "assistant",
+      "content": "Error handling added. Connection retry now returns `Result<MqttClient, ConnectionError>` with exponential backoff. Clippy clean."
+    }
+  ]
+}
+```
+
+## Evaluation Criteria
+
+1. Generated Rust code compiles with `cargo build`
+2. No `unwrap()` in non-test code
+3. Uses `?` propagation, not manual match-on-error
+4. Correct lifetime annotations (no unnecessary `'static`)
+5. `cargo clippy -- -D warnings` passes
+6. Appropriate crate recommendations (tokio, serde, tracing, etc.)
+7. Tool call sequence: read → edit → verify (fmt/clippy/test)
+
+## Training Config Overrides
+
+```python
+MAX_SEQ = 8192   # Rust files can be long
+LR      = 5e-5   # Lower LR for code — less style drift from base
+```
+
+## Estimated Size
+
+- 300–500 examples total
+- ~2M tokens at avg 4K tokens/example
+- Training time: ~2–3 hrs on H100
+- Adapter size: ~305 MB
@@ -0,0 +1,165 @@
+# Prism — TypeScript Specialist LoRA
+
+Adapter codename: `prism`
+Agent: `build-ts`
+Base model: Qwen/Qwen3.5-27B
+
+## Objective
+
+Teach the model TypeScript-idiomatic code generation aligned with the build-ts
+agent's system prompt. The adapter should internalize:
+
+- Strict TypeScript — no `any` without justification
+- `const` over `let`, never `var`
+- Named exports, `type` over `interface` (unless extending)
+- Package manager detection (bun/pnpm/npm)
+- Framework-aware patterns (Next.js, Vite, Astro, SvelteKit, Remix, Bun)
+- tsc → lint → test verification cycle
+
+## Data Sources
+
+### Session extraction (~200–350 examples)
+
+Classify from opencode `build` sessions by:
+- File paths: `.ts`, `.tsx`, `.js`, `.jsx`, `package.json`, `tsconfig.json`
+- Bash commands: `npm`, `pnpm`, `bun`, `npx`, `tsc`, `vitest`, `jest`, `biome`
+- Error patterns: `TS2`, `TS7`, `error TS`, type errors
+- Framework config files in context: `next.config.*`, `vite.config.*`
+
+### Git repo mining (~150–250 examples)
+
+Target repos:
+- `~/.config/opencode/plugins/` — opencode plugins (TypeScript)
+- `madcat-os/sere-kit` — UI component library
+- `madcat-os/visor` — HUD/display system
+- Any `package.json`-rooted repos in madcat-os org
+
+Extract patterns:
+- React component implementations (hooks, state management)
+- Plugin/tool development (opencode tools, MCP servers)
+- Type-safe API integrations
+- Build configuration (tsconfig, vite, next)
+
+### Synthetic (~50–100 examples)
+
+- Generic type patterns and constraints
+- Discriminated unions and type narrowing
+- Zod schema → TypeScript type inference
+- React Server Components vs Client Components
+- Edge runtime constraints (Cloudflare Workers, Vercel Edge)
+
+## Example Format
+
+```json
+{
+  "messages": [
+    {
+      "role": "system",
+      "content": "<build-ts agent system prompt>"
+    },
+    {
+      "role": "user",
+      "content": "Add a new visor widget component for displaying memory graph connections"
+    },
+    {
+      "role": "assistant",
+      "content": null,
+      "tool_calls": [
+        {
+          "type": "function",
+          "function": {
+            "name": "glob",
+            "arguments": {"pattern": "src/components/**/*.tsx"}
+          }
+        }
+      ]
+    },
+    {
+      "role": "tool",
+      "content": "<file list>"
+    },
+    {
+      "role": "assistant",
+      "content": null,
+      "tool_calls": [
+        {
+          "type": "function",
+          "function": {
+            "name": "read",
+            "arguments": {"filePath": "src/components/widgets/LogWidget.tsx"}
+          }
+        }
+      ]
+    },
+    {
+      "role": "tool",
+      "content": "<existing widget pattern>"
+    },
+    {
+      "role": "assistant",
+      "content": null,
+      "tool_calls": [
+        {
+          "type": "function",
+          "function": {
+            "name": "write",
+            "arguments": {
+              "filePath": "src/components/widgets/GraphWidget.tsx",
+              "content": "<new component following existing patterns>"
+            }
+          }
+        }
+      ]
+    },
+    {
+      "role": "tool",
+      "content": "File written successfully"
+    },
+    {
+      "role": "assistant",
+      "content": null,
+      "tool_calls": [
+        {
+          "type": "function",
+          "function": {
+            "name": "bash",
+            "arguments": {"command": "npx tsc --noEmit"}
+          }
+        }
+      ]
+    },
+    {
+      "role": "tool",
+      "content": "<type check passes>"
+    },
+    {
+      "role": "assistant",
+      "content": "GraphWidget created. Follows LogWidget's pattern — named export, typed props, const component. Type check clean."
+    }
+  ]
+}
+```
+
+## Evaluation Criteria
+
+1. Generated TypeScript passes `tsc --noEmit` with strict mode
+2. No `any` types without explicit justification
+3. Correct import/export patterns (named exports)
+4. Framework-appropriate patterns (detects Next.js vs Vite etc.)
+5. Package manager awareness (correct lockfile, correct run command)
+6. Tool call sequence: explore → read → implement → type-check
+7. React patterns: proper hook usage, no stale closures
+
+## Training Config Overrides
+
+```python
+MAX_SEQ = 8192
+LR      = 5e-5
+```
+
+## Estimated Size
+
+- 400–600 examples total (largest dataset — most session history)
+- ~3M tokens
+- Training time: ~3 hrs on H100
+- Adapter size: ~305 MB
@@ -0,0 +1,150 @@
+# Serpent — Python Specialist LoRA
+
+Adapter codename: `serpent`
+Agent: `build-python`
+Base model: Qwen/Qwen3.5-27B
+
+## Objective
+
+Teach the model Python-idiomatic code generation aligned with the build-python
+agent's system prompt. The adapter should internalize:
+
+- Type hints on all function signatures (params + return)
+- `pathlib.Path` over `os.path`
+- `uv` for dependency management, `pyproject.toml` as source of truth
+- Pydantic v2 patterns (`.model_dump()`, not `.dict()`)
+- `pytest` + `pytest-asyncio` for testing
+- `ruff` for lint + format
+- ruff → pytest → mypy verification cycle
+
+## Data Sources
+
+### Session extraction (~100–200 examples)
+
+Classify from opencode `build` sessions by:
+- File paths: `.py`, `pyproject.toml`, `requirements.txt`, `setup.py`
+- Bash commands: `python`, `pip`, `uv`, `pytest`, `ruff`, `mypy`
+- Error patterns: `SyntaxError`, `TypeError`, `ImportError`, traceback format
+- Framework detection: FastAPI, Django, Flask imports
+
+### Git repo mining (~50–100 examples)
+
+Target repos:
+- `madcat-os/lora` — training scripts (this repo)
+- Automation scripts in `~/.config/opencode/scripts/`
+- Any Python projects in madcat-os org
+
+Extract patterns:
+- Data processing scripts (JSONL, datasets)
+- ML training pipelines (unsloth, torch, datasets)
+- CLI tools and automation
+- FastAPI endpoint implementations
+
+### Synthetic (~50–100 examples)
+
+- async/await with `asyncio` patterns
+- Pydantic v2 model definitions with validators
+- `pathlib` file operations
+- Type-narrowing with `TypeGuard`, `assert_never`
+- Dataclass vs Pydantic decision points
+- `uv` project setup and dependency resolution
+
+## Example Format
+
+```json
+{
+  "messages": [
+    {
+      "role": "system",
+      "content": "<build-python agent system prompt>"
+    },
+    {
+      "role": "user",
+      "content": "Add a CLI argument to specify output format (json/csv) in the extraction script"
+    },
+    {
+      "role": "assistant",
+      "content": null,
+      "tool_calls": [
+        {
+          "type": "function",
+          "function": {
+            "name": "read",
+            "arguments": {"filePath": "extract.py"}
+          }
+        }
+      ]
+    },
+    {
+      "role": "tool",
+      "content": "<existing script>"
+    },
+    {
+      "role": "assistant",
+      "content": null,
+      "tool_calls": [
+        {
+          "type": "function",
+          "function": {
+            "name": "edit",
+            "arguments": {
+              "filePath": "extract.py",
+              "oldString": "def main():",
+              "newString": "def main(output_format: Literal[\"json\", \"csv\"] = \"json\") -> None:"
+            }
+          }
+        }
+      ]
+    },
+    {
+      "role": "tool",
+      "content": "Edit applied"
+    },
+    {
+      "role": "assistant",
+      "content": null,
+      "tool_calls": [
+        {
+          "type": "function",
+          "function": {
+            "name": "bash",
+            "arguments": {"command": "ruff check extract.py && ruff format --check extract.py"}
+          }
+        }
+      ]
+    },
+    {
+      "role": "tool",
+      "content": "All checks passed"
+    },
+    {
+      "role": "assistant",
+      "content": "Added `--format` CLI arg via `argparse`. Type-hinted with `Literal[\"json\", \"csv\"]`. Ruff clean."
+    }
+  ]
+}
+```
+
+## Evaluation Criteria
+
+1. All functions have type hints (params and return)
+2. Uses `pathlib.Path`, not `os.path`
+3. `ruff check` and `ruff format --check` pass
+4. `pytest` tests pass
+5. Pydantic v2 patterns (no v1 `.dict()`, `.json()`)
+6. No `requirements.txt` — uses `pyproject.toml` + `uv`
+7. Tool call sequence: read → edit → lint → test
+
+## Training Config Overrides
+
+```python
+MAX_SEQ = 8192
+LR      = 5e-5
+```
+
+## Estimated Size
+
+- 200–400 examples
+- ~1.5M tokens
+- Training time: ~1.5 hrs on H100
+- Adapter size: ~305 MB
@@ -0,0 +1,186 @@
+# Swiftblade — Swift Specialist LoRA
+
+Adapter codename: `swiftblade`
+Agent: `build-swift`
+Base model: Qwen/Qwen3.5-27B
+
+## Objective
+
+Teach the model Swift/Apple-idiomatic code generation aligned with the build-swift
+agent's system prompt. The adapter should internalize:
+
+- Swift 6 strict concurrency (`@Sendable`, `sending`, `@MainActor`)
+- SwiftUI first, UIKit only when necessary
+- `guard` for early exits, proper optional binding
+- Value types over reference types where appropriate
+- `URLSession` for networking (no Alamofire)
+- swift build → swift test verification cycle
+- Architecture detection (MVVM, TCA, MV)
+
+## Current Status
+
+build-swift currently runs on `anthropic/claude-sonnet-4-6` as a quality backstop.
+The swiftblade adapter is intended as a local replacement, but deployment depends
+on evaluation results. If quality is insufficient, keep Claude as primary and
+swiftblade as cost-saving alternative for simple tasks.
+
+## Data Sources
+
+### Session extraction (~20–40 examples)
+
+Swift sessions are rare. Classify by:
+- File paths: `.swift`, `Package.swift`, `*.xcodeproj`, `project.yml`
+- Bash commands: `swift build`, `swift test`, `xcodebuild`, `swift-format`
+- Error patterns: swiftc diagnostics, `error:`, `cannot convert`, `protocol conformance`
+
+### Git repo mining (~30–60 examples)
+
+Target repos:
+- `madcat-os/madcat-apple` — iOS/macOS app
+- Any Swift packages in madcat-os org
+
+Extract patterns:
+- SwiftUI view implementations
+- `@Observable` / `ObservableObject` view model patterns
+- async/await networking with `URLSession`
+- Swift Package Manager configuration
+- Actor-based state isolation
+
+### Synthetic (~80–150 examples)
+
+Swift is the most synthetic-heavy adapter:
+- Swift 6 concurrency: actors, `@Sendable`, `TaskGroup`, structured concurrency
+- SwiftUI: `@State`, `@Binding`, `@Environment`, custom modifiers
+- Navigation: `NavigationStack`, `NavigationSplitView`, navigation paths
+- Data flow: `@Observable` macro (Swift 5.9+), `@Bindable`
+- Platform-specific: `#available`, multi-platform targets
+- Swift Testing framework (`@Test`, `#expect`, `@Suite`) vs XCTest
+- Combine → async/await migration patterns
+- Widget/extension development
+
+## Example Format
+
+```json
+{
+  "messages": [
+    {
+      "role": "system",
+      "content": "<build-swift agent system prompt>"
+    },
+    {
+      "role": "user",
+      "content": "Add a settings view with toggle for dark mode and notification preferences"
+    },
+    {
+      "role": "assistant",
+      "content": null,
+      "tool_calls": [
+        {
+          "type": "function",
+          "function": {
+            "name": "glob",
+            "arguments": {"pattern": "Sources/**/*.swift"}
+          }
+        }
+      ]
+    },
+    {
+      "role": "tool",
+      "content": "<file list showing existing views>"
+    },
+    {
+      "role": "assistant",
+      "content": null,
+      "tool_calls": [
+        {
+          "type": "function",
+          "function": {
+            "name": "read",
+            "arguments": {"filePath": "Sources/Views/ContentView.swift"}
+          }
+        }
+      ]
+    },
+    {
+      "role": "tool",
+      "content": "<existing SwiftUI view pattern>"
+    },
+    {
+      "role": "assistant",
+      "content": null,
+      "tool_calls": [
+        {
+          "type": "function",
+          "function": {
+            "name": "write",
+            "arguments": {
+              "filePath": "Sources/Views/SettingsView.swift",
+              "content": "import SwiftUI\n\nstruct SettingsView: View {\n    @AppStorage(\"darkMode\") private var darkMode = false\n    ..."
+            }
+          }
+        }
+      ]
+    },
+    {
+      "role": "tool",
+      "content": "File written"
+    },
+    {
+      "role": "assistant",
+      "content": null,
+      "tool_calls": [
+        {
+          "type": "function",
+          "function": {
+            "name": "bash",
+            "arguments": {"command": "swift build 2>&1"}
+          }
+        }
+      ]
+    },
+    {
+      "role": "tool",
+      "content": "Build complete!"
+    },
+    {
+      "role": "assistant",
+      "content": "SettingsView created. Uses `@AppStorage` for persistence, `Toggle` for dark mode and notifications. Follows existing MVVM pattern. Builds clean."
+    }
+  ]
+}
+```
+
+## Evaluation Criteria
+
+1. `swift build` passes with no warnings
+2. Swift 6 concurrency: correct `@Sendable`, actor isolation
+3. SwiftUI-first (no UIKit unless justified)
+4. Proper optional handling (`guard let`, `if let`, no force-unwrap)
+5. Value types where appropriate
+6. Architecture matches project (MVVM/TCA/MV detection)
+7. Tool call sequence: explore → read patterns → implement → build → test
+
+## Training Config Overrides
+
+```python
+MAX_SEQ = 8192
+LR      = 5e-5
+```
+
+## Estimated Size
+
+- 100–200 examples (mostly synthetic)
+- ~0.8M tokens
+- Training time: ~1 hr on H100
+- Adapter size: ~305 MB
+
+## Risk: Quality Gap vs Claude
+
+Claude Sonnet excels at Swift due to Apple's partnership data. Swiftblade on
+Qwen3.5-27B may underperform on:
+- Complex generics and associated types
+- Protocol-oriented design patterns
+- Platform-specific API knowledge (Core Data, CloudKit, etc.)
+
+Mitigation: keep Claude as fallback, use swiftblade for simpler tasks (UI,
+boilerplate, tests). Evaluate before full switchover.