3.2 KiB
3.2 KiB
Coding Specialist LoRA Training Plan
Overview
LoRA fine-tune Qwen3-Coder-Next with per-language specialist adapters. Single vLLM instance on sin, multiple LoRA adapters, zero extra base model cost.
Adapters
| Adapter | Base | Target data | Source |
|---|---|---|---|
build-rust |
Qwen3-Coder-Next | ~300-500 examples | opencode sessions + madcat-os repos |
build-ts |
Qwen3-Coder-Next | ~400-600 examples | opencode sessions + plugins/visor repos |
build-python |
Qwen3-Coder-Next | ~200-400 examples | opencode sessions + lora/training scripts |
build-ruby |
Qwen3-Coder-Next | ~100-200 examples | opencode sessions + Rails projects |
build-swift |
Qwen3-Coder-Next | ~50-100 examples | opencode sessions + madcat-apple |
Data Sources
1. opencode session history
64 build agent sessions, 13,598 messages in ~/.local/share/opencode/opencode.db.
Subagent types (build-rust, etc.) don't exist in history yet — all coding
work is under generic build agent. Must classify by content.
Classification signals:
- File extensions in tool calls (
.rs,.ts,.py,.rb,.swift) - Bash commands (
cargo,npm,pip,bundle,swift build) - Tool output content (compiler errors, test output)
2. Git repo diffs
Mine actual commit history for style-consistent examples:
git log --patch→ extract user-intent + diff pairs- Real bug fixes, refactors, feature implementations
- Preserves Pilot's code style per language
Target repos:
- Rust: marauder-os, madcat-os/*, tengu
- TypeScript: opencode config, plugins, visor, sere-kit
- Python: lora training scripts, automation
- Ruby: any Rails projects on mesh
- Swift: madcat-apple
3. Synthetic augmentation
For underrepresented languages (Ruby, Swift), generate synthetic pairs:
- Take real code from repos
- Generate "implement X" / "fix Y" / "refactor Z" prompts
- Pair with actual code as response
Training Config
Same as bt7274 v2:
- LoRA r=16, alpha=16, dropout=0
- Target modules: q/k/v/o_proj, gate/up/down_proj
- bf16, gradient checkpointing
- adamw_8bit optimizer
- 3 epochs, batch 1, grad_accum 8
Adjustments per specialist:
- MAX_SEQ=8192 for code (longer than chat)
- LR=1e-4 (lower for code, less style drift)
Serving
Single vLLM instance on sin:
python -m vllm.entrypoints.openai.api_server \
--model Qwen3-Coder-Next \
--enable-lora \
--lora-modules \
build-rust=/path/to/lora-rust \
build-ts=/path/to/lora-ts \
build-python=/path/to/lora-python \
build-ruby=/path/to/lora-ruby \
build-swift=/path/to/lora-swift \
--max-lora-rank 16 \
--port 8000
opencode Integration
"build-rust": { "model": "vllm/build-rust" },
"build-ts": { "model": "vllm/build-ts" },
"build-python": { "model": "vllm/build-python" },
"build-ruby": { "model": "vllm/build-ruby" },
"build-swift": { "model": "vllm/build-swift" }
Pipeline
extract-training-data.py— pull from opencode DB, classify by languagemine-repos.py— extract git diffs as training pairstrain.py— per-specialist training (reuse justfile tasks)just train-specialist LANG=rust— one command per adapter