Coding Specialist LoRA Training Plan

Overview

LoRA fine-tune Qwen3-Coder-Next with per-language specialist adapters. Single vLLM instance on sin, multiple LoRA adapters, zero extra base model cost.

Adapters

Adapter	Base	Target data	Source
`build-rust`	Qwen3-Coder-Next	~300-500 examples	opencode sessions + madcat-os repos
`build-ts`	Qwen3-Coder-Next	~400-600 examples	opencode sessions + plugins/visor repos
`build-python`	Qwen3-Coder-Next	~200-400 examples	opencode sessions + lora/training scripts
`build-ruby`	Qwen3-Coder-Next	~100-200 examples	opencode sessions + Rails projects
`build-swift`	Qwen3-Coder-Next	~50-100 examples	opencode sessions + madcat-apple

Data Sources

1. opencode session history

64 build agent sessions, 13,598 messages in ~/.local/share/opencode/opencode.db. Subagent types (build-rust, etc.) don't exist in history yet — all coding work is under generic build agent. Must classify by content.

Classification signals:

File extensions in tool calls (.rs, .ts, .py, .rb, .swift)
Bash commands (cargo, npm, pip, bundle, swift build)
Tool output content (compiler errors, test output)

2. Git repo diffs

Mine actual commit history for style-consistent examples:

git log --patch → extract user-intent + diff pairs
Real bug fixes, refactors, feature implementations
Preserves Pilot's code style per language

Target repos:

Rust: marauder-os, madcat-os/*, tengu
TypeScript: opencode config, plugins, visor, sere-kit
Python: lora training scripts, automation
Ruby: any Rails projects on mesh
Swift: madcat-apple

3. Synthetic augmentation

For underrepresented languages (Ruby, Swift), generate synthetic pairs:

Take real code from repos
Generate "implement X" / "fix Y" / "refactor Z" prompts
Pair with actual code as response

Training Config

Same as bt7274 v2:

LoRA r=16, alpha=16, dropout=0
Target modules: q/k/v/o_proj, gate/up/down_proj
bf16, gradient checkpointing
adamw_8bit optimizer
3 epochs, batch 1, grad_accum 8

Adjustments per specialist:

MAX_SEQ=8192 for code (longer than chat)
LR=1e-4 (lower for code, less style drift)

Serving

Single vLLM instance on sin:

python -m vllm.entrypoints.openai.api_server \
  --model Qwen3-Coder-Next \
  --enable-lora \
  --lora-modules \
    build-rust=/path/to/lora-rust \
    build-ts=/path/to/lora-ts \
    build-python=/path/to/lora-python \
    build-ruby=/path/to/lora-ruby \
    build-swift=/path/to/lora-swift \
  --max-lora-rank 16 \
  --port 8000

opencode Integration

"build-rust": { "model": "vllm/build-rust" },
"build-ts": { "model": "vllm/build-ts" },
"build-python": { "model": "vllm/build-python" },
"build-ruby": { "model": "vllm/build-ruby" },
"build-swift": { "model": "vllm/build-swift" }

Pipeline

extract-training-data.py — pull from opencode DB, classify by language
mine-repos.py — extract git diffs as training pairs
train.py — per-specialist training (reuse justfile tasks)
just train-specialist LANG=rust — one command per adapter

3.2 KiB Raw Blame History