docs: bt7274 persona, specialist plan, tts-clean LoRA
This commit is contained in:
@@ -0,0 +1,80 @@
|
||||
# BT-7274 Persona LoRA
|
||||
|
||||
## Overview
|
||||
|
||||
LoRA fine-tune Qwen2.5-7B-Instruct for BT-7274 Titan-AI persona.
|
||||
Voice, tool dispatch, and comms style.
|
||||
|
||||
## Versions
|
||||
|
||||
### v1 (97 examples)
|
||||
- Hand-crafted ChatML examples
|
||||
- Voice/style only, no tool calls
|
||||
- 5 epochs, 65 steps, ~17 min on RTX 2000 Ada 16GB
|
||||
- Adapter: `bt7274-lora-v1/`
|
||||
- Dataset: `bt7274_v1.jsonl`
|
||||
|
||||
### v2 (500 examples)
|
||||
- Extracted from opencode session history (58 core sessions)
|
||||
- 498/500 examples include tool calls (memory, speak, mesh, display)
|
||||
- ~1.1M tokens, avg 3.4K chars/example
|
||||
- 3 epochs, ~187 steps, ~1 hr estimated
|
||||
- Adapter: `bt7274-lora-v2/` (training in progress)
|
||||
- Dataset: `bt7274_v2.jsonl`
|
||||
|
||||
## Data Pipeline
|
||||
|
||||
Extraction script: `~/.config/opencode/scripts/extract-training-data.py`
|
||||
|
||||
1. Query opencode SQLite DB (`~/.local/share/opencode/opencode.db`)
|
||||
2. Filter core-agent sessions (BT-7274 voice)
|
||||
3. Reconstruct multi-turn: user → assistant (text + tool_calls) → tool results → response
|
||||
4. Score quality: BT voice cues, tool usage, length, anti-patterns
|
||||
5. Top N by score → ChatML JSONL
|
||||
|
||||
Quality scoring:
|
||||
- +2 per BT voice cue (Pilot, Boss, Confirmed, etc.)
|
||||
- +5 for tool-inclusive turns
|
||||
- +3 for reasonable length (50-2000 chars)
|
||||
- -5 for anti-patterns (Claude, Happy to help, etc.)
|
||||
|
||||
## Training Config
|
||||
|
||||
- Base: `unsloth/Qwen2.5-7B-Instruct-bnb-4bit`
|
||||
- LoRA: r=16, alpha=16, dropout=0
|
||||
- Targets: q/k/v/o_proj, gate/up/down_proj
|
||||
- MAX_SEQ=4096 (tool chains need length)
|
||||
- bf16, gradient checkpointing, adamw_8bit
|
||||
- Batch=1, grad_accum=8, LR=2e-4
|
||||
|
||||
## Serving
|
||||
|
||||
vLLM on junkpile (RTX 2000 Ada 16GB):
|
||||
|
||||
```bash
|
||||
python -m vllm.entrypoints.openai.api_server \
|
||||
--model Qwen/Qwen2.5-7B-Instruct \
|
||||
--quantization bitsandbytes \
|
||||
--load-format bitsandbytes \
|
||||
--enable-lora \
|
||||
--lora-modules bt7274=/home/madcat/Projects/lora/bt7274-lora-v2 \
|
||||
--max-lora-rank 16 \
|
||||
--max-model-len 32768 \
|
||||
--enforce-eager \
|
||||
--enable-auto-tool-choice \
|
||||
--tool-call-parser hermes \
|
||||
--port 8000
|
||||
```
|
||||
|
||||
## Cart Integration
|
||||
|
||||
Identity injected via `plugins/cart-loader.ts` (Rust NAPI plugin).
|
||||
Reads `~/.config/opencode/carts/bt7274.toml`.
|
||||
No EEMS boot recalls — cart is sole identity source.
|
||||
|
||||
## Incremental Retraining
|
||||
|
||||
Checkpoints saved every 50 steps. To resume or extend:
|
||||
- Add new examples to dataset JSONL
|
||||
- Set `resume_from_checkpoint=True` in TrainingArguments
|
||||
- Optimizer state, LR schedule, momentum all preserved
|
||||
Reference in New Issue
Block a user