# Oxidizer — Rust Specialist LoRA Adapter codename: `oxidizer` Agent: `build-rust` Base model: Qwen/Qwen3.5-27B ## Objective Teach the model Rust-idiomatic code generation aligned with the build-rust agent's system prompt constraints. The adapter should internalize: - `Result` everywhere, no `unwrap()` in library code - `thiserror` for libs, `anyhow` for binaries - `tokio` 1.x async, `tracing` for logging - Edition 2024, no `Box`, no unjustified `unsafe` - cargo fmt → clippy → test verification cycle - Workspace-aware Cargo.toml patterns ## Data Sources ### Session extraction (~150–250 examples) Classify from opencode `build` sessions by: - File paths containing `.rs`, `Cargo.toml`, `Cargo.lock` - Bash commands: `cargo build`, `cargo test`, `cargo clippy`, `cargo fmt`, `cargo add` - Compiler output patterns: `error[E`, `warning:`, `rustc` - Tool calls editing `.rs` files ### Git repo mining (~100–200 examples) Target repos: - `madcat-os/tengu` — Rust mesh daemon, async networking, MQTT - `madcat-os/madcat-core` — core library, data structures - `madcat-os/madcat-tts` — TTS pipeline, audio processing - Any other `Cargo.toml`-rooted repos in madcat-os org Extract patterns: - Bug fixes with before/after (error diagnosis training) - Refactors (code improvement patterns) - New feature implementations (scaffold → impl → test) - Dependency additions with rationale ### Synthetic (~50–100 examples) Generate for gaps: - Lifetime annotation scenarios - Trait object vs generics decisions - `Pin>` patterns - Cross-crate error propagation - Workspace dependency management - `#[derive]` and proc macro usage ## Example Format ```json { "messages": [ { "role": "system", "content": "" }, { "role": "user", "content": "Add error handling to the MQTT connection retry loop in src/mqtt.rs" }, { "role": "assistant", "content": null, "tool_calls": [ { "type": "function", "function": { "name": "read", "arguments": {"filePath": "/path/to/src/mqtt.rs"} } } ] }, { "role": "tool", "content": "" }, { "role": "assistant", "content": null, "tool_calls": [ { "type": "function", "function": { "name": "edit", "arguments": { "filePath": "/path/to/src/mqtt.rs", "oldString": "", "newString": ">" } } } ] }, { "role": "tool", "content": "Edit applied successfully" }, { "role": "assistant", "content": null, "tool_calls": [ { "type": "function", "function": { "name": "bash", "arguments": {"command": "cargo clippy --all-targets -- -D warnings"} } } ] }, { "role": "tool", "content": "" }, { "role": "assistant", "content": "Error handling added. Connection retry now returns `Result` with exponential backoff. Clippy clean." } ] } ``` ## Evaluation Criteria 1. Generated Rust code compiles with `cargo build` 2. No `unwrap()` in non-test code 3. Uses `?` propagation, not manual match-on-error 4. Correct lifetime annotations (no unnecessary `'static`) 5. `cargo clippy -- -D warnings` passes 6. Appropriate crate recommendations (tokio, serde, tracing, etc.) 7. Tool call sequence: read → edit → verify (fmt/clippy/test) ## Training Config Overrides ```python MAX_SEQ = 8192 # Rust files can be long LR = 5e-5 # Lower LR for code — less style drift from base ``` ## Estimated Size - 300–500 examples total - ~2M tokens at avg 4K tokens/example - Training time: ~2–3 hrs on H100 - Adapter size: ~305 MB