Oxidizer — Rust Specialist LoRA

Adapter codename: oxidizer Agent: build-rust Base model: Qwen/Qwen3.5-27B

Objective

Teach the model Rust-idiomatic code generation aligned with the build-rust agent's system prompt constraints. The adapter should internalize:

Result<T, E> everywhere, no unwrap() in library code
thiserror for libs, anyhow for binaries
tokio 1.x async, tracing for logging
Edition 2024, no Box<dyn Error>, no unjustified unsafe
cargo fmt → clippy → test verification cycle
Workspace-aware Cargo.toml patterns

Data Sources

Session extraction (~150–250 examples)

Classify from opencode build sessions by:

File paths containing .rs, Cargo.toml, Cargo.lock
Bash commands: cargo build, cargo test, cargo clippy, cargo fmt, cargo add
Compiler output patterns: error[E, warning:, rustc
Tool calls editing .rs files

Git repo mining (~100–200 examples)

Target repos:

madcat-os/tengu — Rust mesh daemon, async networking, MQTT
madcat-os/madcat-core — core library, data structures
madcat-os/madcat-tts — TTS pipeline, audio processing
Any other Cargo.toml-rooted repos in madcat-os org

Extract patterns:

Bug fixes with before/after (error diagnosis training)
Refactors (code improvement patterns)
New feature implementations (scaffold → impl → test)
Dependency additions with rationale

Synthetic (~50–100 examples)

Generate for gaps:

Lifetime annotation scenarios
Trait object vs generics decisions
Pin<Box<dyn Future>> patterns
Cross-crate error propagation
Workspace dependency management
#[derive] and proc macro usage

Example Format

{
  "messages": [
    {
      "role": "system",
      "content": "<build-rust agent system prompt>"
    },
    {
      "role": "user",
      "content": "Add error handling to the MQTT connection retry loop in src/mqtt.rs"
    },
    {
      "role": "assistant",
      "content": null,
      "tool_calls": [
        {
          "type": "function",
          "function": {
            "name": "read",
            "arguments": {"filePath": "/path/to/src/mqtt.rs"}
          }
        }
      ]
    },
    {
      "role": "tool",
      "content": "<file contents>"
    },
    {
      "role": "assistant",
      "content": null,
      "tool_calls": [
        {
          "type": "function",
          "function": {
            "name": "edit",
            "arguments": {
              "filePath": "/path/to/src/mqtt.rs",
              "oldString": "<original code>",
              "newString": "<fixed code with Result<T, E>>"
            }
          }
        }
      ]
    },
    {
      "role": "tool",
      "content": "Edit applied successfully"
    },
    {
      "role": "assistant",
      "content": null,
      "tool_calls": [
        {
          "type": "function",
          "function": {
            "name": "bash",
            "arguments": {"command": "cargo clippy --all-targets -- -D warnings"}
          }
        }
      ]
    },
    {
      "role": "tool",
      "content": "<clippy output — clean>"
    },
    {
      "role": "assistant",
      "content": "Error handling added. Connection retry now returns `Result<MqttClient, ConnectionError>` with exponential backoff. Clippy clean."
    }
  ]
}

Evaluation Criteria

Generated Rust code compiles with cargo build
No unwrap() in non-test code
Uses ? propagation, not manual match-on-error
Correct lifetime annotations (no unnecessary 'static)
cargo clippy -- -D warnings passes
Appropriate crate recommendations (tokio, serde, tracing, etc.)
Tool call sequence: read → edit → verify (fmt/clippy/test)

Training Config Overrides

MAX_SEQ = 8192   # Rust files can be long
LR      = 5e-5   # Lower LR for code — less style drift from base

Estimated Size

300–500 examples total
~2M tokens at avg 4K tokens/example
Training time: ~2–3 hrs on H100
Adapter size: ~305 MB

3.9 KiB Raw Permalink Blame History Unescape Escape