3.9 KiB
3.9 KiB
Oxidizer — Rust Specialist LoRA
Adapter codename: oxidizer
Agent: build-rust
Base model: Qwen/Qwen3.5-27B
Objective
Teach the model Rust-idiomatic code generation aligned with the build-rust agent's system prompt constraints. The adapter should internalize:
Result<T, E>everywhere, nounwrap()in library codethiserrorfor libs,anyhowfor binariestokio1.x async,tracingfor logging- Edition 2024, no
Box<dyn Error>, no unjustifiedunsafe - cargo fmt → clippy → test verification cycle
- Workspace-aware Cargo.toml patterns
Data Sources
Session extraction (~150–250 examples)
Classify from opencode build sessions by:
- File paths containing
.rs,Cargo.toml,Cargo.lock - Bash commands:
cargo build,cargo test,cargo clippy,cargo fmt,cargo add - Compiler output patterns:
error[E,warning:,rustc - Tool calls editing
.rsfiles
Git repo mining (~100–200 examples)
Target repos:
madcat-os/tengu— Rust mesh daemon, async networking, MQTTmadcat-os/madcat-core— core library, data structuresmadcat-os/madcat-tts— TTS pipeline, audio processing- Any other
Cargo.toml-rooted repos in madcat-os org
Extract patterns:
- Bug fixes with before/after (error diagnosis training)
- Refactors (code improvement patterns)
- New feature implementations (scaffold → impl → test)
- Dependency additions with rationale
Synthetic (~50–100 examples)
Generate for gaps:
- Lifetime annotation scenarios
- Trait object vs generics decisions
Pin<Box<dyn Future>>patterns- Cross-crate error propagation
- Workspace dependency management
#[derive]and proc macro usage
Example Format
{
"messages": [
{
"role": "system",
"content": "<build-rust agent system prompt>"
},
{
"role": "user",
"content": "Add error handling to the MQTT connection retry loop in src/mqtt.rs"
},
{
"role": "assistant",
"content": null,
"tool_calls": [
{
"type": "function",
"function": {
"name": "read",
"arguments": {"filePath": "/path/to/src/mqtt.rs"}
}
}
]
},
{
"role": "tool",
"content": "<file contents>"
},
{
"role": "assistant",
"content": null,
"tool_calls": [
{
"type": "function",
"function": {
"name": "edit",
"arguments": {
"filePath": "/path/to/src/mqtt.rs",
"oldString": "<original code>",
"newString": "<fixed code with Result<T, E>>"
}
}
}
]
},
{
"role": "tool",
"content": "Edit applied successfully"
},
{
"role": "assistant",
"content": null,
"tool_calls": [
{
"type": "function",
"function": {
"name": "bash",
"arguments": {"command": "cargo clippy --all-targets -- -D warnings"}
}
}
]
},
{
"role": "tool",
"content": "<clippy output — clean>"
},
{
"role": "assistant",
"content": "Error handling added. Connection retry now returns `Result<MqttClient, ConnectionError>` with exponential backoff. Clippy clean."
}
]
}
Evaluation Criteria
- Generated Rust code compiles with
cargo build - No
unwrap()in non-test code - Uses
?propagation, not manual match-on-error - Correct lifetime annotations (no unnecessary
'static) cargo clippy -- -D warningspasses- Appropriate crate recommendations (tokio, serde, tracing, etc.)
- Tool call sequence: read → edit → verify (fmt/clippy/test)
Training Config Overrides
MAX_SEQ = 8192 # Rust files can be long
LR = 5e-5 # Lower LR for code — less style drift from base
Estimated Size
- 300–500 examples total
- ~2M tokens at avg 4K tokens/example
- Training time: ~2–3 hrs on H100
- Adapter size: ~305 MB