4.1 KiB
4.1 KiB
Forge — Ruby Specialist LoRA
Adapter codename: forge
Agent: build-ruby
Base model: Qwen/Qwen3.5-27B
Objective
Teach the model Ruby/Rails-idiomatic code generation aligned with the build-ruby agent's system prompt. The adapter should internalize:
# frozen_string_literal: trueon all new files- Symbols over strings for hash keys
- Guard clauses, Result pattern in service objects
- No monkey-patching unless project already does it
- ViewComponent, concerns, service objects, scopes
- standardrb/rubocop → rspec/minitest verification cycle
Data Sources
Session extraction (~30–60 examples)
Ruby sessions are sparse. Classify by:
- File paths:
.rb,.erb,.haml,Gemfile,Rakefile,.ruby-version - Bash commands:
bundle,rails,rake,rspec,rubocop,standardrb - Error patterns:
NoMethodError,NameError, Rails backtrace format
Git repo mining (~30–50 examples)
Target repos:
- Any Rails projects in madcat-os org or personal repos
- Gem libraries if any exist
Extract patterns:
- ActiveRecord model definitions (validations, associations, scopes)
- Controller implementations (strong params, error handling)
- Service object patterns
- Migration files
- RSpec test patterns
Synthetic (~60–120 examples)
Ruby/Rails is the most synthetic-heavy adapter due to sparse session data:
- Rails 8 + Hotwire/Turbo patterns
- Service object with Result pattern
- ActiveRecord query optimization (N+1, eager loading)
- Concern extraction and composition
- API-only Rails setup
- Background job patterns (Sidekiq, Solid Queue)
- ViewComponent implementations
- RSpec: let/subject/shared_examples patterns
Example Format
{
"messages": [
{
"role": "system",
"content": "<build-ruby agent system prompt>"
},
{
"role": "user",
"content": "Create a service object for user registration with email verification"
},
{
"role": "assistant",
"content": null,
"tool_calls": [
{
"type": "function",
"function": {
"name": "read",
"arguments": {"filePath": "app/services"}
}
}
]
},
{
"role": "tool",
"content": "<directory listing showing existing service pattern>"
},
{
"role": "assistant",
"content": null,
"tool_calls": [
{
"type": "function",
"function": {
"name": "write",
"arguments": {
"filePath": "app/services/users/register.rb",
"content": "# frozen_string_literal: true\n\nmodule Users\n class Register\n Result = Data.define(:user, :errors)\n ..."
}
}
}
]
},
{
"role": "tool",
"content": "File written"
},
{
"role": "assistant",
"content": null,
"tool_calls": [
{
"type": "function",
"function": {
"name": "bash",
"arguments": {"command": "bundle exec standardrb app/services/users/register.rb"}
}
}
]
},
{
"role": "tool",
"content": "0 offenses"
},
{
"role": "assistant",
"content": "Service object created. Uses Result pattern (Data.define), guard clauses, frozen string literal. Standardrb clean."
}
]
}
Evaluation Criteria
- All files have
# frozen_string_literal: true - Symbol keys in hashes (not strings)
- Guard clauses for early returns
- Service objects use Result/value objects, not raised exceptions
bundle exec standardrborrubocoppasses- Rails conventions: scopes over class methods, concerns for shared behavior
- Tool call sequence: explore → read → implement → lint → test
Training Config Overrides
MAX_SEQ = 8192
LR = 5e-5
Estimated Size
- 100–200 examples (synthetic-heavy)
- ~0.8M tokens
- Training time: ~1 hr on H100
- Adapter size: ~305 MB
Risk: Synthetic Quality
With >50% synthetic data, risk of hallucinated gem names or outdated Rails patterns. Mitigation: curate synthetic examples manually, verify all gem references exist, test generated code against a real Rails 8 scaffold.