# Forge — Ruby Specialist LoRA Adapter codename: `forge` Agent: `build-ruby` Base model: Qwen/Qwen3.5-27B ## Objective Teach the model Ruby/Rails-idiomatic code generation aligned with the build-ruby agent's system prompt. The adapter should internalize: - `# frozen_string_literal: true` on all new files - Symbols over strings for hash keys - Guard clauses, Result pattern in service objects - No monkey-patching unless project already does it - ViewComponent, concerns, service objects, scopes - standardrb/rubocop → rspec/minitest verification cycle ## Data Sources ### Session extraction (~30–60 examples) Ruby sessions are sparse. Classify by: - File paths: `.rb`, `.erb`, `.haml`, `Gemfile`, `Rakefile`, `.ruby-version` - Bash commands: `bundle`, `rails`, `rake`, `rspec`, `rubocop`, `standardrb` - Error patterns: `NoMethodError`, `NameError`, Rails backtrace format ### Git repo mining (~30–50 examples) Target repos: - Any Rails projects in madcat-os org or personal repos - Gem libraries if any exist Extract patterns: - ActiveRecord model definitions (validations, associations, scopes) - Controller implementations (strong params, error handling) - Service object patterns - Migration files - RSpec test patterns ### Synthetic (~60–120 examples) Ruby/Rails is the most synthetic-heavy adapter due to sparse session data: - Rails 8 + Hotwire/Turbo patterns - Service object with Result pattern - ActiveRecord query optimization (N+1, eager loading) - Concern extraction and composition - API-only Rails setup - Background job patterns (Sidekiq, Solid Queue) - ViewComponent implementations - RSpec: let/subject/shared_examples patterns ## Example Format ```json { "messages": [ { "role": "system", "content": "" }, { "role": "user", "content": "Create a service object for user registration with email verification" }, { "role": "assistant", "content": null, "tool_calls": [ { "type": "function", "function": { "name": "read", "arguments": {"filePath": "app/services"} } } ] }, { "role": "tool", "content": "" }, { "role": "assistant", "content": null, "tool_calls": [ { "type": "function", "function": { "name": "write", "arguments": { "filePath": "app/services/users/register.rb", "content": "# frozen_string_literal: true\n\nmodule Users\n class Register\n Result = Data.define(:user, :errors)\n ..." } } } ] }, { "role": "tool", "content": "File written" }, { "role": "assistant", "content": null, "tool_calls": [ { "type": "function", "function": { "name": "bash", "arguments": {"command": "bundle exec standardrb app/services/users/register.rb"} } } ] }, { "role": "tool", "content": "0 offenses" }, { "role": "assistant", "content": "Service object created. Uses Result pattern (Data.define), guard clauses, frozen string literal. Standardrb clean." } ] } ``` ## Evaluation Criteria 1. All files have `# frozen_string_literal: true` 2. Symbol keys in hashes (not strings) 3. Guard clauses for early returns 4. Service objects use Result/value objects, not raised exceptions 5. `bundle exec standardrb` or `rubocop` passes 6. Rails conventions: scopes over class methods, concerns for shared behavior 7. Tool call sequence: explore → read → implement → lint → test ## Training Config Overrides ```python MAX_SEQ = 8192 LR = 5e-5 ``` ## Estimated Size - 100–200 examples (synthetic-heavy) - ~0.8M tokens - Training time: ~1 hr on H100 - Adapter size: ~305 MB ## Risk: Synthetic Quality With >50% synthetic data, risk of hallucinated gem names or outdated Rails patterns. Mitigation: curate synthetic examples manually, verify all gem references exist, test generated code against a real Rails 8 scaffold.