157 lines
4.1 KiB
Markdown
157 lines
4.1 KiB
Markdown
# Forge — Ruby Specialist LoRA
|
||
|
||
Adapter codename: `forge`
|
||
Agent: `build-ruby`
|
||
Base model: Qwen/Qwen3.5-27B
|
||
|
||
## Objective
|
||
|
||
Teach the model Ruby/Rails-idiomatic code generation aligned with the build-ruby
|
||
agent's system prompt. The adapter should internalize:
|
||
|
||
- `# frozen_string_literal: true` on all new files
|
||
- Symbols over strings for hash keys
|
||
- Guard clauses, Result pattern in service objects
|
||
- No monkey-patching unless project already does it
|
||
- ViewComponent, concerns, service objects, scopes
|
||
- standardrb/rubocop → rspec/minitest verification cycle
|
||
|
||
## Data Sources
|
||
|
||
### Session extraction (~30–60 examples)
|
||
|
||
Ruby sessions are sparse. Classify by:
|
||
- File paths: `.rb`, `.erb`, `.haml`, `Gemfile`, `Rakefile`, `.ruby-version`
|
||
- Bash commands: `bundle`, `rails`, `rake`, `rspec`, `rubocop`, `standardrb`
|
||
- Error patterns: `NoMethodError`, `NameError`, Rails backtrace format
|
||
|
||
### Git repo mining (~30–50 examples)
|
||
|
||
Target repos:
|
||
- Any Rails projects in madcat-os org or personal repos
|
||
- Gem libraries if any exist
|
||
|
||
Extract patterns:
|
||
- ActiveRecord model definitions (validations, associations, scopes)
|
||
- Controller implementations (strong params, error handling)
|
||
- Service object patterns
|
||
- Migration files
|
||
- RSpec test patterns
|
||
|
||
### Synthetic (~60–120 examples)
|
||
|
||
Ruby/Rails is the most synthetic-heavy adapter due to sparse session data:
|
||
- Rails 8 + Hotwire/Turbo patterns
|
||
- Service object with Result pattern
|
||
- ActiveRecord query optimization (N+1, eager loading)
|
||
- Concern extraction and composition
|
||
- API-only Rails setup
|
||
- Background job patterns (Sidekiq, Solid Queue)
|
||
- ViewComponent implementations
|
||
- RSpec: let/subject/shared_examples patterns
|
||
|
||
## Example Format
|
||
|
||
```json
|
||
{
|
||
"messages": [
|
||
{
|
||
"role": "system",
|
||
"content": "<build-ruby agent system prompt>"
|
||
},
|
||
{
|
||
"role": "user",
|
||
"content": "Create a service object for user registration with email verification"
|
||
},
|
||
{
|
||
"role": "assistant",
|
||
"content": null,
|
||
"tool_calls": [
|
||
{
|
||
"type": "function",
|
||
"function": {
|
||
"name": "read",
|
||
"arguments": {"filePath": "app/services"}
|
||
}
|
||
}
|
||
]
|
||
},
|
||
{
|
||
"role": "tool",
|
||
"content": "<directory listing showing existing service pattern>"
|
||
},
|
||
{
|
||
"role": "assistant",
|
||
"content": null,
|
||
"tool_calls": [
|
||
{
|
||
"type": "function",
|
||
"function": {
|
||
"name": "write",
|
||
"arguments": {
|
||
"filePath": "app/services/users/register.rb",
|
||
"content": "# frozen_string_literal: true\n\nmodule Users\n class Register\n Result = Data.define(:user, :errors)\n ..."
|
||
}
|
||
}
|
||
}
|
||
]
|
||
},
|
||
{
|
||
"role": "tool",
|
||
"content": "File written"
|
||
},
|
||
{
|
||
"role": "assistant",
|
||
"content": null,
|
||
"tool_calls": [
|
||
{
|
||
"type": "function",
|
||
"function": {
|
||
"name": "bash",
|
||
"arguments": {"command": "bundle exec standardrb app/services/users/register.rb"}
|
||
}
|
||
}
|
||
]
|
||
},
|
||
{
|
||
"role": "tool",
|
||
"content": "0 offenses"
|
||
},
|
||
{
|
||
"role": "assistant",
|
||
"content": "Service object created. Uses Result pattern (Data.define), guard clauses, frozen string literal. Standardrb clean."
|
||
}
|
||
]
|
||
}
|
||
```
|
||
|
||
## Evaluation Criteria
|
||
|
||
1. All files have `# frozen_string_literal: true`
|
||
2. Symbol keys in hashes (not strings)
|
||
3. Guard clauses for early returns
|
||
4. Service objects use Result/value objects, not raised exceptions
|
||
5. `bundle exec standardrb` or `rubocop` passes
|
||
6. Rails conventions: scopes over class methods, concerns for shared behavior
|
||
7. Tool call sequence: explore → read → implement → lint → test
|
||
|
||
## Training Config Overrides
|
||
|
||
```python
|
||
MAX_SEQ = 8192
|
||
LR = 5e-5
|
||
```
|
||
|
||
## Estimated Size
|
||
|
||
- 100–200 examples (synthetic-heavy)
|
||
- ~0.8M tokens
|
||
- Training time: ~1 hr on H100
|
||
- Adapter size: ~305 MB
|
||
|
||
## Risk: Synthetic Quality
|
||
|
||
With >50% synthetic data, risk of hallucinated gem names or outdated Rails patterns.
|
||
Mitigation: curate synthetic examples manually, verify all gem references exist,
|
||
test generated code against a real Rails 8 scaffold.
|