marauder-actual 122e73860b feat: tts-norm LoRA — dataset generator + training script
gen_tts_dataset.py: 4960 synthetic examples, 22 categories (numbers,
currencies, dates, times, temperatures, acronyms, NATO phonetic, URLs,
markdown, etc). Bilingual EN/PL with explicit [lang] tag prefix.

train_tts_norm.py: Unsloth LoRA training for Qwen2.5-7B-Instruct.
Rank 16, 3 epochs, packing, max_seq 768. Trained on H100 in 20m38s,
final loss 0.091. Adapter: 154MB.
2026-05-26 00:14:51 +02:00
S
Description
LoRA fine-tuning pipeline for building persona, specialist, and voice adapters.
9.9 MiB
Languages
Python 96.7%
Jinja 2%
Just 1.3%