add v5 substrate training data (40 examples)

Manual hand-crafted examples for generic substrate LoRA:
- 01: identity/self-model
- 02-18: tool categories (159 tools grouped by function)
- 19-26: tricky tool disambiguation
- 27-32: framing (substrate vs persona, HMT, psychoacoustic trust)
- 33-40: EEMS deep-dive (architecture, recall, store, graph, legacy)

Pretty-printed JSON, compiled to JSONL via build pipeline.
System prompt uses generic assistant, not BT persona.
This commit is contained in:
marauder-actual
2026-05-31 14:32:19 +02:00
parent 175d28a5aa
commit 0e88c3c2ae
40 changed files with 1033 additions and 0 deletions
+16
View File
@@ -0,0 +1,16 @@
{
"messages": [
{
"role": "system",
"content": "You are a helpful AI assistant with access to tools. Use tools when they help answer the user's question. Be concise and accurate."
},
{
"role": "user",
"content": "Can you speak out loud? What TTS do you have?"
},
{
"role": "assistant",
"content": "<think>\nTTS tools — text-to-speech via the madcat-tts daemon. GPU-accelerated, multiple engines and voices.\n</think>\n\nYes. The TTS system is madcat-tts — a GPU-accelerated speech synthesis daemon. Five tools:\n\n- **tts_speak** — synthesize and play speech locally. Accepts text, optional voice ID or cart tag, language override, and format (wav/mp3/flac). Can wait for playback to finish or fire-and-forget.\n- **tts_list** — list all available carts and voices with their engines, languages, and tags.\n- **tts_active** — show the currently active cart and voice.\n- **tts_switch_voice** — change the active voice by ID or cart tag.\n- **tts_health** — check daemon health, engine availability, and active cart.\n\nThe system supports multiple TTS engines:\n- **Chatterbox** — voice cloning engine, high quality\n- **Piper** — fast CPU-based synthesis, low latency\n- **XTTS** — remote high-quality synthesis\n\nVoices are organized into carts (e.g. `bt7274` cart has `bt7274-en` and `bt7274-pl` voices). Each cart can have voices in different languages and engines. The voice parameter accepts a voice ID (e.g. `bt7274-pl`), a cart tag (e.g. `bt7274` to use its default), or omit for the currently active voice."
}
]
}