tensors

aladac/tensors

Fork 0

Commit Graph

Author	SHA1	Message	Date
aladac	2ca9003f86	style: clean lint warnings introduced by parallel-queue change - Drop unused `import json` from new test module (F401). - Remove unused `# noqa: BLE001` directives — project ruff config doesn't enable BLE001 so the suppressions were dead weight (RUF100 x3). - Replace `×` (U+00D7) with ASCII `x` in console output (RUF001). - Collapse seed-strategy if/else into ternary (SIM108). - Use `enumerate(as_completed(...), start=1)` for completion counter instead of manual `completed = 0; completed += 1` (SIM113). - Run `ruff format` on touched files. Pre-existing lint errors on master (PLC0415/PLR0915/SIM113 in unrelated commands) are untouched — separate cleanup PR if desired. Net delta of this branch over master: 0 new lint errors. All 374 tests still passing.	2026-05-18 23:34:22 +02:00
aladac	6ddcf84167	feat(generate): add --parallel-queue/-P for concurrent submissions Mirrors the style-sweep --parallel-queue flag on the `generate` command. When used with --count N > 1, splits the request into N independent batch_size=1 jobs queued P-at-a-time via ThreadPoolExecutor instead of a single ComfyUI batch. Each task receives a distinct seed (incrementing from --seed when set, freshly randomized per task when --seed=-1) and a distinct output path following the existing stem_NNN.suffix convention. The GPU still processes one prompt at a time, but HTTP queueing, websocket polling, and image-download phases pipeline across tasks for a meaningful wall-clock speedup on warmed-up models (~30-50% in practice). Implementation notes: - count=1 always takes the legacy sequential path regardless of -P. - -P 1 is also sequential — identical behavior to pre-flag invocations. - Bare model names (`-m lust_v10`) are resolved to canonical filenames ONCE in the parent before fanout, so worker tasks (which run with json_output=True path semantics for stdout) don't each duplicate the validation step or, worse, forward unresolved names to ComfyUI. - --json + -P>1 is rejected up-front: the JSON path inside _run_generation short-circuits the disk-save block, which would silently produce zero files. Better to fail loud than save nothing. - parallel_queue is plumbed through --input (JSON/YAML) like every other generate param, with the usual CLI-flag-wins precedence. Tests: 15 new in tests/test_generate_parallel.py covering validation, fanout topology, seed strategies, output naming, --input integration, partial-failure exit code, and a concurrency assertion that confirms threads actually overlap. Manual E2E against ComfyUI on sin: -c 3 -P 3 on FLUX produced 3 distinct images in ~83s vs the ~195s a pure sequential run would take.	2026-05-18 23:31:33 +02:00

Author

SHA1

Message

Date

aladac

2ca9003f86

style: clean lint warnings introduced by parallel-queue change

- Drop unused `import json` from new test module (F401).
- Remove unused `# noqa: BLE001` directives — project ruff config doesn't
  enable BLE001 so the suppressions were dead weight (RUF100 x3).
- Replace `×` (U+00D7) with ASCII `x` in console output (RUF001).
- Collapse seed-strategy if/else into ternary (SIM108).
- Use `enumerate(as_completed(...), start=1)` for completion counter
  instead of manual `completed = 0; completed += 1` (SIM113).
- Run `ruff format` on touched files.

Pre-existing lint errors on master (PLC0415/PLR0915/SIM113 in unrelated
commands) are untouched — separate cleanup PR if desired. Net delta of
this branch over master: 0 new lint errors.

All 374 tests still passing.

2026-05-18 23:34:22 +02:00

aladac

6ddcf84167

feat(generate): add --parallel-queue/-P for concurrent submissions

Mirrors the style-sweep --parallel-queue flag on the `generate` command.
When used with --count N > 1, splits the request into N independent
batch_size=1 jobs queued P-at-a-time via ThreadPoolExecutor instead of
a single ComfyUI batch.

Each task receives a distinct seed (incrementing from --seed when set,
freshly randomized per task when --seed=-1) and a distinct output path
following the existing stem_NNN.suffix convention. The GPU still
processes one prompt at a time, but HTTP queueing, websocket polling,
and image-download phases pipeline across tasks for a meaningful
wall-clock speedup on warmed-up models (~30-50% in practice).

Implementation notes:
- count=1 always takes the legacy sequential path regardless of -P.
- -P 1 is also sequential — identical behavior to pre-flag invocations.
- Bare model names (`-m lust_v10`) are resolved to canonical filenames
  ONCE in the parent before fanout, so worker tasks (which run with
  json_output=True path semantics for stdout) don't each duplicate the
  validation step or, worse, forward unresolved names to ComfyUI.
- --json + -P>1 is rejected up-front: the JSON path inside _run_generation
  short-circuits the disk-save block, which would silently produce zero
  files. Better to fail loud than save nothing.
- parallel_queue is plumbed through --input (JSON/YAML) like every other
  generate param, with the usual CLI-flag-wins precedence.

Tests: 15 new in tests/test_generate_parallel.py covering validation,
fanout topology, seed strategies, output naming, --input integration,
partial-failure exit code, and a concurrency assertion that confirms
threads actually overlap.

Manual E2E against ComfyUI on sin: -c 3 -P 3 on FLUX produced 3 distinct
images in ~83s vs the ~195s a pure sequential run would take.

2026-05-18 23:31:33 +02:00

2 Commits