Files
tensors/tests
aladac 1908cf91c4 feat(style-sweep): --parallel-queue/-P for concurrent ComfyUI submissions
Adds client-side concurrent queueing to style-sweep. -P N submits N
prompts to ComfyUI's HTTP queue concurrently via ThreadPoolExecutor.
The GPU still processes one prompt at a time (ComfyUI's queue is
single-worker), but the HTTP submission, websocket polling, image
download, and disk-write phases pipeline with the next prompt's
submission.

Expected speedup: 5-15% on a typical Flux sweep where per-image GPU
time is ~25-30s and overhead is ~3-5s. Real benefit grows with
slower networks or larger images.

Design choices:
- Default P=1 preserves the exact existing sequential behavior and
  log output (no "(submit #N)" suffix in messages).
- P>1 uses ThreadPoolExecutor.as_completed for completion-order
  reporting; the manifest is re-sorted to source-list order after.
- Skip-existing + dry-run cases are handled synchronously before the
  executor even starts (no point pipelining no-ops).
- --abort-on-error is incompatible with parallelism (can't reliably
  stop in-flight workers); we warn and continue.
- Per-task console output WILL interleave under -P>1 because
  _run_generation prints its own progress; users are pointed at the
  manifest for clean per-slug timing.

Why not full async multi-GPU-workflow parallelism:
- ComfyUI processes its queue strictly sequentially; we can't
  actually run two Flux UNets concurrently without a second ComfyUI
  instance, second port, second model dir, etc.
- Even with two instances on one GPU, the CUDA cores time-slice and
  you get ~1.1x not 2x.
- Memory math is tighter than it looks even on Spark's 80GB unified
  pool: two Flux dev instances = 64GB fixed before any activations.
- Maintenance burden is real; speed gain is marginal.

Client-side pipelining gets the practical wins (overhead hiding,
cleaner progress feedback for long sweeps) without the complexity
or OOM risk.

7 new tests covering: invalid P=0, P=1 equivalence with sequential,
multi-style execution, source-order manifest preservation under
chaotic completion, skip-existing in parallel mode, individual
failure containment, and abort-on-error warning.

267 -> 274 tests.
2026-05-17 18:44:15 +02:00
..
2026-02-15 21:45:23 +01:00