Adds client-side concurrent queueing to style-sweep. -P N submits N
prompts to ComfyUI's HTTP queue concurrently via ThreadPoolExecutor.
The GPU still processes one prompt at a time (ComfyUI's queue is
single-worker), but the HTTP submission, websocket polling, image
download, and disk-write phases pipeline with the next prompt's
submission.
Expected speedup: 5-15% on a typical Flux sweep where per-image GPU
time is ~25-30s and overhead is ~3-5s. Real benefit grows with
slower networks or larger images.
Design choices:
- Default P=1 preserves the exact existing sequential behavior and
log output (no "(submit #N)" suffix in messages).
- P>1 uses ThreadPoolExecutor.as_completed for completion-order
reporting; the manifest is re-sorted to source-list order after.
- Skip-existing + dry-run cases are handled synchronously before the
executor even starts (no point pipelining no-ops).
- --abort-on-error is incompatible with parallelism (can't reliably
stop in-flight workers); we warn and continue.
- Per-task console output WILL interleave under -P>1 because
_run_generation prints its own progress; users are pointed at the
manifest for clean per-slug timing.
Why not full async multi-GPU-workflow parallelism:
- ComfyUI processes its queue strictly sequentially; we can't
actually run two Flux UNets concurrently without a second ComfyUI
instance, second port, second model dir, etc.
- Even with two instances on one GPU, the CUDA cores time-slice and
you get ~1.1x not 2x.
- Memory math is tighter than it looks even on Spark's 80GB unified
pool: two Flux dev instances = 64GB fixed before any activations.
- Maintenance burden is real; speed gain is marginal.
Client-side pipelining gets the practical wins (overhead hiding,
cleaner progress feedback for long sweeps) without the complexity
or OOM risk.
7 new tests covering: invalid P=0, P=1 equivalence with sequential,
multi-style execution, source-order manifest preservation under
chaotic completion, skip-existing in parallel mode, individual
failure containment, and abort-on-error warning.
267 -> 274 tests.
--list/-L prints the resolved styles list as a two-column rich table
(slug + suffix truncated to ~80 chars) and exits without generating.
Template becomes optional when --list is paired with an explicit
--styles source, so you can inspect any styles file standalone.
--style/-S SLUG selects a single style by exact slug match; repeatable
for multiple. Unknown slugs error red with the available slug list.
Filter applies before --limit and preserves the source file's order.
Both flags compose with --limit and --dry-run; when filtering down to
a subset, the manifest is still written for the smaller run.
New `tsr style-sweep` command renders one image per style suffix from a
template JSON, composing prompt = template.prompt + ', ' + style.suffix
and writing to {output_dir}/{slug}.png.
- Template JSON mirrors `generate --input` keys plus output_dir + styles.
- Styles source can be a path or inline list/object on either CLI or
template. Relative styles paths in the template resolve against the
template's directory (so templates can ship with their styles file).
- Skips existing outputs by default (--no-skip-existing to force).
- --dry-run prints planned prompts/paths without invoking generate.
- --limit N caps the sweep for fast iteration.
- --continue-on-error keeps going on individual failures; final exit code
is non-zero if any style failed and failed slugs are reported.
- --remote propagates to the underlying generation, same as `generate`.
- Writes a manifest {output_dir}/_sweep.json with per-style results
(slug, prompt, output, seed, duration_sec, success, error).
Delegates to the `_run_generation` helper extracted from `generate`.