marauder-actual
465e74f49e
fix: patch is_mllm_model for Qwen3.6 text-only model
...
AutoRound misidentifies Qwen3_5ForConditionalGeneration as a VLM
and tries to load a vision processor. Patch to force LLM mode.
2026-06-01 04:26:15 +02:00
marauder-actual
367ed705ab
fix: convert chat messages to text for AutoRound calibration
2026-06-01 04:23:28 +02:00
marauder-actual
4edaeeb21b
switch quantization from llm-compressor to AutoRound
...
llm-compressor pins transformers<=4.57.6, can't load Qwen3.6.
AutoRound (Intel) works with transformers 5.x and is already
installed as an llmcompressor dependency. Produces vLLM-compatible
INT4 output.
2026-06-01 04:22:07 +02:00
marauder-actual
934be8ce48
fix: load tokenizer from base repo for quant venv compat
...
Merged model has tokenizer_class=TokenizersBackend (transformers 5.x)
which is unknown to transformers 4.57.6 in the quant venv.
2026-06-01 04:16:32 +02:00
marauder-actual
0fa46c9fed
add AWQ quantization script (llm-compressor)
2026-06-01 04:15:15 +02:00