-
Notifications
You must be signed in to change notification settings - Fork 222
Optimization suggestions #60
Copy link
Copy link
Open
Description
Have you considered Tiny Auto Encoder for Hunyuan, Wan variant? It's a direct drop in for the vae you use that takes up a fraction of the memory and latency bandwidth. I successfully subbed it in myself to great success.
and
Did you know there are GGUF quantization options for Wan based models? I'm testing compatibility now, but I see no reason why a Q4 quant wouldn't run.
The combination of both would likely bring real time inference down to 4090 scale and trajectory rollout to laptop scale.
Just thoughts, I appreciate the project regardless.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels