comfyanonymous
|
b8c7c770d3
|
Enable bf16-vae by default on ampere and up.
|
2023-08-27 23:06:19 -04:00 |
comfyanonymous
|
a57b0c797b
|
Fix lowvram model merging.
|
2023-08-26 11:52:07 -04:00 |
comfyanonymous
|
f72780a7e3
|
The new smart memory management makes this unnecessary.
|
2023-08-25 18:02:15 -04:00 |
comfyanonymous
|
30eb92c3cb
|
Code cleanups.
|
2023-08-24 19:39:18 -04:00 |
comfyanonymous
|
51dde87e97
|
Try to free enough vram for control lora inference.
|
2023-08-24 17:20:54 -04:00 |
comfyanonymous
|
cc44ade79e
|
Always shift text encoder to GPU when the device supports fp16.
|
2023-08-23 21:45:00 -04:00 |
comfyanonymous
|
a6ef08a46a
|
Even with forced fp16 the cpu device should never use it.
|
2023-08-23 21:38:28 -04:00 |
comfyanonymous
|
f081017c1a
|
Save memory by storing text encoder weights in fp16 in most situations.
Do inference in fp32 to make sure quality stays the exact same.
|
2023-08-23 01:08:51 -04:00 |
comfyanonymous
|
0d7b0a4dc7
|
Small cleanups.
|
2023-08-20 14:56:47 -04:00 |
Simon Lui
|
9225465975
|
Further tuning and fix mem_free_total.
|
2023-08-20 14:19:53 -04:00 |
Simon Lui
|
2c096e4260
|
Add ipex optimize and other enhancements for Intel GPUs based on recent memory changes.
|
2023-08-20 14:19:51 -04:00 |
comfyanonymous
|
e9469e732d
|
--disable-smart-memory now disables loading model directly to vram.
|
2023-08-20 04:00:53 -04:00 |
comfyanonymous
|
3aee33b54e
|
Add --disable-smart-memory for those that want the old behaviour.
|
2023-08-17 03:12:37 -04:00 |
comfyanonymous
|
2be2742711
|
Fix issue with regular torch version.
|
2023-08-17 01:58:54 -04:00 |
comfyanonymous
|
89a0767abf
|
Smarter memory management.
Try to keep models on the vram when possible.
Better lowvram mode for controlnets.
|
2023-08-17 01:06:34 -04:00 |
comfyanonymous
|
1ce0d8ad68
|
Add CMP 30HX card to the nvidia_16_series list.
|
2023-08-04 12:08:45 -04:00 |
comfyanonymous
|
4a77fcd6ab
|
Only shift text encoder to vram when CPU cores are under 8.
|
2023-07-31 00:08:54 -04:00 |
comfyanonymous
|
3cd31d0e24
|
Lower CPU thread check for running the text encoder on the CPU vs GPU.
|
2023-07-30 17:18:24 -04:00 |
comfyanonymous
|
22f29d66ca
|
Try to fix memory issue with lora.
|
2023-07-22 21:38:56 -04:00 |
comfyanonymous
|
4760c29380
|
Merge branch 'fix-AttributeError-module-'torch'-has-no-attribute-'mps'' of https://github.com/KarryCharon/ComfyUI
|
2023-07-20 00:34:54 -04:00 |
comfyanonymous
|
18885f803a
|
Add MX450 and MX550 to list of cards with broken fp16.
|
2023-07-19 03:08:30 -04:00 |
comfyanonymous
|
ff6b047a74
|
Fix device print on old torch version.
|
2023-07-17 15:18:58 -04:00 |
comfyanonymous
|
1679abd86d
|
Add a command line argument to enable backend:cudaMallocAsync
|
2023-07-17 11:00:14 -04:00 |
comfyanonymous
|
5f57362613
|
Lower lora ram usage when in normal vram mode.
|
2023-07-16 02:59:04 -04:00 |
comfyanonymous
|
490771b7f4
|
Speed up lora loading a bit.
|
2023-07-15 13:25:22 -04:00 |
KarryCharon
|
3e2309f149
|
fix mps miss import
|
2023-07-12 10:06:34 +08:00 |
comfyanonymous
|
0ae81c03bb
|
Empty cache after model unloading for normal vram and lower.
|
2023-07-09 09:56:03 -04:00 |
comfyanonymous
|
e7bee85df8
|
Add arguments to run the VAE in fp16 or bf16 for testing.
|
2023-07-06 23:23:46 -04:00 |
comfyanonymous
|
ddc6f12ad5
|
Disable autocast in unet for increased speed.
|
2023-07-05 21:58:29 -04:00 |
comfyanonymous
|
8d694cc450
|
Fix issue with OSX.
|
2023-07-04 02:09:02 -04:00 |
comfyanonymous
|
dc9d1f31c8
|
Improvements for OSX.
|
2023-07-03 00:08:30 -04:00 |
comfyanonymous
|
2c4e0b49b7
|
Switch to fp16 on some cards when the model is too big.
|
2023-07-02 10:00:57 -04:00 |
comfyanonymous
|
6f3d9f52db
|
Add a --force-fp16 argument to force fp16 for testing.
|
2023-07-01 22:42:35 -04:00 |
comfyanonymous
|
1c1b0e7299
|
--gpu-only now keeps the VAE on the device.
|
2023-07-01 15:22:40 -04:00 |
comfyanonymous
|
3b6fe51c1d
|
Leave text_encoder on the CPU when it can handle it.
|
2023-07-01 14:38:51 -04:00 |
comfyanonymous
|
b6a60fa696
|
Try to keep text encoders loaded and patched to increase speed.
load_model_gpu() is now used with the text encoder models instead of just
the unet.
|
2023-07-01 13:28:07 -04:00 |
comfyanonymous
|
97ee230682
|
Make highvram and normalvram shift the text encoders to vram and back.
This is faster on big text encoder models than running it on the CPU.
|
2023-07-01 12:37:23 -04:00 |
comfyanonymous
|
62db11683b
|
Move unet to device right after loading on highvram mode.
|
2023-06-29 20:43:06 -04:00 |
comfyanonymous
|
8248babd44
|
Use pytorch attention by default on nvidia when xformers isn't present.
Add a new argument --use-quad-cross-attention
|
2023-06-26 13:03:44 -04:00 |
comfyanonymous
|
f7edcfd927
|
Add a --gpu-only argument to keep and run everything on the GPU.
Make the CLIP model work on the GPU.
|
2023-06-15 15:38:52 -04:00 |
comfyanonymous
|
fed0a4dd29
|
Some comments to say what the vram state options mean.
|
2023-06-04 17:51:04 -04:00 |
comfyanonymous
|
0a5fefd621
|
Cleanups and fixes for model_management.py
Hopefully fix regression on MPS and CPU.
|
2023-06-03 11:05:37 -04:00 |
comfyanonymous
|
67892b5ac5
|
Refactor and improve model_management code related to free memory.
|
2023-06-02 15:21:33 -04:00 |
space-nuko
|
499641ebf1
|
More accurate total
|
2023-06-02 00:14:41 -05:00 |
space-nuko
|
b5dd15c67a
|
System stats endpoint
|
2023-06-01 23:26:23 -05:00 |
comfyanonymous
|
5c38958e49
|
Tweak lowvram model memory so it's closer to what it was before.
|
2023-06-01 04:04:35 -04:00 |
comfyanonymous
|
94680732d3
|
Empty cache on mps.
|
2023-06-01 03:52:51 -04:00 |
comfyanonymous
|
eb448dd8e1
|
Auto load model in lowvram if not enough memory.
|
2023-05-30 12:36:41 -04:00 |
comfyanonymous
|
3a1f47764d
|
Print the torch device that is used on startup.
|
2023-05-13 17:11:27 -04:00 |
comfyanonymous
|
6fc4917634
|
Make maximum_batch_area take into account python2.0 attention function.
More conservative xformers maximum_batch_area.
|
2023-05-06 19:58:54 -04:00 |