ComfyUI

Commit Graph

Author	SHA1	Message	Date
comfyanonymous	c8013f73e5	Add some Quadro cards to the list of cards with broken fp16.	2023-10-16 16:48:46 -04:00
comfyanonymous	fd4c5f07e7	Add a --bf16-unet to test running the unet in bf16.	2023-10-13 14:51:10 -04:00
comfyanonymous	9a55dadb4c	Refactor code so model can be a dtype other than fp32 or fp16.	2023-10-13 14:41:17 -04:00
comfyanonymous	88733c997f	pytorch_attention_enabled can now return True when xformers is enabled.	2023-10-11 21:30:57 -04:00
comfyanonymous	20d3852aa1	Pull some small changes from the other repo.	2023-10-11 20:38:48 -04:00
Simon Lui	eec449ca8e	Allow Intel GPUs to LoRA cast on GPU since it supports BF16 natively.	2023-09-22 21:11:27 -07:00
comfyanonymous	1cdfb3dba4	Only do the cast on the device if the device supports it.	2023-09-20 17:52:41 -04:00
comfyanonymous	321c5fa295	Enable pytorch attention by default on xpu.	2023-09-17 04:09:19 -04:00
comfyanonymous	0966d3ce82	Don't run text encoders on xpu because there are issues.	2023-09-14 12:16:07 -04:00
comfyanonymous	1938f5c5fe	Add a force argument to soft_empty_cache to force a cache empty.	2023-09-04 00:58:18 -04:00
Simon Lui	4a0c4ce4ef	Some fixes to generalize CUDA specific functionality to Intel or other GPUs.	2023-09-02 18:22:10 -07:00
comfyanonymous	b8c7c770d3	Enable bf16-vae by default on ampere and up.	2023-08-27 23:06:19 -04:00
comfyanonymous	a57b0c797b	Fix lowvram model merging.	2023-08-26 11:52:07 -04:00
comfyanonymous	f72780a7e3	The new smart memory management makes this unnecessary.	2023-08-25 18:02:15 -04:00
comfyanonymous	30eb92c3cb	Code cleanups.	2023-08-24 19:39:18 -04:00
comfyanonymous	51dde87e97	Try to free enough vram for control lora inference.	2023-08-24 17:20:54 -04:00
comfyanonymous	cc44ade79e	Always shift text encoder to GPU when the device supports fp16.	2023-08-23 21:45:00 -04:00
comfyanonymous	a6ef08a46a	Even with forced fp16 the cpu device should never use it.	2023-08-23 21:38:28 -04:00
comfyanonymous	f081017c1a	Save memory by storing text encoder weights in fp16 in most situations. Do inference in fp32 to make sure quality stays the exact same.	2023-08-23 01:08:51 -04:00
comfyanonymous	0d7b0a4dc7	Small cleanups.	2023-08-20 14:56:47 -04:00
Simon Lui	9225465975	Further tuning and fix mem_free_total.	2023-08-20 14:19:53 -04:00
Simon Lui	2c096e4260	Add ipex optimize and other enhancements for Intel GPUs based on recent memory changes.	2023-08-20 14:19:51 -04:00
comfyanonymous	e9469e732d	--disable-smart-memory now disables loading model directly to vram.	2023-08-20 04:00:53 -04:00
comfyanonymous	3aee33b54e	Add --disable-smart-memory for those that want the old behaviour.	2023-08-17 03:12:37 -04:00
comfyanonymous	2be2742711	Fix issue with regular torch version.	2023-08-17 01:58:54 -04:00
comfyanonymous	89a0767abf	Smarter memory management. Try to keep models on the vram when possible. Better lowvram mode for controlnets.	2023-08-17 01:06:34 -04:00
comfyanonymous	1ce0d8ad68	Add CMP 30HX card to the nvidia_16_series list.	2023-08-04 12:08:45 -04:00
comfyanonymous	4a77fcd6ab	Only shift text encoder to vram when CPU cores are under 8.	2023-07-31 00:08:54 -04:00
comfyanonymous	3cd31d0e24	Lower CPU thread check for running the text encoder on the CPU vs GPU.	2023-07-30 17:18:24 -04:00
comfyanonymous	22f29d66ca	Try to fix memory issue with lora.	2023-07-22 21:38:56 -04:00
comfyanonymous	4760c29380	Merge branch 'fix-AttributeError-module-'torch'-has-no-attribute-'mps'' of https://github.com/KarryCharon/ComfyUI	2023-07-20 00:34:54 -04:00
comfyanonymous	18885f803a	Add MX450 and MX550 to list of cards with broken fp16.	2023-07-19 03:08:30 -04:00
comfyanonymous	ff6b047a74	Fix device print on old torch version.	2023-07-17 15:18:58 -04:00
comfyanonymous	1679abd86d	Add a command line argument to enable backend:cudaMallocAsync	2023-07-17 11:00:14 -04:00
comfyanonymous	5f57362613	Lower lora ram usage when in normal vram mode.	2023-07-16 02:59:04 -04:00
comfyanonymous	490771b7f4	Speed up lora loading a bit.	2023-07-15 13:25:22 -04:00
KarryCharon	3e2309f149	fix mps miss import	2023-07-12 10:06:34 +08:00
comfyanonymous	0ae81c03bb	Empty cache after model unloading for normal vram and lower.	2023-07-09 09:56:03 -04:00
comfyanonymous	e7bee85df8	Add arguments to run the VAE in fp16 or bf16 for testing.	2023-07-06 23:23:46 -04:00
comfyanonymous	ddc6f12ad5	Disable autocast in unet for increased speed.	2023-07-05 21:58:29 -04:00
comfyanonymous	8d694cc450	Fix issue with OSX.	2023-07-04 02:09:02 -04:00
comfyanonymous	dc9d1f31c8	Improvements for OSX.	2023-07-03 00:08:30 -04:00
comfyanonymous	2c4e0b49b7	Switch to fp16 on some cards when the model is too big.	2023-07-02 10:00:57 -04:00
comfyanonymous	6f3d9f52db	Add a --force-fp16 argument to force fp16 for testing.	2023-07-01 22:42:35 -04:00
comfyanonymous	1c1b0e7299	--gpu-only now keeps the VAE on the device.	2023-07-01 15:22:40 -04:00
comfyanonymous	3b6fe51c1d	Leave text_encoder on the CPU when it can handle it.	2023-07-01 14:38:51 -04:00
comfyanonymous	b6a60fa696	Try to keep text encoders loaded and patched to increase speed. load_model_gpu() is now used with the text encoder models instead of just the unet.	2023-07-01 13:28:07 -04:00
comfyanonymous	97ee230682	Make highvram and normalvram shift the text encoders to vram and back. This is faster on big text encoder models than running it on the CPU.	2023-07-01 12:37:23 -04:00
comfyanonymous	62db11683b	Move unet to device right after loading on highvram mode.	2023-06-29 20:43:06 -04:00
comfyanonymous	8248babd44	Use pytorch attention by default on nvidia when xformers isn't present. Add a new argument --use-quad-cross-attention	2023-06-26 13:03:44 -04:00
comfyanonymous	f7edcfd927	Add a --gpu-only argument to keep and run everything on the GPU. Make the CLIP model work on the GPU.	2023-06-15 15:38:52 -04:00
comfyanonymous	fed0a4dd29	Some comments to say what the vram state options mean.	2023-06-04 17:51:04 -04:00
comfyanonymous	0a5fefd621	Cleanups and fixes for model_management.py Hopefully fix regression on MPS and CPU.	2023-06-03 11:05:37 -04:00
comfyanonymous	67892b5ac5	Refactor and improve model_management code related to free memory.	2023-06-02 15:21:33 -04:00
space-nuko	499641ebf1	More accurate total	2023-06-02 00:14:41 -05:00
space-nuko	b5dd15c67a	System stats endpoint	2023-06-01 23:26:23 -05:00
comfyanonymous	5c38958e49	Tweak lowvram model memory so it's closer to what it was before.	2023-06-01 04:04:35 -04:00
comfyanonymous	94680732d3	Empty cache on mps.	2023-06-01 03:52:51 -04:00
comfyanonymous	eb448dd8e1	Auto load model in lowvram if not enough memory.	2023-05-30 12:36:41 -04:00
comfyanonymous	3a1f47764d	Print the torch device that is used on startup.	2023-05-13 17:11:27 -04:00
comfyanonymous	6fc4917634	Make maximum_batch_area take into account python2.0 attention function. More conservative xformers maximum_batch_area.	2023-05-06 19:58:54 -04:00
comfyanonymous	678f933d38	maximum_batch_area for xformers. Remove useless code.	2023-05-06 19:28:46 -04:00
comfyanonymous	cb1551b819	Lowvram mode for gligen and fix some lowvram issues.	2023-05-05 18:11:41 -04:00
comfyanonymous	6ee11d7bc0	Fix import.	2023-05-05 00:19:35 -04:00
comfyanonymous	bae4fb4a9d	Fix imports.	2023-05-04 18:10:29 -04:00
comfyanonymous	056e5545ff	Don't try to get vram from xpu or cuda when directml is enabled.	2023-04-29 00:28:48 -04:00
comfyanonymous	2ca934f7d4	You can now select the device index with: --directml id Like this for example: --directml 1	2023-04-28 16:51:35 -04:00
comfyanonymous	3baded9892	Basic torch_directml support. Use --directml to use it.	2023-04-28 14:28:57 -04:00
comfyanonymous	5282f56434	Implement Linear hypernetworks. Add a HypernetworkLoader node to use hypernetworks.	2023-04-23 12:35:25 -04:00
comfyanonymous	3696d1699a	Add support for GLIGEN textbox model.	2023-04-19 11:06:32 -04:00
comfyanonymous	deb2b93e79	Move code to empty gpu cache to model_management.py	2023-04-15 11:19:07 -04:00
comfyanonymous	1e1875f674	Print xformers version and warning about 0.0.18	2023-04-09 01:31:47 -04:00
comfyanonymous	64557d6781	Add a --force-fp32 argument to force fp32 for debugging.	2023-04-07 00:27:54 -04:00
comfyanonymous	bceccca0e5	Small refactor.	2023-04-06 23:53:54 -04:00
藍+85CD	05eeaa2de5	Merge branch 'master' into ipex	2023-04-07 09:11:30 +08:00
藍+85CD	3e2608e12b	Fix auto lowvram detection on CUDA	2023-04-06 15:44:05 +08:00
藍+85CD	7cb924f684	Use separate variables instead of `vram_state`	2023-04-06 14:24:47 +08:00
藍+85CD	84b9c0ac2f	Import intel_extension_for_pytorch as ipex	2023-04-06 12:27:22 +08:00
EllangoK	e5e587b1c0	seperates out arg parser and imports args	2023-04-05 23:41:23 -04:00
藍+85CD	37713e3b0a	Add basic XPU device support closed #387	2023-04-05 21:22:14 +08:00
comfyanonymous	e46b1c3034	Disable xformers in VAE when xformers == 0.0.18	2023-04-04 22:22:02 -04:00
Francesco Yoshi Gobbo	f55755f0d2	code cleanup	2023-03-27 06:48:09 +02:00
Francesco Yoshi Gobbo	cf0098d539	no lowvram state if cpu only	2023-03-27 04:51:18 +02:00
comfyanonymous	4adcea7228	I don't think controlnets were being handled correctly by MPS.	2023-03-24 14:33:16 -04:00
Yurii Mazurevich	fc71e7ea08	Fixed typo	2023-03-24 19:39:55 +02:00
Yurii Mazurevich	4b943d2b60	Removed unnecessary comment	2023-03-24 14:15:30 +02:00
Yurii Mazurevich	89fd5ed574	Added MPS device support	2023-03-24 14:12:56 +02:00
comfyanonymous	3ed4a4e4e6	Try again with vae tiled decoding if regular fails because of OOM.	2023-03-22 14:49:00 -04:00
comfyanonymous	9d0665c8d0	Add laptop quadro cards to fp32 list.	2023-03-21 16:57:35 -04:00
comfyanonymous	ee46bef03a	Make --cpu have priority over everything else.	2023-03-13 21:30:01 -04:00
comfyanonymous	83f23f82b8	Add pytorch attention support to VAE.	2023-03-13 12:45:54 -04:00
comfyanonymous	a256a2abde	--disable-xformers should not even try to import xformers.	2023-03-13 11:36:48 -04:00
comfyanonymous	0f3ba7482f	Xformers is now properly disabled when --cpu used. Added --windows-standalone-build option, currently it only opens makes the code open up comfyui in the browser.	2023-03-12 15:44:16 -04:00
comfyanonymous	afff30fc0a	Add --cpu to use the cpu for inference.	2023-03-06 10:50:50 -05:00
comfyanonymous	ebfcf0a9c9	Fix issue.	2023-03-03 13:18:01 -05:00
comfyanonymous	fed315a76a	To be really simple CheckpointLoaderSimple should pick the right type.	2023-03-03 11:07:10 -05:00
comfyanonymous	c1f5855ac1	Make some cross attention functions work on the CPU.	2023-03-03 03:27:33 -05:00
comfyanonymous	69cc75fbf8	Add a way to interrupt current processing in the backend.	2023-03-02 14:42:03 -05:00
comfyanonymous	2c5f0ec681	Small adjustment.	2023-02-27 20:04:18 -05:00
comfyanonymous	86721d5158	Enable highvram automatically when vram >> ram	2023-02-27 19:57:39 -05:00

1 2 3 4

159 Commits