Commit Graph

349 Commits

Author SHA1 Message Date
comfyanonymous e032ca6138 Fix ddim issue with older torch versions. 2023-07-19 10:16:00 -04:00
comfyanonymous 18885f803a Add MX450 and MX550 to list of cards with broken fp16. 2023-07-19 03:08:30 -04:00
comfyanonymous 9ba440995a It's actually possible to torch.compile the unet now. 2023-07-18 21:36:35 -04:00
comfyanonymous 51d5477579 Add key to indicate checkpoint is v_prediction when saving. 2023-07-18 00:25:53 -04:00
comfyanonymous ff6b047a74 Fix device print on old torch version. 2023-07-17 15:18:58 -04:00
comfyanonymous 9871a15cf9 Enable --cuda-malloc by default on torch 2.0 and up.
Add --disable-cuda-malloc to disable it.
2023-07-17 15:12:10 -04:00
comfyanonymous 55d0fca9fa --windows-standalone-build now enables --cuda-malloc 2023-07-17 14:10:36 -04:00
comfyanonymous 1679abd86d Add a command line argument to enable backend:cudaMallocAsync 2023-07-17 11:00:14 -04:00
comfyanonymous 3a150bad15 Only calculate randn in some samplers when it's actually being used. 2023-07-17 10:11:08 -04:00
comfyanonymous ee8f8ee07f Fix regression with ddim and uni_pc when batch size > 1. 2023-07-17 09:35:19 -04:00
comfyanonymous 3ded1a3a04 Refactor of sampler code to deal more easily with different model types. 2023-07-17 01:22:12 -04:00
comfyanonymous 5f57362613 Lower lora ram usage when in normal vram mode. 2023-07-16 02:59:04 -04:00
comfyanonymous 490771b7f4 Speed up lora loading a bit. 2023-07-15 13:25:22 -04:00
comfyanonymous 50b1180dde Fix CLIPSetLastLayer not reverting when removed. 2023-07-15 01:41:21 -04:00
comfyanonymous 6fb084f39d Reduce floating point rounding errors in loras. 2023-07-15 00:53:00 -04:00
comfyanonymous 91ed2815d5 Add a node to merge CLIP models. 2023-07-14 02:41:18 -04:00
comfyanonymous b2f03164c7 Prevent the clip_g position_ids key from being saved in the checkpoint.
This is to make it match the official checkpoint.
2023-07-12 20:15:02 -04:00
comfyanonymous 46dc050c9f Fix potential tensors being on different devices issues. 2023-07-12 19:29:27 -04:00
comfyanonymous 606a537090 Support SDXL embedding format with 2 CLIP. 2023-07-10 10:34:59 -04:00
comfyanonymous 6ad0a6d7e2 Don't patch weights when multiplier is zero. 2023-07-09 17:46:56 -04:00
comfyanonymous d5323d16e0 latent2rgb matrix for SDXL. 2023-07-09 13:59:09 -04:00
comfyanonymous 0ae81c03bb Empty cache after model unloading for normal vram and lower. 2023-07-09 09:56:03 -04:00
comfyanonymous d3f5998218 Support loading clip_g from diffusers in CLIP Loader nodes. 2023-07-09 09:33:53 -04:00
comfyanonymous a9a4ba7574 Fix merging not working when model2 of model merge node was a merge. 2023-07-08 22:31:10 -04:00
comfyanonymous bb5fbd29e9 Merge branch 'condmask-fix' of https://github.com/vmedea/ComfyUI 2023-07-07 01:52:25 -04:00
comfyanonymous e7bee85df8 Add arguments to run the VAE in fp16 or bf16 for testing. 2023-07-06 23:23:46 -04:00
comfyanonymous 608fcc2591 Fix bug with weights when prompt is long. 2023-07-06 02:43:40 -04:00
comfyanonymous ddc6f12ad5 Disable autocast in unet for increased speed. 2023-07-05 21:58:29 -04:00
comfyanonymous 603f02d613 Fix loras not working when loading checkpoint with config. 2023-07-05 19:42:24 -04:00
comfyanonymous af7a49916b Support loading unet files in diffusers format. 2023-07-05 17:38:59 -04:00
comfyanonymous e57cba4c61 Add gpu variations of the sde samplers that are less deterministic
but faster.
2023-07-05 01:39:38 -04:00
comfyanonymous f81b192944 Add logit scale parameter so it's present when saving the checkpoint. 2023-07-04 23:01:28 -04:00
comfyanonymous acf95191ff Properly support SDXL diffusers loras for unet. 2023-07-04 21:15:23 -04:00
mara c61a95f9f7 Fix size check for conditioning mask
The wrong dimensions were being checked, [1] and [2] are the image size.
not [2] and [3]. This results in an out-of-bounds error if one of them
actually matches.
2023-07-04 16:34:42 +02:00
comfyanonymous 8d694cc450 Fix issue with OSX. 2023-07-04 02:09:02 -04:00
comfyanonymous c3e96e637d Pass device to CLIP model. 2023-07-03 16:09:37 -04:00
comfyanonymous 5e6bc824aa Allow passing custom path to clip-g and clip-h. 2023-07-03 15:45:04 -04:00
comfyanonymous dc9d1f31c8 Improvements for OSX. 2023-07-03 00:08:30 -04:00
comfyanonymous 103c487a89 Cleanup. 2023-07-02 11:58:23 -04:00
comfyanonymous 2c4e0b49b7 Switch to fp16 on some cards when the model is too big. 2023-07-02 10:00:57 -04:00
comfyanonymous 6f3d9f52db Add a --force-fp16 argument to force fp16 for testing. 2023-07-01 22:42:35 -04:00
comfyanonymous 1c1b0e7299 --gpu-only now keeps the VAE on the device. 2023-07-01 15:22:40 -04:00
comfyanonymous ce35d8c659 Lower latency by batching some text encoder inputs. 2023-07-01 15:07:39 -04:00
comfyanonymous 3b6fe51c1d Leave text_encoder on the CPU when it can handle it. 2023-07-01 14:38:51 -04:00
comfyanonymous b6a60fa696 Try to keep text encoders loaded and patched to increase speed.
load_model_gpu() is now used with the text encoder models instead of just
the unet.
2023-07-01 13:28:07 -04:00
comfyanonymous 97ee230682 Make highvram and normalvram shift the text encoders to vram and back.
This is faster on big text encoder models than running it on the CPU.
2023-07-01 12:37:23 -04:00
comfyanonymous 5a9ddf94eb LoraLoader node now caches the lora file between executions. 2023-06-29 23:40:51 -04:00
comfyanonymous 9920367d3c Fix embeddings not working with --gpu-only 2023-06-29 20:43:06 -04:00
comfyanonymous 62db11683b Move unet to device right after loading on highvram mode. 2023-06-29 20:43:06 -04:00
comfyanonymous 4376b125eb Remove useless code. 2023-06-29 00:26:33 -04:00