Commit Graph

1274 Commits

Author SHA1 Message Date
comfyanonymous 471cd3eace fp8 casting is fast on GPUs that support fp8 compute. 2024-10-20 00:54:47 -04:00
comfyanonymous a68bbafddb Support diffusion models with scaled fp8 weights. 2024-10-19 23:47:42 -04:00
comfyanonymous 73e3a9e676 Clamp output when rounding weight to prevent Nan. 2024-10-19 19:07:10 -04:00
comfyanonymous 67158994a4 Use the lowvram cast_to function for everything. 2024-10-17 17:25:56 -04:00
comfyanonymous 0bedfb26af Revert "Fix Transformers FutureWarning (#5140)"
This reverts commit 95b7cf9bbe.
2024-10-16 12:36:19 -04:00
comfyanonymous f584758271 Cleanup some useless lines. 2024-10-14 21:02:39 -04:00
svdc 95b7cf9bbe
Fix Transformers FutureWarning (#5140)
* Update sd1_clip.py

Fix Transformers FutureWarning

* Update sd1_clip.py

Fix comment
2024-10-14 20:12:20 -04:00
comfyanonymous 3c60ecd7a8 Fix fp8 ops staying enabled. 2024-10-12 14:10:13 -04:00
comfyanonymous 7ae6626723 Remove useless argument. 2024-10-12 07:16:21 -04:00
comfyanonymous 6632365e16 model_options consistency between functions.
weight_dtype -> dtype
2024-10-11 20:51:19 -04:00
Kadir Nar ad07796777
🐛 Add device to variable c (#5210) 2024-10-11 20:37:50 -04:00
Jedrzej Kosinski 1f8d9c040b Fixed models not being unloaded properly due to current_patcher reference; the current ComfyUI model cleanup code requires that nothing else has a reference to the ModelPatcher instances 2024-10-11 06:50:55 -05:00
comfyanonymous 1b80895285 Make clip loader nodes support loading sd3 t5xxl in lower precision.
Add attention mask support in the SD3 text encoder code.
2024-10-10 15:06:15 -04:00
Dr.Lt.Data 5f9d5a244b
Hotfix for the div zero occurrence when memory_used_encode is 0 (#5121)
https://github.com/comfyanonymous/ComfyUI/issues/5069#issuecomment-2382656368
2024-10-09 23:34:34 -04:00
Jonathan Avila 4b2f0d9413
Increase maximum macOS version to 15.0.1 when forcing upcast attention (#5191) 2024-10-09 22:21:41 -04:00
comfyanonymous e38c94228b Add a weight_dtype fp8_e4m3fn_fast to the Diffusion Model Loader node.
This is used to load weights in fp8 and use fp8 matrix multiplication.
2024-10-09 19:43:17 -04:00
comfyanonymous 203942c8b2 Fix flux doras with diffusers keys. 2024-10-08 19:03:40 -04:00
Jedrzej Kosinski 1e2777bab1 Added uuid to conds in CFGGuider and uuids to transformer_options to allow uniquely identifying conds in batches during sampling 2024-10-08 17:52:01 -05:00
comfyanonymous 8dfa0cc552 Make SD3 fast previews a little better. 2024-10-07 09:19:59 -04:00
Jedrzej Kosinski 4fdfe2f704 Merge branch 'master' into patch_hooks 2024-10-07 04:29:16 -05:00
comfyanonymous e5ecdfdd2d Make fast previews for SDXL a little better by adding a bias. 2024-10-06 19:27:04 -04:00
comfyanonymous 7d29fbf74b Slightly improve the fast previews for flux by adding a bias. 2024-10-06 17:55:46 -04:00
comfyanonymous 7d2467e830 Some minor cleanups. 2024-10-05 13:22:39 -04:00
Jedrzej Kosinski 06fbdb03ef Merge branch 'master' into patch_hooks 2024-10-05 06:43:43 -05:00
comfyanonymous 6f021d8aa0 Let --verbose have an argument for the log level. 2024-10-04 10:05:34 -04:00
comfyanonymous d854ed0bcf Allow using SD3 type te output on flux model. 2024-10-03 09:44:54 -04:00
comfyanonymous abcd006b8c Allow more permutations of clip/t5 in dual clip loader. 2024-10-03 09:26:11 -04:00
comfyanonymous d985d1d7dc CLIP Loader node now supports clip_l and clip_g only for SD3. 2024-10-02 04:25:17 -04:00
comfyanonymous d1cdf51e1b Refactor some of the TE detection code. 2024-10-01 07:08:41 -04:00
comfyanonymous b4626ab93e Add simpletuner lycoris format for SD unet. 2024-09-30 06:03:27 -04:00
comfyanonymous a9e459c2a4 Use torch.nn.functional.linear in RGB preview code.
Add an optional bias to the latent RGB preview code.
2024-09-29 11:27:49 -04:00
comfyanonymous 3bb4dec720 Fix issue with loras, lowvram and --fast fp8. 2024-09-28 14:42:32 -04:00
City 8733191563
Flux torch.compile fix (#5082) 2024-09-27 22:07:51 -04:00
kosinkadink1@gmail.com 0c8bd63aa9 Added Combine versions of Cond/Cond Pair Set Props nodes, renamed Pair Cond to Cond Pair, fixed default conds never applying hooks (due to hooks key typo) 2024-09-27 14:42:50 +09:00
kosinkadink1@gmail.com 0f7d379d24 Refactored WrapperExecutor code to remove need for WrapperClassExecutor (now gone), added sampler.sample wrapper (pending review, will likely keep but will see what hacks this could currently let me get rid of in ACN/ADE) 2024-09-27 12:14:36 +09:00
kosinkadink1@gmail.com 09cbd69161 Added create_model_options_clone func, modified type annotations to use __future__ so that I can use the better type annotations 2024-09-25 20:29:49 +09:00
kosinkadink1@gmail.com fd2d572447 Modified ControlNet/T2IAdapter get_control function to receive transformer_options as additional parameter, made the model_options stored in extra_args in inner_sample be a clone of the original model_options instead of same ref 2024-09-25 19:46:33 +09:00
comfyanonymous bdd4a22a2e Fix flux TE not loading t5 embeddings. 2024-09-24 22:57:22 -04:00
kosinkadink1@gmail.com d3229cbba7 Implement basic MemoryCounter system for determing with cached weights due to hooks should be offloaded in hooks_backup 2024-09-24 17:28:18 +09:00
kosinkadink1@gmail.com c422553b0b Added get_attachment func on ModelPatcher 2024-09-24 16:20:53 +09:00
chaObserv 479a427a48
Add dpmpp_2m_cfg_pp (#4992) 2024-09-24 02:42:56 -04:00
kosinkadink1@gmail.com da6c0455cc Added forward_timestep_embed_patch type, added helper functions on ModelPatcher for emb_patch and forward_timestep_embed_patch, added helper functions for removing callbacks/wrappers/additional_models by key, added custom_should_register prop to hooks 2024-09-24 12:40:54 +09:00
comfyanonymous 3a0eeee320 Make --listen listen on both ipv4 and ipv6 at the same time by default. 2024-09-23 04:38:19 -04:00
comfyanonymous 9c41bc8d10 Remove useless line. 2024-09-23 02:32:29 -04:00
kosinkadink1@gmail.com 7c86407619 Refactored callbacks+wrappers to allow storing lists by id 2024-09-22 16:36:40 +09:00
comfyanonymous 7a415f47a9 Add an optional VAE input to the ControlNetApplyAdvanced node.
Deprecate the other controlnet nodes.
2024-09-22 01:24:52 -04:00
kosinkadink1@gmail.com a154d0df23 Merge branch 'master' into patch_hooks 2024-09-22 11:54:55 +09:00
kosinkadink1@gmail.com 5052a78be2 Added WrapperExecutor for non-classbound functions, added calc_cond_batch wrappers 2024-09-22 11:52:35 +09:00
kosinkadink1@gmail.com 298397d198 Updated clone_has_same_weights function to account for new ModelPatcher properties, improved AutoPatcherEjector usage in partially_load 2024-09-21 21:50:51 +09:00
comfyanonymous dc96a1ae19 Load controlnet in fp8 if weights are in fp8. 2024-09-21 04:50:12 -04:00
kosinkadink1@gmail.com f28d892c16 Fix skip_until_exit logic bug breaking injection after first run of model 2024-09-21 16:34:40 +09:00
comfyanonymous 2d810b081e Add load_controlnet_state_dict function. 2024-09-21 01:51:51 -04:00
comfyanonymous 9f7e9f0547 Add an error message when a controlnet needs a VAE but none is given. 2024-09-21 01:33:18 -04:00
kosinkadink1@gmail.com 5f450d3351 Started scaffolding for other hook types, refactored get_hooks_from_cond to organize hooks by type 2024-09-21 10:37:18 +09:00
kosinkadink1@gmail.com 59d72b4050 Added wrappers to ModelPatcher to facilitate standardized function wrapping 2024-09-20 20:05:29 +09:00
comfyanonymous 70a708d726 Fix model merging issue. 2024-09-20 02:31:44 -04:00
yoinked e7d4782736
add laplace scheduler [2407.03297] (#4990)
* add laplace scheduler [2407.03297]

* should be here instead lol

* better settings
2024-09-19 23:23:09 -04:00
kosinkadink1@gmail.com 55014293b1 Added injections support to ModelPatcher + necessary bookkeeping, added additional_models support in ModelPatcher, conds, and hooks 2024-09-19 21:43:58 +09:00
comfyanonymous ad66f7c7d8 Add model_options to load_controlnet function. 2024-09-19 08:23:35 -04:00
Simon Lui de8e8e3b0d
Fix xpu Pytorch nightly build from calling optimize which doesn't exist. (#4978) 2024-09-19 05:11:42 -04:00
kosinkadink1@gmail.com e80dc96627 Fix incorrect ref to create_hook_patches_clone after moving function 2024-09-19 11:57:19 +09:00
kosinkadink1@gmail.com 787ef34842 Continued work on simpler Create Hook Model As LoRA node, started to implement ModelPatcher callbacks, attachments, and additional_models 2024-09-19 11:47:25 +09:00
pharmapsychotic 0b7dfa986d
Improve tiling calculations to reduce number of tiles that need to be processed. (#4944) 2024-09-17 03:51:10 -04:00
comfyanonymous d514bb38ee Add some option to model_options for the text encoder.
load_device, offload_device and the initial_device can now be set.
2024-09-17 03:49:54 -04:00
kosinkadink1@gmail.com 6b14fc8795 Merge branch 'master' into patch_hooks 2024-09-17 15:31:03 +09:00
kosinkadink1@gmail.com c29006e669 Initial work on adding 'model_as_lora' lora type to calculate_weight 2024-09-17 15:30:33 +09:00
comfyanonymous 0849c80e2a get_key_patches now works without unloading the model. 2024-09-17 01:57:59 -04:00
kosinkadink1@gmail.com cfb145187d Made Set Clip Hooks node work with hooks from Create Hook nodes, began work on better Create Hook Model As LoRA node 2024-09-17 09:55:14 +09:00
kosinkadink1@gmail.com 4b472ba44c Added support for adding weight hooks that aren't registered on the ModelPatcher at sampling time 2024-09-17 06:22:41 +09:00
kosinkadink1@gmail.com f5c899f42a Fixed MaxSpeed and default conds implementations 2024-09-15 21:00:45 +09:00
comfyanonymous e813abbb2c Long CLIP L support for SDXL, SD3 and Flux.
Use the *CLIPLoader nodes.
2024-09-15 07:59:38 -04:00
kosinkadink1@gmail.com 5a9aa5817c Added initial hook scheduling nodes, small renaming/refactoring 2024-09-15 18:39:31 +09:00
kosinkadink1@gmail.com a5034df6db Made CLIP work with hook patches 2024-09-15 15:47:09 +09:00
kosinkadink1@gmail.com 9ded65a616 Added initial set of hook-related nodes, added code to register hooks for loras/model-as-loras, small renaming/refactoring 2024-09-15 08:33:17 +09:00
comfyanonymous f48e390032 Support AliMama SD3 and Flux inpaint controlnets.
Use the ControlNetInpaintingAliMamaApply node.
2024-09-14 09:05:16 -04:00
kosinkadink1@gmail.com f5abdc6f86 Merge branch 'master' into patch_hooks 2024-09-14 17:29:30 +09:00
kosinkadink1@gmail.com 5dadd97583 Added default_conds support in calc_cond_batch func 2024-09-14 17:21:50 +09:00
kosinkadink1@gmail.com f160d46340 Added call to initialize_timesteps on hooks in process_conds func, and added call prepare current keyframe on hooks in calc_cond_batch 2024-09-14 16:10:42 +09:00
kosinkadink1@gmail.com 1268d04295 Consolidated add_hook_patches_as_diffs into add_hook_patches func, fixed fp8 support for model-as-lora feature 2024-09-14 14:09:43 +09:00
comfyanonymous cf80d28689 Support loading controlnets with different input. 2024-09-13 09:54:37 -04:00
kosinkadink1@gmail.com 9ae758175d Added current_patcher property to BaseModel 2024-09-13 21:35:35 +09:00
kosinkadink1@gmail.com 3cbd40ada3 Initial changes to calc_cond_batch to eventually support hook_patches 2024-09-13 18:31:52 +09:00
kosinkadink1@gmail.com 069ec7a64b Added hook_patches to ModelPatcher for weights (model) 2024-09-13 17:20:22 +09:00
Robin Huang b962db9952
Add cli arg to override user directory (#4856)
* Override user directory.

* Use overridden user directory.

* Remove prints.

* Remove references to global user_files.

* Remove unused replace_folder function.

* Remove newline.

* Remove global during get_user_directory.

* Add validation.
2024-09-12 08:10:27 -04:00
comfyanonymous 9d720187f1 types -> comfy_types to fix import issue. 2024-09-12 03:57:46 -04:00
comfyanonymous 9f4daca9d9 Doesn't really make sense for cfg_pp sampler to call regular one. 2024-09-11 02:51:36 -04:00
yoinked b5d0f2a908
Add CFG++ to DPM++ 2S Ancestral (#3871)
* Update sampling.py

* Update samplers.py

* my bad

* "fix" the sampler

* Update samplers.py

* i named it wrong

* minor sampling improvements

mainly using a dynamic rho value (hey this sounds a lot like smea!!!)

* revert rho change

rho? r? its just 1/2
2024-09-11 02:49:44 -04:00
comfyanonymous 9c5fca75f4 Fix lora issue. 2024-09-08 10:10:47 -04:00
comfyanonymous 32a60a7bac Support onetrainer text encoder Flux lora. 2024-09-08 09:31:41 -04:00
Jim Winkens bb52934ba4
Fix import issue (#4815) 2024-09-07 05:28:32 -04:00
comfyanonymous ea77750759 Support a generic Comfy format for text encoder loras.
This is a format with keys like:
text_encoders.clip_l.transformer.text_model.encoder.layers.9.self_attn.v_proj.lora_up.weight

Instead of waiting for me to add support for specific lora formats you can
convert your text encoder loras to this format instead.

If you want to see an example save a text encoder lora with the SaveLora
node with the commit right after this one.
2024-09-07 02:20:39 -04:00
comfyanonymous c27ebeb1c2 Fix onnx export not working on flux. 2024-09-06 03:21:52 -04:00
comfyanonymous 5cbaa9e07c Mistoline flux controlnet support. 2024-09-05 00:05:17 -04:00
comfyanonymous c7427375ee Prioritize freeing partially offloaded models first. 2024-09-04 19:47:32 -04:00
Jedrzej Kosinski f04229b84d
Add emb_patch support to UNetModel forward (#4779) 2024-09-04 14:35:15 -04:00
Silver f067ad15d1
Make live preview size a configurable launch argument (#4649)
* Make live preview size a configurable launch argument

* Remove import from testing phase

* Update cli_args.py
2024-09-03 19:16:38 -04:00
comfyanonymous 483004dd1d Support newer glora format. 2024-09-03 17:02:19 -04:00
comfyanonymous 00a5d08103 Lower fp8 lora memory usage. 2024-09-03 01:25:05 -04:00
comfyanonymous d043997d30 Flux onetrainer lora. 2024-09-02 08:22:15 -04:00
comfyanonymous 8d31a6632f Speed up inference on nvidia 10 series on Linux. 2024-09-01 17:29:31 -04:00
comfyanonymous b643eae08b Make minimum_inference_memory() depend on --reserve-vram 2024-09-01 01:18:34 -04:00
comfyanonymous 935ae153e1 Cleanup. 2024-08-30 12:53:59 -04:00
Chenlei Hu e91662e784
Get logs endpoint & system_stats additions (#4690)
* Add route for getting output logs

* Include ComfyUI version

* Move to own function

* Changed to memory logger

* Unify logger setup logic

* Fix get version git fallback

---------

Co-authored-by: pythongosssss <125205205+pythongosssss@users.noreply.github.com>
2024-08-30 12:46:37 -04:00
comfyanonymous 63fafaef45 Fix potential issue with hydit controlnets. 2024-08-30 04:58:41 -04:00
comfyanonymous 6eb5d64522 Fix glora lowvram issue. 2024-08-29 19:07:23 -04:00
comfyanonymous 10a79e9898 Implement model part of flux union controlnet. 2024-08-29 18:41:22 -04:00
comfyanonymous ea3f39bd69 InstantX depth flux controlnet. 2024-08-29 02:14:19 -04:00
comfyanonymous b33cd61070 InstantX canny controlnet. 2024-08-28 19:02:50 -04:00
comfyanonymous d31e226650 Unify RMSNorm code. 2024-08-28 16:56:38 -04:00
comfyanonymous 38c22e631a Fix case where model was not properly unloaded in merging workflows. 2024-08-27 19:03:51 -04:00
Chenlei Hu 6bbdcd28ae
Support weight padding on diff weight patch (#4576) 2024-08-27 13:55:37 -04:00
comfyanonymous ab130001a8 Do RMSNorm in native type. 2024-08-27 02:41:56 -04:00
comfyanonymous 2ca8f6e23d Make the stochastic fp8 rounding reproducible. 2024-08-26 15:12:06 -04:00
comfyanonymous 7985ff88b9 Use less memory in float8 lora patching by doing calculations in fp16. 2024-08-26 14:45:58 -04:00
comfyanonymous c6812947e9 Fix potential memory leak. 2024-08-26 02:07:32 -04:00
comfyanonymous 9230f65823 Fix some controlnets OOMing when loading. 2024-08-25 05:54:29 -04:00
comfyanonymous 8ae23d8e80 Fix onnx export. 2024-08-23 17:52:47 -04:00
comfyanonymous 7df42b9a23 Fix dora. 2024-08-23 04:58:59 -04:00
comfyanonymous 5d8bbb7281 Cleanup. 2024-08-23 04:06:27 -04:00
comfyanonymous 2c1d2375d6 Fix. 2024-08-23 04:04:55 -04:00
Simon Lui 64ccb3c7e3
Rework IPEX check for future inclusion of XPU into Pytorch upstream and do a bit more optimization of ipex.optimize(). (#4562) 2024-08-23 03:59:57 -04:00
Scorpinaus 9465b23432
Added SD15_Inpaint_Diffusers model support for unet_config_from_diffusers_unet function (#4565) 2024-08-23 03:57:08 -04:00
comfyanonymous c0b0da264b Missing imports. 2024-08-22 17:20:51 -04:00
comfyanonymous c26ca27207 Move calculate function to comfy.lora 2024-08-22 17:12:00 -04:00
comfyanonymous 7c6bb84016 Code cleanups. 2024-08-22 17:05:12 -04:00
comfyanonymous c54d3ed5e6 Fix issue with models staying loaded in memory. 2024-08-22 15:58:20 -04:00
comfyanonymous c7ee4b37a1 Try to fix some lora issues. 2024-08-22 15:32:18 -04:00
David 7b70b266d8
Generalize MacOS version check for force-upcast-attention (#4548)
This code automatically forces upcasting attention for MacOS versions 14.5 and 14.6. My computer returns the string "14.6.1" for `platform.mac_ver()[0]`, so this generalizes the comparison to catch more versions.

I am running MacOS Sonoma 14.6.1 (latest version) and was seeing black image generation on previously functional workflows after recent software updates. This PR solved the issue for me.

See comfyanonymous/ComfyUI#3521
2024-08-22 13:24:21 -04:00
comfyanonymous 8f60d093ba Fix issue. 2024-08-22 10:38:24 -04:00
comfyanonymous 843a7ff70c fp16 is actually faster than fp32 on a GTX 1080. 2024-08-21 23:23:50 -04:00
comfyanonymous a60620dcea Fix slow performance on 10 series Nvidia GPUs. 2024-08-21 16:39:02 -04:00
comfyanonymous 015f73dc49 Try a different type of flux fp16 fix. 2024-08-21 16:17:15 -04:00
comfyanonymous 904bf58e7d Make --fast work on pytorch nightly. 2024-08-21 14:01:41 -04:00
Svein Ove Aas 5f50263088
Replace use of .view with .reshape (#4522)
When generating images with fp8_e4_m3 Flux and batch size >1, using --fast, ComfyUI throws a "view size is not compatible with input tensor's size and stride" error pointing at the first of these two calls to view.

As reshape is semantically equivalent to view except for working on a broader set of inputs, there should be no downside to changing this. The only difference is that it clones the underlying data in cases where .view would error out. I have confirmed that the output still looks as expected, but cannot confirm that no mutable use is made of the tensors anywhere.

Note that --fast is only marginally faster than the default.
2024-08-21 11:21:48 -04:00
comfyanonymous 03ec517afb Remove useless line, adjust windows default reserved vram. 2024-08-21 00:47:19 -04:00
comfyanonymous 510f3438c1 Speed up fp8 matrix mult by using better code. 2024-08-20 22:53:26 -04:00
comfyanonymous ea63b1c092 Simpletrainer lycoris format. 2024-08-20 12:05:13 -04:00
comfyanonymous 9953f22fce Add --fast argument to enable experimental optimizations.
Optimizations that might break things/lower quality will be put behind
this flag first and might be enabled by default in the future.

Currently the only optimization is float8_e4m3fn matrix multiplication on
4000/ADA series Nvidia cards or later. If you have one of these cards you
will see a speed boost when using fp8_e4m3fn flux for example.
2024-08-20 11:55:51 -04:00
comfyanonymous d1a6bd6845 Support loading long clipl model with the CLIP loader node. 2024-08-20 10:46:36 -04:00
comfyanonymous 83dbac28eb Properly set if clip text pooled projection instead of using hack. 2024-08-20 10:46:36 -04:00
comfyanonymous 538cb068bc Make cast_to a nop if weight is already good. 2024-08-20 10:46:36 -04:00
comfyanonymous 1b3eee672c Fix potential issue with multi devices. 2024-08-20 10:46:36 -04:00
comfyanonymous 9eee470244 New load_text_encoder_state_dicts function.
Now you can load text encoders straight from a list of state dicts.
2024-08-19 17:36:35 -04:00
comfyanonymous 045377ea89 Add a --reserve-vram argument if you don't want comfy to use all of it.
--reserve-vram 1.0 for example will make ComfyUI try to keep 1GB vram free.

This can also be useful if workflows are failing because of OOM errors but
in that case please report it if --reserve-vram improves your situation.
2024-08-19 17:16:18 -04:00
comfyanonymous 4d341b78e8 Bug fixes. 2024-08-19 16:28:55 -04:00
comfyanonymous 6138f92084 Use better dtype for the lowvram lora system. 2024-08-19 15:35:25 -04:00
comfyanonymous be0726c1ed Remove duplication. 2024-08-19 15:26:50 -04:00
comfyanonymous 4506ddc86a Better subnormal fp8 stochastic rounding. Thanks Ashen. 2024-08-19 13:38:03 -04:00
comfyanonymous 20ace7c853 Code cleanup. 2024-08-19 12:48:59 -04:00
comfyanonymous 22ec02afc0 Handle subnormal numbers in float8 rounding. 2024-08-19 05:51:08 -04:00