ComfyUI

Commit Graph

Author	SHA1	Message	Date
comfyanonymous	5d8bbb7281	Cleanup.	2024-08-23 04:06:27 -04:00
comfyanonymous	2c1d2375d6	Fix.	2024-08-23 04:04:55 -04:00
Simon Lui	64ccb3c7e3	Rework IPEX check for future inclusion of XPU into Pytorch upstream and do a bit more optimization of ipex.optimize(). (#4562 )	2024-08-23 03:59:57 -04:00
Scorpinaus	9465b23432	Added SD15_Inpaint_Diffusers model support for unet_config_from_diffusers_unet function (#4565 )	2024-08-23 03:57:08 -04:00
comfyanonymous	c0b0da264b	Missing imports.	2024-08-22 17:20:51 -04:00
comfyanonymous	c26ca27207	Move calculate function to comfy.lora	2024-08-22 17:12:00 -04:00
comfyanonymous	7c6bb84016	Code cleanups.	2024-08-22 17:05:12 -04:00
comfyanonymous	c54d3ed5e6	Fix issue with models staying loaded in memory.	2024-08-22 15:58:20 -04:00
comfyanonymous	c7ee4b37a1	Try to fix some lora issues.	2024-08-22 15:32:18 -04:00
David	7b70b266d8	Generalize MacOS version check for force-upcast-attention (#4548 ) This code automatically forces upcasting attention for MacOS versions 14.5 and 14.6. My computer returns the string "14.6.1" for `platform.mac_ver()[0]`, so this generalizes the comparison to catch more versions. I am running MacOS Sonoma 14.6.1 (latest version) and was seeing black image generation on previously functional workflows after recent software updates. This PR solved the issue for me. See comfyanonymous/ComfyUI#3521	2024-08-22 13:24:21 -04:00
comfyanonymous	8f60d093ba	Fix issue.	2024-08-22 10:38:24 -04:00
comfyanonymous	843a7ff70c	fp16 is actually faster than fp32 on a GTX 1080.	2024-08-21 23:23:50 -04:00
comfyanonymous	a60620dcea	Fix slow performance on 10 series Nvidia GPUs.	2024-08-21 16:39:02 -04:00
comfyanonymous	015f73dc49	Try a different type of flux fp16 fix.	2024-08-21 16:17:15 -04:00
comfyanonymous	904bf58e7d	Make --fast work on pytorch nightly.	2024-08-21 14:01:41 -04:00
Svein Ove Aas	5f50263088	Replace use of .view with .reshape (#4522 ) When generating images with fp8_e4_m3 Flux and batch size >1, using --fast, ComfyUI throws a "view size is not compatible with input tensor's size and stride" error pointing at the first of these two calls to view. As reshape is semantically equivalent to view except for working on a broader set of inputs, there should be no downside to changing this. The only difference is that it clones the underlying data in cases where .view would error out. I have confirmed that the output still looks as expected, but cannot confirm that no mutable use is made of the tensors anywhere. Note that --fast is only marginally faster than the default.	2024-08-21 11:21:48 -04:00
comfyanonymous	03ec517afb	Remove useless line, adjust windows default reserved vram.	2024-08-21 00:47:19 -04:00
comfyanonymous	510f3438c1	Speed up fp8 matrix mult by using better code.	2024-08-20 22:53:26 -04:00
comfyanonymous	ea63b1c092	Simpletrainer lycoris format.	2024-08-20 12:05:13 -04:00
comfyanonymous	9953f22fce	Add --fast argument to enable experimental optimizations. Optimizations that might break things/lower quality will be put behind this flag first and might be enabled by default in the future. Currently the only optimization is float8_e4m3fn matrix multiplication on 4000/ADA series Nvidia cards or later. If you have one of these cards you will see a speed boost when using fp8_e4m3fn flux for example.	2024-08-20 11:55:51 -04:00
comfyanonymous	d1a6bd6845	Support loading long clipl model with the CLIP loader node.	2024-08-20 10:46:36 -04:00
comfyanonymous	83dbac28eb	Properly set if clip text pooled projection instead of using hack.	2024-08-20 10:46:36 -04:00
comfyanonymous	538cb068bc	Make cast_to a nop if weight is already good.	2024-08-20 10:46:36 -04:00
comfyanonymous	1b3eee672c	Fix potential issue with multi devices.	2024-08-20 10:46:36 -04:00
comfyanonymous	9eee470244	New load_text_encoder_state_dicts function. Now you can load text encoders straight from a list of state dicts.	2024-08-19 17:36:35 -04:00
comfyanonymous	045377ea89	Add a --reserve-vram argument if you don't want comfy to use all of it. --reserve-vram 1.0 for example will make ComfyUI try to keep 1GB vram free. This can also be useful if workflows are failing because of OOM errors but in that case please report it if --reserve-vram improves your situation.	2024-08-19 17:16:18 -04:00
comfyanonymous	4d341b78e8	Bug fixes.	2024-08-19 16:28:55 -04:00
comfyanonymous	6138f92084	Use better dtype for the lowvram lora system.	2024-08-19 15:35:25 -04:00
comfyanonymous	be0726c1ed	Remove duplication.	2024-08-19 15:26:50 -04:00
comfyanonymous	4506ddc86a	Better subnormal fp8 stochastic rounding. Thanks Ashen.	2024-08-19 13:38:03 -04:00
comfyanonymous	20ace7c853	Code cleanup.	2024-08-19 12:48:59 -04:00
comfyanonymous	22ec02afc0	Handle subnormal numbers in float8 rounding.	2024-08-19 05:51:08 -04:00
comfyanonymous	39f114c44b	Less broken non blocking?	2024-08-18 16:53:17 -04:00
comfyanonymous	6730f3e1a3	Disable non blocking. It fixed some perf issues but caused other issues that need to be debugged.	2024-08-18 14:38:09 -04:00
comfyanonymous	73332160c8	Enable non blocking transfers in lowvram mode.	2024-08-18 10:29:33 -04:00
comfyanonymous	2622c55aff	Automatically use RF variant of dpmpp_2s_ancestral if RF model.	2024-08-18 00:47:25 -04:00
Ashen	1beb348ee2	dpmpp_2s_ancestral_RF for rectified flow (Flux, SD3 and Auraflow).	2024-08-18 00:33:30 -04:00
comfyanonymous	d31df04c8a	Indentation.	2024-08-17 23:00:44 -04:00
Xrvk	e68763f40c	Add Flux model support for InstantX style controlnet residuals (#4444 ) * Add Flux model support for InstantX style controlnet residuals * Refactor Flux controlnet residual step to a separate method * Rollback minor change * New format for applying controlnet residuals: input->double_blocks, output->single_blocks * Adjust XLabs Flux controlnet to fit new syntax of applying Flux controlnet residuals * Remove unnecessary import and minor style change	2024-08-17 22:58:23 -04:00
comfyanonymous	4f7a3cb6fb	unet -> diffusion_models.	2024-08-17 21:31:04 -04:00
comfyanonymous	bb222ceddb	Fix loras having a weak effect when applied on fp8.	2024-08-17 15:20:17 -04:00
comfyanonymous	fca42836f2	Add model_options for text encoder.	2024-08-17 11:17:20 -04:00
comfyanonymous	cd5017c1c9	calculate_weight function to use a different dtype.	2024-08-17 01:06:08 -04:00
comfyanonymous	83f343146a	Fix potential lowvram issue.	2024-08-16 17:12:42 -04:00
Matthew Turnshek	1770fc77ed	Implement support for taef1 latent previews (#4409 ) * add taef1 handling to several places * remove guess_latent_channels and add latent_channels info directly to flux model * remove TODO * fix numbers	2024-08-16 12:53:13 -04:00
comfyanonymous	5960f946a9	Move a few files from comfy -> comfy_execution. Python code in the comfy folder should not import things from outside it.	2024-08-15 11:21:14 -04:00
guill	5cfe38f41c	Execution Model Inversion (#2666 ) * Execution Model Inversion This PR inverts the execution model -- from recursively calling nodes to using a topological sort of the nodes. This change allows for modification of the node graph during execution. This allows for two major advantages: 1. The implementation of lazy evaluation in nodes. For example, if a "Mix Images" node has a mix factor of exactly 0.0, the second image input doesn't even need to be evaluated (and visa-versa if the mix factor is 1.0). 2. Dynamic expansion of nodes. This allows for the creation of dynamic "node groups". Specifically, custom nodes can return subgraphs that replace the original node in the graph. This is an incredibly powerful concept. Using this functionality, it was easy to implement: a. Components (a.k.a. node groups) b. Flow control (i.e. while loops) via tail recursion c. All-in-one nodes that replicate the WebUI functionality d. and more All of those were able to be implemented entirely via custom nodes, so those features are not a part of this PR. (There are some front-end changes that should occur before that functionality is made widely available, particularly around variant sockets.) The custom nodes associated with this PR can be found at: https://github.com/BadCafeCode/execution-inversion-demo-comfyui Note that some of them require that variant socket types ("") be enabled. Allow `input_info` to be of type `None` * Handle errors (like OOM) more gracefully * Add a command-line argument to enable variants This allows the use of nodes that have sockets of type '' without applying a patch to the code. Fix an overly aggressive assertion. This could happen when attempting to evaluate `IS_CHANGED` for a node during the creation of the cache (in order to create the cache key). * Fix Pyright warnings * Add execution model unit tests * Fix issue with unused literals Behavior should now match the master branch with regard to undeclared inputs. Undeclared inputs that are socket connections will be used while undeclared inputs that are literals will be ignored. * Make custom VALIDATE_INPUTS skip normal validation Additionally, if `VALIDATE_INPUTS` takes an argument named `input_types`, that variable will be a dictionary of the socket type of all incoming connections. If that argument exists, normal socket type validation will not occur. This removes the last hurdle for enabling variant types entirely from custom nodes, so I've removed that command-line option. I've added appropriate unit tests for these changes. * Fix example in unit test This wouldn't have caused any issues in the unit test, but it would have bugged the UI if someone copy+pasted it into their own node pack. * Use fstrings instead of '%' formatting syntax * Use custom exception types. * Display an error for dependency cycles Previously, dependency cycles that were created during node expansion would cause the application to quit (due to an uncaught exception). Now, we'll throw a proper error to the UI. We also make an attempt to 'blame' the most relevant node in the UI. * Add docs on when ExecutionBlocker should be used * Remove unused functionality * Rename ExecutionResult.SLEEPING to PENDING * Remove superfluous function parameter * Pass None for uneval inputs instead of default This applies to `VALIDATE_INPUTS`, `check_lazy_status`, and lazy values in evaluation functions. * Add a test for mixed node expansion This test ensures that a node that returns a combination of expanded subgraphs and literal values functions correctly. * Raise exception for bad get_node calls. * Minor refactor of IsChangedCache.get * Refactor `map_node_over_list` function * Fix ui output for duplicated nodes * Add documentation on `check_lazy_status` * Add file for execution model unit tests * Clean up Javascript code as per review * Improve documentation Converted some comments to docstrings as per review * Add a new unit test for mixed lazy results This test validates that when an output list is fed to a lazy node, the node will properly evaluate previous nodes that are needed by any inputs to the lazy node. No code in the execution model has been changed. The test already passes. * Allow kwargs in VALIDATE_INPUTS functions When kwargs are used, validation is skipped for all inputs as if they had been mentioned explicitly. * List cached nodes in `execution_cached` message This was previously just bugged in this PR.	2024-08-15 11:21:11 -04:00
comfyanonymous	0f9c2a7822	Try to fix SDXL OOM issue on some configurations.	2024-08-14 23:08:54 -04:00
comfyanonymous	f1d6cef71c	Revert "Disable cuda malloc by default." This reverts commit `50bf66e5c4`.	2024-08-14 08:38:07 -04:00
comfyanonymous	33fb282d5c	Fix issue.	2024-08-14 02:51:47 -04:00

1 2 3 4 5 ...

1056 Commits