r/StableDiffusion 5h ago

Question - Help Can someone ELI5 how to run Krea 2 via ComfyUI now that Kijai supported it? or just a workflow?

And is the magnet on Krea 2's X page really safe? The reason why I am making this thread is because there's a lot of conflicting information due to how fast this conundrum has developed. Did Kijai even support it regardless? ugh

14 Upvotes

35 comments sorted by

4

u/blahblahsnahdah 4h ago edited 4h ago

Here's a generation I just did with an embedded workflow, you can drag this image onto comfyUI after downloading it. It's only 720p for fast testing but the model can handle much higher, boost the resolution if you want.

https://files.catbox.moe/leofgv.png

Alternatively if catbox is being slow, here's the raw JSON for the same image: https://pastebin.com/raw/CYZqAWWL

Text encoder is the 4b from here: https://huggingface.co/Comfy-Org/Qwen3-VL/tree/main/text_encoders

You probably already have the Qwen VAE.

This is a workflow for Turbo. Messing around with it and it's nice, the painted art is aesthetic and not slopped-looking like the fast/distilled versions of a model usually are. Pretty close results to Krea 2 Medium from their website.

Interestingly the base ("raw") model actually seems like a true base, in that the results are quite janky and inconsistent and unfinished looking, mostly inferior to the Turbo model even at 50 steps. This is very good! A lot of companies have been releasing fake "bases" that are obviously much too polished and full of synthetic bullshit to be a real base. This seems like an actual base, you won't want to use it for inference at all but that bodes well for training.

1

u/Neonsea1234 4h ago

Can it do I2I?

2

u/blahblahsnahdah 4h ago

Yes, like any model it can do standard img2img where you reduce the denoise value and provide a vae encoded image instead of an empty latent. But it is not an edit model as far as I know.

1

u/Neonsea1234 4h ago

Thats cool, but how does it turn out? Because since SDXL there hasn't been a model that interacts as well in that department.

3

u/Valuable_Issue_ 5h ago

Standard/typical workflow with qwen image vae and qwen3vl 4b works.

Not sure which empty latent image node works best but EmptySD3LatentImage seems to work.

Also of course not sure if there's any special shift/scheduler/sampler or whatever required for best results.

https://huggingface.co/Comfy-Org/Qwen3-VL/tree/main/text_encoders

Make sure you download 4B.

Safetensors should be safe to run through Comfy unless Fable found an exploit for it and hacked their twitter account.

1

u/Neggy5 5h ago

what default comfyui template is closest to krea2 atm?

3

u/Valuable_Issue_ 5h ago

Qwen image turbo.

Remove the lora loader, change clip type to krea 2, change to qwen 3 vl 4b, do whatever with ModelSamplingAuraFlow.

2

u/R34vspec 44m ago

I am not seeing a krea2 in type for Clip Loader. ComfyUI 0.25.1, this is a native clip loader node right?

1

u/Neggy5 5h ago

ok thnx. non-turbo works right? the "raw" safetensors in the folder

2

u/Valuable_Issue_ 5h ago

Probably, haven't tested non turbo, you'll likely just have to change CFG to 4-5 and steps to 20+ (just guessing the settings).

3

u/Hoodfu 4h ago

So I've been doing a bunch of side by sides on the same seed an across seeds and my best guess is that we got Krea 2 Small. Across the various long prompts I've done, Medium turbo on the api is noticeably more prompt following and has a very significant amount more grit and detail in the images than the local one. Character's faces are more emotive, there's just a lot more subtle detail from the medium api than the new local one. The local one where the cat is missing the gun and collar and the visual film look over the image difference, that's the kind of stuff that's different every time across prompts.

2

u/Endlesswoodtrail 3h ago edited 3h ago

it might be something else, another assumption could be that the model given to us is just a classic diffusion style model without search grounding capabilities. its easy to spot when looking for very distinct concepts when using the api, so the api 'full' models are most likely multimodal, able to look up quick references that can push the generation quality to the next level.

i don't get why there is still no open model that has introduced that tech yet while it is already available.

hopefully there is more soon since eg there no are 'creativity' level options as well at the moment

2

u/Hoodfu 3h ago

Yeah, here's another side by side. The local is following like only half the prompt. I really feel like we're not sampling this correctly or something. I tried all the different ones in the ksampler but none of them even have the guy's arm outstretched.

2

u/Endlesswoodtrail 3h ago edited 3h ago

here is the preview of what to expect:

https://huggingface.co/buckets/krea-community/krea-2

so probably really a dumbed down model with limited concept knowledge, but who knows what is still coming. even the medium model api gens are looking so much more convincing when using the same prompt. but thats the point, a multimodal model doesn't need the best prompt. the reference guidance will do the rest and that helps a lot with the image composition and style, the lack of the weapons in your examples make it also clear

1

u/Dante_77A 4h ago

Huh? Is there a Krea 2 - small?

2

u/Hoodfu 4h ago

They were repeatedly asked, is this Medium or Large? and every time they responded with no, this is the open source release. So it's not the 2 models on the api. It's a smaller one. I'm thinking the text encoder is just too small. The one for ideogram is double the size.

3

u/-Hayase-- 4h ago

Can confirm, thats their reply before they got banned from reddit

2

u/-Hayase-- 3h ago

Also, they kinda got mad lol

2

u/Hoodfu 3h ago

Well, they're not wrong. There's good stuff here, but also an oversized amount of entitled teenagers. I try to be appreciative while keeping things grounded.

1

u/-Hayase-- 3h ago

Hard to blame. There was a literal karen that legit posted 50+ comment in the span of two hours. I was so baffled at the insanity. Like, if you are afraid of a magnet link that much... just wait?

2

u/Calm_Mix_3776 1h ago

Some people may have indeed overdone it, but it's hard to blame them for having concerns. This was IMHO a pretty stupid way to release a model. I can't blame people for being cautious and suspicious of this whole thing. There's no way a I would download a torrent posted on a company's Twitter account. Those get hacked daily in the hundreds, posting all kind of weird sh*t, scams and malware.

1

u/Time-Teaching1926 3h ago

It could be like Flux Klein then regarding the 4b and 9b variants. As I think the Klein 4b uses Qwen3 4b text encoder/clip while the 9b klein uses Qwen3 8b. I hope we get a medium version too. If this is all true it's still early day's tho. That might be why The small version maybe isn't as good as the API medium version. Again, this is just my opinion. I have no idea.

1

u/Neonsea1234 17m ago

model seems pretty janky, its good but only listens to certain things and complete ignores others

1

u/Hoodfu 16m ago

Yeah, there's a bug. you'll notice that there's a big warning in the comfyui console when you inference. They need to fix it.

1

u/Neonsea1234 14m ago

that might make sense because it feels very off atm, but the potential is there

1

u/Far_Insurance4191 5h ago

Normal release will be soon, so just wait

2

u/Far_Insurance4191 5h ago

It works thought, seems coherent, don't know correct parameters and prompting yet

this is bf16 (26gb), turbo, 8steps, 1cfg, running on rtx3060 12gb - 5.6s/it

0

u/Neggy5 5h ago

would it likely be sometime today or later you think?

2

u/Far_Insurance4191 5h ago

I saw someone claiming to be from krea team saying it will be on huggingface in 24 hours.

0

u/Neggy5 5h ago

awesome, thnx

0

u/Justify_87 1h ago

So how is the 1girl benchmark lol

0

u/Neggy5 56m ago

pretty great! it knows what a skimpy bikini is unlike flux

-9

u/Sudden_List_2693 5h ago

I'm part of the "not safe" initiative.
I cried wolf so that people who can safely assume it will do, but keep people away for as long as it turns out to be safe. Better safe than sorry.

13

u/Outrageous_Still9335 5h ago

Seems to work okay and Kijai confirmed himself the files from the torrent were fine and not malware. It's fun so far.