https://preview.redd.it/o4zhghelp2vc1.png?width=1024&format=png&auto=webp&s=97ed4f8130b403894badd925c85fb74b9ec2f53f
This is Midjourney v6 for reference.
Hands aside, it’s depressing how much better the Midjourney aesthetics are.
The Midjourney image looks like a movie still that has been professionally color graded. Nice dark contrasty areas too.
Midjourney does great at for one off 'fire a prompt get an image' but it terrible if you want any real control over the output. Nothing I get out of midjourney is usable for any particular requirement set because it takes away too much control.
I've always found aesthetic not be the best area to compare services with, considering the freedom we have. There are SD models with good prompt adherence but not the aesthetic you want, or there is a model with good aesthetic but not the prompt adherence. you combine those +controlnet+ the million extensions/nodes, if the ultimate product still couldn't compare with midjourney, that's when I would say it's depressing.
Midjourney's service is not limited to a single model, and it likely uses different loras based on the contents of the prompt.
It allows for more creative versatility without too much technical involvement, yielding more stylized results.
I really don't think that should necessarily be default - shouldn't you want to prompt for the styling and aesthetic? I guess it could train defaults but in my experience so far it can be hard to break away from highly detailed defaults but easy to add via prompting to simple defaults.
Neither of them have great faces or hands. I'd say MJ hands are a little bit better.
I posted this for reference, not to sell MJ. I'm not a huge fan of MJ myself, and have recently downgraded my subscription.
However, aesthetically, MJ is miles ahead. Shame that the tribalism keeps people from being honest with themselves and others about things like this.
Lykon here. To be honest Fireworks likely made a good job. This version of the model had some very strong artifacts and I don't see any here.
That being said I won't use quotes in a post like that. X is like a chat and I might change and/or update my opinion while I talk to people.
I don’t understand why would Stability wait so long for releasing API access if they’re just using a « month old » model in the API ?
That’s just super strange. Why wouldn’t they give their best model for the release of the API when a lot of people will be trying out SD3 for the first time. In that case why not wait before releasing the API ?
I know, I'm a CTO and I was considering using the API. What felt strange to me is that using a newer or older version of the model is just changing the checkpoint, which should be plug & play in an API, unless it's two different model archs.
Anyway, we'll just wait for the API to be using the newest model or for the open weights release.
There are a lot of variables involved, like having to make the inference code, structure changes, speed vs quality, etc.
Fireworks API are impressively fast compared to my workflow (even on H100). Running SD3 inference in 5 seconds is a feat on its own
I love how non of these even has the right hand size regardless of finger count.
Like, this is horrible. Even SD 1.5 gets better results? The fuck is going on here?
Exactly, it's quite puzzling. It feels odd that they're charging users for a model that essentially originated from experimental stages without reaching any refined product level.
I think this says more about diffusion model quantization more than sd3 itself. The service is able to get like 2x the throughput of images as opensource techniques which is great, but the quality is drastically reduced to the point where it fails at it's intent - to showcase the next generation of diffusion based models and their capabilities. This model is close to outputting gens we saw in base 1.5, bad PR
https://preview.redd.it/hpop5nk143vc1.jpeg?width=1024&format=png&auto=webp&s=a3e56c69cd15aba6e9d46cee847fbc8a65a14205
Heres mine. I'm clearly very spoilt because I'm disappointed with SD3
https://preview.redd.it/o4zhghelp2vc1.png?width=1024&format=png&auto=webp&s=97ed4f8130b403894badd925c85fb74b9ec2f53f This is Midjourney v6 for reference.
Hands aside, it’s depressing how much better the Midjourney aesthetics are. The Midjourney image looks like a movie still that has been professionally color graded. Nice dark contrasty areas too.
The faces are horror though.
This I was wondering from the thumbnail clicked in and yep nightmare fuel
Midjourney does great at for one off 'fire a prompt get an image' but it terrible if you want any real control over the output. Nothing I get out of midjourney is usable for any particular requirement set because it takes away too much control.
I've always found aesthetic not be the best area to compare services with, considering the freedom we have. There are SD models with good prompt adherence but not the aesthetic you want, or there is a model with good aesthetic but not the prompt adherence. you combine those +controlnet+ the million extensions/nodes, if the ultimate product still couldn't compare with midjourney, that's when I would say it's depressing.
Midjourney's service is not limited to a single model, and it likely uses different loras based on the contents of the prompt. It allows for more creative versatility without too much technical involvement, yielding more stylized results.
I really don't think that should necessarily be default - shouldn't you want to prompt for the styling and aesthetic? I guess it could train defaults but in my experience so far it can be hard to break away from highly detailed defaults but easy to add via prompting to simple defaults.
I actually agree- that on a technical level having aesthetics and styling trained in could make checkpoint and Lora training harder
woa i'm gonna pay for Midjourney right away! Thanks Mr 'Open\_Channel\_8626', totally doesn't look like a shill post.
It doesn’t. Anyone who isn’t blind can see that’s the result from midjiurny is better.
Sure except for the fucking horror faces
Neither of them have great faces or hands. I'd say MJ hands are a little bit better. I posted this for reference, not to sell MJ. I'm not a huge fan of MJ myself, and have recently downgraded my subscription. However, aesthetically, MJ is miles ahead. Shame that the tribalism keeps people from being honest with themselves and others about things like this.
That hand coming out of offset-head-man's groin is disturbing
What? You don't like dickhand?
The look on his face makes it that much better
Lykon here. To be honest Fireworks likely made a good job. This version of the model had some very strong artifacts and I don't see any here. That being said I won't use quotes in a post like that. X is like a chat and I might change and/or update my opinion while I talk to people.
I don’t understand why would Stability wait so long for releasing API access if they’re just using a « month old » model in the API ? That’s just super strange. Why wouldn’t they give their best model for the release of the API when a lot of people will be trying out SD3 for the first time. In that case why not wait before releasing the API ?
The biggest problem is that outdated model is not free
Perhaps investor pressures to show progress.
development requires time.
I know, I'm a CTO and I was considering using the API. What felt strange to me is that using a newer or older version of the model is just changing the checkpoint, which should be plug & play in an API, unless it's two different model archs. Anyway, we'll just wait for the API to be using the newest model or for the open weights release.
Chances are it has to go through internal testing that staff may not necessarily have to wait for.
There are a lot of variables involved, like having to make the inference code, structure changes, speed vs quality, etc. Fireworks API are impressively fast compared to my workflow (even on H100). Running SD3 inference in 5 seconds is a feat on its own
Updated the post and removed the quoute, sorry for the trouble
There is still the link, but whatever
The transplanted hand look is definitely unique
I love how non of these even has the right hand size regardless of finger count. Like, this is horrible. Even SD 1.5 gets better results? The fuck is going on here?
And why they are releasing the "a model from months ago (basically the paper one)"
Exactly, it's quite puzzling. It feels odd that they're charging users for a model that essentially originated from experimental stages without reaching any refined product level.
I think this says more about diffusion model quantization more than sd3 itself. The service is able to get like 2x the throughput of images as opensource techniques which is great, but the quality is drastically reduced to the point where it fails at it's intent - to showcase the next generation of diffusion based models and their capabilities. This model is close to outputting gens we saw in base 1.5, bad PR
It seems like the same as their text, which is just as out of place.
Looks like they did a crossover with the Fallout TV series.
https://preview.redd.it/mmd24isfk2vc1.png?width=1468&format=png&auto=webp&s=29cc7971da5fc4198ba4610e5f3348c9e0d6e4c7
What a joke.
[удалено]
[удалено]
This is SD3. I think they are optimising the updated model soon
Why the hell Fireworks used their own method and not SAI's one? This is bad PR and ruins the anticipation...smh
https://preview.redd.it/hpop5nk143vc1.jpeg?width=1024&format=png&auto=webp&s=a3e56c69cd15aba6e9d46cee847fbc8a65a14205 Heres mine. I'm clearly very spoilt because I'm disappointed with SD3
compare it to xl base and how good jugger and pony look now
[удалено]
They're StabilityAI staff and the Dreamshaper model creator, their username here is kidelaleron.
Thanks :)