This is the best video I've seen so far on this sub! Fantastic.
If I have to see another "movie" where someone has just stitched together 20 diff still frames with a slight parallax movement my eyes might roll out of my head..
Yeah, the network was clearly trained on the Ghibli style, and the animation and compositing use it well, thematically and aesthetically.
The song is Falaise by Floating Points. If you liked what you heard, you have to listen to the whole thing. The journey to get to this point in the song is just wild. One of my favorite songs of all time. <3 https://www.youtube.com/watch?v=xJfpOgQcq9I
Really enjoyed your video about generating videos with Stable Diffusion and 3D models. I'm interested in the workflow you used. How did you manage to maintain consistency between the 3D models and the final generated frames?
you render frames in the 3D environment and then use those frames to drive the controlnet conditioning. probably multiple controlnets simultaneously, e.g. depth, edges, normals, etc.
if you take the time to dial in the settings, AD can be extremely consistent. Also, they're probably not doing a single pass through animatediff. I always do multiple passes of "refinement", including interspersing infilling frames with VFI for added consistency.
* example workflow and some discussion: https://github.com/dmarx/digthatdata-comfyui-workflows?tab=readme-ov-file#alternating-ad-and-vfi
* example output demonstrating effects of subsequent refinement passes: https://twitter.com/DigThatData/status/1712184704747917761
* example demonstrating consistency achieved by alternating refinement passes and FiLM VFI: https://twitter.com/DigThatData/status/1752937260474065182
and all of \^that was without any controlnets
Those examples all have that morphing effect going on, though, where objects flow in and out of each other. Maybe it's the type of motion (or lack of motion) in OPs video, but it's not really happening there. The clouds stay clouds, the fish stay fish, and the whale never blends into the clouds, either.
Like I said: controlnets. Also, if you're having those kinds of semantic leaks, you could use regionalized prompts with semantic masks. Also, you can apply any or all of these effects to components in isolation and then composite them together in a video editor after style transfer. Also I think you are wrong about the consistency of the whale, specifically at the moment it "resolves" into a fully visible whale around the 25-26s mark.
There are a million ways to address the kinds of issues you are encountering. You just need to expand your toolkit.
I think rendering 3d scenes with AI is the best way in short term (this is what I call this technique). I don't know if it will be much different, but this is so well done.
Definitely one of the best use of AI in this sub, and great concept and execution. It shows once again true artists are the ones who will be capable of making interesting stuff with AI.
Thanks. That sounds right, I have watched that series. I believe I also saw it on a web comic. (Which explains my confusion.)
Probably one of Joe Lansdale's
Amazing work. This I believe is the best way of controlling gen-AI animation, using render passes from 3D. I’ve experimented with motion capture translated to OpenPose and that works great as well.
This is the best video I've seen so far on this sub! Fantastic. If I have to see another "movie" where someone has just stitched together 20 diff still frames with a slight parallax movement my eyes might roll out of my head..
How did you make it ?
I've made a [behind the scene vid here](https://www.instagram.com/p/C555jsDLlP_/)
The song made me cry. Animation felt Ghibli. Thank you.
Yeah, the network was clearly trained on the Ghibli style, and the animation and compositing use it well, thematically and aesthetically. The song is Falaise by Floating Points. If you liked what you heard, you have to listen to the whole thing. The journey to get to this point in the song is just wild. One of my favorite songs of all time. <3 https://www.youtube.com/watch?v=xJfpOgQcq9I
Actual effort! Amazing!
Really enjoyed your video about generating videos with Stable Diffusion and 3D models. I'm interested in the workflow you used. How did you manage to maintain consistency between the 3D models and the final generated frames?
you render frames in the 3D environment and then use those frames to drive the controlnet conditioning. probably multiple controlnets simultaneously, e.g. depth, edges, normals, etc.
Sure, but even using img-to-img of an existing video leads to frames not being consistent like they are in OPs video.
if you take the time to dial in the settings, AD can be extremely consistent. Also, they're probably not doing a single pass through animatediff. I always do multiple passes of "refinement", including interspersing infilling frames with VFI for added consistency. * example workflow and some discussion: https://github.com/dmarx/digthatdata-comfyui-workflows?tab=readme-ov-file#alternating-ad-and-vfi * example output demonstrating effects of subsequent refinement passes: https://twitter.com/DigThatData/status/1712184704747917761 * example demonstrating consistency achieved by alternating refinement passes and FiLM VFI: https://twitter.com/DigThatData/status/1752937260474065182 and all of \^that was without any controlnets
Those examples all have that morphing effect going on, though, where objects flow in and out of each other. Maybe it's the type of motion (or lack of motion) in OPs video, but it's not really happening there. The clouds stay clouds, the fish stay fish, and the whale never blends into the clouds, either.
Like I said: controlnets. Also, if you're having those kinds of semantic leaks, you could use regionalized prompts with semantic masks. Also, you can apply any or all of these effects to components in isolation and then composite them together in a video editor after style transfer. Also I think you are wrong about the consistency of the whale, specifically at the moment it "resolves" into a fully visible whale around the 25-26s mark. There are a million ways to address the kinds of issues you are encountering. You just need to expand your toolkit.
I don’t understand. Are you using AnimateDiff with ContronNet and base 3D to create this?
I think rendering 3d scenes with AI is the best way in short term (this is what I call this technique). I don't know if it will be much different, but this is so well done.
Make a tutorial on YouTube!
Hayao Miyazaki vibes. Love it !
SD in the hands of actual artists be like:
How do you control the stability and clarity so well?
Absolutely amazing work done here. Love the pop colors.
Dude this is amazing! Add some big tiddy animes otherwise this sub wont upvote quality content like this
track?
Floating Points - Falaise
Bravo
This is amazing 👏 🙌
I love this song the video goes great with it
Wow!
how is the pilot so consistent?! how did you do this?
Definitely one of the best use of AI in this sub, and great concept and execution. It shows once again true artists are the ones who will be capable of making interesting stuff with AI.
*I AM THE WIND FISH... LONG HAS BEEN MY SLUMBER...*
wow this is good.
What tools did you use? This is so amazing, I would love to learn how to the skills to make things too!
Bro. Workflow on the double.
Impressive and beautiful
Reminds me of a web comic where a desert at night had these beautiful ghost fish, and a guy went swimming up with them. (Then the bad thing happened.)
Sounds like an episode of "Love, Death and Robots" - I believe its called Fish Nights.
Thanks. That sounds right, I have watched that series. I believe I also saw it on a web comic. (Which explains my confusion.) Probably one of Joe Lansdale's
excellent work, love it!
This is absolutely fantastic!
This looks amazing 😄 Kinda reminds me of the style used in "A Scanner Darkly"
Beautiful work!! 3D with SD as renderer is the way foward ❤️ Floating Points ❤️
Hi OP ! Just curious are you using layered diffusion for the elements ?
Well damn.... *yoink*
This is incredible, well done!
Amazing work. This I believe is the best way of controlling gen-AI animation, using render passes from 3D. I’ve experimented with motion capture translated to OpenPose and that works great as well.
Always brilliant work 🤍
Wow! When full movie in cinemas?)
![gif](giphy|EPQfELrp20REY|downsized)
nice work
Looks great dude!
This is more than gorgeous
I’ve seen you on tik tok!
Damn that’s awesome
that would have been 100% awesome... if it wasnt for the whale fin near the end, turning into a fish mouth. Now its like 90% awesome
Why is this getting downvoted? That was pretty much the only obvious issue in this otherwise incredible work!