top of page


Used stock footage from Pexels. The character is in a studio-lit environment with multiple sources of light. There is also a fair amount of spec and highlights on her, which added more challenge to AI to generate PBR passes.Using Switchlight AI background removal tool to extract the fg from the footage. There is some flicker but there is a temporal consistency slider that can be tweaked to get a more stable output.

To generate the background, I used Leonardo AI with few prompts. Nothing fancy there. The generated background was then fed into Switchlight to generate an HDRI map using DiffusionLight open-source ML model. Generated HDRI and FG(without bg) were put into the Switchlight AI Virtual Production tool to relight the FG. Generated PBR passes and brought them into Nuke to put together a base comp. Base comp was then fed back into Leonardo AI to generate some ideas and details which can be rolled back into Nuke using traditional compositing. Finally added some lens effects and final touches.


Rendering CG with one HDRI light source. Outputting as a PNG image sequence cropped at 512x512 aspect for efficiency. Rendering depth as well from Blender which was normalised in Nuke.
Running two ControlNet on the source using uncanny edge and the cg depth map to keep the composition close to the source.

Influencing the model with various IPadapters for each version. Mostly by the strength of .40. Passing the model and latent image through the AnimateDiff node using mm-stabilized_high model. Other models yielded lots of inconsistency and this one worked the best. Keeping the context length 16 for a smooth blend between frames.

Used SD upscale and latent upscale to add more resolution. Latent upscale gave the best results but added a lot of additional details not in the initial image generation. Encoding the images into a .h264 container.

Animate diff works great with a batch size of 16 above but slows down the sampler quite a lot. The best is to generate at 512 and upscale. There might be some good use of frame interpolation to fill in the frames in between for smoother results.


Stable Diffusion Model: Stable diffusion is a method in machine learning used for image processing tasks. It's particularly effective for image generation and manipulation tasks due to its ability to preserve details while generating realistic outputs. Stable diffusion models, like Diffusion Models, are designed to model the conditional distribution of pixel values in an image.

Grayscale Render from Blender: Before delving into the AI part, you first need a grayscale render from Blender. Blender is a popular open-source 3D creation software. Rendering an image in grayscale means the output will be in shades of gray rather than full color. This grayscale render acts as the guide or reference for the AI to understand the desired visual characteristics of the final image.

Here is the link to the indepth tutorial :


Dived deeper into Wonder Studio's capabilities and passed it through some lighting tweaks and compositing. All of it was done in one day. Wonder took 45 mins to process.

Started with a custom character. There is a character guideline on Wonder's website that goes through all the prepping. There were quite a few intersections in the geo, so had to adjust the mocap data in a blender. The rotomation was not that smooth so did some cleanup as well. All of it was fairly easy.

Lately, the ability to export blender scenes got unlocked and I was able to use that to output some extra lights to bring the character to the same light levels as the plate. It was quite interesting to know that Wonder didn't camera track the shot but rather just did a rotation and estimated the focal length on a stationary camera.

Here is a more in-depth tutorial  :


I used Foundry Nuke to key and generate the alpha. While the switch light has an AI key option, I preferred to rely on the traditional keying method for better accuracy. Then, I fed 10 images for each environment into the switch light and performed the relighting.

Afterward, I used the relit images as ground truth to train Foundry Nuke's copycat against the original. This training data was then applied to the entire sequence.

In depth tutorial :


Using LumaAI Gaussian splatting capabilities and bringing in the data into Unreal Engine. Unreal was used to clean up the data using cull boxes and curl boxes and create a dynamic camera move. 


Imported a basic 3D model, rendered it with an HDRI and custom background, then utilized the render as a latent image in comfyUI for image-to-image enhancement of textures and overall appearance. The refined 3D model render was then reprojected onto the geometry in Blender to adjust for slight camera view discrepancies.

bottom of page