A model that creates images by diffusing in latent space.
LDM is like drawing on a compressed draft. It denoises the picture inside a small latent space first, then decodes it back into the full image.
It powers many sharp text-to-image tools. It runs faster and uses less graphics memory.
Diffusion
LDM moves the diffusion process from pixels into latent space.
VAE
A VAE often compresses an image into latent space and decodes it back.
Text-to-Image Generation
LDM is the speed base for many high-res text-to-image models.
U-Net
A U-Net often predicts noise in latent space and removes it step by step.