A text-to-image model that uses latent diffusion to make pictures.
Stable Diffusion is like an Etch A Sketch with an art goblin inside. You type “pirate cat,” and TV static turns into a picture.
It can make a fresh image from words. It can also guide a sketch or fix a photo.
LDM
Stable Diffusion is a famous LDM and puts diffusion in latent space.
Diffusion
Diffusion gives it the basic trick of turning noise into an image.
Text-to-Image Generation
Stable Diffusion made text-to-image tools much easier for regular people.
CLIP
CLIP helps it understand prompts and match them to image ideas.