A rule that picks the model with the shortest total code.
MDL is like explaining a broken cookie jar to your mom. Mom counts the story words plus the excuse words. The shortest total wins her trust.
It helps choose between models. It adds model size plus mistakes, so no model memorizes noise.
Information Theory
MDL uses code length to measure both model size and mistakes.
Regularization
Regularization often puts the short-explanation idea into the training score.
Bias-Variance Tradeoff
MDL warns the model not to memorize noise just to lower errors.
SLT
MDL is a model choice rule that prefers simple explanations.