TD-Gammon — Temporal Difference Gammon

Fact

A backgammon AI trained by playing itself with TD learning.

TD-Gammon is like a kid playing both sides at lunch. It loses, groans, swaps seats, and gets smarter.

It used a neural network to judge backgammon boards. It showed self-play could train strong game AI.

TD Learning
TD-Gammon used TD Learning to update its board scores.

RL
TD-Gammon was an early famous win for RL.

Neural-network
TD-Gammon used a Neural-network to estimate its chance of winning.

AlphaZero
AlphaZero followed the self-play path that TD-Gammon helped show.