Post-training — RL Glossary

The LLM training phase that takes a foundation model and makes it useful.

A pretrained model predicts the next token. It doesn’t follow instructions, answer questions, or refuse harmful requests. Post-training reshapes it: teaching it tasks, aligning it with human preferences, adapting it to specific domains.

Talk to an RL expert