Train on input-output examples to teach a model a task or format.
Pretraining is self-supervised. No human labels. SFT adds them back. Humans write the examples: “Summarize this article” → a good summary. “Translate to French” → a correct translation. Thousands of these pairs. The model adjusts its parameters to produce outputs that match.
SFT on instruction-response pairs. The step that turns a foundation model into a chatbot. The model learns to follow directions: answer questions, refuse harmful requests, stay on topic. Most conversational AI starts here.