STaR - Self Taught Reasoner

By adding "step by step" the model outputs a rationale. Then the user nurses the model to a better rationale. The fine-turning automates this fine-tuning with many questions. From the revealing of reasoning the LLM used, it can be corrected. So models that use the Star method output reasoning along with answers or chain of thought, so errors in its logic can be detected and repaired. This Star method, described in a paper, is used during the training of the model.

  

📝 📜 ⏱️ ⬆️