STaR - Self Taught Reasoner
This revision is from 2024/07/16 09:29. You can Restore it.
By adding "step by step" the model outputs a rationale. Then the user nurses the model to a better rationale. The fine-turning automates this fine-tuning with many questions. From the revealing of reasoning the LLM used, it can be corrected. So models that use the Star method output reasoning along with answers or chain of thought, so errors in its logic can be detected and repaired. This Star method, described in a paper, is used during the training of the model.