Revision as of 05:06, 27 July 2025 edit Kjerish (talk \| contribs) Extended confirmed users 5,991 edits Add diagram ← Previous edit		Revision as of 03:02, 28 July 2025 edit undo Kjerish (talk \| contribs) Extended confirmed users 5,991 edits Reduce size Next edit →
Line 2: {{Copy edit\|for=jargon\|date=May 2025}} [[File:Three-stage large language model training workflow.svg\|~~500px~~200px\|thumb\|A diagram of the three-stage training workflow used to train LLMs. Training a reasoning model is an optional fourth step of fine-tuning on top of the three-step "assistant model" recipe.]] '''Reasoning language models''' ('''RLMs''') are [[large language model]]s that are trained further to solve tasks that take several steps of [[reasoning]].<ref>{{cite arXiv \|last1=Besta \|first1=Maciej \|last2=Barth \|first2=Julia \|last3=Schreiber \|first3=Eric \|last4=Kubicek \|first4=Ales \|last5=Catarino \|first5=Afonso \|last6=Gerstenberger \|first6=Robert \|last7=Nyczyk \|first7=Piotr \|last8=Iff \|first8=Patrick \|last9=Li \|first9=Yueling \|title=Reasoning Language Models: A Blueprint \|date=2025-01-23 \|arxiv=2501.11223 \|class=cs.CL}}</ref> They tend to do better on logic, math, and programming tasks than standard LLMs, can [[Backtracking\|revisit and revise]] earlier steps, and make use of extra computation while answering as another way to [[Neural scaling law\|scale performance]], alongside the number of training examples, parameters, and training compute.<ref name=":8">{{cite web \|title=Learning to reason with LLMs \|url=https://openai.com/index/learning-to-reason-with-llms/ \|website=OpenAI \|date=2024-09-12 \|access-date=2025-07-26}}</ref>

Reasoning language model: Difference between revisions