Revision as of 20:25, 4 May 2025 edit Kjerish (talk \| contribs) Extended confirmed users 5,991 edits Removed content about Prompting, CoT, RAG, tool use, few shot prompting, benchmarking. These are related concepts but not specific to RLMs. This reads like it was written by someone who just discovered how LLMs work but it detracts from the emerging concept of RLMs ← Previous edit		Revision as of 20:42, 4 May 2025 edit undo Kjerish (talk \| contribs) Extended confirmed users 5,991 edits Refactored introduction to be more specific Next edit →
Line 1: {{Short description\|Language models designed for reasoning tasks}}{{Merge to\|Reflection (artificial intelligence)\|date=April 2025}}{{unreliable sources\|date=January 2025}} '''Reasoning language models''' ('''RLMs''') are [[large language model]]s that have been further trained to solve multi‑step reasoning tasks.<ref>{{cite arXiv \|title=Reasoning Language Models: A Blueprint \|last=Besta \|first=Maciej \|date=2025-01-23 \|eprint=2501.11223 \|class=cs.CL}}</ref> These models perform better on logical, mathematical or programmatic tasks than traditional autoregressive LLMs, have the ability to [[Backtracking\|backtrack]], and employ test-time compute as an additional [[Neural scaling law\|scaling axis]] beyond [[Training, validation, and test data sets\|training examples]], parameter count, and train-time compute. '''Reasoning language models''' are [[artificial intelligence]] systems that combine [[natural language processing]] with structured reasoning capabilities. These models are usually constructed by [[Prompt engineering\|prompting]], [[Fine-tuning (deep learning)\|supervised finetuning]] (SFT), and [[reinforcement learning]] (RL) initialized with [[Pretrained language model\|pretrained language models]]. == Supervised finetuning ==

Reasoning language model: Difference between revisions