Reasoning language model: Difference between revisions

Content deleted Content added
Line 83:
{{Main|Benchmark (computing)|List of language model benchmarks}}
 
The reasoning ability of language models are usually tested on problems of which there arewith unambiguous solutions that can be cheaply checked, and requires reasoning when solved by a human. TheseSuch problems are usually in mathematics and [[competitive programming]]. The answer is usually an array of integers, a multiple choice letter, or a program that passes [[Unit testing|unit tests]] within a limited runtime. Some common ones include:
 
* GSM8K (Grade School Math): 8.5K linguistically diverse [[Primary school|elementary school]] [[Word problem (mathematics education)|math word problems]] that require 2 to 8 basic arithmetic operations to solve.<ref name=":2" />