Revision as of 20:19, 29 April 2025 edit Citation bot (talk \| contribs) Bots 5,862,686 edits Alter: template type, url. URLs might have been anonymized. Add: date, title, class, eprint, authors 1-10. Removed parameters. Some additions/deletions were parameter name changes. \| Use this bot. Report bugs. \| Suggested by Headbomb \| #UCB_toolbar ← Previous edit		Revision as of 12:45, 1 May 2025 edit undo Lollipoplollipoplollipop (talk \| contribs) Extended confirmed users, IP block exemptions 21,175 edits Adding short description: "Version of a transformer large language model" Tag: Shortdesc helper Next edit →
Line 1: {{Short description\|Version of a transformer large language model}} A '''1.58-bit Large Language Model''' ('''1.58-bit LLM''', also '''ternary LLM''') is a version of a [[Transformer (deep learning architecture)\|transformer]] [[large language model]] with weights using only three values: -1, 0, and +1. This restriction theoretically allows the model to replace costly multiplications with additions and reduce the storage memory. Since the end-task performance and [[Perplexity (LLM)\|perplexity]] of the 1.58-bit LLMs, at least for smaller model sizes (up to 3-4B parameters), are close to their "full precision" (16-bit [[FP16]] or [[BF16]]) counterparts, this design allows reaching the same [[artificial intelligence]] goals with much lower hardware requirements, latency, and training effort.{{sfn\|Ma\|Wang\|Ma\|Wang\|2024\|p=1}}{{sfn\|Friha\|Amine Ferrag\|Kantarci\|Cakmak\|2024\|p=5822}}{{sfn\|Hutson\|2024}}

1.58-bit large language model: Difference between revisions