Revision as of 20:12, 22 April 2025 edit Dimawik (talk \| contribs) Extended confirmed users 2,445 edits →top: Expanding article ← Previous edit		Revision as of 01:30, 23 April 2025 edit undo Dimawik (talk \| contribs) Extended confirmed users 2,445 edits →Sources: added source Next edit →
Line 23: * {{cite web \|last=Morales \|first=Jowi \|title=Microsoft researchers build 1-bit AI LLM with 2B parameters \|website=Tom's Hardware \|date=2025-04-17 \|url=https://www.tomshardware.com/tech-industry/artificial-intelligence/microsoft-researchers-build-1-bit-ai-llm-with-2b-parameters-model-small-enough-to-run-on-some-cpus \|access-date=2025-04-21}} * {{cite \|last=Ouyang \|first=Xu \|last2=Ge \|first2=Tao \|last3=Hartvigsen \|first3=Thomas \|last4=Zhang \|first4=Zhisong \|last5=Mi \|first5=Haitao \|last6=Yu \|first6=Dong \|title=Low-Bit Quantization Favors Undertrained LLMs: Scaling Laws for Quantized LLMs with 100T Training Tokens \|date=2024 \|doi=10.48550/ARXIV.2411.17691 \|doi-access=free \|url=http://arxiv.org/pdf/2411.17691 \|access-date=2025-04-22}} * {{cite \|last=Wang \|first=Hongyu \|last2=Ma \|first2=Shuming \|last3=Dong \|first3=Li \|last4=Huang \|first4=Shaohan \|last5=Wang \|first5=Huaijie \|last6=Ma \|first6=Lingxiao \|last7=Yang \|first7=Fan \|last8=Wang \|first8=Ruiping \|last9=Wu \|first9=Yi \|last10=Wei \|first10=Furu \|title=BitNet: Scaling 1-bit Transformers for Large Language Models \|date=2023 \|doi=10.48550/ARXIV.2310.11453 \|doi-access=free \|url=https://arxiv.org/abs/2310.11453 \|access-date=2025-04-23}} [[Category:Large language models]]

1.58-bit large language model: Difference between revisions