Revision as of 20:47, 9 July 2025 edit Dimawik (talk \| contribs) Extended confirmed users 2,445 edits →BitNet: redirect ← Previous edit		Revision as of 14:29, 10 July 2025 edit undo Artoria2e5 (talk \| contribs) Extended confirmed users, IP block exemptions 38,954 edits Fine-tuning LLMs to 1.58bit: extreme quantization made eas Next edit →
Line 9: In 2025, Microsoft researchers had released an open-weights and open inference code model ''BitNet b1.58 2B4T'' demonstrating performance competitive with the full precision models at 2B parameters and 4T training tokens.{{sfn\|Ma\|Wang\|Huang\|Zhang\|2025\|p=}} == Post-training quantization == BitNet derives its performance from being trained natively in 1.58 bit instead of being quantized from a full-precision model after training. Still, training is an expensive process and it would be desirable to be able to somehow convert an existing model to 1.58 bits. In 2024, HuggingFace reported a way to gradually ramp up the 1.58-bit quantization in fine-tuning an existing model down to 1.58 bits.<ref>{{cite web \|title=Fine-tuning LLMs to 1.58bit: extreme quantization made easy \|url=https://huggingface.co/blog/1_58_llm_extreme_quantization#pre-training-results-in-158b}}</ref> == Critique ==

1.58-bit large language model: Difference between revisions