1.58-bit large language model: Difference between revisions

Content deleted Content added
BitNet: redirect
Fine-tuning LLMs to 1.58bit: extreme quantization made eas
Line 9:
 
In 2025, Microsoft researchers had released an open-weights and open inference code model ''BitNet b1.58 2B4T'' demonstrating performance competitive with the full precision models at 2B parameters and 4T training tokens.{{sfn|Ma|Wang|Huang|Zhang|2025|p=}}
 
== Post-training quantization ==
BitNet derives its performance from being trained natively in 1.58 bit instead of being quantized from a full-precision model after training. Still, training is an expensive process and it would be desirable to be able to somehow convert an existing model to 1.58 bits. In 2024, HuggingFace reported a way to gradually ramp up the 1.58-bit quantization in fine-tuning an existing model down to 1.58 bits.<ref>{{cite web |title=Fine-tuning LLMs to 1.58bit: extreme quantization made easy |url=https://huggingface.co/blog/1_58_llm_extreme_quantization#pre-training-results-in-158b}}</ref>
 
== Critique ==