Content deleted Content added
→BitNet: Expanding article |
m →BitNet: ce |
||
Line 4:
== BitNet ==
BitNet creators did not use the post-training quantization of weights but instead relied on the new ''BitLinear'' transform that replaced the ''nn.Linear'' layer of the traditional transformer design.{{sfn|Wang|Ma|Dong|Huang|2023|p=1}}
In 2025, Microsoft researchers had released an [[open-weights]] and [[open inference code]] model ''BitNet b1.58 2B4T'' demonstrating performance competitive to the full precision models at 2B parameters and 4T training tokens.{{sfn|Ma|Wang|Huang|Zhang|2025|p=}}
|