https://t.co/PLL0hNPbdv
@teortaxesTex: Good guy DeepSeek gives us accelerated models The most interesting one here is Gemma4-12B, I presume vision included. Might be the best local model in its weight class now, by some margin Qwen 3.5 not included because DS[park] doesn't do linear attention I guess https://t.co/lKtcGBQH3w
