• brucethemoose@lemmy.world
    link
    fedilink
    arrow-up
    8
    ·
    6 days ago

    Alibaba’s QwQ 32B is already incredible, and runnable on 16GB GPUs! Honestly it’s a bigger deal than Deepseek R1, and many open models before that were too, they just didn’t get the finance media attention DS got. And they are releasing a new series this month.

    Microsoft just released a 2B bitnet model, today! And that’s their paltry underfunded research division, not the one training “usable” models: https://huggingface.co/microsoft/bitnet-b1.58-2B-4T

    Local, efficient ML is coming. That’s why Altman and everyone are lying through their teeth: scaling up infinitely is not the way forward. It never was.