• TechLich@lemmy.world
    link
    fedilink
    English
    arrow-up
    3
    ·
    10 hours ago

    I dunno about that… Very small models (2-8B) sure but if you want more than a handful of tokens per second on a large model (R1 is 671B) you’re looking at some very expensive hardware that also comes with a power bill.

    Even a 20-70B model needs a big chunky new graphics card or something fancy like those new AMD AI max guys and a crapload of ram.

    Granted you don’t need a whole datacenter, but the price is far from zero.