• panda_abyss@lemmy.ca
    link
    fedilink
    English
    arrow-up
    53
    ·
    23 hours ago

    It is, gradient descent is what you use to find optimal model parameters.

    the algorithm takes a step, computes a gradient (whether any nearby options are better), then moves in that direction to improve the parameters, in a loop.

    • Septimaeus@infosec.pub
      link
      fedilink
      English
      arrow-up
      2
      ·
      2 hours ago

      Adding to the above, one of the challenges of GD is how to know whether the global optimum reported by the function isn’t just one of its many imposters (local optima). That’s the “big picture” he’s talking about. Working with Elon was a dead end.

      • panda_abyss@lemmy.ca
        link
        fedilink
        English
        arrow-up
        1
        ·
        51 minutes ago

        It seems obviously so. I don’t think I could hire someone who worked on Grok’s deep fake porn engine, ever.

        Between the working for a nazi, child porn issue, and all, it’s a bad fucking look.