It is, gradient descent is what you use to find optimal model parameters.
the algorithm takes a step, computes a gradient (whether any nearby options are better), then moves in that direction to improve the parameters, in a loop.
Adding to the above, one of the challenges of GD is how to know whether the global optimum reported by the function isn’t just one of its many imposters (local optima). That’s the “big picture” he’s talking about. Working with Elon was a dead end.
It is, gradient descent is what you use to find optimal model parameters.
the algorithm takes a step, computes a gradient (whether any nearby options are better), then moves in that direction to improve the parameters, in a loop.
Adding to the above, one of the challenges of GD is how to know whether the global optimum reported by the function isn’t just one of its many imposters (local optima). That’s the “big picture” he’s talking about. Working with Elon was a dead end.
It seems obviously so. I don’t think I could hire someone who worked on Grok’s deep fake porn engine, ever.
Between the working for a nazi, child porn issue, and all, it’s a bad fucking look.