LLMs are not capable of creating anything, including code. They are enormous word-matching search engines that try to find and piece together the closest existing examples of what is being requested. If what you’re looking for is reasonably common, that may be useful. If what you’re looking for is obscure, you may get things that don’t apply. And the LLM cannot tell the difference. They can be useful but, unlike an LLM, you need to understand the context to use them safely.
I think the most interesting thing about LLMs is actually what they tell us about the repetitive nature of most of what we do.
LLMs are not capable of creating anything, including code. They are enormous word-matching search engines that try to find and piece together the closest existing examples of what is being requested. If what you’re looking for is reasonably common, that may be useful.
Just for common understanding, you’re making blanket statements about LLMs as though those statements apply to all LLMs. You’re not wrong if you’re generally speaking of the LLM models deployed for retail consumption like, as an example, ChatGPT. None of what I’m saying here is a defense about how these giant companies are using LLMs today. I’m just posting from a Data Science point of view on the technology itself.
However, if you’re talking about the LLM technology, as in a Data Science view, your statements may not apply. The common hyperparameters for LLMs are to choose the most likely matches for the next token (like the ChatGPT example), but there’s nothing about the technology that requires that. In fact, you can set a model to specifically exclude the top result, or even choose the least likely result. What comes out when you set these hyperparameters is truly strange and looks like absolute garbage, but it is unique. The result is something that likely hasn’t existed before. I’m not saying this is a useful exercise. Its the most extreme version to illustrate the point. There’s also the “temperature” hyperparamter which introduces straight up randomness. If you crank this up, the model will start making selections with very wide weights resulting in pretty wild (and potentially useless) results.
What many Data Scientists trying to make LLMs generate something truly new and unique is to balance these settings so that new useful combinations come out without it being absolute useless garbage.
I write software for a living and I have worked directly with LLM backend code. You aren’t wrong about the exceptions, but I think they actually reinforce my main point. If you play with the parameters you can make all kinds of things happen, but all of those things are still driven by the existing information it already has or can find. It can mash things together in random new ways, but it will always work with components that already exist. There is no awareness of context or meaning that would allow it to make intelligent choices about what it mashes together. That will always be driven by the patterns it already knows, positively or negatively.
It’s like doing chemistry by picking random bottles from the shelf and dumping them into a beaker to see what happens. You could make an amazing discovery that way, but the chances of it happening are very, very low. And even if it does happen, there’s an excellent chance that you won’t recognize it.
I’m in favor of using LLMs for tasks that involve large-scale data analysis. They can be quite helpful, as long as the user understands their limitations and performs due diligence to validate the results.
Unfortunately what we are mostly seeing are cases where LLMs are used to generate boilerplate text or code that is assembled from a vast collection of material that someone who actually knew what they were doing had previously created. That kind of reuse is not inherently bad, but it should not be confused with what competent writers or coders do. And if LLMs really do take over a lot of routine daily tasks from people, the pool of approaches to those tasks will stagnate, and eventually degenerate, as LLMs become the primary sources of each others’ solutions.
LLMs may very well change the world, but not it in the ways most people expect. Companies that have invested heavily in them are pushing them as the solutions to the wrong problems.
If you play with the parameters you can make all kinds of things happen, but all of those things are still driven by the existing information it already has or can find. It can mash things together in random new ways, but it will always work with components that already exist.
Or purely randomness, but the spirit of your point is sound. And if it is randomness it may be unique output, but the utility of that result may be zero.
There is no awareness of context or meaning that would allow it to make intelligent choices about what it mashes together. That will always be driven by the patterns it already knows, positively or negatively.
100% AGREE. LLMs are not “thinking”. LLMs are NOT the HAL 9000 from the movie 2001: A space odyssey
It’s like doing chemistry by picking random bottles from the shelf and dumping them into a beaker to see what happens. You could make an amazing discovery that way, but the chances of it happening are very, very low. And even if it does happen, there’s an excellent chance that you won’t recognize it.
100% AGREE.
I’m in favor of using LLMs for tasks that involve large-scale data analysis. They can be quite helpful, as long as the user understands their limitations and performs due diligence to validate the results.
Unfortunately what we are mostly seeing are cases where LLMs are used to generate boilerplate text or code that is assembled from a vast collection of material that someone who actually knew what they were doing had previously created. That kind of reuse is not inherently bad, but it should not be confused with what competent writers or coders do. And if LLMs really do take over a lot of routine daily tasks from people, the pool of approaches to those tasks will stagnate, and eventually degenerate, as LLMs become the primary sources of each others’ solutions.
100% agree. The degeneration is already occurring because bad LLM output is being fed back in as authoritative training data resulting in confidently wrong answers being presented as truth. Critical thinking seems to have become an endangered species in the last 20 years and I’m really worried that people are trusting LLM chatbots completely and never challenging the things they output but instead accepting them as fact (and acting on those wrong things!).
LLMs may very well change the world, but not it in the ways most people expect. Companies that have invested heavily in them are pushing them as the solutions to the wrong problems.
I think we have some of the pieces today that will make AI in general more trustworthy in the future. Grounding can go part way to making today’s LLMs more trustworthy. If an LLM claims something as fact, it should be able to produce the citation that supports it (outside of LLM output). That source can then be evaluated critically. Today’s grounding doesn’t go far enough though. An LLM today will say “I got that from HERE” and simply give a document. It won’t show the page or line of text and supporting arguments that would justify its arrival at its stated output. It can’t do these things today because I just described reasoning which is something an LLM is NOT capable of. So we wait for true AGI instead.
LLMs are not capable of creating anything, including code. They are enormous word-matching search engines that try to find and piece together the closest existing examples of what is being requested. If what you’re looking for is reasonably common, that may be useful. If what you’re looking for is obscure, you may get things that don’t apply. And the LLM cannot tell the difference. They can be useful but, unlike an LLM, you need to understand the context to use them safely.
I think the most interesting thing about LLMs is actually what they tell us about the repetitive nature of most of what we do.
Just for common understanding, you’re making blanket statements about LLMs as though those statements apply to all LLMs. You’re not wrong if you’re generally speaking of the LLM models deployed for retail consumption like, as an example, ChatGPT. None of what I’m saying here is a defense about how these giant companies are using LLMs today. I’m just posting from a Data Science point of view on the technology itself.
However, if you’re talking about the LLM technology, as in a Data Science view, your statements may not apply. The common hyperparameters for LLMs are to choose the most likely matches for the next token (like the ChatGPT example), but there’s nothing about the technology that requires that. In fact, you can set a model to specifically exclude the top result, or even choose the least likely result. What comes out when you set these hyperparameters is truly strange and looks like absolute garbage, but it is unique. The result is something that likely hasn’t existed before. I’m not saying this is a useful exercise. Its the most extreme version to illustrate the point. There’s also the “temperature” hyperparamter which introduces straight up randomness. If you crank this up, the model will start making selections with very wide weights resulting in pretty wild (and potentially useless) results.
What many Data Scientists trying to make LLMs generate something truly new and unique is to balance these settings so that new useful combinations come out without it being absolute useless garbage.
I write software for a living and I have worked directly with LLM backend code. You aren’t wrong about the exceptions, but I think they actually reinforce my main point. If you play with the parameters you can make all kinds of things happen, but all of those things are still driven by the existing information it already has or can find. It can mash things together in random new ways, but it will always work with components that already exist. There is no awareness of context or meaning that would allow it to make intelligent choices about what it mashes together. That will always be driven by the patterns it already knows, positively or negatively.
It’s like doing chemistry by picking random bottles from the shelf and dumping them into a beaker to see what happens. You could make an amazing discovery that way, but the chances of it happening are very, very low. And even if it does happen, there’s an excellent chance that you won’t recognize it.
I’m in favor of using LLMs for tasks that involve large-scale data analysis. They can be quite helpful, as long as the user understands their limitations and performs due diligence to validate the results.
Unfortunately what we are mostly seeing are cases where LLMs are used to generate boilerplate text or code that is assembled from a vast collection of material that someone who actually knew what they were doing had previously created. That kind of reuse is not inherently bad, but it should not be confused with what competent writers or coders do. And if LLMs really do take over a lot of routine daily tasks from people, the pool of approaches to those tasks will stagnate, and eventually degenerate, as LLMs become the primary sources of each others’ solutions.
LLMs may very well change the world, but not it in the ways most people expect. Companies that have invested heavily in them are pushing them as the solutions to the wrong problems.
Or purely randomness, but the spirit of your point is sound. And if it is randomness it may be unique output, but the utility of that result may be zero.
100% AGREE. LLMs are not “thinking”. LLMs are NOT the HAL 9000 from the movie 2001: A space odyssey
100% AGREE.
100% agree. The degeneration is already occurring because bad LLM output is being fed back in as authoritative training data resulting in confidently wrong answers being presented as truth. Critical thinking seems to have become an endangered species in the last 20 years and I’m really worried that people are trusting LLM chatbots completely and never challenging the things they output but instead accepting them as fact (and acting on those wrong things!).
I think we have some of the pieces today that will make AI in general more trustworthy in the future. Grounding can go part way to making today’s LLMs more trustworthy. If an LLM claims something as fact, it should be able to produce the citation that supports it (outside of LLM output). That source can then be evaluated critically. Today’s grounding doesn’t go far enough though. An LLM today will say “I got that from HERE” and simply give a document. It won’t show the page or line of text and supporting arguments that would justify its arrival at its stated output. It can’t do these things today because I just described reasoning which is something an LLM is NOT capable of. So we wait for true AGI instead.