How do LLMs work?
As a layperson with zero technical background, the biggest challenge I have is properly contextualizing what just happened. There are a few mental biases that I’m fighting here:
ChatGPT and other AI tools seem so damn magical. Something that shouldn’t be possible.
ChatGPT in particular seems human at many points. Human brains are biased toward anthropomorphization, and the iterative chat interface is really throwing mine for a loop.
I am not used to seeing “probabilistic tools.” Traditional software is great at carrying out the same, specific instructions, over and over again. Generative AI tools are much more versatile but are less reliable.
Given what I’ve seen on social media, I’m pretty sure I am not alone. When we see something that shouldn’t be possible, seems human, and is developing at a fast pace, it’s easy to go around in circles. AGI is going to be here! We are all going to lose our jobs! But wait - this thing can’t even do simple math! And it lies! Never mind, it’s useless!
So much confusion.
I will cover each of the above topics individually. But to start, here are two videos that I found immensely helpful in understanding how LLMs work. I’ve found that having even a basic grasp of how it works has been helpful in clarifying a lot of mysticism and magical realism that surround tools such as ChatGPT. I hope it’s helpful for you as well.
p.s. For those that are in a more technical mood: What is ChatGPT Doing… and Why Does It Work?