Reports

Imagine you're a detective trying to crack a tough case, and your memory is like your trusty notebook. The bigger your notebook, the more clues you can keep track of at once.

For GPT-3, with its 2048-token limit, think of it as a detective with a small notebook. This detective can only juggle about 1500 words at a time. It's like having a memory that quickly forgets older details as new ones come in!

Now, GPT-4 is like a super-detective with a giant whiteboard, handling anywhere from 8192 to 32768 tokens. This detective can manage entire case files, suspect lists, and witness statements all at once. It's like having a memory that never forgets.

Here's where it gets interesting with RAG. RAG doesn't just look forward and backward within the model's context window. Instead, it's like giving our detective a magical library card.

When using RAG, imagine our detective (the LLM) gets a new case (your question). Instead of relying only on their memory (context window), they rush to the library (an external database) and quickly grab relevant books (documents) that might help solve the case.

The detective then reads these books and combines this new information with what they already know to crack the mystery. The size of the context window determines how much of this combined information (from memory and the library) the detective can handle at once.

So, in your GPT-3 example, it's not about looking 2048 tokens forward and backward. It's about how much total information (from the retrieved documents and the question itself) can fit into that 2048-token window. If the information from the documents and the question exceeds 2048 tokens, some of it will have to be left out—like our detective having to choose which clues to focus on because their notebook is full.

That's why larger context windows, like in GPT-4, are so exciting. They're like giving our detective a bigger brain to work with more clues at once. Furthermore, RAG isn't limited to what's in the model's original training data. It can pull fresh info from its "library," making it great for up-to-date facts. It's like our detective having access to a constantly updated criminal database.

So next time you're using an AI with RAG, picture a super-smart detective racing through a magical library, piecing together clues to answer your questions. The bigger their memory, the more complex the mystery they can solve.

79270979