The token limit in ChatGPT refers to the maximum number of tokens that can be processed in a single interaction or conversation with the model. Tokens are chunks of text that can be as short as a single character or as long as a word, depending on the language and encoding used.
In language models like GPT-3, including ChatGPT, text input is converted into tokens, and these tokens are what the model processes during inference. Each word, punctuation mark, and even spaces can be represented by one or more tokens.
Token limits can effect output response
The token limit is important because GPT-3 has a maximum capacity for how many tokens it can handle in a single computation. If an interaction’s total tokens, including both input and output, exceeds this limit, you will need to truncate, omit, or otherwise reduce the text until it fits within the constraint.
For example, if the token limit is set to 4096 and your input message contains 50 tokens, the model will have approximately 4046 tokens remaining to generate the response. Longer conversations may require more careful management of the token usage.
We’ve run across this very issue as we’ve been integrating AI powered chat functionality to Essentia AI. When you work with thousands of pages from multiple documents to query or generate content, you quickly run into the maximum token limit. Unless you can narrow the focus of the content from those pages, you often get incorrect, incomplete or no answer to a request.
Smaller chunks combined with Semantic Search
To overcome this, we’ve developed a methodology to optimize results by leveraging Essentia AI’s already powerful OCR and multi-document full text search capabilities with a semantic database. This allows us to split documents into smaller chunks that can be stored in the database. Whenever you want to generate an answer based on the documents you own, Essentia AI will find the relevant chunks from document(s) and feed that to the language model instead of whole pages of documents. You can also apply the filtering capabilities of Essentia prior to a query to make it even more precise.
We’ve found this method to be highly effective at getting relevant and accurate answers to questions posed to GPT from our own business documents. Stay tuned for more updates as we continue to experiment with the capabilities of combining Essentia AI and ChatGPT. If you are interested in finding out more, you can contact us at info@auriq.com and request a demo.