Hello here! Thank you for taking your time to learn more about how Answerly uses tokens.
My goal is to give you a clear understanding of how we consume tokens, so you can better manage your usage.
What uses the most tokens?
Firstly, it’s important to point out that the biggest consumer of tokens is maintaining the conversation history with the agent.
As your conversation grows larger, every message you send to an agent will use more and more tokens, until it reaches the maximum tokens allowed by your selected model.
Token categories
There are three categories of token usage:
- Conversational
- Functional
- Embeddings
What are Conversational tokens?
This category encompasses the tokens that are consumed when you chat with an Agent.
For each message you send to an Agent, approximately 6000 tokens are allocated for datasets, personality, your business info, and other features such as language and behavior.
However, don’t worry – ‘allocated’ does not mean ‘spent’. The actual amount you spend is entirely dependent on the attributes of your specific conversation.
For instance, the size of your business info (max 1024 tokens), your personality (max 1024 tokens), and the diversity of your dataset.
What are Functional tokens?
Functional tokens facilitate the creation of Answerly features like Human Takeover and Quality Control. They always use a low-cost model (like gpt3.5), regardless of the model you choose in Language Learning Model (LLM) options.
- Human takeover spends around 256 tokens per visitor query.
- Quality control options like hallucinations and unrelated conversations each spend a total of 1024 tokens per request if activated.
What are Embedding tokens?
Embedding tokens are used for training your agent, and they are very cost-effective. For example, if you were to train an Agent with a standard knowledge base (comprising around 30 pages or 100,000 words) – the cost would be roughly 0.71 cents.
I hope this gives you a better understanding of how Answerly uses tokens.