2023-08-28

Context Size of LLM and Vector Database

Introduction

Large Language Models (LLMs) have been gaining a lot of attention in recent years. One challenge of LLMs is the generation of inaccurate information, known as "hallucination."

One approach to address hallucination is to expand the size of the context. The context size refers to the amount of text the model can process at once. Companies like Anthropic and OpenAI are conducting research to increase the capacity of context that can be provided to LLMs.

Drawbacks of Excessive Context

Expanding the context size allows the model to process more information at once. Theoretically, this could enable the model to tackle more complex problems.

However, increasing the context size comes with challenges such as:

  • Decreased Quality of Responses
    With a larger context size, LLMs need to process more information. However, this abundance of information might impair the model's ability to extract relevant information. Especially, there's an increased risk of the model getting confused by irrelevant information, leading to the generation of inaccurate answers, which is the risk of hallucination.

  • Increased Computation and Costs
    Along with the increase in context size comes an increase in computational requirements. This implies that if LLM providers charge per token, the cost per query increases with larger context sizes. In other words, processing more tokens requires more resources and incurs higher costs.

Introduction of Vector Database

As a novel approach to providing context to LLMs, the concept of a Vector Database is gaining attention. By utilizing a Vector Database, unnecessary information can be filtered out, and only relevant information is processed. This leads to improved efficiency per token and overall enhanced accuracy and efficiency of the model.

According to a Pinecone article, by including only relevant information in the context, they were able to maintain 95% accuracy while reducing the number of tokens by 25% when processing entire documents. This translates to a 75% reduction in operational costs.

References

https://www.pinecone.io/blog/why-use-retrieval-instead-of-larger-context/

Ryusei Kakujo

researchgatelinkedingithub

Focusing on data science for mobility

Bench Press 100kg!