**200x Cheaper Vectors: New Embedding Model!**

It also outperforms OpenAI and Cohere models.

RAG is 80% retrieval and 20% generation.

So if RAG isn’t working, most likely, it’s a retrieval issue, which further originates from chunking and embedding.

Contextualized chunk embedding models solve this.

In this article, let’s dive in to understand what they are and how they address the common issues with RAG setups.

In RAG:

In fact, chunking also involves determining chunk overlap, generating summaries, etc., which are tedious.

There’s another problem!

Despite tuning and balancing tradeoffs, the final chunk embeddings are generated independently with no interaction with each other.

This isn’t true with real-world docs, which have long-range dependencies.