📑 Table of contents

PageIndex : vector-free RAG that reasons instead of searching

Deep Tech 🟢 Beginner ⏱️ 12 min read 📅 2026-05-07

PageIndex : vector-free RAG that reasons instead of searching

🔎 Traditional RAG just started showing its age

Retrieval-Augmented Generation has dominated the AI landscape for three years. The recipe seemed set in stone: slice a document into chunks, transform them into embeddings, store everything in a vector database, then query using cosine similarity.

The problem? This approach is hitting its limits. Chunks break the logic of documents. Vector similarity confuses semantic proximity with actual relevance. And when the context requires a global understanding of a 200-page report, the system drowns.

PageIndex, released by VectifyAI in late May 2026 on GitHub, proposes a radical paradigm shift. Instead of searching for nearby vectors, the system builds a hierarchical index of your documents and lets the LLM reason over it. Result: 98.7% accuracy on FinanceBench, a financial benchmark known for being ruthless.

953 stars in a single day on GitHub. The message from the community is clear: vector RAG was never an end in itself, it was a stopgap.


The essentials

  • PageIndex completely eliminates vector databases and artificial chunking. It builds a tree-based index (like a table of contents) of your documents.
  • Retrieval is done via LLM reasoning through a tree search inspired by AlphaGo, not by cosine similarity.
  • The system achieves 98.7% on FinanceBench, establishing a new state-of-the-art on this benchmark.
  • A chat version (chat.pageindex.ai), an API, an MCP server, and a file system scalable to millions of documents are already available.
  • The "human-like" approach guarantees complete traceability: every answer references the exact page and section of the source document.

Tool Main usage Price (May 2026, check on github.com) Ideal for
PageIndex (open-source) Vectorless RAG on your own documents Free (MIT) Developers who want to control their RAG pipeline
PageIndex Chat Ready-to-use conversational interface Free (with limits) Quick tests, demos, non-technical users
PageIndex API Integration into third-party applications Free (self-hosted) / SaaS via pageindex.ai Productions requiring a REST API
PageIndex MCP Connection to AI agents (Claude, etc.) Free (self-hosted) Agentic workflows with document access
PageIndex File System Scaling to millions of documents Free (self-hosted) Enterprises with large document volumes

The fundamental problem of vector RAG

Classic RAG relies on a fragile assumption: two texts that are semantically close in the embedding space are relevant to each other. This assumption holds true for simple factual questions. It falls apart as soon as analytical questions are asked.

Take an annual financial report. Question: "What is the foreign exchange hedging strategy described by management, and how does it compare to the risks identified on page 47?"

A vector RAG will search for individual chunks containing words like "exchange", "hedging", "risk". It will likely return disconnected pieces. The answer will be approximate, or even wrong, because the system never understood the argumentative structure of the document.

Chunking makes this problem worse. Cutting text every 512 tokens is a blind operation. It can separate a premise from its conclusion, split a table in two, or isolate a reference from its context. It's like reading a book by picking paragraphs at random.

To understand these limitations in depth, our article on RAG for dummies details exactly how contextual memory works — and how it breaks when chunks are poorly calibrated.


Comment PageIndex reasons instead of searching

PageIndex completely inverts the logic. Rather than projecting text fragments into a vector space and then searching for the closest ones, it builds a structured representation of the entire document.

The hierarchical tree index

The system analyzes each document and generates a tree index, comparable to a detailed table of contents. Each node of the tree represents a section, with its metadata: page number, title, subtitles, content summary.

Concretely, for a 150-page PDF, PageIndex does not create 300 chunks. It creates a mind map of the document, with thematic branches and leaves corresponding to the actual sections. No artificial slicing.

The tree search inspired by AlphaGo

This is where the connection to AlphaGo becomes concrete. When a question arrives, PageIndex does not do a simple lookup. It launches a tree search reasoning process where the LLM explores the branches of the index, evaluates the relevance of each section, and progressively descends toward the most relevant passages.

The LLM reads the high-level summaries, decides which branch to explore, reads the subsections, refines, and finally lands on the exact content. Exactly like a human would flip through a book: you read the table of contents, identify the promising chapter, and go to the right page.

The result is a context-aware retrieval. The system understands not only where to search, but why that section is relevant to the question asked.


Benchmarks: 98.7% on FinanceBench

FinanceBench is considered one of the most demanding benchmarks for RAG systems. It contains complex financial questions requiring the cross-referencing of information from different annual reports.

PageIndex's score of 98.7% is not a marginal improvement. It is a leap that changes the very nature of the problem. The best vector RAG systems plateau around 80-85% on this benchmark, and that is by aggressively optimizing chunk size, the embedding model, and the reranking strategy.

PageIndex achieves these performances without any of these optimization levers. No chunk size to tune, no embedding model to select, no reranker to add. The architectural simplicity is striking.

The key factor is reasoning. On FinanceBench, questions often require following a logical thread through several sections of the same document. A chunk retriever cannot do this. A tree reasoner does it naturally.


Use cases where PageIndex outperforms traditional RAG

Long and structured documents

Annual reports, legal contracts, technical documentation, scientific papers. Anything with a strong internal structure (chapters, sections, appendices) is a natural candidate for PageIndex. Vector RAG destroys this structure; PageIndex exploits it.

Analytical and multi-step questions

"Compare the Asian expansion strategy described in 2024 with the actual results presented in 2025." This type of question requires locating two different sections, understanding them in their context, and then synthesizing. Vector RAG fails structurally. PageIndex excels.

Regulatory traceability

In regulated sectors (finance, healthcare), every answer must be traceable to a precise source. PageIndex systematically references the original page and section. This is not an add-on; it is inherent to the method: reasoning goes through the index, so traceability comes for free.

Documents with tables and figures

Vector embeddings handle tables and figures poorly, which lose their meaning when isolated from their captions or narrative context. PageIndex preserves the integrity of these elements because it does not chunk.

If you are hesitating between this approach and other model enrichment methods, notre comparatif fine-tuning vs RAG vs prompting helps to position each technique according to the context.


The PageIndex Ecosystem: Much More Than a GitHub Repo

PageIndex Chat for Testing Immediately

The team launched chat.pageindex.ai, a conversational interface where anyone can upload documents and interact with them via vectorless RAG. It's the fastest way to realize the qualitative difference compared to a classic RAG chat.

The experience is surprising. The answers are more precise, more grounded in the document, and above all, the references are exact. No more "page 12" pointing to irrelevant content.

The API and the MCP Server

For integrations, PageIndex exposes an API via pageindex.ai/developer. But even more interesting: the MCP server allows you to connect PageIndex directly to AI agents like Claude Desktop or any agent based on the Model Context Protocol.

A Claude agent can thus query your corporate documentation via PageIndex, with the guarantee that the retrieval is based on reasoning and not on vector similarity. It's a considerable reliability gain for agentic workflows.

PageIndex File System for Scaling

For organizations that have millions of documents, PageIndex File System offers a scalable architecture. The hierarchical index lends itself naturally to distribution: each document has its tree, and the system can manage a forest of trees without increasing complexity.

The Agentic Vectorless RAG Example

VectifyAI published an example using the OpenAI Agents SDK that combines PageIndex with an autonomous agent. The agent can iterate on its search, refine its understanding, and ask follow-up questions — all relying on PageIndex's tree reasoning for each retrieval step.

It's an architecture that is starting to look like a genuinely intelligent document research assistant, not a search engine wrapped in a chatbot.


Vectorless vs Vector-based: understanding the conceptual difference

The distinction is not merely technical. It is epistemological.

Vector RAG assumes that meaning can be captured in a vector. That geometric proximity in a high-dimensional space reflects semantic relevance. This postulate works for classification and recommendation. It shows its limits for document reasoning.

PageIndex assumes that the meaning of a document lies in its structure and in an LLM's ability to navigate this structure. The document is not a collection of fragments to be projected. It is a logical tree to be explored.

The library metaphor is illuminating. Vector RAG is photocopying all the pages of all the books, throwing them in the air, and picking up the sheets that look like your question. PageIndex is using the library catalog, finding the right aisle, the right book, the right page — exactly like a librarian.

Of course, this approach has a cost: each query requires several reasoning steps (the tree search), which consumes more tokens than a simple embedding call. But the drop in latency observed with recent models more than compensates for this extra cost. And above all, the quality of the responses makes the cost comparison irrelevant for critical use cases.

For projects where budget is a blocking factor, our guide on free models without sacrificing quality shows how to optimize inference costs without losing performance.


The current limitations of PageIndex

Honesty requires pointing out the weaknesses. First, the dependency on the reasoning model. PageIndex delegates the tree search to the LLM. If the LLM is weak (small model, bad prompt), the index is useless. A vector RAG system with a good embedding can work with a lightweight reranking model. PageIndex requires a robust reasoning model.

Next, the index build time. For a complex document, generating the hierarchical tree takes more time than splitting into chunks and generating embeddings. It is a heavier initial investment, partially offset by the fact that there is no chunk size to re-optimize later.

Finally, unstructured documents. If you have a bunch of informal notes, email threads, or structureless transcripts, the hierarchical index of PageIndex provides less value. The system is designed for documents with an internal organization. On flat content, the advantage over vector RAG is reduced.


❌ Common mistakes

Mistake 1: Confusing PageIndex with a simple reranker

A reranker takes the results of a vector retriever and reorders them. PageIndex does not rerank anything: it entirely replaces the retrieval step. There are no vectors, no cosine similarity, no candidates to rerank. Applying a reranker on top of PageIndex makes no sense.

Mistake 2: Using PageIndex on unstructured documents

If your data consists of logs, tweets, or text fragments without internal organization, the tree index won't bring much to the table. Vector RAG remains suited for these cases. PageIndex shines on documents that have a table of contents, chapters, sections—in short, a textual architecture.

Mistake 3: Underestimating the cost of reasoning per query

Each query with PageIndex triggers a tree search which can involve multiple LLM calls. At a high query volume, the token bill can be significant. You need to size accordingly and not replace an optimized vector system with PageIndex without calculating the cost per query.

Mistake 4: Ignoring the underlying reasoning model

PageIndex is not magic. If the LLM performing the tree search is unable to understand the structure of your documents (highly technical jargon, underrepresented languages), performance will drop. The choice of the reasoning model is a critical parameter, not a detail.


❓ Frequently Asked Questions

Does PageIndex completely replace vector databases?

No. For pure similarity search (finding documents "like this one"), vector databases remain relevant. PageIndex replaces vector similarity-based RAG for analytical questions on structured documents.

Which LLM should be used with PageIndex?

A model with good reasoning capabilities. Benchmarks were run with state-of-the-art models. Small open-source models can work but with a drop in quality proportional to their reasoning capacity.

Does PageIndex handle scanned PDFs?

Not directly. Like any RAG system, it first requires an OCR step. The tree index is built on structured text, not on page images.

Can PageIndex be combined with fine-tuning?

Yes, and it is even relevant. Fine-tuning can improve the model's reasoning for a specific domain, while PageIndex handles the retrieval. The two approaches are orthogonal.

Is the 98.7% score on FinanceBench reproducible?

The score comes from VectifyAI's official GitHub repository (May 2026). Reproducibility depends on the reasoning model used and the exact configuration. It needs to be verified on your own infrastructure.


✅ Conclusion

PageIndex does not offer a better version of vector RAG — it proposes abandoning it. By replacing cosine similarity with tree reasoning on a hierarchical index, it solves the structural problems that have been plaguing classic RAG from the very beginning. The 98.7% on FinanceBench and the 29,284 stars in just a few days on GitHub confirm that the shift from "searching" to "reasoning" is not a gimmick, it's a paradigm shift. Vector RAG had a five-year head start; it just lost three of them in one go.