More

yingfeng · 2025-03-04T11:43:52 1741088632

RAGFlow v0.17.0 now enables Agentic Reasoning for Deep Research, integrating any LLM — no RLM dependency required.

yingfeng · on July 17, 2024

Because colbert is not an end-to-end solution. As seen for RAGatouille, it has integrated colbertv2 into this repo. However, it's not a database, we implement tensor within infinity, aim to make an end-to-end solution for late interaction based ranking models.

yingfeng · on July 17, 2024

paradedb could also deliver three-way hybrid search through pg_vector, pg_sparse and pg_search. Compared with paradedb, infinity has following advantages:

1. Performance

The performance of pg_vector is far slower than vector search of Infinity due to the vector index design. The performance of pg_sparse is also slower than sparse vector search of infinity. The performance of pg_search is much slower than full text search of infinity. pg_search is based on Tantivy, which is much slower than the inverted index of infinity.

Detailed benchmark could be seen in this article : https://infiniflow.org/blog/fastest-hybrid-search or github repo.

2. Infinity has all the builtin implementation of the above three search approaches. These indices could work smoothly together with the executor of infinity. The users could use any combination of the search approaches, together with the fused ranking algorithms, in a very efficient approach.

3. Infinity has also builtin support for tensor, which makes it possible to deliver an in-database colbert reranker compared with the cross encoder based reranker outside. The colbert reranker could bring much benefits for search qualities.

4. Infinity is much easier to use, it could be deployed as either a standalone server, or as an embedded python library just through pip install.

5. Infinity is designed start from scratch, it does not have the burden of postgresql, and is evolving fast. It will run on cloud in very near future which could save the cost a lot.

philippemnoel · on July 17, 2024

Hey folks, ParadeDB co-founder here. Cool project! Just thought I'd chime in and clarify a few things:

1. pg_sparse is deprecated. pgvector released native sparse vector support with the `sparsevec` datatype, and ParadeDB no longer maintains pg_sparse. It has been this way for several months already.

I'd love to see a benchmark re: Tantivy. You claim that pg_search is much slower, but Tantivy is state-of-the-art for full-text search performance and the ParadeDB performance is robust. You can see our benchmarks in our repository README, where we compare ourselves to Elastic.

4/5. ParadeDB is Postgres by design. If you are adopting Postgres, which many are, then ParadeDB can be installed directly as an extension via logical replication on a read replica. This removes the need for ETL to a non-Postgres system, which drastically reduces operational burden.

Of course, if you're not using Postgres, ParadeDB is not designed for you and a tool like Infinity seems like a viable option alongside other standalone search engines.

yingfeng · on July 17, 2024

A noticeble work to demostrate the effectiveness of hybrid search is blended rag by IBM research (https://arxiv.org/abs/2404.07220), which has shown that 3-way hybrid search could achieve STOA over multiple evaluation datasets. And also, we've reproduced the results of blended rag, as shown in this article. Additionally, blended rag + colbert based reranker could have a much better results.

The major challenges are how to implement and manage such many indices within single database. That's why we build this database start from scratch. Infinity is actually a kind of "indexing" database, based on a columnar store. The executor also requires refined design to fuse these hybrid search approaches effectively.

yingfeng · on July 17, 2024

There are some vector databases that already include both dense vector search and sparse vector search, such as qdrant. A hybrid search of these 2 does not solve many problems well, such as exact queries. Moreover, according to our experiments, as seen in the article, the performance of dense vector + sparse vector, improves only a little bit. In addition to these 2 way recall, infinity offers bm25 as well as colbert reranker, which can make the ranking quality of the hybrid search much better.

yingfeng · on July 17, 2024

Hi, I'm one of the creators of infinity, and the article has mentioned about the sparse vector vs bm25. While the sparse vector performs well under some evaluations, it is obtained by training a model, which means that it can't fully represent all of the user's keywords/tokens, and those that don't appear in the training set, are truncated. So this is a very big impact for many enterprise vertical scenarios. And bm25 doesn't have such a limitation

philippemnoel · on July 17, 2024

BM25 is indeed way more important than these vector DBs will claim. At ParadeDB, we've observed significant use cases where customers need both

yingfeng · on July 11, 2024

From the viewpoint of RAG 2.0, during the stage of indexing or pre-processing, such approaches as knowledge graph is a MUST to resolve such issues as multi hop question answering, long text question answering as well as semantic gap between question and answers. As a result, you could look on graphrag as a component of future RAGFlow. Given the graph based orchestration, integrating graphrag into ragflow is not difficult. As a result, RAGFlow will support graphrag in very near future.

yingfeng · on June 19, 2024

From 0.8, RAGFlow(https://github.com/infiniflow/ragflow) will provide no code workflow orchestration. This article describes what kind of graph orchestration engine is needed, and how it can be used to implement Agentic RAG.

yingfeng · on May 31, 2024

RRF is a simple and effective means of fused ranking for multiple recall. Within our open source RAG product RAGFlow(https://github.com/infiniflow/ragflow), Elasticsearch is currently used instead of other general vector databases, because it can provide hybrid search right now. Under the default cases, embedding based reranker is not required, just RRF is enough, while even if reranker is used, keywords based retrieval is also a MUST to be hybridized with embedding based retrieval, that's just what RAGFlow's latest 0.7 release has provided.

On the other hand let me introduce another database we developed, Infinity(https://github.com/infiniflow/infinity), which can provide the hybrid search, you can see the performance here(https://github.com/infiniflow/infinity/blob/main/docs/refere...), both vector search and full-text search could perform much faster than other open source alternatives.

From the next version(weeks later), Infinity will also provide more comprehensive hybrid search capabilities, what you have mentioned the 3-way recalls(dense vector, sparse vector, keyword search) could be provided within single request.

testfoo444 · on May 31, 2024

Elastic Search is publishing a lot of interesting posts on this topic although with a bit of marketing for ex https://www.elastic.co/search-labs/blog/semantic-reranking-w...

yingfeng · on May 14, 2024

I read the paper and there are some similarities between ZenDB and RAGFlow, but also many differences.

The goal of RAGFlow is to use computer vision models to recognize the structure of a document, including diagrams and tables, and then to slice these structures into appropriate formats, such as table information combined with table definitions into text, which is then sent to the RAG system to be used for retrieval and answering questions.

ZenDB also makes use of computer vision models to understand documents, but it is mainly used to understand the semantic structure of documents, such as headings, phrases, etc., which also involves semantic-based text clustering. ZenDB also defines a query language specifically for querying these semantics. ZenDB is pretty useful to query and summarize long text.

I think some combination of RAGFlow and ZenDB for processing unstructured document data could be interesting to work on.