I read the paper and there are some similarities between ZenDB and RAGFlow, but ...

I read the paper and there are some similarities between ZenDB and RAGFlow, but also many differences.

The goal of RAGFlow is to use computer vision models to recognize the structure of a document, including diagrams and tables, and then to slice these structures into appropriate formats, such as table information combined with table definitions into text, which is then sent to the RAG system to be used for retrieval and answering questions.

ZenDB also makes use of computer vision models to understand documents, but it is mainly used to understand the semantic structure of documents, such as headings, phrases, etc., which also involves semantic-based text clustering. ZenDB also defines a query language specifically for querying these semantics. ZenDB is pretty useful to query and summarize long text.

I think some combination of RAGFlow and ZenDB for processing unstructured document data could be interesting to work on.