This sounds very promising, but let me ask an honest question: to me, it seems l...

whakim · on July 12, 2024

Contrary to some of the sibling responses, my experience with pgvector specifically (with hundreds of millions or billions of vectors) is that the workload is quite different from your typical web-app workload, enough so that you really want them on separate databases. For example, you have to be really careful about how vacuum/autovacuum interacts with pgvector’s HNSW indices if you’re frequently updating data; you have to be aware that the tables and indices are huge and take up a ton of memory, which can have knock-on performance implications for other systems; etc.

CuriouslyC · on July 11, 2024

This is a read workload that can be easily horizontally scaled. The reduction in dev and infrastructure complexity is well worth the slight increase in DB provisioning.

montanalow · on July 11, 2024

Yep, one of our other projects, pgcat is exactly to help make the horizontal scaling as easy as possible.

https://github.com/postgresml/pgcat

nkmnz · on July 12, 2024

Running an LLM on the same server as your database is "a read workload that can be easily horizontally scaled"?

CuriouslyC · on July 12, 2024

You can use PL/Python to make API calls outside of the database, you just don't need a separate service to interact with the DB to orchestrate all your ML stuff, only endpoints.