> I'm not sure if this is a good idea, just like pretending a network request is...

> I'm not sure if this is a good idea, just like pretending a network request is a function call

This was my first reaction, too.

Perhaps there's something about data locality that makes it good for certain use cases?

> I still prefer to clearly explicit embedding, LLM generation, etc.

The bit that I usually need to control is how the retrieved results are formatted in the prompt. In order to make the context as information dense as possible, I might strip out certain words/l and/or symbols. But it depends on the query, so it can't be done at ingestion time.