Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> Its reliance on frequent random access patterns makes it incompatible with the sequential access nature of disk storage.

So use SSD? Are people seriously still trying to use spinning disk for database workloads in 2024?



An SSD does not solve the problem of page fault chasing, it just makes it slightly less bad. This is fundamentally a software architecture problem.

This is solved with latency-hiding I/O schedulers, which don’t rely on cache locality for throughput.


Hi, I'm the author of the article. The sequential access pattern of IVF makes prefetching and large block sequential reads much easier, whereas it's almost impossible for HNSW to achieve efficient prefetching.


Even SSD won't be fast enough for most indexes due to the random access nature. I've seen more than 1M iops on a huge nvme disk when use DiskANN index


Data farms are all about cost per GB. Spinning media is a fraction of the cost per.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: