The article says that SQLite may not be a good choice for large datasets. Assuming the other constraints (concurrent writes) are not a concern, why not? Assume a 750GB-1TB dataset of time series data, which fits on an SSD — what system would work better than SQLite?
Its not SQLite being slower, its a question of machine resources. When you use a Database cluster spanning across machines you are just throwing more resources at it, there no 'magic' per se going on in other tools. There is also likely better indexing which enables efficient querying.
Since the whole database is in one file though 140TB may be the filesize limit, searching through a index of 140TB data will still be a lot slower. That is the case even with client-server models or even with Hadoop.
Most people claiming to have a big data problem, actually don't have one. Its just bad understanding of SQL, coupled with NoSQL fashion which powers people to opt out of SQL. One more problem with SQL is, its a career path in itself. There is whole industry built around it, data base design, administration etc etc. And people who find this as high barrier to entry, take the easy way out and choose NoSQL based tools hoping it will act as a panacea- Only to re implement SQL badly at some point in their stack. But SQL has other advantages, it teaches you think about efficient representation of data. Which in turn leads to an overall better design of everything that connects it.
For most of your everyday so called 'Big data' problems, SQLite will work like charm. This covers most of the shops that claim to be doing big data work.
For the real big data problems, well then SQLite wasn't designed for it anyway.
> An SQLite database is limited in size to 140 terabytes (247 bytes, 128 tibibytes). And even if it could handle larger databases, SQLite stores the entire database in a single disk file and many filesystems limit the maximum size of files to something less than this. So if you are contemplating databases of this magnitude, you would do well to consider using a client/server database engine that spreads its content across multiple disk files, and perhaps across multiple volumes.