If only Postgres had Virtual Generated Columns. Not being snarky; MySQL has had ...

charettes · on April 9, 2025

Virtual generated columns are not required to allow an index to be used in this case without incurring the cost of materializing `to_tsvector('english', message)`. Postgres supports indexing expressions and the query planner is smart enough to identify candidate on exact matches.

I'm not sure why the author doesn't use them but it's clearly pointed out in the documentation (https://www.postgresql.org/docs/current/textsearch-tables.ht...).

In other words, I believe they didn't need a `message_tsvector` column and creating an index of the form

  CREATE INDEX idx_gin_logs_message_tsvector
  ON benchmark_logs USING GIN (to_tsvector('english', message))
  WITH (fastupdate = off);

would have allowed queries of the form

  WHERE to_tsvector('english', message) @@ to_tsquery('english', 'research')

to use the `idx_gin_logs_message_tsvector` index without materializing `to_tsvector('english', message)` on disk outside of the index.

Here's a fiddle supporting it https://dbfiddle.uk/aSFjXJWz

sgarland · on April 9, 2025

You are correct, I missed that. In MySQL, functional indices are implemented as invisible generated virtual columns (and there is no vector index type supported yet that I'm aware of), but Postgres has a more capable approach.

charettes · on April 9, 2025

TIL I wasn't aware MySQL functional indices were implemented using virtual columns [0]

[0] https://dev.mysql.com/doc/refman/8.4/en/create-index.html#cr...

ahoka · on April 9, 2025

I had the same question when reading the article, why not just index the expression?

VoVAllen · on April 9, 2025

It's coming in the Postgres 18. https://www.depesz.com/2025/02/28/waiting-for-postgresql-18-...

sgarland · on April 9, 2025

Yes (very exciting!), but you won’t be able to index them, and that’s really where they shine, IMO.

Still, I’m sure they’ll get there. Maybe they’ll also eventually get invisible columns, though tbf that’s less of a problem for Postgres as it is for MySQL, given the latter’s limited data types.

danielheath · on April 9, 2025

You can index arbitrary expressions, though, including indexing the same expression used to define the invisible column, right?

CodesInChaos · on April 9, 2025

I hope OrioleDB will succeed in replacing Postgres' high maintenance storage engine with something that just works.

mastax · on April 9, 2025

MySQL logical replication isn’t quite foolproof but it’s vastly easier than anything PostgreSQL offers out of the box. (I hope I’m wrong!)

sgarland · on April 9, 2025

I think they’re about the same in complexity, other than that Postgres offers more options. MySQL did have logical replication long before Postgres, so I’ll give it that.

Postgres has one option for replication that is a godsend, though: copy_data. This lets you stand up a new replica without having to first do a dump / restore (assuming your tables are small enough / your disk is large enough, since the primary will be holding WAL during the initial sync). Tbf, MySQL doesn’t need that as much, because it offers parallel dump and restore, even on a single table.

brightball · on April 9, 2025

I mean, technically any database with triggers can have generated columns, but PostgreSQL has had generated columns since version 13. Current version is 17.

https://www.postgresql.org/docs/current/ddl-generated-column...

I can’t think of any advantage of a virtual generated column over a generated column for something like a search index where calculating on read would be very slow.

Postgres has been able to create indexes based on the output of functions forever though, which does the job here too.

sgarland · on April 9, 2025

The advantage is when you want to store something for ease of use, but don’t want the disk (and memory, since pages read are loaded into the buffer pool) hit. So here, you could precompute the vector and index it, while not taking the double hit on size.

brightball · on April 10, 2025

That’s the same benefit in Postgres as creating an index with the result of function.

senorrib · on April 9, 2025

That’s just syntax sugar for a trigger. Not really a big advantage.