Now Reading
Ask HN: It is 2023, how do you select between MySQL and Postgres?

Ask HN: It is 2023, how do you select between MySQL and Postgres?

2023-05-11 13:52:47

Postgres. Fast, full-featured, rock-solid, and a great community.

I think many of us can’t be bothered to go over (again) the issues we’ve had with MySQL in the past. The last straw for me was about ten years ago, when I caught MySQL merrily making up nonsense results for a query I’d issued that accidentally didn’t make any sense.

Very likely this particular issue, and others like it, have been fixed in the meantime. But I just got the sense that MySQL was developed by people who didn’t quite know what they were doing, and that people who really did know what they were doing weren’t ever likely to be attracted to that to fix it.

Having used postgres for the past decade, I tried MySQL for a side project to see whats changed with it. The sad answer is that it feels like nothing has changed – Oracle seems to have let what used to be a core technology of the industry languish.

I’m sure there are use cases where MySQL will be the better choice over postgres, but the future for the stack looks bleak.

There is almost no good reason to choose MySQL over PostgreSQL for any operational reason, I did a deep dive many moons ago (before major improvements in performance to postgres) and people were saying that MySQL was faster. I found that not to be true and the differences have only gained even more favour towards postgres.

also, I assume you mean MariaDB as MySQL is owned by Oracle and I would greatly implore anyone and everyone to avoid Oracle as if it has herpes.

There are a lot of historic problems with MySQL accepting invalid data, committing data even when there are constraint issues, and having very poor transactional isolation, I am not sure if these have improved.

Truthfully, the only benefits you gain from using MariaDB or MySQL are:

* Memory tables

* Having inconsistent replicas (which can be useful when you want your downstream to have less data than your upstream and you know it won’t get updated.)

Is there a good way to do case-insensitive accent-insensitive collations yet in postgresql? It’s been a holdup for using that for some use cases like searching for data, like a person’s name, in pgsql when the casing or accents don’t match perfectly.

Mssql has had this for ever, and I’m pretty sure MySQL has it as well.

I think that the only reasons to choose MySQL (or Maria) over Postgres for a new project are operational. Postgres is probably the better database in almost all respects, but major version upgrades are much much more of a pain on Postgres than on almost any other system I have ever used. That being said, I would choose Postgres pretty much every time for a new project. The only reason I would use Maria or MySQL would be if I thought I later would want to have something like Vitess, for which I think there isn’t really an equivalent for Postgres.

> but major version upgrades are much much more of a pain on Postgres than on almost any other system I have ever used.

This is a thread comparing MySQL and Postgres and your claim is that postgres is harder to do major version upgrades than anything you have used??

Context is important here, have you honestly actually upgraded a MySQL node? It’s a lesson in pain and “major” version changes happen on minor versions, like the entire query planner completely trashing performance in 5.6->5.7

Postgres has two forms of updates:

1) in place binary upgrade.

Fast, clean, simple, requires that you have the binaries for the old and the new database.

2) dump/restore.

Serialise the database into text files, load a new database and deserialise those files into it.

Slow, but works flawlessly & consistently with relatively low danger.

MySQL can only do option 2.

You can sort of fake an “update” by abusing the fact that MYSQLs replication offers no guarantees, so you can make a new server a replica; then roll over. But it is impossible to know what data was lost in that transition and MySQL will happily continue without ever telling you.

I have experienced this behaviour in large e-commerce retailers. MySQL was very popular for a very long time and I am intimately aware of operational best practices and how they are merely patching over an insane system.

MySQL doesn’t use SemVer. MySQL 5.6 vs 5.7 are different “release series”, and switching between them is considered a “major” version change.

MySQL absolutely fully supports in-place binary upgrades, saying otherwise is pure FUD. And the upgrade process in MySQL doesn’t even iterate over your table data in any way, so claiming it will cause “data loss” is also pure FUD.

At Facebook we automated rolling in-place updates of our entire fleet, with new point builds of fb-mysql going out several times a month, to the largest MySQL deployment in the world. Worked flawlessly and this was a full decade ago.

MySQL is widely considered easier to upgrade (relative to Postgres) because MySQL’s built-in replication has always been logical replication. Replicating from an older-version primary to a newer-version replica is fully supported. When upgrading a replica set, the usual dance is “upgrade the replicas in-place one at a time, promote one of the replicas to be the new primary while temporarily booting out the old primary; upgrade the old primary and then rejoin the replica set”.

Facebook has, at minimum, 3 teams maintaining MySQL. including a team who genuinely modifies it into submission. so much that they needed 1,700 patches to port their modified version to 8.0.

It is not relevant to the discussion to discuss how Facebook has managed to munge it to work reasonably well by pouring thousands of hours of engineer time into the effort; and MySQLs in-place upgrades absolutely do not work the way you describe consistently.

I know this because I have been in the code, and only after having experienced it. Maybe some of your lovely colleagues has helped out your particular version to be marginally more sane.

It genuinely must be nice having a dozen people who can work around these issues though, I certainly wouldn’t consider it an operational win, most companies have no DB automation engineers, or DB performance engineers or MySQL infrastructure engineers.

> Replicating from an older-version primary to a newer-version replica is fully supported.

Here also be dragons, as eluded to. I know it works quite often, I have used it.

FWIW: I ran global AAA online-only game profile systems on a handful of Postgres machines at about 120k transactions/s in 2016, I would have needed 5x as many instances to do it in MySQL, and this was only tiny part of our hosted infra.. which included a global edge deployment of game servers, auth servers, matchmaking, voice bridges and so on.

and we only had two people responsible for the entire operation

Please educate me on how my statement about MySQL upgrades is incorrect, I’d love to hear this. I’ve been using MySQL for 20 years, and while 2 of those years were at Facebook, 18 were not. I’ve performed MySQL upgrades in quite a wide range of environments, and what you’re saying here about lack of in-place upgrades or eating data is simply not aligned with reality.

I haven’t made any comments regarding performance comparisons, and have also run extremely large DB footprints with tiny teams, but I don’t see how any of that is relevant to the specific topic of new-version upgrade procedure!

Because it depends so much on your storage engine and schema, I have never seen it recommended because there are circumstances where you have data which is unrepresentative unless you are very careful or you don’t actually use the expressiveness of the DB.

I mean, I’ve also seem my share of “ERROR 1071 (42000) at line xxx: Specified key was too long; max key length is xxx bytes” randomly that basically means the machine needs manual recovery.

God help you if you don’t have innodb_file_per_table enabled to begin with too.

I know you want me to cite exactly. That will take me time to find because I stopped caring about MySQL 7 years ago, but I will dig for you.

FWIW while I use Postgres for my own development I’ve had to administer a number of MySQL servers for other devs. Upgrades have always been updating the MySQL package, restarting MySQL, then running `mysql_upgrade`, and restart the server again. I’m pretty sure the mysql_upgrade has even been missed a number of times and it’s worked fine.

I won’t say it’s impossible you ran into issues doing this, but it is the documented and supported upgrade path.

I love Postgres, but as someone whose maintained both for years, upgrades (at small scale) are the one area where I’d say MySQL has Postgres beat.

> I’ve also seem my share of “ERROR 1071 (42000) at line xxx: Specified key was too long; max key length is xxx bytes” randomly that basically means the machine needs manual recovery.

What? This error has nothing to do with upgrades, nothing to do with manual recovery, and hasn’t been a common problem for many many years.

In old versions of MySQL, it just meant you needed to configure a few things to increase the InnoDB index limit to 3072 bytes, instead of the older limit of 767 bytes:

innodb_file_per_table=ON
innodb_large_prefix=ON
innodb_file_format=barracuda

and then ensure the table’s row_format is DYNAMIC or COMPRESSED.

But again, all of this happens by default in all modern versions of MySQL and MariaDB.

Should it have been the defaults much earlier? Absolutely yes, MySQL used to have bad defaults. It doesn’t anymore.

Look, I get it, you’ve got your sunk cost in knowledge of MySQL and you’ve been on large support teams for it. Maybe you’re afraid I’m suggesting that this knowledge goes out the window. and it has gotten better, but I wouldn’t give my kids watered down led infused soft drinks just because I had suffered through led poisoning. I remember coming to blows with you in other threads over the years because you think MySQL can be saved or is totally fine, but honestly, just, no.

The error I gave is a similar one to the one I used to get with “major” upgrades that happened when Ubuntu decided it was time to upgrade.

It happens and I seriously never claimed that it was an ultra common problem, merely that upgrades in Postgres are more intentional and not painful except for a little extra work between major versions. The standard upgrade path within major versions; 9.x or 10.x or 11.x or 12.x is working just the same as MySQL, except I have much more experience of MySQL completely fumbling their “automatic unattended” upgrade or even the mysql_upgrade command.

Mostly because in the real world outside of engineering cultures databasen are massively abused, ISAM tables that are constantly updated, InnoDB ibdata1 in the terabytes, poor configs, replicas that have skipped a few queries, column changes inside a transaction that failed but actually modified data, it happens. Usually I am called in to clean the mess.

Major difference here is that Postgres doesn’t leave a mess, so I never have the kind of issues that I am describing in this thread with it, and you don’t because I am guessing that you’re there when they’re installed, someone with knowledge was actively maintaining. or you have a lot of people to help with shortcomings.

If you say what you’re trying to actually achieve I can help with a solution, but asking if it supports an arbitrary feature is not going to get the answer you want because depending on what you’re actually using an archive table for, Postgres might have something already built in but it will almost assuredly not be exactly like an archive table storage type.

Yes, for over half a decade at least, but “binlog” is a MySQL term, for postgresql it has the much more apt name: write-ahead log.

it is the only official, in-binary replication mechanism.

PostgreSQL every time, unless you have a specific reason, or as already pointed out, you’re sure you don’t just need SQLite.

PSQL in my experience has vastly better tooling, community, and is ahead of the curve tech wise. Same with extensions availability. Or perhaps you need to move away from it to say CockroachDB, or similar which is much easier.

Choose whichever one you/your team is more familiar with. Both are battle-tested and proven and will likely scale to whatever needs you have.

Having used both in production, I agree with the above. It’s not going to make or break your business or project.

I will also add that their are giant companies relying on both databases with great success. Facebook still runs on MySQL, and contribute back to it. Youtube I’m not sure about, but it did run on MySQL for a long time, well after it got massive. I’m sure examples exist for Postgres (Amazonm since they moved off Oracle?)

This is the correct answer.

Whichever one you start out with, you will be annoyed if you switch to the other one 5 years later. I started out with mysql, and when I started working on a postgres project, I was shocked at some of the ways it was lacking (how am I supposed to store email addresses in a database without collations?).

But when postgres folks grouse about stuff in mysql, I’m usually nodding along and saying “yeah, that would be nice”.

They’re both great options. If anybody on your team is already an expert at one of them, use that one.

Use MySQL if you’re expecting to have to do large operational database tasks re: migrations, maintenances, with no ability to go offline. gh-ost, percona-osc, the new INSTANT DDL stuff, is all quite far ahead in MySQL-land. Additionally, Vitess and Planetscale are making huge strides in MySQL performance. There are more people and guides in the world to help recover even the most mutilated of MySQL databases. MySQL gets more love in extremely large enterprise-level organizations, and it shows.

Use Postgres if you need some of the quite-good extensions, most notably PostGIS, or if you just want things to work; most documentation will be postgres flavored. Postgres gets more love from web-developers, and it shows.

This is wrong. MySQL does not support transactional DDL, so you cannot run migration and abort them in the middle.

Always use postgresql. It’s more logical, more extensible, saner, supports many extensions and is more predictable.

MySQL is inconsistent crap, that trades away consistency, correctness and stability for a little bit of performance in standard use cases.

Do yourself a favor and always use postgreSQL. I switched 15 years ago and never looked back. Have done 15-50 projects since in psql.

Friends dont let friends use #Horracle software.

That includes VirtualBox, MySQL, Horracle Cloud. Just step back. Walk away. Do not pass go, do not collect $20000 lawyers fees for unintended actions.

This is like asking how you’d choose between Emacs and Vim, Mac and PC, Monoliths and Microservices, Functional and Object Oriented .. you’re likely going to elicit a lot of passion and not a ton of objective information.

For most applications, either choice is going to be just fine. Use what your team has the most experience with. If you have no experience, try them both out and go with whatever you’re most comfortable with.

The difference is not significant enough to matter for most projects, esp just starting out. Hence, I mostly choose Postgres, since I don’t like Oracle as a company very much.

>MySQL and MariaDB

Again, what is this thing you are trying to compare? I just see MariaDB and MariaDB =)

However, because of my blindness to actual MySQL, I have totally not paid attention to any differences between the two. I guess “drop in replacement” isn’t actually true any more. Thanks for the info

I’ve been using MariaDB (MySQL) as a hobbyist for years. I just set up a couple myqsql servers with phpmyadmin on Raspberry PIs and use them for local development. Basic crud apps, etc.

I’ve always assumed that PostgreSQL is a step up, but never really bothered to look into what I get for the effort. Do I really get anything if I’m not trying to make apps at scale?

One big factor that keep us on MySQL is the MyRocks engine. We have huge databases with billions of rows. The MyRocks enable the use of it with heavy compression, that PostgreSQL can´t handle it, as it is much slower and uses 30x more disk usage, even with heavy TOAST tuning and/or ZFS compression.

The only instance where I’d choose mysql over postgres is if your database needs are very simple, but you need to be able to scale hard, and your devops aren’t skilled enough to manage an advanced postgres setup.

For a typical db.t3.xlarge instance, you’re talking about 29c/hour vs 27.2c per hour. That’s $157.68 as the total difference for one year’s runtime, when the whole instance cost for postgres would be $2540.4 for the year, or about 6%. The larger the machine, the closer to parity. Given the absolutely small difference, I hope this isn’t the dividing line in any commercial project.

Again, I don’t still know if this is the case, but you could use smaller instances with MySQL aurora than with Postgres, given the way our application worked it would have made a big difference for us if we had used mysql.

RDS/Aurora was our most expensive resource so we were looking at ways to cut that cost down and mysql was one option (though the way the app worked and the extensions that it relied on made it not a possibility.)

Also interested in the responses, not because it seems like a close decision but because I would pick postgres by default for anything (anything that isn’t simple enough to be done in sqlite).

If Postgres was that much better than MySQL then you would expect to see exact reasons on why to pick it. Every comment so far has not listed any reason.

Ok, here’s one: When you give MySQL invalid data, its standard behavior is to just silently store garbage in your database in many cases where PostgreSQL would’ve told you that your data is invalid.

MySQL’s handling of unicode has also been terrible historically, with way too many foot guns, but I don’t know if that may be better with recent versions.

People aren’t providing strong reasons because the question wasn’t “what are some objective reasons for picking one over another”, but “how do you pick between them”. People are simply answering the question OP asked, and a lot of people’s process is simply to pick PostgreSQL.

A lot of the MySQL issues historically have been fixed. UTF-8 is better now, invalid data handling is better (by default even! though your distros default config probably puts you back in permissive mode!) but regardless I’m still using Postgres every single time.

The fact is that MySQL historically was terrible for complex schemas with complex types while postgres was a pleasure to work with. MySQL had a huge performance edge for many years but that came at a cost of resiliency and reliability. MySQL has greatly improved on these key areas and Postgres has also made significant performance improvements. Systems these days are also so powerful that the database probably isn’t your benchmark.

Regardless, I always use Postgres every single time because I am just scarred from years of dealing with MySQL. What even is MySQL is also an open question at this point, there’s MySQL and MariaDB and Percona flavors and the founder of Percona was just ousted and I can’t be bothered to put in mental energy to untangle all this to even figure out what MySQL I should be developing against.

Compare this to Postgres where the community seems to have an extremely steady hand and constant progress. There’s no forks, there’s no infighting, there’s no drama, there’s a great regular release schedule with incremental improvements.

> unless you want to anthropomorphise your db and and hold it accountable for past behaviour

No one is holding the literal bits that make up the database executable accountable here, they are indicating they don’t trust the devs of MySQL/MariaDB to do a good job. Whether or not that is an accurate view on their part is arguable, but it’s pretty clear from context that they don’t think that several if/else statements had it out for them.

> When you give MySQL invalid data, its standard behavior is to just silently store garbage

This is a common misconception, but this hasn’t been the case for over 7 years. MySQL 5.7, released in Oct 2015, changed its defaults to enable strict sql_mode. All prior versions have hit end-of-life for support years ago, so there is no modern version of MySQL with this silent truncation behavior.

The only reason this problem persists is because Amazon RDS (all versions and forms, including Aurora) uses nonstandard default settings which disable strict mode!

That all said, I do believe Postgres is an excellent database, and a great choice for quite a large range of use-cases. But please, let’s compare 2023 Postgres with 2023 MySQL, not 2023 Postgres with much older MySQL. It’s only fair.

> MySQL’s handling of unicode has also been terrible historically, with way too many foot guns, but I don’t know if that may be better with recent versions.

See Also

Unicode generally “just works” if the charset in use is utf8mb4. As of MySQL 8.0, this is the default.

Ah, that’s good to hear. I haven’t looked seriously at databases other than SQLite for a long time, it would be interesting to see a more up to date evaluation.

A few reasons:

– Transactional DDL statements (schema modifications)

– Better support for UPSERT operations

– Better JSON support (including ability to index into JSON columns)

– the RETURNING statement to return data that was inserted/updated

In general Postgres is a lot more featureful than MySQL.

Postgres has a worse implementation of MVCC. It results in more bloat being produced, and slightly slower updates in a highly concurrent environment. Most businesses don’t operate at the scale where this matters. But on the flip side, the tooling and developer experience is much better.

I like SQLite. But I really wish its default behavior wasn’t to simply allow garbage data into the database. If I have an int column, don’t let me accidentally store a string.

https://www.sqlite.org/stricttables.html

“In a CREATE TABLE assertion, if the “STRICT” table-option key phrase is added to the top, after the closing “)”, then strict typing guidelines apply to that desk. … The STRICT key phrase on the finish of a CREATE TABLE assertion is barely acknowledged by SQLite model 3.37.0 (2021-11-27) and later.

  sqlite> create desk x (a int);
  sqlite> insert into x values ('whats up');
  sqlite> choose * from x;
  whats up
  sqlite> drop desk x;
  sqlite> create desk x (a int) strict;
  sqlite> insert into x values ('whats up');
  Runtime error: can't retailer TEXT worth in INT column x.a (19)

Yeah, I know about that, and I’m doing that on all my tables these days. It’s just sad that the default behaviour is to allow garbage data, and that if you ever forget to put ‘strict’ on your tables you’ll have a perfectly functional application with no sign that anything is wrong until you suddenly find corrupt data in your database.

At a certain scale, you’ll want replication or replication, which SQLite doesn’t really do AFAIK. At a scale below that, you’ll probably want to be able to have multiple web servers talking to one database server, which SQLite doesn’t really do either. I also think SQLite’s performance during heavy write workloads is worse than PostgreSQL’s?

Basically, AFAIK, SQLite becomes problematic once you need more than one computer to handle requests.

Here’s a good place to start:

https://www.sqlite.org/whentouse.html

https://www.sqlite.org/quirks.html

Full-scale RDBMSs, particularly Postgres, have heaps of goodies that both SQLlite would not have (or which it does have, however which are not so richly featured). As soon as you have gotten hooked on a couple of of those, the excellence will really feel much more clear.

In the meantime the tipping factors in favor of SQLite appear to be embedded methods, and its entire “service-less” structure and plain ease of use. Which is why it nonetheless will get a lot of love, for these contexts.

> By choosing SQLite.

It’s a good reminder to give some thought to whether one actually needs MySQL|Postgres. If not, SQLite is the way to go. Most of my code that uses a DB is using SQLite.

But obviously, if you actually need MySQL|Postgres then SQLite is not an option.

Postgres doesn’t have type hints and it might create a false impression of robustness until it messes up table statistics and does FULL SCAN against a table with millions of rows ignoring all indicies. It happens super rarely though, so if you run anything critical, you’ll probably be down only for a few hours once a year or two. Be ready for this to happen though.

Apart from that (and noticeably higher memory consumption), Postgres is most likely preferable.

MySQL is still ahead operationally (no vacuum, group replication, gh-ost, optimizer hints, etc.). I would choose it unless one of the Postgres extensions is a killer fit for your project.

Most answers seem written by fanboys rather than legit answers.

I would say go with what you know and are most comfortable with. You are more likely to get the better outcome.

This.

I’ve heard countless times that Postgres is better and I’ve watched talks where they show how loosey-goosey MySQL is with some things but I know how to backup, restore, tune, secure and replicate MySQL. I grok it’s permissions in depth more than I ever have with Postgres and I’ve even written a mysql plugin in C so I have that in my toolbox if I need it. So I’d by default, usually go with MySQL (or in some cases SQLite.) but if I didn’t have to administer or maintain it, and someone else was handling that I think I’d be fine with Postgres too.

Friends don’t let their friends choose Mysql 🙂

A super long time ago (decades) when I was using Oracle regularly I had to make a decision on which way to go. Although Mysql then had the mindshare I thought that Postgres was more similar to Oracle, more standards compliant, and more of a real enterprise type of DB. The rumor was also that Postgres was heavier than MySQL. Too many horror stories of lost data (MyIsam), bad transactions (MyIsam lacks transaction integrity), and the number of Mysql gotchas being a really long list influenced me.

In time I actually found out that I had underestimated one of the most important attributes of Postgres that was a huge strength over Mysql: the power of community. Because Postgres has a really superb community that can be found on Libera Chat and elsewhere, and they are very willing to help out, I think Postgres has a huge advantage over Mysql. RhodiumToad [Andrew Gierth] https://github.com/RhodiumToad & davidfetter [David Fetter] https://www.linkedin.com/in/davidfetter are extremely useful people.

I do not know that Postgres’ licensing made an enormous distinction or not however my notion is that there are a ton of third occasion merchandise primarily based on Postgres however personalized to particular DB wants due to the extra liberalness of the PG license which is MIT/BSD derived https://www.postgresql.org/about/licence/

A few of the PG primarily based third occasion DBs:

Enterprise DB https://www.enterprisedb.com/ – normal goal PG with some variants

Greenplum https://greenplum.org/ – Information warehousing

Crunchydata https://www.crunchydata.com/products/hardened-postgres – excessive safety Postgres for regulated environments

Citus https://www.citusdata.com – Distributed DB & Columnar

Timescale https://www.timescale.com/

Why Select PG as we speak?

In order for you higher ACID: Postgres

In order for you extra compliant SQL: Postgres

In order for you extra customizability to quite a lot of use-cases: Postgres utilizing a variant

In order for you the pliability of utilizing NOSQL at occasions: Postgres

In order for you extra product data reusability for different backend merchandise: Postgres

1) The choice is Postgres if you care about your data at all.

2) Yes, if you are already HUGE and have requirements on Vitesse then by all means use it. If so, you are not asking this question—see #1.

3) It’s a blog or something where it doesn’t matter, use a static site generator.

If you are using .NET then Postgres might be better choice. (much better support for drivers and default ORM).

If you need replication go with MySQL.

> What sort of functional requirements would cause you to choose one over the other?

Simple: I don’t like having headaches. Therefore, I chose postgres.

Why is mysql better for quick and dirty? I feel like pg extensions offer a lot more “dirtiness” running inside pg than mysql has.

yeah that’s a weird take. if you want quick and dirty you use sqlite and if you need something more you go with postgres. some replication things are nicer in mysql apparently but postgres is the better option for most workflows

Source Link

What's Your Reaction?
Excited
0
Happy
0
In Love
0
Not Sure
0
Silly
0
View Comments (0)

Leave a Reply

Your email address will not be published.

2022 Blinking Robots.
WordPress by Doejo

Scroll To Top