MySQL vs. PostgreSQL comparison
Support is often asked for web hosting which of the two (MySQL and PostgreSQL) is better. As always, the answer is that it'll depend on your requirements.
That said, Anchor uses and recommends PostgreSQL.
Rough comparison of features
- Somewhat lighter-weight
- Has commercial support if you want it, like Redhat Enterprise Linux
- Widespread, plenty of apps that target it
- Blazing fast performance, particularly when predominantly SELECT queries, especially on MyISAM tables (this is somewhat anecdotal though, and will be highly dependent on the environment and workload)
- "Enterprise-grade" clustering and some degree of replication is available - this is a very different product from the MySQL Server that most people will be familiar with, so it's not just a drop-in replacement
- MyISAM tables don't support transactions and have little idea about ACID properties. If a MyISAM table breaks (frequent), you need to run myisamchk. This will take multiple aeons on a large table
You do have the option of using the InnoDB engine however, which is is transactional and ACIDic
- MyISAM tables only do table-level locking, so concurrency is bollocks
- The InnoDB engine uses multi-version concurrency, which is much better
- This noted, MyISAM is the default if you don't specify an engine at table-creation time, so it's easy for an inexperienced developer to go down the MyISAM path
- Last I checked, subquery performance in MySQL was pretty awful
Triggers are perverse, see Triggers in MySQL for the gory details
- Very feature rich
- GIS add-on functionality
- Flexible full-text search (brought into mainline in 8.2)
- Multiple replication options to suit your environment and requirements
- Powerful server-side procedural languages are available, and can be extended (PL/pgSQL is installed by default, but others like Perl, Python, Ruby, TCL, etc are available)
- Writing your own extensions is pretty easy
- Uses multi-version concurrency control, so concurrent performance rocks
Fully ACID compliant
- Commercial support through multiple third-parties is available
- Strong access-control framework
- Less-mature replication software
- This may not be quite accurate, it's more that there's no single de-facto method that's recommended, widely-appropriate and widely-known by admins/users
Relational Database Management System
Subjective. A lightweight application is one that doesn't use much resources for the features/speed it provides. On a sizeable database, performance may well be blazingly fast, but it's not lightweight if it eats several gig of memory to do it.
As well as the free MySQL that everyone knows and loves, there's Enterprise versions available that include more features, scalability, clustering, support and management tools. There's a feature comparison here, prices start at $600 per server, per year (probably USD).
MyISAM is a lightweight database storage engine used by MySQL. It's the default for new MySQL databases, meaning most hapless users are using it without even knowing it. Well-supported by applications that use it, but definitely not the best thing since sliced bread. MyISAM data is machine and OS-independent, which is nice. However, it doesn't have support for foreign keys (used for inter-table constraints and checks), and no transaction support (these are Very Good things to have). Performs well with mostly SELECT-based loads. Concurrency is poor if UPDATES are performed, due to table-level locking.
Included with MySQL, InnoDB claims to be ACID-compliant and offers row-level locking, transactions, foreign-key constraints and crash-recovery abilities. Concurrency is greatly improved with row-level locking, allowing non-overlapping SELECTs to take place at the same time as UPDATEs. InnoDB is said to be a bit slower and heavier than MyISAM, but this needs to be considered in light of usage requirements. MyISAM is probably unmatched for pure reads, but InnoDB offers much in the way of flexibility and capability. It's worth noting that the InnoDB engine is sometimes disabled in my.cnf by default sometimes. Anchor *strongly* recommends the usage of InnoDB over MyISAM, as when stuff breaks, it's much easier to work with and recovery is much easier.
To provide greater performance or reliability, MySQL can be replicated from master servers to slave servers. There is much flexibility in how this is done, and MySQL offers funky stuff that can make your brain hurt. As things go, it's also "pretty easy" to setup, probably no more than an hour of remote-hands charges for a customer.
Stands for Atomicity, Consistency, Isolation, Durability. Atomicity means that all tasks in a transaction (a single statement, or an explicit batch of statements) are either fully completed, or not completed at all. An example would be a money transfer between two bank accounts. If one account is credited, it's vital that the other is debited. Consistency refers to all "rules" being abided by, such as data-type constraints on a column, or assertions on the allowed range of values on a column. Isolation means that events within a transaction are isolated from other transactions until it is completed. It should not be possible to read the database and see an intermediate state of a transaction. Durability ensures that once a transaction has been completed and the application notified, the results will not be lost. This usually means writing to disk.
Multi-version Concurrency Control (MVCC)
PostgreSQL uses a different form of concurrency management, along with its own secret-recipe storage engine. MVCC works by taking a "snapshot" in time at the start of each transaction. This basically means that each transaction is fully isolated from the others. In conjunction with finely-grained row-level locking, this allows for a high level of concurrency and performance. While this in theory can lead to contention and deadlocks, PostgreSQL is able to detect and gracefully handle such situations. See "Isolation" for more details.
While the ANSI SQL standard specified four levels of transaction isolation, PostgreSQL offers two, namely READ COMMITTED and SERIALIZABLE. READ COMMITTED ensures that one transaction will only see data that has been committed by other transation. This in theory means the data being read could change while you're in the middle of reading. This is usually not a big problem, however, and is suitable for most applications while allowing good concurrency and speed. SERIALIZABLE ensures that all transactions are executed in serial, guaranteed to be unaffected by others. In practise most transactions will be parallelised where possible, but if any overlap occurs, PostgreSQL will abort one transaction and issue an error.
Locks are used to prevent concurrent access when one transaction needs to ensure that data will not be changed by other transactions. The MyISAM storage engine can only perform table-level locking, meaning that even a small UPDATE on a large table makes the rest of the table unusable by other transactions. This can have profound effects on usability if operations such as a repair/check are taking place on a live system. InnoDB and PostgreSQL both support row-level locking, meaning only transactions with overlapping areas will affect each other. This is a very good reason to not use MyISAM.
An SQL query made as part of another "parent" query to fetch data that will be processed/used in the parent. Most queries can be rewritten to avoid a subquery, which is preferable as subqueries tend to be a bit slow in MySQL.
MySQL and PostgreSQL both offer forms of access control to connecting clients, and both work a little bit differently. It roughly boils down to "user X can connect from IP address Y and has privileges A,B,C on object Z". Some will find MySQL's manner of doing this a little bit perverse and not entirely straightforward, due to the way it splits this metadata over a number of system tables. PostgresSQL does it a little more simply, and access control is done via a conf-file and standard GRANT/REVOKE statements. The conf-file specifies restrictions on how a given user (now knows as ROLEs) may connect to the database, after which GRANT statements are used to control exactly what they have access to.