The relational data model organizes data in tables of rows and columns and dominates in database management tools. Today, there are other data models including NoSQL and NewSQL but relational database management systems (RDBMSs) still dominate for storing and managing data.
This article compares and contrasts two widely implemented open-source RDBMSs: PostgreSQL and MySQL. We'll focus on exploring the data types that each use, their advantages and disadvantages, as well as situations where they are best used.
PostgreSQL is an object-relational database management system (ORDBMS) with an emphasis on extensibility and standards compliance. MySQL is an open-source relational database management system (RDBMS). Both are supported by all major cloud service providers, including Amazon, Google, & Microsoft.
MySQL is perceived to be much quicker but offers fewer features. PostgreSQL is believed to have a deeper feature set. Some programmers think of PostgreSQL as being similar to Oracle, and it is often favored by those who know Oracle products.
MySQL has a strong following that likes its speed and ability to create new projects quickly. The truth is that the current releases of both products, MySQL 8 and PostgreSQL 12 (depending on the distribution version), have plenty of speed, power and dynamic new features. Let's take a closer look.
What is a Database Management System (DBMS)?
Databases are models of clusters of information or data. A database management system is a program that interacts with a database. A DBMS controls access to a database, writes data, runs queries and performs other tasks related to database management. One common misconception is that databases and database management systems are the same thing.
The difference between a database management system and a database are as follows:
- A database management system is software that interacts with a database.
- Whereas a database is a collection of data and doesn't necessarily have to be stored on a computer.
Database systems have underlying models that structure how data is stored and accessed. Relational database management systems employ a relational data model. This means the data is organized into tables, formally known as relations. A relation is a set of tuples (rows) in a table and each tuple shares a set of attributes (columns).
Each attribute (column) is assigned a data type that dictates the kind of entries allowed in the attribute. Different RDBMSs have different data types that aren't always interchangeable.
Structured query language (SQL) is a popular way for relational databases to manage and query their data. That said, many RDBMSs have their own dialect of SQL which can have certain limitations and/or extensions. Extensions are generally built to allow users to perform complex queries that would be difficult with standard SQL. Common data types that you may be familiar with from programming languages include dates, strings, integers and booleans.
In some DBMSs, numeric data like integers are signed meaning they can represent positive and negative numbers and others are unsigned meaning they only represent positive numbers.
What is a Database Constraint?
A database administrator may constrain what values can be entered into an attribute (column) or the entire tuple (row). Common SQL constraints are:
- UNIQUE: No two entries in a column can be the same
- NOT NULL: Column cannot have null values
- PRIMARY KEY: Every entry in a column is not null and is unique
- FOREIGN KEY: Refers to the primary key in another table
- CHECK: Constrains the range of values that can be entered in a column
- DEFAULT: Provides a default value for a given column
- INDEX: Enables faster retrieval of data
PostgreSQL (or Postgres) is said to be the most advanced open-source relational database, maintained by the PostgreSQL Global Development Group and its prolific community support. Its goal is to be highly extensible and standards-compliant. Being standards-compliant makes it very easy to switch your database to another type of SQL server. It has features like table inheritance and function overloading, often associated with object databases. Companies that use PostgreSQL include Apple, BioPharm, Etsy, IMDB, Macworld, Debian, Fujitsu, Red Hat, Sun Microsystem, Cisco and Skype.
PostgreSQL has developed strong features to boost performance. They include:
- Several indexing functions including functional, index-only scans, partial, functional and multiple-index-counting.
- Asynchronous commit.
- Cost-based optimizer.
- Asynchronous as well as synchronous replication.
Recent releases have added more than 70 new performance enhancements. They are the direct result of a commitment to improving performance by taking out identifiable problem areas. The program can compress and decompress the information as it runs to optimize disk space. Another advanced feature developed recently is the ability to set asynchronous commits for the whole system, or per user or transaction. This lets different kinds of transactions run together without the need to choose in advance which storage engine type to use for each table.
Postgres is very efficient at concurrency and does not have read locks thanks to Multiversion Concurrency Control (MVCC) that enables atomicity, consistency, isolation, and durability of transactions (ACID compliance). As an example of PostgreSQL’s flexibility, it supports key-value data pairs to be stored inside single records, a feature made possible through its hstore data type.
PostgreSQL is very strict in coding--MySQL not so much. While PostgreSQL was created as an academic project made to mimic a huge database, MySQL started out small and has evolved to be big enough to handle large websites. Leading cloud vendors have PostgreSQL support on their platforms. There is also a REST API for any Postgres database.
PostgreSQL also executes security protocols based on the Access Control Lists (ACL) for user operations such as connections and queries.
MySQL databases are the world's most widely used databases. It is easy to set up and requires minimal fine-tuning to achieve excellent performance levels. The name is derived from a girl's name My, a daughter of one of the co-founders, Michael Widenius. The source code is available under the GNU GPL (General Public License). The project is now owned and administered by Oracle Corporation. Previous to Oracle's involvement, it was run by a Swedish company named MySQL AB. Companies that use MySQL include GitHub, US Navy, NASA, Tesla, Netflix, WeChat, Facebook, Zendesk, Twitter, Zappos, YouTube, and Spotify.
MySQL is widely used as part of the LAMP (Linux, Apache, MySQL, PHP) stack of open-source programs that form the backbone of many websites on the Internet including:
- Google (excluding Search)
The core MySQL program does not have any Graphic User Interface tools to operate the software or create and maintain databases. There are several front-end programs available to make MySQL easier to use. The official program for this purpose is MySQL Workbench, owned and developed by Oracle but free to use for everyone.
MySQL works on many Windows, Linux and Mac platforms including:
- MAC OS X
- Novell Netware
- SCO Unixware
- Microsoft Windows
MySQL features include cross-platform support, updatable views, SSL support, stored procedures, information schema, query caching, partitioned tables and full-text indexing through the MyISAM program. MySQL was written in C and C++. MySQL supports programming languages such as C, C++, Delphi, Perl, Java, Lua, .NET, Node.js, Python, PHP, Lisp, Go, R, D, and Erlang. MySQL also supports hash indexes and the InnoDB engine uses inverted lists for FULLTEXT indexes. It is highly flexible and reasonably flexible, making it a popular choice for web applications.
Speed and Data Security
Data security considerations are crucial when selecting a database management system, and both of these two SQL database management systems have ample security features to offer. Often, developers think that data security breaches like SQL injections reflect a weakness in the database software they are using when, instead, these problems are really a result of improper technical implementation in their code. Regardless of whether your team settles on MySQL or PostgreSQL, you have to implement proper data management such as sanitization of inputs to prevent SQL injections and other security problems.
MySQL gives you features like user management to control the queries that users can run on your databases, which allows you to limit privileges to only those users who should have them. This is implemented via MySQL ROLES and PRIVILEGES. You can set password authentication for users, as well as to authenticate users using other authentication schemes in MySQL Enterprise. These schemes include Linux PAM (Pluggable Authentication Modules), native Windows services, and LDAP (Lightweight Directory Access Protocol).
PostgreSQL has a robust system for user and group management, and you can use GRANT commands to assign SQL PRIVILEGES to ROLES. Client authentication is controlled through a pg_hba.conf configuration file, which contains a set of IP addresses and usernames that are allowed access. SSL-based connections can also be used, along with external authentication systems like PAM and Kerberos.
Performance optimization is somewhat more straightforward with PostgreSQL than it is with MySQL, thanks to the former’s compliance with SQL standards. This makes it easy to compose efficient queries, as well as use the range of indexes available in the database system, such as Bitmap, B-Trees, and full-text. With MySQL, indexes are stored in B-Trees, which allow efficient logarithmic time operations for data insertions, deletions, as well as other operations. PostgreSQL has support for materialized views, which help speed up your database views, but this is not supported natively in MySQL. You can use techniques like SQL query caching and full-text search (FTS) to get better MySQL performance.
Replication, Clustering, and Scalability
The advantages of distributed computing mean that your database must be able to support a distributed architecture as well if that’s what you have in place. PostgreSQL and MySQL fit in the mold of distributed database management systems (DDBMS). When it comes to replication, MySQL takes a master-slave approach where SQL queries are used to replicate data from a master to one or multiple slaves in a database cluster. MySQL’s architecture has the distinction that slaves can perform database writes, which comes in handy when the master is unavailable for any reason.
PostgreSQL uses streaming WAL (Write Ahead Log) files for replication, which tend to be faster and more reliable than MySQL’s approach. PostgreSQL replication is asynchronous by default, but synchronous replication is also available. Since version 10, PostgreSQL has supported logical replication, an approach that uses WAL records to conduct replication at the table level.
Both PostgreSQL and MySQL support partitioning to break large tables into smaller parts where your queries can be processed faster. With PostgreSQL, two kinds of partitioning are available, declarative partitioning and partitioning by inheritance. The latter uses rules or triggers implemented as functions, whereas the former relies on partitioning declarations that specify the partitioning method and the columns or expressions to use as a partitioning key. Partitioning by inheritance allows more flexible approaches. This scheme is not available in MySQL. MySQL uses solely declarative partitioning, but even this is available only with the InnoDB and NDB storage engines. With its multiple indexing options, PostgreSQL offers more options for tuning your database performance as table segments grow large in size.
Strengths and Weaknesses
While, for a lot of applications, these two relational database management systems can be used interchangeably, their unique characteristics and strengths play out differently. PostgreSQL’s first big distinction from MySQL is its better support for SQL standards, with PostgreSQL 12 supporting at least 160 out of 179 features required for full SQL:2016 Core conformance. PostgreSQL is fiercely committed to open source, and this is an area where the community is quick to point out Oracle’s corporate ownership of MySQL, which, while not at all problematic for most, could be a difference-maker for some in the open-source community. Some developers in the MySQL community have embraced MariaDB, the open-source fork of MySQL, which emerged over concerns about Oracle’s ownership of the MySQL project.
That said, PostgreSQL has had a historically smaller community, bearing testimony to just how popular MySQL has been as a database management system. Some tools and integrations that work for MySQL may not be available if you attempt to use PostgreSQL. Beyond its widespread usage and compatibility with many software tools, MySQL’s biggest strength may just happen to be its ease of use, good performance, and relative simplicity, especially on smaller projects. This simplicity explains, in part, its widespread adoption across the software world, both on open source projects and closed source projects. MySQL will be less ideal for situations where you might need full SQL support as well as situations where you have heavy use of concurrent read-write operations, where MySQL performance can be more problematic.
How They Compare to Other SQL and NoSQL DBMSs
Both PostgreSQL and MySQL continue to rank very highly among the best, and most widely used, database management systems, inclusive of SQL as well as NoSQL systems. NoSQL systems have risen in prominence in recent years, thanks to their better performance for modeling hierarchical data, typically implemented in JSON or XML format, as well as when storing large documents such as in big data. That said, SQL databases like PostgreSQL and MySQL are better at managing relational data and use cases involving transactional, atomic, data.
On this front, MySQL has perhaps the most widespread usage of any database management system, and a developer survey from ScaleGrid shows that MySQL leads in usage by developers, followed closely by MongoDB, the leading NoSQL system by developer adoption, and PostgreSQL:
- MySQL - 38.9% share of users
- MongoDB - 24.6%
- PostgreSQL - 17.4%
- Redis - 8.4%
- Cassandra - 3.0%
- Oracle - 1.8%
That said, rankings from DB Engines rank SQL systems like Oracle and Microsoft SQL Server much more highly. A key note, indicating the rapid changes in database management systems, is to be found in the ScaleGrid survey, which showed that over 60% of developers used a SQL database of one kind or another. This shows the enduring popularity of relational database management systems, but also indicates that NoSQL databases are not that far behind. One just has to look at the history of MongoDB and its other NoSQL counterparts over the last several years to get a sense of the momentum of NoSQL solutions. The reality is that PostgreSQL and MySQL no longer operate in a database world dominated solely by SQL and relational database systems. Rather, teams are now almost as quick to look at a NoSQL solution like MongoDB as they are to reach for either of these two leading SQL database systems.
Should I use MySQL or PostgreSQL?
For new projects, determine if you are going to port to closed software later on. In that case, PostgreSQL is closer to Oracle, and any code made for PostgreSQL will port to Oracle easily. MySQL has moved away from the SQL standard somewhat. PostgreSQL is more reliable because it is ACID (Atomicity, Consistency, Isolation, and Durability) compliant which means queries will maintain data integrity, and return the same output without error. Both support multiple operating systems including Unix and non-unix operating systems.
MySQL is less reliable and not ACID compliant The way it handles foreign key references, auditing and transactions makes it less reliable. MySQL is good if you are thinking you may use code from other open-source projects. Since it is widely used in smaller websites, there are plenty of add-ons, plugins, and modules to optimize MySQL from popular software like Wordpress, Drupal, and Joomla.