UpGuard technical articles

MySQL vs MongoDB

Written by UpGuard | Jul 1, 2014 5:04:00 PM

MySQL and MongoDB represent two sides of an argument that has been raging recently concerning data storage – the tried and tested relational database vs. non-relational or NoSQL database. They are both open-source products distributed under a version of the GNU GPL, and both are also available as commercial versions offering many more features and corporate support.

Relational vs. Non-relational Databases

Let’s first digress from our two combatants and look at the relational-vs-nonrelational argument for a bit.

Relational SQL databases have been around since the 1970’s, invented by the brilliant E.F. Codd. SQL and relational databases are excellent for representing and working with sets of data, similar to finding the region covered by intersection points in a Venn diagram. For instance, in a commercial bank’s application, it is simplicity itself to create a SQL query to extract, say, the names and contacts of all female customers, with a current-account balance over $100,000, who have taken out a loan in your bank within the last 2 years. SQL can easily allow you to get that accurate using the famous SELECT statement. The tight rules governing relational database structure mean that it is easy to ensure data integrity and security.

However, what SQL and relational databases are not good at is scaling. Because of the necessary table and database structure in relational databases, they really only scale well vertically within a single server - by increasing memory and CPU, using faster disks, etc. But they don’t scale well horizontally by adding more servers to share the load, i.e. distributed computing. This is where the relational models own strengths turn into weaknesses.

So is there a business need that has come up recently in which we now need several servers, rather than one large, massively powerful server? Why yes, that business need is the Internet. What most websites require is performance more than data integrity. Especially for very large websites – think Amazon,  Ebay or Google, quick website response and availability is much more important than the security and data integrity offered by relational databases.

Hence the emergence of non-relationa or NoSQL databases. One of their defining characteristics is that they are able to take scale very well across several servers and reap the advantages of distributed computing. With the advent of fast Internet connections, these servers may be in sync even over widely dispersed geographical locations (Google!). One way of achieving this is by storing data in key-value pairs, rather than the traditional table. A key-value pair is a combination of a data item and its related value. Examples of key-value pairs are:

  • Customer_name Andrew Jones
  • Customer_gender M
  • Customer_Current_account_bal $28,400

You see that in NoSQL databases, the data field and the value for that field are stored together as one record. This makes data retrieval much faster and enables , but also introduces problems with data integrity. A relational table, on the other hand, would store the same customer data as a set of distinct tables, one containing the customer bio-data (name, date of birth, gender, social security number and so on), another containing customer balances (account type, balance) and so on.

However, this does not mean that SQL-based relational databases are about to go the way of the dinosaur. They have been around for over 40 years and seen off many pretenders to the throne, mainly because they offer 3 huge advantages:

  1. A simple way of representing data/ business models
  2. An easy-to-use language to retrieve and query that data (SQL)
  3. Bulletproof data integrity and security built right into the database without having to rely on application rules and logic.

What the new NoSQL databases do is fulfill a need unmet by relational databases – the need for easy scalability. And also not to have to rely on object-relational mapping: read more about it here. This comparison is rougly akin to how the invention of the quicker, nimbler motor vehicle in the early 20th century did not necessarily usurp the train invented a few centuries earlier. Though rail transport was slower and not as flexible as the car, it was (and even today still is) very valuable for a different need – efficiently moving very large quantities of goods overland. In case it’s not clear, in this analogy the train is the relational database and the car is the NoSQL database. One isn’t necessarily a replacement for the other. They both serve different niches, only that the older solution wasn’t as suitable for a specific need, thus the invention of the new solution to fill that gap. Also read more about NoSQL vs. SQL databases here.


Now back to the original discussion. MySQL is a relational database originally developed by MySQL AB, now owned by Oracle. You may wonder why Oracle would buy a competitor and not kill it off, seeing as how it competes with Oracle’s own database offering. Actually, they mostly operate in two different spaces and don’t cannibalize each other. MySQL is mostly used in to store data for web applications, typically as part of the popular LAMP stack (Linux, Apache, MySQL, PHP/ Python/ Perl), while Oracle DB is mostly used in large enterprise scenarios. That’s not to say MySQL cannot support large enterprise databases or that Oracle DB cannot support web apps, a few customers use them this way, but they are the exception rather than the rule.

The basic MySQL system ships with no GUI tools, only a set of CLI’s. There is an official set of front-end tools called MySQL Workbench, freely available from Oracle. MySQL runs on all major operating systems – Linux, Windows, OS/2, OS X, and various flavors of Unix. It even runs on Symbian for mobile computing applications.

MySQL has over the years added various features so that it is today a full-featured RDBMS. In addition to the standard features such as tables, indexes, stored procedures it also offers updatable views, triggers, cursors, query caching, partitioned tables and clustering. An unusual feature of MySQL is that you can swap out the storage engine and use another more suited to a particular scenario or application. The natively supported engines are InnoDB, MyISAM, Archive, CSV, Blackhole, MySQL Federated and NBD Cluster. For example, the MySQL Federated engine gives you the flexibility to access remotely-stored tables as if they were local. Read more here

MySQL’s limitations of MySQL are the same that generally affect relational databases. Horizontal scaling and is not easy and high transaction loads (millions of read-writes) seriously affect performance. Although clustering and replication are available, as well as the innovative MySQL Fabric (an integrated system for managing a collection of MySQL servers), each of these is a pain to implement and cannot fully compensate for the fundamental design problem of relational databases.


MongoDB is a well-known open source NoSQL database. It employs the concept of key-value pairs, here called a document store. In MongoDB document stores are created and stored as BSON files, which are really a modified version of JSON files. See more about document stores here

MongoD offers very good performance for situations containing very high write loads, but where data integrity isn’t a pressing concern; a good example are the comments sections of large, busy websites like Craigslist or The New York Times – by the way, these aren’t theoretical by the way: both of these use MongoDB.

One major limitation of MongoDB is that unlike the relational MySQL, it does not offer an easy way to join tables. It has an inelegant solution to this: multi-dimensional data types in which you can embed one document store inside another. So for instance you can embed the customer account document consisting of the {“Customer_account_type: Current”, “Customer_balance: $28,400”} document into the customer data document {“Customer name: Andrew Jones, “Customer_gender: M”} and in this way retrieve the data about both the customer and his bank balance. As mentioned, it’s inelegant and awkward but it works. Also read a captivating post on a real-world limitation of using MongoDB here.


The MySQL vs. MongoDB argument is in many ways a relational vs. NoSQL argument. Each one excels in certain scenarios and is ill-suited for others. And because of their very different structures each one contains features not found in the other – see more here