MongoDB vs MySQL

Last updated by UpGuard on June 29, 2020

scroll down

MongoDB and MySQL represent two sides of an argument that has been raging recently concerning data storage – the relatively new non-relational or NoSQL database versus the tried and tested relational database. They are both open-source products distributed under a version of the GNU GPL, and both are also available as commercial versions offering many more features and corporate support.

Table of contents


  1. Non-relational vs. Relational Databases
  2. MongoDB
  3. MySQL
  4. Who Uses Them?
  5. MongoDB Query Language vs SQL
  6. Database Performance and Speed
  7. Scaling, Sharding, and Availability
  8. Commercial Support and Community Support Options
  9. Which Is Right For Your Business?
  10. Conclusion
  11. References

Relational vs. Non-relational Databases

Let’s first digress from our two combatants and look at the non-relational-vs-relational argument for a bit.

First, the database world over the last decade has witnessed the emergence of non-relational or NoSQL databases. One of their defining characteristics is that they are able to take scale very well across several servers and reap the advantages of distributed computing. 

With the advent of fast Internet connections, these servers may be in sync even over widely dispersed geographical locations (Google!). One way of achieving this is by storing data in key-value pairs, rather than the traditional table. 

A key-value pair is a combination of a data item and its related value. Examples of key-value pairs are:

  • Customer_name Andrew Jones
  • Customer_gender M
  • Customer_Current_account_bal $28,400

You see that in NoSQL databases, the data field and the value for that field are stored together as one record. This makes data retrieval much faster and enables, but also introduces problems with data integrity. 

A relational table, on the other hand, would store the same customer data as a set of distinct tables, one containing the customer bio-data (name, date of birth, gender, social security number and so on), another containing customer balances (account type, balance) and so on.

In contrast to the relative newness of the NoSQL databases, relational SQL databases have been around since the 1970s, invented by the brilliant E.F. Codd. 

SQL and relational databases like MySQL, MS SQL Server, and Oracle Database are excellent for representing and working with sets of data, similar to finding the region covered by intersection points in a Venn diagram. For instance, in a commercial bank’s application, it is simplicity itself to create a SQL query to extract, say, the names and contacts of all female customers, with a current-account balance over $100,000, who have taken out a loan in your bank within the last 2 years. 

SQL (structured query language) can easily allow you to get that accurate using the famous SELECT statement. The tight rules governing relational database structure mean that it is easy to ensure data integrity and security.

However, what SQL and relational databases are not good at is scaling. Because of the necessary table and database structure in relational databases, they really only scale well vertically within a single server - by increasing memory and CPU, using faster disks, etc. But they don’t scale well horizontally by adding more servers to share the load, i.e. distributed computing. This is where the relational models' own strengths turn into weaknesses.

So is there a business need that has come up recently in which we now need several servers, rather than one large, massively powerful server? Why yes, that business need is the Internet. 

What most websites require is performance more than data integrity. Especially for very large websites – think Amazon, eBay or Google, quick website response and availability is much more important than the security and data integrity offered by a relational data model.

However, this does not mean that SQL-based relational databases are about to go the way of the dinosaur. They have been around for over 40 years and seen off many pretenders to the throne, mainly because they offer 3 huge advantages:

  1. A simple way of representing data/ business models
  2. An easy-to-use language to retrieve and query that data (SQL)
  3. Bulletproof data integrity and security built right into the database without having to rely on application rules and logic.

What the new NoSQL databases do is fulfill a need unmet by modeling with a relational data structure – the need for easy scalability, particularly with cloud computing deployments. And also not to have to rely on object-relational mapping (ORM). 

This comparison is roughly akin to how the invention of the quicker, nimbler motor vehicle in the early 20th century did not necessarily usurp the train invented a few centuries earlier. Though rail transport was slower and not as flexible as the car, it was (and even today still is) very valuable for a different need – efficiently moving very large quantities of goods overland. In case it’s not clear, in this analogy, the train is the relational database and the car is the NoSQL database. 

One isn’t necessarily a replacement for the other. They both serve different niches, only that the older solution wasn’t as suitable for a specific need, thus the invention of the new solution to fill that gap. Also, read more about NoSQL vs. SQL databases here.

MongoDB

Now back to the original discussion. MongoDB is a well-known open-source NoSQL database. It employs the concept of key-value pairs, here called a document store. In MongoDB, document stores are created and stored as BSON files, which are really a modified version of JSON documents. See more about document stores here

MongoDB offers very high performance for situations containing very high write loads, but where data integrity isn’t a pressing concern; a good example is the comments sections of large, busy websites like Craigslist or The New York Times – by the way, these aren’t theoretical by the way: both of these use MongoDB.

One major limitation of MongoDB is that unlike the relational MySQL, it does not offer an easy way to join tables. It has an inelegant solution to this: multi-dimensional data types in which you can embed one document store inside another. 

So, for instance, you can embed the customer account document consisting of the {“Customer_account_type: Current”, “Customer_balance: $28,400”} document into the customer data document {“Customer name: Andrew Jones, “Customer_gender: M”} and in this way retrieve the data about both the customer and his bank balance. As mentioned, it’s inelegant and awkward but it works. 

Also, read a captivating post on a real-world limitation of using MongoDB here.

MySQL

MySQL is a relational database originally developed by MySQL AB, now owned by Oracle. You may wonder why Oracle would buy a competitor and not kill it off, seeing as how it competes with Oracle’s own database offering. 

Actually, they mostly operate in two different spaces and don’t cannibalize each other. 

MySQL is mostly used to store data for web applications, typically as part of the popular LAMP stack (Linux, Apache, MySQL, PHP/ Python/ Perl), while Oracle DB stores data mostly in large enterprise scenarios. That’s not to say MySQL cannot support large enterprise databases or that Oracle DB cannot support web apps, a few customers use them this way, but they are the exception rather than the rule. MySQL’s direct competitors would be open source SQL database systems like PostgreSQL, MariaDB, and SQLite.

The basic MySQL system ships with no GUI tools, only a set of CLI’s. There is an official set of front-end tools called MySQL Workbench, freely available from Oracle Corporation. MySQL runs on all major operating systems – Linux, Windows, OS/2, OS X, and various flavors of Unix. It even runs on Symbian for mobile computing applications.

MySQL has over the years added various features so that it is today a full-featured RDBMS. In addition to the standard features such as tables, indexes, stored procedures it also offers updatable views, triggers, cursors, query caching, partitioned tables and clustering. 

An unusual feature of MySQL is that you can swap out the storage engine and use another more suited to a particular scenario or application. The natively supported engines are InnoDB, MyISAM, Archive, CSV, Blackhole, MySQL Federated and NDB Cluster. For example, the MySQL Federated engine gives you the flexibility to access remotely-stored tables as if they were local. Read more here

MySQL uses a relational database schema, so the limitations of MySQL are the same as those that generally affect relational databases. Horizontal scaling is not easy and high transaction loads (millions of read-writes) seriously affect performance. 

Although clustering and replication are available, as well as the innovative MySQL Fabric (an integrated system for managing a collection of MySQL servers), each of these is a pain to implement and cannot fully compensate for the fundamental design problem of relational databases.

Who Uses Them? 

While relatively new compared to MySQL, MongoDB has gained a lot of momentum over the years and now boasts usage right across the tech world. Whereas MySQL boasts greater maturity and familiarity to a larger cross-section of developers, MongoDB is moving arguably faster and new organizations make the switch to NoSQL all the time.

Here are some notable MongoDB users:

  • Uber Technologies
  • Lyft
  • Bepro Company
  • Accenture
  • CircleCI
  • MIT
  • Trello
  • Segment
  • Codecademy
  • Picnic Technologies
  • Zomato
  • Doubtnut

Organizations that use MongoDB typically gravitate to it in pursuit of blisteringly fast performance speeds as well as the flexibility that comes with NoSQL data storage. Its schema-free document structure pairs well with real-time Node.js apps where storing unstructured data quickly is a concern.

On the other side of the equation, MySQL is not to be outdone in the range and pedigree of tech businesses that use its relational database management system. These include:

  • Airbnb
  • Netflix
  • Pinterest
  • Slack
  • Udemy
  • Twitter
  • Amazon
  • Shopify
  • Dropbox
  • LinkedIn
  • Coursera
  • GitHub
  • DigitalOcean
  • Accenture
  • Asana
  • 9GAG
  • trivago
  • Intuit
  • Square
  • MIT
  • Quora
  • Yelp
  • TransferWise

MySQL usage is driven, in part, by a need for familiarity, and a product with support and knowledge widely distributed across the tech world. This makes MySQL a no-brainer for building a v1 or minimally viable product, and, often, this DB engine remains powering apps with very high traffic. 

This has been the case with organizations like Pinterest, which has come up with innovative approaches to improve MySQL performance as traffic to its network grew over the years. MySQL also excels when you have a schema design that will generally stay consistent as your app evolves.

MongoDB Query Language vs SQL

Given that the NoSQL paradigm is built on storage of documents in collections, MongoDB does not use traditional tables and associated concepts. Instead, the MongoDB database management system comes with its own nomenclature and ways of doing things that are different, but easy enough for teams used to MySQL to understand.

Here is how these two approach data storage in terms of the concepts they use.

MongoDB Entity MySQL Entity
database database
document / BSON document row
collection table 
field column
index index

 

Querying databases using these two systems involves two completely separate languages. Both, however, are declarative and succinct. Unlike MySQL, which uses SQL, MongoDB uses JavaScript as its query language, making it easier for developers who already know JavaScript to start writing compliant queries with a minimal learning curve. Other programming languages are available for writing MongoDB queries using language-specific drivers such as those for Python, Go, or Scala.

Here is an example of MongoDB and MySQL query languages in use to accomplish an identical task.

Use Case: Creating a Collection/Table and Inserting a Document/Row

Here is how you create a new Collection in MongoDB:

db.createCollection("people")

In practice, this explicit creation of a new Collection is unnecessary since MongoDB does this for you automatically when you save the first document to a database. So you can just insert your first document and the collection is created automatically. Here’s the code for inserting a document:

db.people.insertOne( {

    user_id: "abc123",

    age: 55,

    status: "A"

 } )

Here’s how we would implement this in MySQL, using Structured Query Language.

CREATE TABLE people (

    id MEDIUMINT NOT NULL

        AUTO_INCREMENT,

    user_id Varchar(30),

    age Number,

    status char(1),

    PRIMARY KEY (id)

)

This query creates our table, then, to save a new record, we’d run the following query in MySQL:

INSERT INTO people(user_id,

                  age,

                  status)

VALUES ("abc123",

        55,

        "A")

Database Performance and Speed

While both of these open-source database management systems provide robust performance, MongoDB is easier to get good performance with, even in the absence of advanced database tuning. MySQL can perform just as fast or faster depending on your use case, but to get the higher levels of MySQL performance, you will likely need advanced tuning knowledge.

MongoDB is so fast out of the box due to its storing entire entities in a single document. This means that data for a record is not spread across numerous relational database tables. 

To fetch or update information related to an entity, the database reads and writes to only a single place in the database. There is no need for SQL joins to try and retrieve the related data, and optimizing query performance becomes easier for developers as well. 

Developer productivity is another factor where MongoDB enjoys an advantage, partly because of MongoDB’s ease of use, and the use of JavaScript in database queries.

MongoDB tends to perform better than MySQL as your databases grow. When your applications have to handle large sets of unstructured data, the performance gains you can gain from MongoDB can be significant. This is partly due to the engineering behind MongoDB, which emphasizes a high insertion rate for a large volume of data rather than maximum transaction safety. 

MySQL Database, while not as performant in terms of raw insertion speed, excels in terms of ACID-compliance. ACID (atomicity, consistency, isolation, durability) denotes that transactions are processed reliably, allowing, for example, an entire transaction to fail rather than execute a write that only partially succeeded. MySQL and SQL databases have long had the edge in this space, however, since Version 4, MongoDB has supported ACID transactions similar to SQL databases.

Scaling, Sharding, and Availability

When it comes to scalability and availability, MongoDB enjoys deep advantages due to its use of replica sets to enable high availability of data, as well as other core design decisions. Replica sets in MongoDB enable nodes to hold the same data, allowing fast failover with the automatic election of a new primary node. All this typically happens within a span of just five seconds at the most. 

MySQL failover is not as smooth and requires far more input from your operations team to bring up a replacement if one of your database nodes goes down. In contrast, MongoDB comes with native support for sharding your data across multiple MongoDB nodes. 

This means you can scale cheaply and infinitely on commodity hardware, lowering the costs for scaling your app. Shards in MongoDB require no change to the application, whereas the relational nature of databases in MySQL, requiring JOINs for many operations, makes locating data from a table on separate shards complex. 

Scaling MySQL effectively involves various pitfalls your team has to be able to navigate in order to solve performance bottlenecks and implement sharding properly. Schema migration is a challenge due to the set-in-stone nature of MySQL schemas, with join operations and triggers adding to the complexity of migration. 

While auto-sharding is not available, sharding is possible and can greatly enhance your MySQL performance. MySQL has support for master-slave replication, as well as master-master replication. You can tune your MySQL performance and availability with some of these steps:

  • Use master-master replication with a backup host to provide failover
  • As database server load increases, add capacity by upgrading the database server as required, adding RAM, faster hard drives, storage space.
  • Add new ranges for your database shards to allow a higher number of shards.
  • To scale even further, consider moving shards to new machines.    

If you want even better scalability and high availability with MySQL, you can consider using MySQL Cluster, a release of MySQL built for linear scalability, with built-in replication between clusters across multiple geographical sites.

Commercial Support and Community Support Options 

The quality of support is an important consideration for selecting a database management system. In this respect, MongoDB and MySQL are both excellent picks, with ample support from their respective communities and commercial support available.

MongoDB users can get free support through a variety of avenues, including the official MongoDB docs, as well as a community forum where users hang out. Another excellent resource is the free MongoDB University, which offers free courses for both database administrators and developers. 

If you need paid support, it is provided by MongoDB, Inc, the public company that develops and maintains MongoDB. Three subscription tiers are available: MongoDB Enterprise Advanced, MongoDB Professional, and Development Support.   

For MySQL, you can access the knowledge of the community through the MySQL Forums. Detailed documentation and guides are also available. For even more in-depth training, you can purchase MySQL courses from Oracle University. Oracle also offers paid MySQL support. With the help of MySQL Support Engineers, you can cut down on the time it takes to resolve issues and clear blockers in your MySQL deployments.

Which Is Right For Your Business? 

While the prevalence of these database systems at various tech companies handling very large tech deployments indicate that they are about equally capable, each one has areas where it performs best. MongoDB tends to do a stellar job when you are concerned most about the ability to scale easily with a native horizontal scaling architecture with easy sharding. 

If your application calls for a flexible schema and you want the open-ended development approach where you can store unstructured data sets at will, MongoDB will be a much stronger solution for your needs. MongoDB also stands out for high availability and quick, instant failover and recovery.

MySQL, on the other hand, stands out for its ability to handle a high transaction rate and ensure data consistency. It has a reliable, privilege-based security model. It is an excellent fit when you have a stable, fixed schema and don’t plan to store much unstructured data. 

As the world’s most popular database management system, MySQL has a plethora of resources readily available for digging deeper into any aspect of the DBMS. It’s a system that has been tested in a wide variety of industries and should be able to hold up well no matter what you throw at it. 

The widespread adoption of MongoDB, however, means that the NoSQL approach is often as compelling, if not more appealing, an option for new projects going forward.

Conclusion

The MySQL vs. MongoDB argument is in many ways a relational vs. NoSQL argument. Each one excels in certain scenarios and is ill-suited for others. And because of their very different structures each one contains features not found in the other – see more here

References


Related posts

Learn more about the latest issues in cybersecurity