Database Blog Image

What is MongoDB?

By Michael Calabrese, Senior Programmer

You have heard in the wind "NoSQL." Your developers are whispering document databases are cool. Well, we at Lunar Logic have begun building projects using MongoDB, a NoSQL, document-based database. I plan to cover this new technology to give you an idea of how it works, as well as its pros and cons over a series of articles. First, a little background on where we've come from.

Almost all databases that are now in use are relational databases/SQL databases. These systems organize data into tables and rows. If you use spreadsheets, tables are like a spreadsheet, and a table would be a single sheet. Data are set up in rows, where each row is formatted identically. The fields for your data are like the columns of the spreadsheet. For example, a beginning address database might look like:

Name Occupation Address1 Address2 Spouse Child1 Child2

This is a intuitive way to think about how you might keep an address book. All the data is organized in an easy way. In a spreadsheet it's very easy to add more columns if you need more addresses and children. But what happens when you need to add another address or another child in an SQL database?
The SQL databases handles this one of two ways. First, if you need a new address, you have to add a new column. So if a person has more addresses than are available, you have to create a new column. If you have a program looking at the data, you have to add a new control for each address and child. This is not very flexible, so SQL databases generally handle the problem a second way.

The “proper” way is to split repeating data into separate tables that are related. So you would have a main table of People and related tables of Addresses and Children. This would allow any person to have any number of addresses or children, and would look something like this:

People Table: PersonID Name Occupation Spouse

Addresses Table: PersonID Address

Children Table: PersonID Child

On the program side of things, you'd have a widget that could simply add another address or child to the displayed person. This solves the problem in SQL databases, but we incur a cost of joining those related tables together (by the Person's ID) to display all of the data for a person together. With MongoDB we can avoid that.

Document databases like MongoDB handle this problem in a way that is much closer to the first intuitive organization of the data. Data are thought of as a collection of documents instead of a table of identical looking records. The documents in a collection don't have to look anything like one another. For example:

Person Collection: Document 1: Name Spouse Document 2: Name Child Spouse Occupation

Not only is the document not locked into a set format, it may also contain a list. With a list, data can be easily added and removed. With this in mind, we can build, from the first sketch of data, a Person document that generally looks like this:

Name Spouse Occupation Addresses List: Address1 Address2 ... List Of Children: Child1 Child2 ...

Here we get the all of the advantages of SQL in keeping the data organized, allowing for multiple addresses and children. We gain another advantage in that it is all in one packet, so there is no “join penalty” for getting the data back from separate tables. This makes MongoDB faster than SQL for this type of information.

MongoDB and SQL organize data in different ways and from these basic differences will come the pro and cons of each platform. Next time I will delve into the different thought processes to building on MongoDB databases and some of the troubles that you need to watch out for.

Image Credit:
Are You Ready to Start Your Project?