Choose how to store data wisely - Every dev ever.

We have come so far in the field of managing our data physically. From storing them as files, tables and as documents, and now it's even possible to store it as graphs. Not only the way data is stored has evolved, but the techniques have also become pocket friendly.

So, if the cost is not the issue then what

It's how your applications use that data.

Some applications need faster retrieval. Some need ACID assurance. Some need the latest write in the latest read. All these things are primary and should be considered when you're choosing your data storage medium.

If you're looking for help, follow this guide, mark your points and you will eventually get the right DB for you.

People are generally aware of two types of databases:

SQL
NoSQL

...but in this guide, we will draw a comparison between three types:

SQL
NoSQL
Graph( Also a type of NoSQL but needs some special attention, keeping its unique characteristics in mind.)

Let's compare.

Storage

Let's see how each DB stores an entity(real-world object) and its attributes(properties of that object):

SQL

SQL stores Entities as Tables and attributes as columns in that table.

NoSQL

No SQL stores Entities as either Document, Key-Value store, Table, etc. and attributes as Keys in the document, Keys in a key-value store and Columns in the table.

Graph

In Graphs, Entities are stored as Nodes and attributes are assigned as attributes to these nodes.

Retrieval

Fetch queries are the ones that decide the fate of your application. If the retrieval of data is fast, your application is fast and if retrieval is slow, you know you have to optimize the query or regret choosing that database.

SQL

SQL itself is quite fast but things get complicated when the data you have to retrieve is split across multiple tables and, in worst-case scenarios, across multiple databases. You are forced to use joins which, if happens on more than 2 tables, reduces your response time to a great extent. That's why, gone are the days of normalizing your data model. We are fine with redundancy if that improves the performance of queries.

NoSQL

NoSQL didn't change the way of storing data just for the sake of its name. It did that in order to avoid heavy use of joins. NoSQL emphasizes storing the related data in the same object, instead of referencing it using a foreign key, though we will have data redundancy, which in turn doesn't give a dime about the cost of storage (not that it matters nowadays ~ cheap storage).

Graph

"I am speed" is what a Graph based database says, but we have to keep in mind the limitation of its use cases. Graph database fails to deliver in applications where we fetch range based data. But for any social platform, analytics website, security-management applications and the sorts, Graph database is the ideal for faster retrieval of results.

ACID Compliance

When dealing with transactions (operations on multiple entities at the same time), ACID compliance is a must for databases.

ACID stands for

Atomicity - Either all operations in the transaction take place or none at all.

Consistency - Database to remain consistent before and after the transaction.

Isolation - Operations between two or more transactions are independent of each other.

Durability - Doesn't lose the effect of successful transaction even in case of system failure

SQL

The reason why SQL is most widely used is not only because of the initial investment in the development of SQL, but also because it is ACID compliant.

NoSQL

NoSQL made an entry in the industry with the tradeoff between faster retrieval and ACID compliance. NoSQL, originally, is non-ACID-compliant but there are some databases like MongoDB, which have tried to make it ACID-compliant in their newer releases.

-Graphs

Graphs are already a kind of NoSQL, so you can't expect them to be ACID compliant either. But there are some Graph databases too which are making effort towards it. One example would be the Neo4j. They already claim their Graph database is ACID compliant.

Scaling

Does this database scale well? It is already the first thing that pops up in your mind when choosing the stack for your application. It's obvious that we have to be foresighted about future databases. Otherwise, migrating data to another database at a later stage will give you a tough time.

There are 2 types of scaling

Vertical Scaling - Increase the capacity like memory, storage, and bandwidth of the network on that machine. Horizontal Scaling - Use another instance of a database on another server. (Distributed database)

SQL

SQL is vertically scalable. As you can see clearly, we can't go on increasing the limit of the existing storage infinitely. Each machine has some limitations and that is the scaling limit of our SQL databases.

NoSQL

NoSQL is a database that scales well using the latter approach. This scaling comes at a compromise of consistency according to CAP (only two things can be achieved amongst Consistency, Availability and Partial Tolerance in a distributed data environment) theorem.

Graph

Since it is a kind of NoSQL database, we expect the graph to be horizontally scalable as well. Neo4j and ArangoDB are examples of such a graph database.

Use Cases

SQL

Go for SQL if:

Your queries don't get complicated with multiple joins.
You need robust ACID support.
Horizontal scaling is fine with your application.
NoSQL

NoSQL can be considered at times where:

You know the queries will get complicated with multiple joins in SQL and,
Your application doesn't have a need for ACID transactions (though some databases are starting to support multiple ACID transactions nowadays).
Graph

Graphs are ideal:

In an application where an object has a hierarchy of nested related objects. The graph database would be faster than any other database and the best option.
To analyze data. So, you can expect the graph database to be extremely helpful in the case of data analysis application and predicting future trends.

Conclusion

With the advancement in each type of database. It is possible that the shortcomings they have now won't exist in the future or they might have an alternate solution. If you can avail all the functionalities that your application needs in one database, you must just check the stability of the version and if it's fine, you are good to go.

If you are able to choose your database using this article, go ahead and leave some likes and comments.

Also, focus on data modeling now. 😄

Thank you for reading.

Choosing Databases: The Wise Way

Storage

Retrieval

ACID Compliance

Scaling

Use Cases

Conclusion

Comments

Misc.

Algolia - Make Every Search Strategic

More from this blog

Context Window to Knowledge Graph: The Evolution of Memory in Language Models

Why Systems Slow Down and What Smart Caching Teaches Us About Scalability

Designing A Real-Time AI Pipeline For Human-like Video Conversations

The Three Pillars of Digital Trust: Encoding, Hashing, and Encryption

Strategies for Data Privacy and Regulatory Compliance in React Native Development

Command Palette

Storage

Retrieval

ACID Compliance

Scaling

Use Cases

Conclusion

Comments

Misc.

Algolia - Make Every Search Strategic

More from this blog