NoSQL is a non-relational database that was designed to meet the technological needs of modern organizations. Widely recognized for their qualities and ability to scale, these databases are gaining more and more followers in 2022. Find out everything you need to know about NoSQL.
According to the Forbes Global 2000 ranking, more and more companies are turning to NoSQL technology. This interest is due to the fact that traditional relational databases (RDBs) can no longer meet the needs of modern businesses. But what is NoSQL and how can it become so successful? The answer in our dossier.
Table of contents
What is NoSQL?
NoSQL is a set of database technologies that is based on a different model than relational databases. Indeed, these databases are not able to manage the scaling nor to face the agility of the current applications. The NoSQL allows to answer these problems but the technology offers many more advantages.
Why use NoSQL?
Relational databases were born long before the Internet, the cloud, Big Data and mobile technologies. They are therefore not no longer able to fully meet the requirements of today’s businesses. It is in this logic that non-relational databases, of which NoSQL is a part, have succeeded in making their mark.
In relational databases, the data are represented in the form of tables composed of rows and columns. Multiple value data is modeled as multiple rows in the same table. And the associated data takes the form of rows in different tables. To extract or write information in a relational DB, it is therefore necessary disassemble and reassemble the tables.
NoSQL, on the other hand, works in a different way. Indeed, a NoSQL DB models the data as objects and the multi-value data as collections. The associated data take the form of objects or nested collections.
NoSQL therefore treats the data of the the most natural way possibleas they are and without any transformation.
Relational DBs usually run on a single server. When the size of the database increases, the capacity of this server must also be increased. This is referred to as the vertical scalability of a server. On the other hand, the horizontal scalability consists in increasing the number of servers to follow the increase of the database size.
Although it is possible to shard across multiple servers in a SQL database, this creates technical problems that are difficult to overcome. Indeed, it will be necessary on the one hand to deploy the relational DB on several servers. And on the other hand, it will be necessary to make the various equipment work as a single server.
This manual sharding of the database is not not taken into account natively by a relational DB. Moreover, this risks compromising the transactional integrity of the DB.
With a NoSQL DB, the sharding is automatic. In other words, the deployment of the DB on the various servers is done automatically. Just as it is the case for the distribution of data and requests. Moreover, if a server of the cluster fails, the system automatically replaces it with another server and this without disruption or interruption of service.
In relational databases, data caching is required to improve read performance of these systems. However, it has no effect on write performance. Even worse, this feature gives more work to the technical teams and unnecessarily complicates the system.
In a NoSQL DB, the caching is directly integrated into the technology. It allows to store as much data as possible in the system memory. In addition, it improves the data reading performance without altering the overall performance of the database.
Automatic replication and availability
NoSQL databases perform automatic replication in order to ensure uninterrupted service in case of unscheduled failures or maintenance. In addition, some of them are able to perform an automatic failover in case a node is unavailable. The DB can therefore continue to function by sending requests to another node.
On the other hand, distributed NoSQL databases are able to automatically distribute data across multiple geographic areas. This capability provides a triple benefit to organizations. First, it allows them to respect constraints in terms of data localization. Second, it offers the ability to guard against regional unavailability issues.
Finally, it improves the performance of the system by reducing latency. Indeed, thanks to the distributed nature of NoSQL DBs, read and write operations can be executed on the closest nodes.
Finally, thanks to replication, distributed NoSQL DBs Allow critical applications to run 24/7. A need that relational DBs cannot satisfy. Indeed, these systems only work on a single server or on a cluster with a single shared storage.
What are the types of NoSQL databases?
There are four types of NoSQL databases and each type is designed to address a specific problem.
The key-value databases are the simplest. Each element of the DB takes the form of an attribute (the key) and each key has a corresponding value. Because of this simplicity, key-value DBs are therefore able to efficiently handle a large number of queries.
They are also able to process a large amount of data and are very well designed to store user profiles. Among the key-value databases are Dynamo DB, Berkeley DB or Redis.
Document oriented databases allow to associate a key with a complex data structure called the document. Each document can then contain several key-value pairs, several key-table pairs or even other nested documents.
Document-oriented DBs use JSON and XML to represent structured documents. These types of databases have the advantage of avoiding joins to reconstruct information since everything is included in the document structure. Among the document-oriented DBs we find the MongoDB or CouchDB databases.
Column-oriented databases store data by column. These types of databases are designed to Efficiently handle queries on large data sets.
Column-oriented DBs then use what is called a key space. This contains all families of columns with rows, which in turn contain columns. This representation is similar to that of a data schema in a relational model. The BDD Cassandra and HBase are among the column-oriented databases.
Graph-oriented databases are designed to store data networks like social networks. These object-oriented databases are based on graph theory. They exploit a structure in the form of nodes and links to model and store information.
Graph-oriented DBs offer increased performance in terms of speed and data processing. Indeed, the use of graphs allows to avoid multiple joins. Moreover, they allow simple developments and easy modeling. The Neo4J and Giraph databases are among the graph-oriented databases.
What are the advantages of NoSQL databases?
Handling massive data
With the advent of Big Data, applications and services are taking on a staggering amount of information. This information can be up to terabytes of data. These systems must also be able to maintain good performance in reading, writing and storing data. NoSQL DBs are perfectly efficient to meet these needs.
Furthermore, a NoSQL DB is Flexible enough to adapt quickly to the continuous increase of data volumes. Indeed, such a distributed database is able to add more resources by simply adding more servers. The read, write and storage capacities are thus distributed among the servers to allow operation at any scale.
Finally, NoSQL technology is able to manipulate dynamic data (which evolve with time) as well as data structured, semi-structured or unstructured.
Agile application development
Relational DBs generally require the existence of a data schema to store information. On the one hand, this approach does not adapt to the agile development process but it also makes it more complex to take into account the evolution of requirements.
Indeed, each time a new functionality is added, the data schema requires modification. Each update of the schema of the DB then causes interruptions. It follows that as the number of iterations increases, the disruption of the development cycle becomes more important.
A NoSQL DB, on the other hand, is designed to facilitate modifications without interrupting the service. The development cycle is therefore faster and the integration of codes is more reliable. Indeed, NoSQL models data in the form of objects. It also leaves to the developers, the modeling of these data at the level of the applications.
What are the drawbacks of NoSQL databases?
NoSQL databases do not support ACID properties (Atomicity, Consistency, Isolation, Durability) that relational DBs have. Therefore, NoSQL does not guarantee reliable execution of these transactions. In order to support ACID, developers will have to create their own code.
On the other hand, NoSQL is not compatible with SQL even though some NoSQL DBs use a structured query language. Finally, NoSQL is relatively young compared to relational DBs. It is thus much less stable and offers fewer features than its elder sister.