I have been recently exploring Elasticity and Scalability attributes of databases. You can see that in my earlier blog posts about NuoDB where I wanted to look at Elasticity and Scalability concepts. The concepts are very interesting, and intriguing as well. I have discussed these concepts with my friend Joyti M and together we have come up with this interesting read. The goal of this article is to answer following simple questions
- What is Elasticity?
- What is Scalability?
- How ACID properties vary from NOSQL Concepts?
- What are the prevailing problems in the current database system architectures?
- Why is NuoDB an innovative and welcome change in database paradigm?
This word’s original form is used in many different ways and honestly it does do a decent job in holding things together over the years as a person grows and contracts. Within the tech world, and specifically related to software systems (database, application servers), it has come to mean a few things - allow stretching of resources without reaching the breaking point (on demand).
What are resources in this context?
Resources are the usual suspects – RAM/CPU/IO/Bandwidth in the form of a container (a process or bunch of processes combined as modules). When it is about increasing resources the simplest idea which comes to mind is the addition of another container. Another container means adding a brand new physical node.
When it is about adding a new node there are two questions which comes to mind.
1) Can we add another node to our software system?
2) If yes, does adding new node cause downtime for the system?
Let us assume we have added new node, let us see what the new needs of the system are when a new node is added.
- Balancing incoming requests to multiple nodes
- Synchronization of a shared state across multiple nodes
- Identification of “downstate” and resolution action to bring it to “upstate”
Well, adding a new node has its advantages as well. Here are few of the positive points
- Throughput can increase nearly horizontally across the node throughout the system
- Response times of application will increase as in-between layer interactions will be improved
Now, Let us put the above concepts in the perspective of a Database. When we mention the term “running out of resources” or “application is bound to resources” the resources can be CPU, Memory or Bandwidth. The regular approach to “gain scalability” in the database is to look around for bottlenecks and increase the bottlenecked resource.
When we have memory as a bottleneck we look at the data buffers, locks, query plans or indexes. After a point even this is not enough as there needs to be an efficient way of managing such large workload on a “single machine” across memory and CPU bound (right kind of scheduling) workload. We next move on to either read/write separation of the workload or functionality-based sharing so that we still have control of the individual. But this requires lots of planning and change in client systems in terms of knowing where to go/update/read and for reporting applications to “aggregate the data” in an intelligent way. What we ideally need is an intelligent layer which allows us to do these things without us getting into managing, monitoring and distributing the workload.
In the context of database/applications, scalability means three main things
- Ability to handle normal loads without pressure
E.g. X users at the Y utilization of resources (CPU, Memory, Bandwidth) on the Z kind of hardware (4 processor, 32 GB machine with 15000 RPM SATA drives and 1 GHz Network switch) with T throughput
- Ability to scale up to expected peak load which is greater than normal load with acceptable response times
- Ability to provide acceptable response times across the system
E.g. Response time in S milliseconds (or agreed upon unit of measure) – 90% of the time
The Issue – Need of Scale
In normal cases one can plan for the load testing to test out normal, peak, and stress scenarios to ensure specific hardware meets the needs. With help from Hardware and Software partners and best practices, bottlenecks can be identified and requisite resources added to the system. Unfortunately this vertical scale is expensive and difficult to achieve and most of the operational people need the ability to scale horizontally. This helps in getting better throughput as there are physical limits in terms of adding resources (Memory, CPU, Bandwidth and Storage) indefinitely.
Today we have different options to achieve scalability:
- Read & Write Separation
The idea here is to do actual writes to one store and configure slaves receiving the latest data with acceptable delays. Slaves can be used for balancing out reads. We can also explore functional separation or sharing as well. We can separate data operations by a specific identifier (e.g. region, year, month) and consolidate it for reporting purposes.
For functional separation the major disadvantage is when schema changes or workload pattern changes. As the requirement grows one still needs to deal with scale need in manual ways by providing an abstraction in the middle tier code.
- Using NOSQL solutions
The idea is to flatten out the structures in general to keep all values which are retrieved together at the same store and provide flexible schema. The issue with the stores is that they are compromising on mostly consistency (no ACID guarantees) and one has to use NON-SQL dialect to work with the store.
The other major issue is about education with NOSQL solutions. Would one really want to make these compromises on the ability to connect and retrieve in simple SQL manner and learn other skill sets? Or for that matter give up on ACID guarantee and start dealing with consistency issues?
Hybrid Deployment – Mac, Linux, Cloud, and Windows
One of the challenges today that we see across On-premise vs Cloud infrastructure is a difference in abilities. Take for example SQL Azure – it is wonderful in its concepts of throttling (as it is shared deployment) of resources and ability to scale using federation. However, the same abilities are not available on premise. This is not a mistake, mind you – but a compromise of the sweet spot of workloads, customer requirements and operational SLAs which can be supported by the team. In today’s world it is imperative that databases are available across operating systems – which are a commodity and used by developers of all hues.
An Ideal Database Ability List
- A system which allows a linear scale of the system (increase in throughput with reasonable response time) with the addition of resources
- A system which does not compromise on the ACID guarantees and require developers to learn new paradigms
- A system which does not force fit a new way interacting with database by learning Non-SQL dialect
- A system which does not force fit its mechanisms for providing availability across its various modules.
Well NuoDB is the first database which has all of the above abilities and much more. In future articles I will cover my hands-on experience with it.
Reference: Pinal Dave (http://blog.SQLAuthority.com)