When I wrote the blog around Understanding Database Scalability – Vertical and Horizontal, I saw few have conversations with me on if it is this or that. And these discussions could take a course all by itself. There is no one size fits all when it comes to scalability. We need to understand the fine prints and details of how the application needs are, what are the restrictions, user requirements and much more.
When designing an application, you can use a combination of partitioning options—scale-up and scale-out are not mutually exclusive. When you combine vertical and horizontal elastic scale capabilities in an application’s architecture, you can expect more complexity during the design and development stages, but you can offset that upfront investment with savings in operational costs. In addition, your application benefits from the flexibility to adapt quickly to changing demands.
Here are some possibilities and considerations, I see when it comes to working on such architectures:
- You can designate one shard as the so-called “hot shard,” which is hosted on a server with fixed resources, then scaled up or down to accommodate changes in workload. In a content management system, for example, a hot shard can be used to maximize throughput for updates and searches during the day, when users access millions of documents. At night, the day’s data is moved to cold shards.
- When the business need is to simplify application development and data management, you should use a scale-up approach. Even though acquiring and provisioning big, scale-up systems can add complexity, this approach works well for reactive situations when the benefit of boosting resources outweighs the cost of scaling out.
- When operational cost is a factor, you should use a scale-out approach, because application development costs and effort tend to increase, whereas the costs and effort of procuring computing and storage are typically acceptable.
- For more complex scale-out scenarios, multi-layer sharding is possible in theory but difficult to program and maintain. You can use multiple layers of horizontal sharding to partition a tenant’s data into multiple shardlets and shards, which are hosted by multiple physical database servers. For instance, in a multitenant solution, a tenant can reside in its own data store or share a data store with other tenants. A tenant can even be spread across multiple data stores if another attribute is used to partition the data.
As you can see, there are no set rules when it comes to defining Shards and scaling in databases. I have personally seen people choose different solutions and in the current age, people are exploratory when it comes to designing solutions that are out-of-box. Most importantly, they want the systems to be automated and self-healing with minimal or no human interventions. I personally feel, once we figure out the requirements, the automation piece is easier to handle.
Reference: Pinal Dave (https://blog.sqlauthority.com)
1 Comment. Leave new
Hi Pinal,
I have a table with about 2 million distinct IDs and another table with about 7000 distinct codes plus a varchar(02) flag. I did a simple cross join to create a table with just over 14 billion records. I will update the varchar field accordingly but after that I have to pivot each ID with the 7000 codes into columns creating a giant matrix. I can’t create indexes as the tempdb runs out of space but my bigger concern is the architecture of the columns I need to pivot. What is the best way to build this? I am thinking of creating 8 tables to cover the 7300 columns and joining by the distinct ids due to sql server’s 1024 column limit per table. I hope I have provided enough detail. I have been developing a long time and never encountered this situation. Thanks.