MySQL – Scalability on Amazon RDS: Scale out to multiple RDS instances

Today, I’d like to discuss getting better MySQL scalability on Amazon RDS.

The question of the day: “What can you do when a MySQL database needs to scale write-intensive workloads beyond the capabilities of the largest available machine on Amazon RDS?”

Let’s take a look.

In a typical EC2/RDS set-up, users connect to app servers from their mobile devices and tablets, computers, browsers, etc.  Then app servers connect to an RDS instance (web/cloud services) and in some cases they might leverage some read-only replicas.

 

Figure 1. A typical RDS instance is a single-instance database, with read replicas.  This is not very good at handling high write-based throughput.

As your application becomes more popular you can expect an increasing number of users, more transactions, and more accumulated data.  User interactions can become more challenging as the application adds more sophisticated capabilities. The result of all this positive activity: your MySQL database will inevitably begin to experience scalability pressures.

What can you do?

Broadly speaking, there are four options available to improve MySQL scalability on RDS.

1. Larger RDS Instances – If you’re not already using the maximum available RDS instance, you can always scale up – to larger hardware.  Bigger CPUs, more compute power, more memory et cetera. But the largest available RDS instance is still limited.  And they get expensive.

“High-Memory Quadruple Extra Large DB Instance”:

  • 68 GB of memory
  • 26 ECUs (8 virtual cores with 3.25 ECUs each)
  • 64-bit platform
  • High I/O Capacity
  • Provisioned IOPS Optimized: 1000Mbps

2. Provisioned IOPs – You can get provisioned IOPs and higher throughput on the I/O level.

However, there is a hard limit with a maximum instance size and maximum number of provisioned IOPs you can buy from Amazon and you simply cannot scale beyond these hardware specifications.

3. Leverage Read Replicas – If your application permits, you can leverage read replicas to offload some reads from the master databases. But there are a limited number of replicas you can utilize and Amazon generally requires some modifications to your existing application.

And read-replicas don’t help with write-intensive applications.

4. Multiple Database Instances – Amazon offers a fourth option:

You can implement partitioning,thereby spreading your data across multiple database Instances” (Link)

However, Amazon does not offer any guidance or facilities to help you with this. “Multiple database instances” is not an RDS feature.  And Amazon doesn’t explain how to implement this idea.

In fact, when asked, this is the response on an Amazon forum:

Q: Is there any documents that describe the partition DB across multiple RDS?
I need to use DB with more 1TB but exist a limitation during the create process, but I read in the any FAQ that you need to partition database, but I don’t find any documents that describe it.

A: “DB partitioning/sharding is not an official feature of Amazon RDS or MySQL, but a technique to scale out database by using multiple database instances. The appropriate way to split data depends on the characteristics of the application or data set. Therefore, there is no concrete and specific guidance.”

So now what?

The answer is to scale out with ScaleBase.

Amazon RDS with ScaleBase: What you get – MySQL Scalability!

ScaleBase is specifically designed to scale out a single MySQL RDS instance into multiple MySQL instances.

Critically, this is accomplished with no changes to your application code.  Your application continues to “see” one database.   ScaleBase does all the work of managing and enforcing an optimized data distribution policy to create multiple MySQL instances.

With ScaleBase, data distribution, transactions, concurrency control, and two-phase commit are all 100% transparent and 100% ACID-compliant, so applications, services and tooling continue to interact with your distributed RDS as if it were a single MySQL instance.

The result: now you can cost-effectively leverage multiple MySQL RDS instance to scale out write-intensive workloads to an unlimited number of users, transactions, and data.

Amazon RDS with ScaleBase: What you keep – Everything!

And how does this change your Amazon environment?

1. Keep your application, unchanged – There is no change your application development life-cycle at all.  You still use your existing development tools, frameworks and libraries.  Application quality assurance and testing cycles stay the same. And, critically, you stay with an ACID-compliant MySQL environment.

2. Keep your RDS value-added services – The value-added services that you rely on are all still available. Amazon will continue to handle database maintenance and updates for you. You can still leverage High Availability via Multi A-Z.  And, if it benefits youra application throughput, you can still use read replicas.

3. Keep your RDS administration – Finally the RDS monitoring and provisioning tools you rely on still work as they did before.

With your one large MySQL instance, now split into multiple instances, you can actually use less expensive, smallersmaller available RDS hardware and continue to see better database performance.

Conclusion

Amazon RDS is a tremendous service, but it doesn’t offer solutions to scale beyond a single MySQL instance. Larger RDS instances get more expensive.  And when you max-out on the available hardware, you’re stuck.  Amazon recommends scaling out your single instance into multiple instances for transaction-intensive apps, but offers no services or guidance to help you. This is where ScaleBase comes in to save the day.

It gives you a simple and effective way to create multiple MySQL RDS instances, while removing all the complexities typically caused by “DIY” sharding andwith no changes to your applications .

With ScaleBase you continue to leverage the AWS/RDS ecosystem: commodity hardware and value added services like read replicas, multi A-Z, maintenance/updates and administration with monitoring tools and provisioning.

SCALEBASE ON AMAZON

If you’re curious to try ScaleBase on Amazon, it can be found here – Download NOW.

Reference: Pinal Dave (http://blog.sqlauthority.com)

About these ads

SQLAuthority News – Microsoft Whitepaper – Optimizing Your Query Plans with the SQL Server 2014 Cardinality Estimator

I have been reading this Whitepaper from a couple of days and I am yet not done reading it completely, but I think it is one of the best white papers I have read in the recent time. First of all, it is written by my friend and SQL Expert Joe Sack. If you know Joe, you know that he is blessed with the skill to make a complex subject very easy. There are very few experts can do this. Kudos to Joe for amazing white paper. This white paper is 43 pages long and over 10,000 words, but trust me, every single word is worth it. I am planning to complete this whitepaper this weekend.

Every new version of SQL Server gets new performance enhancement features. In SQL Server 2014 SQL Server Query Optimizer cardinality estimation process got a major overhaul. A new algorithm for the cardinality estimation now has improved accuracy, consistency and suitability of key areas within the cardinality estimation process. In simple words, it is going to improve performance of queries executed on SQL Server.

Here is the paragraph from the Whitepaper which describes what actually this white paper is going to include:

The SQL Server query optimization process seeks the most efficient processing strategy for executing queries across a wide variety of workloads. Achieving predictable query performance across online transaction processing (OLTP), relational data warehousing, and hybrid database schemas is inherently difficult. While many workloads will benefit from the new cardinality estimator changes, in some cases, workload performance may degrade without a specific tuning effort. In this paper, we will discuss the fundamentals of the SQL Server 2014 cardinality estimator changes. We will provide details on activating and deactivating the new cardinality estimator. We will also provide troubleshooting guidance for scenarios where query performance degrades as a direct result of cardinality estimate issues.

I personally believe there are so many new enhancements in SQL Server that sometimes, I feel foreign to this entire subject. I feel that SQL Server has moved ahead and I am still working with version n-1. However, when I come across white paper like this, I regain my own confidence as now I know I can depend on this white paper to learn what are the new features available. Later on I can go and implement this feature on production server and master the basics as well as advanced concepts.

Download the white paper from here. 

Reference: Pinal Dave (http://blog.SQLAuthority.com)

SQL SERVER – 28 Links for Learning SQL Wait Stats from Beginning

It has been exact two years since I have written series on SQL Wait Statistics and Queues. I often received quite a few questions related to this subject. Here are my answers to the questions.

Q: The series which you have written two years ago, is it still relevant to latest SQL Server?
A: Yes, absolutely. Everything which I have written earlier is still relevant to the latest version of SQL Server. The matter of the fact, most of it will remain relevant forever.

Q: Is there anyway, I can read everything together in an eBook format?
A: Yes, you can get SQL Wait Stats on Kindle over here.

Q: Is SQL Wait Stats a good logical starting point for SQL Performance Tuning?
A: I believe so. It gives you a good idea where exactly your bottleneck is in your server.

Q: I have previously not learned about SQL Wait Stats, can I start now?
A: Absolutely, Yes, here are the links:

SQL SERVER – Introduction to Wait Stats and Wait Types – Wait Type – Day 1 of 28

SQL SERVER – Signal Wait Time Introduction with Simple Example – Wait Type – Day 2 of 28

SQL SERVER – DMV – sys.dm_os_wait_stats Explanation – Wait Type – Day 3 of 28

SQL SERVER – DMV – sys.dm_os_waiting_tasks and sys.dm_exec_requests – Wait Type – Day 4 of 28

SQL SERVER – Capturing Wait Types and Wait Stats Information at Interval – Wait Type – Day 5 of 28

SQL SERVER – CXPACKET – Parallelism – Usual Solution – Wait Type – Day 6 of 28

SQL SERVER – CXPACKET – Parallelism – Advanced Solution – Wait Type – Day 7 of 28

SQL SERVER – SOS_SCHEDULER_YIELD – Wait Type – Day 8 of 28

SQL SERVER – PAGEIOLATCH_DT, PAGEIOLATCH_EX, PAGEIOLATCH_KP, PAGEIOLATCH_SH, PAGEIOLATCH_UP – Wait Type – Day 9 of 28

SQL SERVER – IO_COMPLETION – Wait Type – Day 10 of 28

SQL SERVER – ASYNC_IO_COMPLETION – Wait Type – Day 11 of 28

SQL SERVER – PAGELATCH_DT, PAGELATCH_EX, PAGELATCH_KP, PAGELATCH_SH, PAGELATCH_UP – Wait Type – Day 12 of 28

SQL SERVER – FT_IFTS_SCHEDULER_IDLE_WAIT – Full Text – Wait Type – Day 13 of 28

SQL SERVER – BACKUPIO, BACKUPBUFFER – Wait Type – Day 14 of 28

SQL SERVER – LCK_M_XXX – Wait Type – Day 15 of 28

SQL SERVER – Guest Post – Jonathan Kehayias – Wait Type – Day 16 of 28

SQL SERVER – WRITELOG – Wait Type – Day 17 of 28

SQL SERVER – LOGBUFFER – Wait Type – Day 18 of 28

SQL SERVER – PREEMPTIVE and Non-PREEMPTIVE – Wait Type – Day 19 of 28

SQL SERVER – MSQL_XP – Wait Type – Day 20 of 28

SQL SERVER – Guest Posts – Feodor Georgiev – The Context of Our Database Environment – Going Beyond the Internal SQL Server Waits – Wait Type – Day 21 of 28

SQL SERVER – Guest Post – Jacob Sebastian – Filestream – Wait Types – Wait Queues – Day 22 of 28

SQL SERVER – OLEDB – Link Server – Wait Type – Day 23 of 28

SQL SERVER – 2000 – DBCC SQLPERF(waitstats) – Wait Type – Day 24 of 28

SQL SERVER – 2011 – Wait Type – Day 25 of 28

SQL SERVER – Guest Post – Glenn Berry – Wait Type – Day 26 of 28

SQL SERVER – Best Reference – Wait Type – Day 27 of 28

SQL SERVER – Summary of Month – Wait Type – Day 28 of 28

Reference: Pinal Dave (http://blog.SQLAuthority.com)

SQL SERVER – Monitor Database via a Heatmap, Alarms List and Realtime Diagnostics for FREE

Note: This review is based on the performance monitoring and tuning free product Spotlight on SQL Server Freemium.

If you think of a DBA’s life, it has one very simple goal – their server should never go down and all the queries should always perform excellently. However, just like any other life goals o it is not possible to achieve that easily. I have often seen many DBA’s continuously watching their monitor to make sure that their servers are running fine. Another habit of most  DBAs is to continuously check their mobile phone for alerts. Nowadays we get so many alerts it is  getting harder to keep watch on the most important alerts for the health of our server. The habit of looking at the phone and computer monitor is so rooted into a DBA’s mind that they keep on looking at their phone at home to catch a suspicious alert.

Earlier this year, when I attended SQL PASS 2013, I  stopped by the Dell Software booth to see what they have  new in the SQL Server world. I  noticed Spotlight on SQL Server Freemium running on their monitor. When I inquired about the price, I was happy, in fact I was very happy as it was totally FREE! After returning home, I got much too busy with my day job, but  I recently I got some time and I downloaded the Spotlight on SQL Server Freemium FREE tool. The installation was pretty straight forward and easy. It took me less than 10 seconds for me to install the tool, just make sure that your SSMS is closed when you install  Spotlight on SQL Server Freemium, otherwise it will show you a warning to turn off SSMS.

Once I installed the plug-in, it was very easy to use it as it becomes an integral part of the SQL Server Management Studio, the interface is a very user friendly.

There are three distinct options in the Spotlight on SQL Server Freemium tool bar menu. Once you click on Monitoring it will give  three options. 1) Heatmap 2) Alarms 3) Connections. Let’s look at  them very quickly over here.

1) Heatmap

If our server is down, we want to know right away, but if everything is smooth we do not want to keep on getting reminders about that. For that reason Heatmap is a very essential part of  Spotlight on SQL Server Freemium. It gives an ‘at-a-glance’ picture of the state of all the servers DBAs have in their environment. Colors communicate all the information about what is going on with your server. The heatmap takes this a step further by displaying each server as a tile and then aggregating all of the statuses of a server and assigning a size to that tile. It also displays alarms for the connection when touched.

2) Alarms

Alarms  is just an alternate way to view Heatmaps. They display alarms on each server ordered by severity. You can configure and sort alarms the way you prefer. Once an alarm rings an experienced user can do either of  two actions: a) Acknowledge the alarm and solve the issue  b) Snoozing it to be reminded in the future.

3) Connections

This particular area displays various connections to diagnose a server as well as the server which you are monitoring. You can make various adjustments in your server connection in this section.

System Health Check

One of the biggest features of Spotlight on SQL Server Freemium is health check and providing a prioritized list of the key health system issues. Users can pinpoint various issues with the help of this list and resolve SQL Server issues. There are major five categories this tool checks:  Security, Disaster Recovery, Index Optimization, Memory and SQL Best Practice.

In future blog posts we will cover each of these topics in depth. Meanwhile, I strongly suggest you download Spotlight on SQL Server Freemium and makes sure your servers are healthy. Additionally, visit www.SpotlightEssentials.com, the one-stop shop for all things Spotlight.

Reference: Pinal Dave (http://blog.sqlauthority.com)

SQLAuthority News – Learn MySQL Indexing in 99 Minutes – MySQL Indexing for Performance – Video Course

Every year around September 1st, I have decided to do something cool. This is because September 1st is my daughter Shaivi’s birthday. In previous years, I have released my new books as well new courses on this day.  This year I have done something similar for her. I like to work hard and dedicate my efforts to my family as they are the one who provides me unconditional support to do various activities for the community.

Journey to Learn MySQL Indexing in 99 Minutes

Indexes are considered to be sure solution for Performance tuning but it has two sides of the story. A good index can improve performance of your query and bad index can degrade the performance of the query. I have build a course around MySQL indexing where I explain how to create indexes which improves the performances of the MySQL Query by many folds. Indexing is such a subject that I can talk about this subject for many days and still I will have more material to share. However, I had decided to build a course which is very sharp and to the point. I wanted to make sure that with the help of this course one can learn all the major points of MySQL indexes in very short period of the time.

While I was building this course – I had put a threshold that I will not build a very long boring course. I wanted to make sure that I build a crisp course which addresses the need of the hour and teach the most efficient methods to improve performance of the query. It took me more than 3 weeks to build all the material I wanted to cover in this course. Once I have build the material the real challenge was on, I wanted to make sure that I build a course which is basic enough for a novice to understand but advanced enough that an expert can learn various tricks as well. I wanted to balance the course for every level as well as I wanted to make sure that the pace of the course is good enough for everyone to keep up with it.

Finally, after long hours and many days of work, I finalized the content of the course which just is the right fit for everybody – a novice who does not know anything about index as well an expert who already have a clear idea about the index.  I had pretty much mapped out every single minute of the course with the demo and slide. As per my estimate the course should have not gone over 100 minutes. When I was done with the course, I was extremely delighted as the entire course was recorded in 99 minutes – YES, only 99 minutes. I have previously recorded many courses but this course was built with perfection in the seconds of the time.

You can learn MySQL Performance Tuning with the help of Indexing in 99 minutes.

Course Content

Here is the course outline which I have built to learn MySQL Indexing. You can watch this course by creating free trial at Pluralsight. If you do not see the play button to the watch the course, please login to the Pluralsight with your valid credentials.

  • MySQL Indexing for Performance
    • Introduction
  • Types of Indexes
    • Introduction
    • InnoDB vs MyISAM
    • B-Tree Index
    • Advantages of B-Tree Index
    • Clustered Index
    • Advantages of Clustered Index
    • Disadvantages of Clustered Index
    • Secondary Index
    • Building Clustered B-Tree
    • Building Secondary Index B-Tree
    • InnoDB Clustered Table
    • Hash Index
    • Limitation of Hash Index
    • Adaptive Hash Index
    • Building Hash Index
    • Other Indexes
    • Summary in Sixty Seconds
  • Indexing Strategies for High Performance
    • Introduction
    • Effectiveness of Index
    • Demo: List Indexes
    • Demo: Basics of Indexes
    • Demo: Order of Columns in Index
    • Demo: Optimize OR Conditions
    • Demo: Optimize OR Conditions – Advanced
    • Demo: Optimize AND Conditions
    • Demo: Optimize AND Conditions – Advanced
    • Demo: Cover Index
    • Demo: Clustered Index
    • Demo: Index Hints
    • Summary in Sixty Seconds
  • Index Maintenance
    • Introduction
    • MySQL Query Optimizer
    • Statistics
    • InnoDB Statistics
    • Data and Index Fragmentation
    • Index Maintenance
    • Other Index Maintenance Tips
    • Summary in Sixty Seconds
  • Checklists
    • Checklists

Watch Video Course

Click here to watch the MySQL Indexing for Performance video course.

As an author of this course, I have few free codes to watch this course. Please leave a comment to watch this course and I will send a few of the readers a free card to watch this course. You can also request the free code on my Facebook page as I am very active there.

Reference: Pinal Dave (http://blog.sqlauthority.com)

SQL SERVER – Tips for SQL Query Optimization by Analyzing Query Plan

Update: You can download dbForge Studio for SQL Server for free trial.

One of the most exciting periods of my life relates to the maintenance and optimization of ERP system for one large manufacturing company. The problem was that the system was initially created for a limited range of tasks, which over time grew much bigger than expected.

When multiple users simultaneously used the system, working in it was close to impossible. Increasing the operating capacity of a server could not completely solve the problem. So it was settled on revising current business-functionality and optimizing the most resource-consuming SQL queries.

Before I go on with the article, it’s worth mentioning that SQL Server generates an execution plan for each query. The plan describes an algorithm that will lead to a desired result.

We should also keep in mind how execution plans are generated. To reach maximal query execution speed, query optimizer (separate component of the SQL Server core) always attempts to generate a plan with the sequence of actions that consumes minimum server resources.

When evaluating any execution plan, the query optimizer takes into account multiple factors: involved database objects, conditions of joining them, returned columns list, indexes presence, availability of indexes and actual statistics, etc.

However, sometimes the optimizer cannot operate with the actual data and that is why it can over- or underestimate the SQL query cost when executing different plans. That’s why there is always a chance that the non – optimal plan will be chosen.

SSMS environment allows viewing the execution plan with the help of Show Execution Plan. However, analyzing complicated queries in it is far from easy. That is why I use a more functional query profile available in dbForge Studio for SQL Server.

We can get an execution plan each time SQL query execution is started. To do so, we need to switch to the profiling mode.

Or we can use the Generate Execution Plan command without running the query.

Several examples will let me demonstrate some tips for optimizing SQL queries using the profiler:

1. Minimizing the number of implicit connections

For the most complicated SQL queries I seek to use views depending on the situation, since their declaration might be abundant, and the server will be consuming more time for reading and analyzing this data. I do not intend to say that they should not be used at all, but they should be used wisely.

For example, we should get a list of customer’s tables. All the necessary object data can be obtained from the system views.

SELECT *
FROM sys.tables t

We will get the following plan for this query:

In the profiles the most resource-consuming operation are highlighted with red and yellow accordingly, representing the percentage of resources spent on each operation.

Let’s modify our SQL query – now we shall select only table names:

SELECT t.name
FROM sys.tables t

Now our execution plan got simplified and it is obvious that the query execution will require less resources:

How did we achieve that? Let’s take a look at the content of sys.tables. Unfortunately, SSMS does not allow viewing system views DLL, so we will use a small trick:

DECLARE @SQL VARCHAR(MAX)
SELECT @SQL = OBJECT_DEFINITION(OBJECT_ID('sys.tables'))
PRINT @SQL

We receive the following results:

CREATE VIEW sys.tables
AS
SELECT
o.name, o.OBJECT_ID, o.principal_id, o.schema_id, o.parent_object_id,
o.TYPE, o.type_desc, o.create_date, o.modify_date,
o.is_ms_shipped, o.is_published, o.is_schema_published,
ISNULL(ds.indepid, 0) AS lob_data_space_id,
rfs.indepid AS filestream_data_space_id,
...
ts.name
AS lock_escalation_desc,
o.is_filetable
FROM sys.objects$ o
LEFT JOIN sys.sysidxstats lob ON lob.id = o.OBJECT_ID AND lob.indid <= 1
LEFT JOIN sys.syssingleobjrefs ds ON ds.depid = o.OBJECT_ID AND ...
LEFT JOIN sys.syssingleobjrefs rfs ON rfs.depid = o.OBJECT_ID AND ...
LEFT JOIN sys.syspalvalues ts ON ts.class = 'LEOP' AND ...
WHERE o.TYPE = 'U'

Query optimizer looks at the columns which are actually used in our SQL query and removes excess joins. Now you can see why the application of an asterisk in query can result in reducing query efficiency.

However the same data can be received in a more efficient way. All tables are objects and are contained in sys.objects:

DECLARE @SQL VARCHAR(MAX)
SELECT @SQL = OBJECT_DEFINITION(OBJECT_ID('sys.objects))
PRINT @SQL
'

We receive the following in Output:

CREATE VIEW sys.objects AS
SELECT
name,
OBJECT_ID,
principal_id,
schema_id,
...
FROM sys.objects$

It’s obvious that this view is very simple and thus faster than sys.tables. All we have to do is select the tables among all objects by adding filtering condition (U – USER_TABLE, V – VIEW):

SELECT t.name
FROM sys.objects t
WHERE t.[type] = 'U'

To compare the queries we need to select execution plans and click Compare Selected Results on the context menu.

2. Elimination of data re-reading

Usually data re-reading is one of the most resource-demanding operations. That is why queries should be written in a way that would allow reading data in a query only once.
For example:

SELECT
(
SELECT COUNT(*)
FROM sys.objects o
WHERE o.[type] = 'U')
,(
SELECT COUNT(*)
FROM sys.objects o
WHERE o.[type] = 'V')

When you look through the plan, you can notice that the data from the table is being read twice:

The example below shows how this problem can be resolved:

;WITH cte AS
(
SELECT o.[type], [count] = COUNT(*)
FROM sys.objects o
WHERE o.[type] IN ('U', 'V')
GROUP BY o.[type]
)
SELECT (
SELECT [count]
FROM cte
WHERE [type] = 'U'
)
, (
SELECT [count]
FROM cte
WHERE [type] = 'V'
)

Keep in mind that CTE is just a code generalization – it will not allow you to eliminate the data re-read:

Let’s try to rewrite the SQL query using aggregation:

SELECT
COUNT(CASE WHEN o.[type] = 'U' THEN 1 END)
,
COUNT(CASE WHEN o.[type] = 'V' THEN 1 END)
FROM sys.objects o
WHERE o.[type] IN ('U', 'V')

The progress is evident:

But let’s imagine that we need to make more than two aggregations. In this case query execution will consume more time. But in fact, we can simplify it even more:

SELECT *
FROM (
SELECT o.OBJECT_ID, o.[type]
FROM sys.objects o
WHERE o.[type] IN ('U', 'V')
)
t
PIVOT
(
COUNT(t.OBJECT_ID) FOR [type] IN (U, V)
)
p

Let’s compare the queries:

I suppose, no comments are needed.

To sum it up I would like to point out that simplifying SQL query logic is the key to query optimization.
When analyzing execution plans with the help of the query profiler, you will be able to detect bottlenecks in the performance of databases and improve the efficiency of your SQL queries.

I suggest you download dbForge Studio for SQL Server and try this out. Let me know what you think about the same.

Reference: Pinal Dave (http://blog.sqlauthority.com)

SQL SERVER – Example of Performance Tuning for Advanced Users with DB Optimizer

Performance tuning is such a subject that everyone wants to master it. In beginning everybody is at a novice level and spend lots of time learning how to master the art of performance tuning. However, as we progress further the tuning of the system keeps on getting very difficult. I have understood in my early career there should be no need of ego in the technology field. There are always better solutions and better ideas out there and we should not resist them. Instead of resisting the change and new wave I personally adopt it.

Here is a similar example, as I personally progress to the master level of performance tuning, I face that it is getting harder to come up with optimal solutions. In such scenarios I rely on various tools to teach me how I can do things better. Once I learn about tools, I am often able to come up with better solutions when I face the similar situation next time.

A few days ago I had received a query where the user wanted to tune it further to get the maximum out of the performance. I have re-written the similar query with the help of AdventureWorks sample database.

SELECT *
FROM HumanResources.Employee e
INNER JOIN HumanResources.EmployeeDepartmentHistory edh ON e.BusinessEntityID = edh.BusinessEntityID
INNER JOIN HumanResources.Shift s ON edh.ShiftID = s.ShiftID;

User had similar query to above query was used in very critical report and wanted to get best out of the query. When I looked at the query – here were my initial thoughts

  • Use only column in the select statements as much as you want in the application
  • Let us look at the query pattern and data workload and find out the optimal index for it

Before I give further solutions I was told by the user that they need all the columns from all the tables and creating index was not allowed in their system. He can only re-write queries or use hints to further tune this query.

Now I was in the constraint box – I believe * was not a great idea but if they wanted all the columns, I believe we can’t do much besides using *. Additionally, if I cannot create a further index, I must come up with some creative way to write this query. I personally do not like to use hints in my application but there are cases when hints work out magically and gives optimal solutions.

Finally, I decided to use Embarcadero’s DB Optimizer. It is a fantastic tool and very helpful when it is about performance tuning.

I have previously explained how it works over here.

First open DBOptimizer and open Tuning Job from File >> New >> Tuning Job.

Once you open DBOptimizer Tuning Job follow the various steps indicates in the following diagram.

Essentially we will take our original script and will paste that into Step 1: New SQL Text and right after that we will enable Step 2 for Generating Various cases, Step 3 for Detailed Analysis and Step 4 for Executing each generated case. Finally we will click on Analysis in Step 5 which will generate the report detailed analysis in the result pan.

The detailed pan looks like. It generates various cases of T-SQL based on the original query. It applies various hints and available hints to the query and generate various execution plans of the query and displays them in the resultant. You can clearly notice that original query had a cost of 0.0841 and logical reads about 607 pages. Whereas various options which are just following it has different execution cost as well logical read. There are few cases where we have higher logical read and there are few cases where as we have very low logical read.

If we pay attention the very next row to original query have Merge_Join_Query in description and have lowest execution cost value of 0.044 and have lowest Logical Reads of 29. This row contains the query which is the most optimal re-write of the original query. Let us double click over it.

Here is the query:

SELECT *
FROM HumanResources.Employee e
INNER JOIN HumanResources.EmployeeDepartmentHistory edh ON e.BusinessEntityID = edh.BusinessEntityID
INNER JOIN HumanResources.Shift s ON edh.ShiftID = s.ShiftID
OPTION (MERGE JOIN)

If you notice above query have additional hint of Merge Join. With the help of this Merge Join query hint this query is now performing much better than before.

The entire process takes less than 60 seconds.

Please note that it the join hint Merge Join was optimal for this query but it is not necessary that the same hint will be helpful in all the queries. Additionally, if the workload or data pattern changes the query hint of merge join may be no more optimal join. In that case, we will have to redo the entire exercise once again. This is the reason I do not like to use hints in my queries and I discourage all of my users to use the same.

However, if you look at this example, this is a great case where hints are optimizing the performance of the query. It is humanly not possible to test out various query hints and index options with the query to figure out which is the most optimal solution. Sometimes, we need to depend on the efficiency tools like DB Optimizer to guide us the way and select the best option from the suggestion provided.

Let me know what you think of this article as well your experience with DB Optimizer. Please leave a comment.

Click to Download Scripts

Reference: Pinal Dave (http://blog.sqlauthority.com)