SQL SERVER – Reducing CXPACKET Wait Stats for High Transactional Database

While engaging in a performance tuning consultation for a client, a situation occurred where they were facing a lot of CXPACKET Waits Stats. The client asked me if I could help them reduce this huge number of wait stats. I usually receive this kind of request from other client as well, but the important thing to understand is whether this question has any merits or benefits, or not.

Before we continue the resolution, let us understand what CXPACKET Wait Stats are.

The official definition suggests that CXPACKET Wait Stats occurs when trying to synchronize the query processor exchange iterator. You may consider lowering the degree of parallelism if a conflict concerning this wait type develops into a problem. (from BOL)

In simpler words, when a parallel operation is created for SQL Query, there are multiple threads for a single query. Each query deals with a different set of the data (or rows). Due to some reasons, one or more of the threads lag behind, creating the CXPACKET Wait Stat. Threads which came first have to wait for the slower thread to finish. The Wait by a specific completed thread is called CXPACKET Wait Stat. Note that CXPACKET Wait is done by completed thread and not the one which are unfinished.

“Note that not all the CXPACKET wait types are bad. You might experience a case when it totally makes sense. There might also be cases when this is also unavoidable. If you remove this particular wait type for any query, then that query may run slower because the parallel operations are disabled for the query.”

Now let us see what the best practices to reduce the CXPACKET Wait Stats are.

The suggestions, with which you will find that if you search online through the browser, would play a major role as and might be asked about their jobs In addition, might tell you that you should set ‘maximum degree of parallelism’ to 1. I do agree with these suggestions, too; however, I think this is not the final resolutions. As soon as you set your entire query to run on single CPU, you will get a very bad performance from the queries which are actually performing okay when using parallelism.

The best suggestion to this is that you set ‘the maximum degree of parallelism’ to a lower number or 1 (be very careful with this – it can create more problems) but tune the queries which can be benefited from multiple CPU’s.

You can use query hint OPTION (MAXDOP 0) to run the server to use parallelism.

Here is the two-quick script which helps to resolve these issues:

Change MAXDOP at Server Level

EXEC sys.sp_configure N'max degree of parallelism', N'1'
GO
RECONFIGURE WITH OVERRIDE
GO

Run Query with all the CPU (using parallelism)

USE AdventureWorks
GO
SELECT *
FROM Sales.SalesOrderDetail
ORDER BY ProductID
OPTION (MAXDOP 0)
GO

Below is the blog post which will help you to find all the parallel query in your server.

SQL SERVER – Find Queries using Parallelism from Cached Plan

Please note running Queries in single CPU may worsen your performance and it is not recommended at all. Infect this can be very bad advise.

I strongly suggest that you identify the queries which are offending and tune them instead of following any other suggestions.

Reference: Pinal Dave (http://blog.sqlauthority.com)

About these ads

23 thoughts on “SQL SERVER – Reducing CXPACKET Wait Stats for High Transactional Database

  1. You r exactly right Sir, but by setting maxdop to 1 in a hybrid workload ( reporting + oltp ) as in almost cases it will hurt performance. So would it be better to set maxdop to half the no. of cpu to control the wait stat?

  2. Hi Pinal,

    It was nice to meet you this week in Seattle. I have one comment on this post:

    “Note that CXPACKET Wait is done by completed thread and not the one which are unfinished.”

    Specifically, it is the coordinator / boss thread that registers CXPACKET waits. The child / worker threads do not experience this wait state. The coordinator will always experience this wait as long as at least one child has not yet completed.

    Cheers,
    Chris

  3. Keep in mind that you can set up the parallelism on a global (server instance level), on query level and also on a query level. Also, keep in mind that you can override the server parallelism setting at any time, i.e. if you set the global parallelism setting to 1, you can still execute some queries by using multiple CPUs.
    Here is an article I wrote about this. http://www.sqlservice.se/allmant/and-some-more-about-parallelism/

    Also, keep in mind that if you have a mixed workload on the same instance it is best to find the correct setting for the threshold of parallelism and not the Degree of parallelism, but this is a whole different subject.

    Also, as Snehal is asking, in a case where you have a system with a mixed workload, it would be a good idea to consider segregation of the workload. I usually put systems with different workload types on different instances, or even on virtual machines. I design them like that to begin with because it is easier to manage.

  4. I would be wary of setting MAXDOP=1, effectively turning it off at the server level. Instead, as Pinal suggested, tune your queries – Parallelism may be caused by missing indexes, and changing the execution path, in the instance where huge table scans occur.

    In a hybrid environment (reads and writes) you may want to adjust the maxdop settings accordingly, IF it becomes an issue – exactly monitoring the waitstats for high occurrence of CXPACKETS. HTH

    - RP

  5. Hello, Pinal!

    As always, great article! I would like to highlight a particular rule of thumb that we use with our Enterprise product. (I agree with you that the correct answer is “it depends”, but at least for our systems, the below seems to be working, making it a golden rule for us).

    Snehal: This is what might help you also.

    Our system has it’s roots as a legacy system from the days of flat-file database systems (about 20 years ago). It has grown over the years to now use SQL 2008 R2. The point behind mentioning this bit of history is that most of our queries are tuned on running against a single processor. Our warehouse product is, of course a “modern” product (developed in the days of SQL 2000) and quite a few of the OLAP queries are tuned for sound multi-processor performance.
    As a general rule, because our OLTP and OLAP systems are deployed on different physical machines (not only different SQL instances), we tend to recommend that the MAXDOP on the OLTP system be 1 and that on the OLAP be 0.
    Doing vice-versa would toast the performance of both systems, as is the case with Snehal.

    Snehal: If your reporting and OLTP systems are fused and/or cannot be separated, I would say that your reporting stored procedures should be modified to use MAXDOP 0. The server level setting should be 1.

    Thanks & Regards,
    Nakul Vachhrajani.

  6. As long as I’m on this thread, I’ll speak up in favor of MAXDOP 1 on a *real* OLTP system. I mentioned that our group manages servers that turn about 1.5B transactions per day. About 1B of that is on servers that have MAXDOP = 1, and I’m going to switch it on the others as soon as I can. I don’t think you should ever go parallel without intending to, at least not until you have a server with a *lot* (as in 64 or more) processors. The protection offered by MAXDOP 1 is, in my opinion, much more valuable than what you give up in exchange (like the ability to run DBCC in parallel).

    In the interest of full disclosure, I’ll mention that Adam Machanic disagrees with this, and suggests that OLTP should “lean toward” MAXDOP 2, but to me this is just opening up the door to parallel plans that you don’t want to happen, and which will take cycles away from well-behaved queries. It also would encourage people to write code that needs to go parallel. With MAXDOP 1 as our server setting, we need to request other MAXDOP settings explicitly in our code, and that’s good because that means it will be obvious to the code reviewer that the query is intended to go parallel.

    For OLAP / Reporting / DSS (there’s an old TLA) and similar systems, I agree that MAXDOP 0 could be appropriate, but again, that allows one rogue process to take over your server. Setting MAXDOP equal to the number of cores (or half the number of cores) and overriding that only when you need more is another strategy that could work. For those servers, I think it really depends on whether it’s OK with you to have individual requests chewing up all your schedulers at once.

    Cheers,
    Chris

  7. Pingback: SQLAuthority News – Download Whitepaper – Understanding and Controlling Parallel Query Processing in SQL Server Journey to SQL Authority with Pinal Dave

  8. Hi Pinal,
    Can you tell, what is the difference between setting MAXDOP to 0 and Setting it to no. of CPUs in the server?

    -Manoj

  9. Designing a new SQL Server instace that will support a web applicaion, the application is running on 32 nodes.
    The server has 128 GB memory and 16 processor cores. The application contains 2 databases and supports both OLAP and OLTP workloads.
    On testing some queries run extremely slow and some very fast.
    To ensure that the server process database queires as fast as possible.
    What should I do for BEST answer?

    A. Execute exec sp_configure ‘maximum degree of parallesism’, 1

    B. Execute exec sp_configure ‘maximum degree of parallesism’, 8

    C. Use SQL Profiler to identify queries tha experience CXPACKET wait. Add (OPTION MAXDOP 1) to each query.

    D. Use SQL Profiler to identify queries tha experience CXPACKET wait. Add (OPTION MAXDOP 8) to each query.

    Please answer the question.
    Thank you,
    Bidhan

  10. Pingback: SQL SERVER – Weekly Series – Memory Lane – #003 « SQL Server Journey with SQL Authority

  11. Pingback: SQL SERVER – Weekly Series – Memory Lane – #007 « SQL Server Journey with SQL Authority

  12. Pingback: SQL SERVER – Root Cause to Performance Problems – Notes from the Field #002 | Journey to SQL Authority with Pinal Dave

  13. I have table with columns like AutoID, Number, Name, City, State, Country.

    I wanted is the maximum number entered in the “Number” column with the combination of Name, City, State and Country.

    Name City State Country
    Smith Chicago Chicago USA
    John Chicago Chicago USA
    John Los Angelos Chicago USA

    Now smith should get “Number” 1, John 2, and again John(in Los Angelos) 1 as he is the first from Los Angelos.

    I can simply put a where clause in query and get the max number + 1. But the problem is that when I have huge amount of data and the number of users increases, my query will be really slow. I am also inserting data in the same table so it will keep on piling.

    I hope I have made my self clear.

  14. I have table with columns like AutoID, Number, Name, City, State, Country.

    I wanted is the maximum number entered in the “Number” column with the combination of Name, City, State and Country.

    Name City State Country
    Smith NY NY USA
    John NY NY USA
    John NJ NY USA

    Now smith should get “Number” 1, John 2, and again John(in NJ) 1 as he is the first from NJ.

    I can simply put a where clause in query and get the max number + 1. But the problem is that when I have huge amount of data and the number of users increases, my query will be really slow. I am also inserting data in the same table so it will keep on piling.

    I hope I have made my self clear.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s