top of page

SQL Server Avengers: CPU Affinity War

Maybe not the best entry in the franchise, but this is another one of those configuration options that has been around forever, but very very rarely (ie...ever) have the need to touch. I've been in the situation where a colleague (yes, really!) has messed with this, and then was surprised by the outcome.


The key message to take away from this is that generally, you only want to mess with this under extreme load and/or edge case scenarios, and when directed to by experienced database professionals.


Oh, and test... you test hard.


So what is this CPU Affinity lark all about then?

Well, under a Server Properties > Processors

Here we have the options, to set the processor affinity mask for all process, and also the I/O affinity mask. The default is to let SQL handle this automatically as above, as it will provide better performance for most workloads.


On a basic level, these settings restrict which CPUs can be used when restricting SQL server processing and/or SQL IO operations.


So they just limit the CPUs SQL can use right?

Well yes, but there is more to it than that. With the default automatic CPU affinity setting, any visible online scheduler can use any CPU, there is no restriction. Like so for a 2 CPU/scheduler system.

You cannot restrict CPUs with Affinity in order to reduce SQL license costs.

...so don't even try.


The part that many people miss, is that when you enable CPU affinity, even for all CPUs, you are effectively linking each scheduler with a particular CPU that has been marked as enabled. (Which is pretty much what the word means) Like so...



When you enable CPU affinity, you are effectively linking each scheduler with a particular CPU.

Trace Flag 8002 - Schedulers be free!

To override this behaviour, and allow all enabled CPUs to be used by all visible online schedulers, Trace Flag 8002 exists, this is a server wide flag that requires setting as a startup parameter.



OK, what about performance?

Say you have 8CPU, and you enable only 4 of them via affinity. That means i'll have only 50% of my CPUs, so the remaining ones will be twice as busy right?


No... there is absolutely zero guarentee of that, and this is where i've seen people make this assumption, and then wonder why their CPUs are on fire. The change in behaviour pattern will be completely based on your hardware/config and workload.


It's Demo time.


These are highly contrived demos, running code on my laptop that would probably never make up a real world scenario. Its purely to show differences in behaviour between different types of load.


I have

  • A workload, primarily scans, going parallel.

  • A workload, primarily seeks, single threaded.

  • A mix of the two.


Multiple threads in SQLStress to get a heavier load.


MAXDOP 0 (Bad wicked naughty zoot!)

CTFP - 5

For info all other settings where left as is, apart from those specified as amended.


Default CPU settings, Scan based load.


Settings, duration and other metrics.


So our workload caps CPU at a steady 65-70%

And top 10 waits, just for transparency.


Manual CPU Affinity - SCAN Workload

Ok, so now lets set our affinity to Manual CPU, and select all CPUs anyway. So all are still available, IO affinity is left default (Auto) . So it looks something like this.



Wow, that was faster!

But our CPUs are now on fire....flatlined 100%. This is the CPU behaviour i've seen out in the wild when this has been done before too. Needless to say, this caused lots of other issues.

And waits... less SOS_Scheduler_Yield, even though our CPUs are melting. CXConsumer / CXSync_Port shows a lot less as well, indicating less parallelsm, but plans and settings were identical between runs.



I'm not going into investigating the depths of why its different here, just highlighting the huge change in performance profile from a single change to an identical workload. Interesting.


Default CPU settings - Seek Workload


Lets bring in the performance tuners, and turn our workload into predominately Seeks.


Fat lump of CPU usage. Same 65-70% top out as previously.



CPU Manual Affinity - ALL CPUS - Seek Workload.

Same again, let us fiddle with our CPU affinity again, Manual CPU, all CPUs selected, IO affinity default.


Faster again.

Brief spike in CPU, trails off quickly though as the workload was fast.

Walter Waits.


So here, we see a fast workload, with potentially less CPU used?



Default Settings - Mixed Workload


So now, lets have a mix of seeks and scans.

Our settings and timings.


CPU , between 35-60%.


Manual Affinity - Mixed Workload

Same again, set the CPU affinity to manual (all CPUs). IO affinity automatic.


Speed city again.


And a lot more CPU use, on average 50-90%.

Again, less waits recorded.




Why some of the differences in patterns?


Well, on my laptop, which isn't the best at multithreading. Auto Affinity was using 8 cores hard, and not the other 4. Manual affinity suddenly unleashed the beast, and allowed SQL to make use of all of the CPUs.


Auto Affinity (Left) not using all cores, vs Manual Affinity (right), where it is. Black line is CPU total.













And to show TF 8002 behaviouralong with manual affinity. Similar pattern to Auto again.



Summary


All of our tests ran quicker, with CPU higher for all but the Seek Load tests.


Test Duration (m:s)

CPU %

Default CPU - Scan Load

3:52

65-70

Manual CPU - Scan Load

2:40

100

Default CPU - Seek Load

0:25

65-70

Manual CPU - Seek Load

0:17

55 (Max)

Default CPU - Mixed Load

1:58

35-65

Manual CPU - Mixed Load

1:24

50-90


First off, do not just read this post and go and set all of your servers to manual CPU Affinity. Really, just don't. This post was purely to show you the change in performance that may occur if you start playing with this setting. Who was expecting the Server CPU usage to suddently increase significantly when Affinity was set with the same number of cores as the default auto setting? My colleagues at the time certainly weren’t.


The workload behaviour here was faster , but was a contrived limited time load, on a laptop purely to show the potential change in your performance profile when the setting is changed, even if you are changing it to what you may think is the same config.


For anyone focusing purely on the numbers, yes, it was quicker... by making use of more CPU on this laptop... but they were exactly that, limited run tests...on my laptop. You may have a balanced workload already across CPUs, you may not... again, your situation is different.


Your workload isn't just a set time, its potentially all day, and changing. If you flatline CPUs, chances are that performance of some or all parts of your workload will be wrecked.


Takeaway

Your workloads are different...

Your workloads are real...

Your hardware is different...

Your users are real...

Your users get angry...

Your users may hurt you...


Microsoft say... only change this setting in edge cases as the standard setting is best in most situations. If I was going to mess with CPU affinity on real servers to try and find the best possible config, i'd only do it if I could test as close to production server/VM spec and with identical workloads just to see how it behaved.


If you want to restrict CPU numbers for whatever reason, and want a similar behaviour pattern to standard, then consider Trace Flag 8002.


To end, and purely for balance, you may discover performance improvements by making use of CPU affinity for you set up, and its the reason we are given all of these lovely buttons and dials to fiddle with in the first place. But please go in with eyes wide open and a cautious step.


Thanks for reading

Rod, no affinity to fishing, or lightning.



2 Comments

Rated 0 out of 5 stars.
No ratings yet

Add a rating
Rated 5 out of 5 stars.

Excellent article as ever, I knew there was a reason to not go there :-)

Like
Replying to

Thanks Chris, it can cause pain, very quickly. 😂

Like
bottom of page