top of page

Partitioned Views, a blast from the Past?

This post came around when I was at a loose end one evening, and just started poking at a local sandpit database, and it got me reminiscing and revisiting / testing a few things. The devil makes work for idle thumbs and all that...


Partitioned Views...do they have a place in society anymore?


The concept has been around for a very long time, but often argued as defunct since SQL 2016 SP1 where we were gifted them for use in lowly Seiko Standard Edition. Before that, we didn't have much of a choice if we didn't have the Rolex Enterpise Bling Edition on our wrists where we could use Table Partitioning.


What are Partitioned Views?

I'm not going to go into any depth as there are many articles out there, but basically, you:


  • Split your main table up into multiple tables on a column/columns of your choosing

  • Enforce those columns with a CHECK Constraint so SQL knows what is allowed in the table.

  • Load the relevent data into each table.

  • Throw a view over all of your tables, simply UNIONing ALL each table to give one fat resultset.

The classic example for this is choosing a Datetime column, then splitting the tables by Month/Year etc. MyTable2019 for 2019 data, Mytable2020 for 2020 etc.


Stop...DemoTime

Lets spin up a Demo setting this up, i'm using the StackOverflow2013 database to mess with.


Dbo.Posts has 17142169 rows in this DB.


Lets split that up into year 'sub tables', based on the CreationDate column,

ie.. 'partition' *cough * *cough * by when posts were created.



So we can see, there were 274536 posts for 2008, 1288974 posts for 2009, and so on. (The data in this DB stops at 2013)


Now we'll add the CHECK constraints on to the tables, so that SQL knows what is in the table in relation to the CreationDate column.


And i'll add the Clustered index on each table as it is with the original dbo.Posts table.


And finally our View to bring all of the tables back together. Note, no SCHEMABINDING as there are nasty *'s in there....



Right, so we've now got our Partitoned View, with Sub tables holding our data as we want it, so lets throw a query at it, and our original Posts table.



For the dbo.Posts, this looks like:

With the vw_posts query looking like:

It's Great!

Hang on, thats loads faster! Look closer at the execution plan object, the vw_posts query accessed the Posts2011_ID Clustered index only, it didn't bother with any of the other tables.


Lets back that up with Statistics IO / Time data.


Dbo.Posts query

Table 'Posts'. Scan count 7, logical reads 4192157

SQL Server Execution Times:

CPU time = 11561 ms, elapsed time = 11147 ms.


Dbo.vw_posts query

Table 'Posts2011'. Scan count 7, logical reads 805586

SQL Server Execution Times:

CPU time = 4187 ms, elapsed time = 2856 ms.


Yup, that avoided extra work without us having to do anything.


The table to the partitioned view only read the table for which we wanted data, and we don't have any indexes on the CreatedDate column. Due to the CHECK constraints we added, rather than scan the entire table, SQL knew that only the Posts2011 table had data that could satisfy this query. End result is that they were eliminated from the plan, result!


Wait, that's awesome! So simple, and SQL is super clever to only pull what I want, and I love it, and I....I.....

Whoa there tiger... the above example is very contrived. Where are the indexes, what happens when there are indexes, and wouldn't I just have indexes on there in the first place?


The chances are that you would, so lets do that, again in a contrived sort of way as you might not have single column indexes.



It's Still Great

Same query again barman.


Table 'Posts'. Scan count 7, logical reads 4190852,

SQL Server Execution Times:

CPU time = 13436 ms, elapsed time = 11615 ms.


Table 'Posts2011'. Scan count 7, logical reads 805364

SQL Server Execution Times:

CPU time = 4891 ms, elapsed time = 2737 ms.


That's all the same, plans identical as before (Scanning remember).... what's the problem?


Time to Pull the Rug from under us.

Lets now narrow the search scope down, from 6 months to 1 month... as you've been shouting at your Users/Devs already for pulling too much unneeded data around.



Dbo.Posts

Dbo.vw_Posts

Oh b*gger, the original used a Seek and lookup, whereas our partitioned View decided it just wanted to scan.


Table 'Posts'. Scan count 7, logical reads 1421768

SQL Server Execution Times:

CPU time = 1251 ms, elapsed time = 716 ms.


vs.


Table 'Posts2011'. Scan count 7, logical reads 805482

SQL Server Execution Times:

CPU time = 4155 ms, elapsed time = 2765 ms.


So it still read a lot less, at the expense of a lot more CPU and Duration. Ouch, bigtime, but how/why?


Looking at the tooltip for the Posts2011 Index scan.... we have a Residual Predicate which contains the query predicate... so each row in the Posts2011 table is being checked to see if it is valid.

This is also what is happening in the first 'no-index' example above, just that with both queries scanning, the benefit of reading fewer pages that using the Partitioned View provides is better than nothing, so we see the benefit.


As soon as we can Seek within the index, then the benefits can be lost.


Ok, lets Cover it.



Both now super seek, so both really fast, but we've sort of lost the performance benefit of the Partitioned view.


Add another tiny column in so we're no longer covered... Answercount.


And.... back to square one. Dbo.Posts seeks, but vw_Posts scans. Plus Clippy Clipster (copyright Brent Ozar inc) is now screaming at us to cover it.


So, it seems that the Partitioned view is much happier to default to a scan, than a seek and Key Lookup. As soon as the query is covered by an index, it's happy to seek away to its heart's content. Note the Query Cost above however, SQL is estimating the vw_Posts view query to be cheaper than the Posts table query. Nice attempt, but must try harder.


Dammit Jim, i'm a VIEW not a Table.


At the end of the day, it's a VIEW, with its own definition, and another layer for the Optimizer to navigate when building your query plan. Our view here is just to UNION ALL the tables togther to allow us to elimiate some table access. SQL is expanding that on every execution and incorporating it into our plans. Our queries need to take that into account also.


So it offers the chance of performance gains...

...plus flexible maintenance options...

...all at the risk of wrecking performance.


Sounds a bit similar to Table Partitioning pros/cons if you ask me, see below.


So, that all said what's the point?


Its another case of usefulness depends purely on your situation, so I like to consider it just another tool in the toolbox.


Things like:


  • Maintenance! You can maintain (or not maintain) smaller objects.

  • You need to traverse multiple databases.

  • If you can't index the 'partitioning' key column, and your queries all Scan already...

  • You can index the 'partitioning' key column, but you are pulling LOADS of data , ie... Reporting or DW type queries (and don't have

  • Your queries already filter in the partition column....

  • You want to have differences across your different sub tables...ie, indexes, compression, permissions, or even columns (as long as the view columns match up between tables).

  • The queries that you want to be performant are covered by indexes already so have less chance of regressing.

  • You want to reduce the blast radius of queries tipping from Seeks to Scans.

  • You just don't want to use Table Partitioning for whatever reasoning, maybe it bullied you at school.

  • You have the IO performance of a Commodore 64 and need to reduce pages reads at the expense of all else.

  • You're still on SQL 2016 RTM or below. (No, this isn't an excuse to stay there...#UpgradeJim!)


Proper table Partitioning was introduced as an upgrade to this, and it is in many ways. As alluded to, it also has its foibles however, as in:


  • It can/will harm perfomance,

  • Indexes need to be aligned with your partitioning key,

  • Queries need to be written in a way to take the partitioning into account

  • Partition elimination can be a fickle beast too .


Maybe you like to use Partitioned Views lots, and have good reason to. I'd love to hear any use cases you have and benefits that you see over table partitioning.


Thanks for reading.

SQL - "I really should be doing more with my life on my evenings than reminiscing about older functionality" - Rod

Komentarai

Įvertinta 0 iš 5 žvaigždučių.
Kol kas nėra įvertinimų

Pridėti vertinimą
bottom of page