Episode 41: Index Maintenance

1400

Welcome back on the SQL Trail, compañeros. Does the thought of setting up your index maintenance plan keep you up at night? Are you looking for ways to make the job easier? In this episode of the SQL Data Partners Podcast, I talk to Sean McCown about Minion Software and the care and feeding of indexes. We also chat about his fragmentation and fill factor preferences, how and when to rebuild, and the ways that T-SQL could be improved.

Episode 40 Quote:

“…I figured I had to go by something that I knew I would be comfortable knowing that someone who was a sys admin was going to drop this tool in their shop, not know how to configure it, and not care to configure it, and not have any maintenance at all.”  – Sean McCown

This episode is sponsored by

Cozyroc-logo-png

Listen to learn…

  • The difference between index rebuilds and index reorganization
  • The aggressive fill factor percentages Sean suggests
  • Why he dislikes most maintenance plans
  • Why a firm maintenance window is unnecessary with Minion Software
  • The default settings Sean recommends for maintenance plans
  • The link between CD cases and SQL Server
  • Why Books Online is wrong
  • The T-SQL features that Sean could live without
  • The most useless superpower ever

As always, please join me on the trail by starting a conversation below or connecting on Twitter. If this podcast is helpful to you, please leave a review on iTunes so others can find it as well. Subscribe to the podcast via iTunesStitcher, and Podbean.

About Sean
Sean McCown is a SQL Server DBA and Microsoft Certified Master in SQL Server. He and his wife Jen founded MinionWare, where they build the products they always wanted to use. As of 2016, this includes Minion Enterprise, Minion Reindex, Minion Backup, and Minion CheckDB. He also blogs about various SQL Server topics alongside his wife, Jen McCowan.

Resources
@KenpoDBA  
MinionWare software
DBARant
Security Theater
Rebuild Index Task Maintenance Plan
Midnight DBA (Jen’s blog)
13th Bullet Bulletproof

Transcription: Index Maintenance

 

Carlos: Well Sean, welcome to the show!

Sean: Hey, thanks!

Carlos: It’s great to have you.  Now interestingly enough, I met you at SQL a couple years ago. And you were talking about something about this Minion, that you had. That you wanted to create this “Minion Environment” and that you had this program or this theory that you could kind of create these base tables and they would go out and execute or make changes to your jobs and all that would be in an automated fashion so that if you needed to change, for example, the location of a backup, you could do that and hundreds of servers could be updated instantaneously.

Sean: So it wasn’t really a theory. I had already had it in several shops. The theory was turning it into an actual marketable product.

 Carlos: Well there you go, that’s right. Letting other people use it besides you knowing the ins and outs of it and all the documentation.

Sean: Exactly.

Carlos: And so, low and behold now there’s a product with the Minion. And one of the things I wanted to chat with you about, which is part of this minion, is the index care and feeding, or the reindexing part of it if you will. And so, lots of misinformation and different opinions, potentially, about how to care and feed those indexes. So I thought I’d have you on and talk about it. Obviously you’ve gone through lots of pains to make that an automated process that now touched several environments and getting some feedback. So let’s start there: what are some of the pains that we have with index maintenance?

Sean: Well, it really depends on what you’re using to do your maintenance. I hate to already set that up to be self-serving, but I really did create the product because of the pains that are out there. Not only in maintenance plans which are practically [expletive] useless. Oh wait, I didn’t mean to say that. I’ve been with Jen all day, anyway, you get the idea.

Carlos: [laughing] Thank you for toning it down a notch here on the podcast.

Sean: You know I did too! The maintenance plans from what I’ve seen, they’ve made some attempt to make them better, right? But there’s still a lot of holes in them. I would say the biggest problem is in varied configuration. Everything out there treats index maintenance like very table in every index in every database has the exact same requirements.

Carlos: Right. It doesn’t care the size or anything. It says, “Hey! You’re an index! I’m going to rebuild you.”

Sean: Yeah, exactly. Maintenance plans as they stand now are exactly like that, right? It rebuilds them, period. And other solutions out there that are very popular kind of do the same thing. They allow you to make changes and have different settings, but it’s a separate job for every one of them. So if I’ve got 3000 tables, 10,000 tables… I have one user with 350,000 tables in a sing database. And if you’ve got that many tables and a lot of them have different fill factors or different thresholds or need to be done on different days or they need to be done in different orders because this table is more important than this table, and so on and so on… then that’s this many jobs. It’s all of a sudden a full time job just to find out which job that table is being processed in.

Carlos: Sure. Let’s back up just for a second then to the maintenance plans, for example. We talked about, for example, that every index will get rebuilt. And I think, to find a factor on that, we all create our maintenance plans with a maintenance window generally like in the evenings, where we have downtime and we expect things to run a little slower during these times. And we all do our backups and our index jobs all at that time. And occasionally, we get the call form the SAN admins like, “Uh… I see my SAN is going haywire at one o’clock in the morning. What’s going on?”

Sean: Yeah. Oh, is there a question in there?

Carlos: Well again, my thought is that with the maintenance plans just saying, “Hey, seek and destroy.” Creating some logic to say, “Hey, let me interrogate you a little bit and do I think you’re an index that needs to be rebuilt? Maybe tonight you’re not, but tomorrow you will be.”

Sean: Exactly. So there are a couple of things that I built in that really help alleviate some of that. And one of them is the Prep Only option where you can gather the index stats earlier that day and use them during the job. Because one thing that I saw is that if you have a maintenance plan that’s two hours or four hours, that can take an hour and a half to gather the stats.

Carlos: Interesting.

Sean: So you’re using a lot of your maintenance window finding out what kind of maintenance you have to do.

Carlos: Right. I guess I hadn’t interrogated that. I need to take a peak to see exactly how long those processes are taking. I assume that because you do have that step, you’re seeing that yeah, a lot of that time is spent saying, “Okay, should I rebuild you or not?”

Sean: Exactly. And the other thing that I did was I allow you to order the tables. So you can put the tables in any order you want to and to me, that extends your maintenance window. Because there have been very few times, and I have been doing this over twenty years, right? There have been very few times when there was a hard, “the index maintenance had to stop at this time. Period, end of discussion, no matter what.” When people tell me that I go “Dude, come on. You act like I’m your mother and I’ve never been in databases before.” They really mean, now they’ve had this drilled into them by their business units, but what they really mean is that their core set of tables that everybody  hits cannot be reindexed outside of this window. And you could have one hundred or two hundred much smaller used tables that can have that rolling outage from the locks. And they’re really not going to take up that many resources. They’re not going to take up a lot of disk or CPU, and they’re not going to lock for very long. So seriously, who cares if they’re done out of the window? So it’s very important to have that two hour window if I know that all my really important stuff is hit up front and everything else just kind of trickles down, because I guarantee that if you take a table with three thousand rows, twenty thousand rows, three million rows, and re-index it in the middle of the day and it’s not something that gets hit pretty often, no one is even going to notice.

Carlos: It all depends on the usage, right? And your mileage may vary there.

Sean: Exactly. So don’t tell me that, “no, all 400,000 of our tables have to be done in this two hour window or the business unit is going to lose their minds and the place is going to crumble.” Come on, man.

Carlos: So let’s talk about that. So what do you define as, “here’s my criteria for the index that needs to be rebuilt?” So carte blanch, general scenarios, default settings?

Sean: Well, so my default settings are, I’m a little more aggressive in my default settings than a lot of people are, right? So I do re-orgs at 10% and rebuilds at 20. That’s a little bit more aggressive.

Carlos: It seems like 30 percent is that magic number.

Sean: Oh come on.

Carlos: [laughing] That’s one in three pages that’s affected.

Sean: We all know, and Paul has said this publically many times, that when he put out that number in BOL it was because they forced him to come up with a number and he pulled that number out of his ass. He said that publically many times. And the truth of the matter is that, a) I’m with him. He hated giving a number but they made him. And the truth of the matter is, that this is the problem with many maintenance plans out there, even the more popular ones no matter how much acclaim they’ve gotten, is that every table is different. Every table in your database could potentially be anywhere, and I’ve seen tables with just a few thousand rows, not be able to withstand any more than seven or nine percent fragmentation and I’ve seen other tables be able to stand thirty-five and forty-five and be just fine. So there’s a lot of variance in there. How much the stats are being changed, and how much variance in the WHERE clauses there is, and how much variance in the data. So there are a lot of factors involved in there, so every table has its own potential to have its own thresholds.

Carlos: So this comes back to knowing your environments.

Sean: You have to keep track of this stuff and see when the tables… now, you have to have a default. The reason I build the system the way I did was to set all of your tables at a guess. Okay, I put 20%, you’re welcome to change it to whatever you like. BOL says 30, and no one ever got fired for following BOL so they put 30. Whatever. Find, you put 30 in there. And the point is, to go in, there and be able to put in table overrides and say, “No, no, no, this table needs 15 and this table needs 22, and this table needs this, and this table needs that, and this table doesn’t really need it until 60 or 65 ‘cause that’s a small table and it just doesn’t have much.” But the point is to have everything covered and then go in and out in overrides and not have to put in tons of extra jobs just to do it.

Carlos: So I’m now curious. How many jobs, and I’m not sure if you’re getting data back, that sounds like a lot of work to go in and say, “I’m going to set this table at 20% and this table at 25.” I can see even maybe at the schema level to say, “Okay, my heavily used tables are in this schema. I’ll create one for this guy.” How realistic is it go in and say, “I’ve got a hundred tables. I’m going to specify a different one for even 10% of them.”

Sean: Right, so you know, right now you’re talking about ten tables right? Big deal. The good news is that you really, if you had 20,000 tables, you’re only going to have a couple handful of tables where this thing is really important. The entire database isn’t the most used portion of the database, right? So there are going to be some tables that cause you problems, for one reason or another, and those are the things you put in exceptions for. And for everything else it’s going to be good enough. So you can, if you had a schema with 1500 tables and they were really important and they all needed the same override, let’s say they all needed to be 10% lower than the rest of the database. You don’t have to put them in there individually. You can write a query to put that INSERT into a table and create the overrides that way. You don’t have your hand in there doing 1500 tables, that just defeats the purpose of it, right? But you do have your hand in there doing all that stuff if you need to create a separate job for every one of them and have that table in an INCLUDE with its own little parameters and stuff. You are going to have your hands in there 1500 times, right? But yeah? For this one, just run an INSERT. And the cool thing about that is that it can be dynamic. The way I’ve done it is, not only can you do it based on that. You can order the tables based off of, right now you base it off of the order you give the table itself. But you can, and it doesn’t take very much coding at all, you can do it based on fragmentation level and usage. Or just usage. So you can look in the DMV and say, “these are the most used indexes, lets’ do these first and reorder them based on that way.” And I’m going to build that into the system in version 2 and right now you have to kind of create your own intermediate step, but even then, all you’re doing is an update based off of the results of another table so it’s really not that hard. Once you get it written you apply it to everybody, right?

Carlos: So obviously looking at the most used indexes, we want to make sure that they get the best care and feeding. Because inevitably, in my mind, with good care and feeding, again we’re looking at the disk fragmentation and trying to keep it as small as possible based on these thresholds, I can potentially add a couple more indexes. Or I’m not going to be shy about adding those indexes because I know I have good care and feeding. I think we have to be careful with indexes and we can’t create every single index that every query wants.

Sean: …that crosses your mind.

Carlos: Exactly. But if we see that we have heavy usage there, we can do that. Now one of those defaults that we talked about, and another potential sticking point with index rebuilds, is the fill factor. And fill factor, just to rephrase it, take us through it. When we say fill factor what do we mean by it?

Sean: I like to use the analogy of my dvd case, right? So I’ve got this lovely dvd case, and if it weren’t so far across the room I’d bring it in and show it up. So you’ve got a lovely book of dvds and you’re putting them all there in alphabetical order, and then you get all the way to Z because you like Zoolander and Zootopia. So you’ve got all these guys. So you’ve got the thing not filled up, you have a few pages left in the end but you’ve put them all in there. And then you go and you buy another one that starts with a B. What do you do? All the spaces are taken. So do you find where it goes in there and move 30 pages of DVDs? No, that’s ridiculous. So fill factor is basically in that index leaving space for extra entries and on the page as well, leaving space for extra entries so you don’t have to move everything down. You’re saying, “Okay, I’m going to fill up each page by half.” And this is why I like this, because it’s a perfect analogy for reindexing, right? I’m going to move these guys by half and so I have four slots on each page and I’m going to put two dvds on each one. If I buy three B’s, now I still have to move it, because I have three B’s but I only have two slots left. I’m going to take that and put it at the end, and I’m going to leave a little sticky note on there that says, “It’s on page something or other.”

Carlos: Right. Hey, I’m over here now.

Sean: Right. That’s fragmentation. And eventually, when I get five or six or seven of these things, it’s going to get laborious and I’m going to have to take that page out, instead of moving everything, I’ll take that B page out of the back and just clip it back in there where the other Bs are instead of moving everything down. So I’m going to take that page and move it back where everything was.

Carlos:  And SQL Server is going to get some efficiency with being able to read in order. It doesn’t want to jump around.

Sean: That’s actually reorg, you get the idea, right? That’s why I love that analogy. It really shows people who don’t really get the whole reindexing thing, it puts it in tangible terms for them.

Carlos: I noticed that your default fill factor is 90. Now I feel like, so my recollection, except for SharePoint, that 100 is the keep it at 100, fill it up, and we’ll take care of the fragmentation with the re-index. Talk to me about your thought process there.

Sean: Okay, so, my personal default is 80. And I compromised on 90, because I know everyone doesn’t like the 80 default. But there are enough OLTP systems out there and one of the things we deal with the most is people not reindexing enough.

Carlos: That is true, right. Throw them out there and they don’t even create the maintenance plans, right?

Sean: And leaving a fill factor on there isn’t really going to hurt performance that much, which is why I put 90. There are some extra reads a little bit, but it’s not really that bad, right? It’s not going to hurt your performance and it’s really going to help your INSERTs. If you’ve got an OLTP system it’s going to do nothing but help your INSERTs and it’ll keep you from having to do that maintenance more often, right? So, in other words, it lets you do your maintenance less often, and that’s a way to state it that’s not a negative. But you know, if you have a DSS system and you do full loads every night and you drop your loads, well then change it to 0 or 100 and move on. I guess change it to 100, because if it was at 0 it’d be re-indexed all the time. But you’d probably just turn it off for that database anyway.

Carlos: So you’re not necessarily looking at size in there? I guess, you mentioned inserts. So if I know I have a table with lots of inserts, maybe I can reduce my fill factor? I guess would that be the only factor to consider? Inserts with fill factor? Or changes, potentially?

Sean: Inserts and updates, really. But that’s really deleting an insert. So, it still boils down to inserts because it’s still gonna have to move that. So yeah, it’s all on inserts and you know it will keep it from being fragmented as often by default and I think by default that’s a good setting. That’s a responsible setting. Because most databases are OLTP in some nature.

Carlos: The other thing that I had is that, yeah, you might increase the index size a little bit, but at least you’re going to get htat buffer, because the other thing that people don’t do well is their growth settings. So when you do re-index, you’re going to run into log growth and database growth and then all of a sudden you’re going to get the dreaded, “You have 15 instances of IO taking longer than 15 milliseconds.” And you’re looking at your log in the next day and going, “what the heck?”

Sean: Yeah, exactly.

Carlos:  Very good. So, other thoughts about index maintenance that maybe people aren’t doing very well? Other than setting it up, number one, that people should be thinking about?

Sean: Wow, let me think about that.

Carlos:  I apologize, we didn’t talk about the rebuild versus reorg debate. When to use reorg and when to use rebuild.

Sean: Right, so reorg is really just, it’s really just there to get you by until another reindex, until another rebuild. So what it does is it puts those pages back in order and just moves the pages back in order. It doesn’t actually physically change the pages themselves or reestablish the fill factor that was there. It just takes these pages, and they get all shuffled up with page splits and so you’ve got all these pointers going, “Oh, don’t forget I have this one over there, and that one over there, and this one there,” so it just takes them all and that IO can really matter. So that’s why by default in my routine I put reorgs every night and just reindexes on the weekend. I kind of like to do it more aggressively than that but I kind of like to have a subdued default. I mean, coming up with defaults for these things is actually really kind of hard because you’re like, “god, what do I put for a default?”

Carlos: With so many environments can be different. Maybe you have a little more empathy for Microsoft now?

Sean: Well yeah, but not for the whole one percent growth on the lock. There’s…come on, man. That’s a horrible default. Okay, maybe one and one. But one and ten percent? Really? So you know, I figured I had to go by something that I knew I would be comfortable knowing that someone who was a sys admin was going to drop this tool in their shop, not know how to configure it, and not care to configure it, and not have any maintenance at all. So I wanted to give them a fill factor, I wanted to set them up to not have to do the full rebuild every week, or every night, right? So I gave them a fill factor and I’ running the rebuild job during the week and that will get them through to their weekend maintenance and then they can change anything on top of that. I’ve actually seen reorgs outperform rebuilds, as in the table performs better just on a reorg that it did with a rebuild.

Carlos: Oh interesting.

Sean: Yeah, I know. No one’s been able to explain it to me, but I’ve quantified it a few times.

Carlos: Very good, so I guess your mileage may vary. Sean, let’s do SQL Family.  So one of the things I always like to talk with people is how they work. You created a tool and we can talk more about that tool, but what’s your favorite SQL tool?

Sean: What’s my favorite SQL tool outside of mine?

Carlos: It can be yours. Let’s talk about it.

Sean: Oh, it’s a toss-up between Minion Enterprise and Minion Backup itself.

Carlos: Okay.

Sean: And very quickly becoming CheckDB because I’m writing it now. Oh, it’s a sweet tool.

Carlos: So, Minion Enterprise consists of the reindex, the backup, DBCC checkdb…

Sean: And so many other modules it’s not even funny. Dude, it does so much.

Carlos: Very good. And so this is ultimately a way for administrators to administrate their environments with all these different components and be able to update what they have and set them much easier?

Sean: It takes all of those things that you would rebar across all your servers and it lets you do it from a single place in literally seconds. So a really good example that I like to give with the security module is, let’s say that you have Anne, she just gets hired to marketing, and she needs to look like Steve. But Steve has access to 40,000 databases across 700 servers.

Carlos: Yikes.

Sean: And his permissions are completely different in all of these things, it’s all over the board ‘cause Steve’s been here 7 years and you’ve fined-tuned his permissions to everything, right?

Carlos: And you’re not using Active Directory groups?

Sean: Right.

Carlos: Shame on you! [laughing]Sean: Well, maybe it’s a mix, right? Maybe you’ve got it on some and not on others. So with Minion Enterprise, I can clone, what did I say, Mary? I can clone Mary as Steve and it takes me 30 seconds. And I’ll clone her across all 700 instances with the same SIP. So, just stuff like that, where if someone gets put into sa, even if they’re buried 30 levels deep in an AD group, you’ll get an alert saying, “this guy’s been given access to these groups on these boxes.” It’s just one of those things that really, really enhances your security because now you know. Because you can’t physically watch that many boxes at once and you don’t want that much stuff out on one of these boxes. So you do it all from one place.

Carlos: There you go, very cool. If you could change one thing about SQL Server, what would it be?

Sean: Um, I think I would own it. [laughing]Carlos: [laughing] It would be Sean Server!

Sean: That’s right, I could use that kind of cash. Other than that, oh, there are aspects of every piece of it that really piss me off.

Carlos: So obviously we talked about the default database growth settings.

Sean: Yeah. That’s an annoyance, but I wouldn’t spend my dev dollars there. I think I would make things in T-SQL a lot easier. Because there are a lot of things that they could do for us that make things really hard right now, and actually decrease your security.

Carlos: Kind of getting outside SQL Server itself, getting out to the box?

Sean: If they wanted to improve that, sure. They could make it easier. I don’t think they could make it safer. But they could make it easier, right? I mean, uh, things like right now we have to use dynamic SQL for just ridiculous stuff. If we could do something like SELECT * FROM @table, instead of having to write dynamic SQL for that. Or if I could get at databases like USE @database and then do a query in the context of an sp, right? And then switch to that database instead of having to roll through each of them in dynamic SQL, then that would help as well. Just those two cases alone would knock out like 90% of the dynamic SQL out there, because most of it is just getting around this kind of BS that they used to let us do. Now you brought up command shell.

Carlos: [laughing] No, no, no, I’m just saying getting out of SQL Server. I assumed that’s what you were talking about.

Sean: No, no, no, it’s fine. But, I will go off on one of my diatribes for 20 seconds.

Carlos: Here we go.

Sean: Everybody is so against the command shell.

Carlos: It’s been put on the naughty list.

Sean: Everyone keeps calling it evil and they keep saying it’s a security nightmare and it’s just not. It isn’t.

Carlos: We’ll make sure to put in your rant. I did read that today in prep for this interview.

Sean: For the security theatre? Right?

Carlos: We’ll make sure it gets into the Show Notes.

Sean: Oh, excellent.

Carlos: So Sean, you’re a very interesting fellow. And again, I was very interested to meet you on SQL Cruise. I admit, I was going to go on SQL Cruise anyway, but I was interested to chat with you and one of the things that struck me when we talked is that, at the time, and again MCM has kind of gone the way of the dodo. You were the only one that I was aware of that was a non-Microsoft, non-consultant person to get the MCM.

Sean: Yeah, actually at that time I was.

Carlos: I was like, “Wow, that’s kind of interesting.” And I asked you, “why did you want to get that?” And do you remember what you said?

Sean: No, I don’t, actually.

Carlos: Let’s see if your answer matches. Why did you want to get your MCM?

Sean: I wanted to see how I measured up, I think?

Carlos: What I remember you told me was, “I wanted to see if I could do it.”

Sean: Yeah.

Carlos: I was like, wow!

Sean: Because you have to be a consultant most of the time to get the level of experience it takes to do something like that. And I wanted to see, since I had been a DBA in major shops but never really a consultant, I wanted to see if I had the chops for that kind of thing. If I was as good as I was purporting to be.

Carlos: [laughing] very cool, so with all of that, my ultimate question is: What is the best piece of career advice you’ve received along the way?

Sean: You know what? I remember seeing that in your show notes and I forgot to even think about it. I’ll tell you. I don’t know, a lot of people don’t know that I was friends with the late Ken Henderson. And you know, the gurus guy books. And big Microsoft, he was based here in Irving and we used to sit up late and night and we used to chat. We were both late night coders and we would sit late night and chat. And I sat in his office elate a number of times and just, shot it with him and what-not. We sat on his couch and what-not and I liked it because he always kept the lights off. And so one night he gave me an excellent piece of advice, I don’t think he was even trying to at the time, but he was sitting there really going off about something on his blog and he said, “Don’t ever, ever write or talk about anything in your professional community about religion.”

Carlos: Our last question for you today. If you could have one superhero power today, what would it be and why would you want it?

Sean: I would like to read minds.

Carlos: Wow.

Sean: Because, knowing what people are thinking can really give you an advantage, if even knowing when to shut up. Something as simple as, “wow, I’m pissing her off. I’m going to shut up and walk away.” That’s a brilliant skill to have to not only know what they’re thinking and guide the conversation. But just to know when to stop it too.

Carlos: [laughing] I’m done, I raised my white flag. I’m out.

Sean: Okay, well it’s been good talking to you! See ya! Sorry I’m such a putz, right? [laughing] It’s like, “Oh wow, he knows he’s such a putz and he stopped! Good for him!”

Carlos: [laughing] There you go. Now you just have to deal with all the dirty laundry. That’d be the one downside that I see to that power.

Sean: Yeah, but you could also know when someone’s mind is wandering and they’re really not listening to you.

Carlos: There is that. It’d be nice to know that about the people listening to my podcast. Am I just talking to empty space right now?

Sean: Exactly. We saw this, there was this wonderful meme going around for a while about useless superpowers. The one that I really loved the most was 13th bulletproof.

Carlos: 13th bulletproof, what’s that?

Sean: You’re bulletproof to the 13th bullet.

Carlos: Okay, so number 14, you’re done.

Sean: So this guy was lying there with 12 holes in him and the 13th one bounced off.

Carlos: [laughing] Oh, so it’s only on the 13th bullet. I thought at least you got to 13 bullets.

Sean: 13th bullet proof.

Carlos: Yeah, that is useless.

Sean: I thought it was a great image too. He’s in an alley with 12 holes in him and the last one just bounced off. I was like, “aw, dude!”

Carlos: Very good. Sean, thanks so much for being here.

Sean: My pleasure. I love doing stuff like this, I don’t do it nearly enough.

Carlos: We’ll make sure that there are links to your product and your brand, some of the things we talked about today will be up on the Show Notes for people to listen to later. So we do appreciate you coming and chatting with us.

Sean: Excellent. And welcome me back anytime. I’ll darken your doorstep any time you like. Now the truth is, this is for your listeners, howmuch of what we talked about is going to be edited out? I counted two.

Carlos: I think, the index part you’re safe. Once we get to the SQL family questions, then no promises there.

Sean: I can’t wait to see it.

Carlos: Thanks again Sean.

 

 

 

 

 

Episode 40: Squeezing Data With Compression

1400Compañeros!  Did you know that acccording to an unofficial Twitter poll, only 60% percent of SQL Server Enterprise users said that they took advantage of the compression features? In this episode of the SQL Data Partners Podcast I chat with Jason Brimhall, about why that he thinks that is. He also answers your SQL Server compression questions: Why is it a bad idea to compress everything in your database? How does compression work? Why doesn’t compression always work? We also talk about why compression sometimes doesn’t help and he reveals a case study where compression moved the performance needle way back.

 

This episode is sponsored byDBMaestro

 

Episode 40 Quote:

“…Most of the time, because of the huge decrease in IO that is occurring, you see an improvement in performance. Queries improve dramatically in performance. I have seen, consistently, five to six times performance improvement when you compress tables.”  – Jason

Listen to learn…

  • The changes to compression in 2016
  • How compression affects CPU cycles and performance
  • The link between compression and indexes
  • When to use compression
  • How to tell if compression will work for your database
  • The deprecated SQL Server feature that compression relies on
  • The importance of test environments when using compression
  • The company motto that Jason says is the best advice he’d ever received
  • The mid-nineties movie that Jason would love to live out

As always, please join me on the trail by starting a conversation below or connecting on Twitter. If this podcast is helpful to you, please leave a review on iTunes so others can find it as well. Subscribe to the podcast via iTunesStitcher, and Podbean.

About Jason Brimhall
Jason is a SQL Server Microsoft Certified Master/Microsoft Certified Solutions Master and Microsoft SQL Server MVP. He’s currently employed by SQL Solutions Group where he consistently finds ways to maximize query performance. Jason is the author of two books:

Resources

Jason on Twitter
SQL Server Society of Las Vegas
SQL Server 2016 and Compression
Page Compression
Row Compression
Unicode Compression
SQL Server Data Types
SQL Server Extended Events

Transcription: compression

Carlos: So Jason, welcome to the show!

Jason: Hi! Glad to be here.

Carlos: So you’re quite the travelling man. You’re joining us all the way from Madison, Wisconsin!

Jason: Yeah, it’s clear down there at the bottom tip of South America. It’s the furthest south I’ve ever been.

Carlos: [laughing] Man, I’m going to have to go back on my geography there. Seeing as I spent two years in Argentina, I don’t remember that being at the southern tip of South America.

Jason: Well, it depends on which South America you’re talking about. [laughing]Carlos: Well there you go. So ultimately I’m hoping that your data compression knowledge is a little better than your geography, and that’s really what we want to talk about today is database compression.

Jason: Well actually as you segway into it, you know, it actually works very well because we just compressed the entire globe into the northern hemisphere, into the United States.

Carlos: [laughing] There you go. So I guess there are some similarities then, right? We’re going to take all that and squeeze it down into only what’s needed. I guess it’ll be a lot of mountain ranges at that point.

Jason: Yeah, lots of peaks and valleys.

 Carlos: So, database compression, right, is ultimately an Enterprise-only feature. Let’s get into that a little bit. Ultimately with compression we’re trying to shrink our database. Why wouldn’t we always want to compress everything? Or, what decisions or factors go into deciding what to compress?

Jason: So that’s actually a very good question, and it’s kind of a hard question to answer ti be honest, because ideally we’d want to compress everything, right?

Carlos: It makes sense. You’re talking about these huge systems and sure, disk is getting cheaper but with all this data kind of being everywhere being able to compact that seems like it would make sense.

Jason: Yeah, you know, everybody’s always trying to reduce their storage footprint no matter which way it is. I mean if you look at the SAN level, you have all these data de-duplication systems at the SAN level that de-duplicate at the block level trying to save on storage, right? And the same is true inside the database. So you sit there and go, “Well, why wouldn’t we want to de-duplicate or compress our data and remove all that extra space?” Well, a reason why we may not comes from the hard knocks of experience where I actually had a client who had a situation where we were able to take a 700 GB table and reduce it down to 135 GB. And we’ll get into some of the internals maybe about compression here in a little bit and explain how that’s possible and how it really works. But this table had a query that was executing and they had a lot of data in there and they were running low on disk space and it was just this one table that was chewing up most of the disk space. So the client had no actual test system. Their dev/test stage and production were all the same server: their production box. But that doesn’t ever happen in real world scenarios, right?

Carlos: Yeah, I’m sure. None of my companeros have instances like that.

Jason: In real-life situations, where you do with what you can afford. And that’s one server to be able to service up the production data and so in an effort to try and save them on some money we went and compressed that table. As a baseline prior to compressing, I tested their 31 query that was hitting that table. And the query would take 21 seconds or so to execute proper to the compression. 700 GB, mostly performing a clustered index scan the way the query was written and couldn’t rewrite the query for them because of application constraints. So compress, the table, get it down to 130 GB, and after that run that same baseline query and that same baseline query is now suddenly taking 10, 15, 20 minutes to execute.

Carlos: Oh wow.

Jason: And the only change having been compression. We only implemented page compression. So the rollback on that is, “Hey, this is the only change that we made so let’s confirm that this is actually what we’re seeing.” So as a test, we tried to rewrite the query and optimize it. And still running slow. And after that, I said, “OK, well, the change was compress the table. We compressed the clustered index. I’m going to uncompress the clustered index, I’m going to remove the page compression, and then rebuild the index to make sure that every page is scanned to remove that compression a bit. And we reran the query and it’s back down to 20 seconds. Immediately. So, the moral of the story is, besides test, test, test, test, is not every system is conducive to compression. And the only way that you can really actually see that it’s going to get you the benefits you need…  I mean, this wasn’t an update. This was just a pure read query and when you do reads, you know, less data and less IO should equate to better performance as well because you’re reading fewer pages.

Carlos: Right, that was my other thought there. With the compression we’ve talked a little bit about the disk. But of course, SQL Server reads everything from memory so if it pulls that smaller footprint into memory you in theory could fit more pages in there

Yep, you can definitely fit more pages into memory and I tested this particular case on both a cold test and a warm test. And so everybody understands what that is, you run the same query multiple times to make sure that you have the pages into cache and we’re pulling it from memory and not from disk. And no real difference in performance there. Now, you might say, “Well, obviously something else must have been happening that caused it to slow down.” Well, I monitored the CPU as well during this time, and there was no noticeable increase in CPU utilization during this particular query. So, I forget the exact metrics before but wen this query was running, this server was running at 50 percent CPU utilization. After the compression, the CPU utilization actually dropped to about 4 percent. Running the query it had stayed at 45 percent. So no noticeable uptick in CPU utilization. That kind of digresses into another talking point I have about compression. And that is CPU utilization.

Carlos: That seems to be the big hit, right?

Jason: So, a lot of people will sit there and say, “Well don’t compress, because you’re going to see a big hit on CPU.” And there’s a caveat to that, right? The caveat being that, yes, it does take a little bit of CPU to uncompress those pages and expand it out as it reads those pages. Understood. However, when you consider that you have, in most cases, those 20 second queries should have improved to 3 or 4 seconds or even faster. So if I have decreased my CPU stress for 16 seconds I have freed up CPU. Now, maybe I see an uptick for 3 or 4 seconds for the duration of that, maybe I don’t. If a query has executed for 21 seconds it has to have utilized X quantum of CPU, right? Well, if I can make it run in 3 to 4 seconds now it using a different quantum of CPU. Technically, fewer cycles of CPU but if it upticks say from 50 percent to 55 percent for that 3 to 4 seconds, then it’s only 3 to 4 seconds of an uptick in CPU usage as opposed to 20 seconds.

Carlos: You’re kind of getting back the rest of those cycles, in a sense.

Jason: Right. So it’s’ kind of a give and take. I have improved the performance of the query and removed those CPU cycles. Occasionally I may see an uptick in CPU cycles for the duration of that query and occasionally I will see lower CPU cycles. So, the actual CPU utilization is going to largely depend on the system on which you’re trying to compress and every case is going to be different. Your mileage may vary. Let’s throw another one out there. Another SQLism: It depends. Okay? You have to test. So what do you do? You set a baseline, you run a known set of queries, you compress it, you run that known set of queries again, and you compare the results before and after.

Carlos: Measure those. Interesting. So that kind of sounds like it’s a little bit of a crapshoot. Kind of, “You try it in your environment and see what happens.” Are there any, maybe not hard and fast rules, but general guidelines where you can say, “I’ve seen favorable compression when X.”

Jason: So, I will say that it’s less of a crapshoot and with the exception of that one situation where I started with that story in the beginning, that’s kind of the edge case scenario. And I like to tell that story because it underscores the necessity to test in your environment. Now, if that’s an edge case, now we’re talking about the 99.99 percent of the time were it’s not an edge case. It’s not that particular type of case where you see that huge difference in performance where all of a sudden it all hits the fan and nothing is performing like you expect it to. Most of the time, because of the huge decrease in IO that is occurring, you see an improvement in performance. Queries improve dramatically in performance. I have seen, consistently, five to six times performance improvement when you compress tables. And the five to six would typically be with page compression. You’re going to see different performance gains if you use a different compression algorithm. Say, row compression or columnstore compression, which is kind of like page compression which are on steroids. And then to throw the next generation into that is the in-memory hekaton, which is also build on kind of the same principles as the columnstore, using that X velocity and taking it to that next level and compressing that data super tiny so it can fit in memory better.

Carlos: Yeah, so you can put more in there.

Jason: So, it’s, as Microsoft has touted, you can see thirty to one hundred times improvement in query performance.

Carlos: But I think it’s important to point out that when we talk about compression from the SQL Server feature, we’re not talking about in-memory tables, correct? While that feature, or the columnstore inedex, the way you would set hat up and implement that are two different features.

Jason: That’s not using the compression index, sure.   It’s using a different syntax. They’re kind of compression as a building block for columnstore, so it’s just a building block for hekaton, right? So they’re all kind of intertwined and really, when you look right down at it, they’re all kind of intertwined back to a deprecated feature that you can’t turn off. It’s still there. That’s’ the varbinary storage format. That’s the original SQL Server compression that’s deprecated but even in 2016, it is still enabled and you cannot turn it off. That’s because they all kind of require that compression algorithm to be there for each of them to work. While the syntax is technically different for each of those, they are building blocks.

Carlos: So talking about data types, are there certain kinds of data or datatypes that wouldn’t be available for compression?

Jason: So, the types of stuff that you want to avoid for compression are going to be any of the data types that are going to be stored off row. And when we say off row, that means…

Carlos: Varchar max…

Jason: Right. Varchar max, nvarchar max, your nchar max, your char max, your varbinary, right? The MAX types. And it’s not always necessarily max, but anything that we have to story, once we get up to 2000 or 4000 depending on the kind of data being stored on that page, to the point where it has to overflow the data from that page to another page, where you just have a pointer and it’s filling up another page with all that data…. That’s what we’re talking about.

Carlos: So maybe just to circle that base, it might be helpful to explain that a little bit more. As SQL Server writes, we have these 8KB chunks called pages. But when you have a column or an integer or a char 10, it’s going to write that into the page. And as you mentioned, any of these MAX columns what it’s going to do is it’s going to say, “I don’t really know how big that thing’s gonna be, so I’m going to give it its own space. And for that single record you start writing, oh it says, “oh, varchar max, I’m going to give it another page.” And now you have two pages for that single record.

Jason: Yeah, and there are times where when you store, even with a MAX, we can spill that max out into more than just a couple of pages, right? We can have many pages where it’s only the data for that one row, one column. That stuff can’t be compressed because as what happened to that work, it goes back to how SAN de-duplication works. Without getting into the weeds too much on it, the way compression works in SQL server is: first you have this thing called row compression. It’s one of the options for compression. And what row compression does is it removes the spaces. All the extra characters from within the row. So it reduces the amount of space within the row. And then you have page compression, which first does row compression and removes all that extra space from the row and trims it up that way, and then what it does is it looks for the same data to appear in multiple places on the page. And then it does a data de-duplication. So it compresses the data by removing all that space, but then it de-duplicates it so it leaves kind of a token in each of the slots that it has replaced that data. So now it becomes a whole lot smaller and you have a reference pointer, much the same as what you have with SAN technology where they do de-duplicate the data. You have one known good value and the rest of them have a pointer back to this known good value. It removes a lot of space.

So with that explanation in there, now when you have a column one row with one column that has spilled over into two or three pages, you have this data that’s all on these other pages. It can’t compare to any other columns or any other rows on those other pages in order to de-duplicate the data, because as far as SQL Server is concerned, it’s unique data.

Carlos: So we’ve talked a little bit within that about the CPU bump that you may hit. But I guess I’m thinking- now that it’s doing that, other side effects of index rebuilding, for example? Any effects there? Is that uncompressing, looking at that, to resort it and recompressing? Or how does that work?

Jason: That’s a good question. The way that the indexing is going to work is when you‘re actually running a rebuild or a reorganize, that’s where you’re going to see some sort of uptick. It is going to require more resources. It does have to take and look at that page and say okay. And understand what happens with the pages when you rebuild these pages, right? What’s happening is that it’s sorting the pages, not the data on the pages unless it finds out that all the data has changed and we need to move data between pages. And that’s where you start to have the uptick in resource utilization is if when data all of the sudden, because of the splits and stuff, now you have to reorganize the data because you have too much free space on a page and data gets moved between pages. That right there, just like the potential when you’re using it normally, I’ve seen an uptick of maybe five percent in CPU utilization. Time wise with index rebuilds, I can’t honestly say I know how much longer index rebuilds take, because I set up intelligent builds where I just set up a time limit for my index rebuilds on a nightly basis. And I just check to make sure that they got rebuild within the time limit. I haven’t dived terribly deep to seeing, if, “Oh, this index took, say, two minutes and now that It’s page compressed it’s taking two minutes and thirty seconds.” You know? When I’m looking at just a few seconds here or there, it would have to be something dramatic for me to notice that it’s taken considerably longer. Otherwise it’s statistically non-different.

Carlos: Right, so I think if you have good care and feeding of your indexes and you’re putting them, maybe not off hours, but generally you tend to see them in the non-business hours when that works for your environment. But good care and feeding and you’re watching that.

Jason: And along the same lines, and that kind of ties back in with the principle discussed earlier, where the MAX datatypes where you can’t compress those pages. Just because you have that max datatype there doesn’t meant that some pages don’t get compressed. Some pages are still going to get compressed within an index or a table, so the same tokens till applies. You have some pages that are going to be at different stages of compression, none or row or page compression, within an index. So not every single page is going to be completely compressed. Say you have 100 pages but only 5 of them meet your compression requirements. You’re not going to see a huge bump in CPU utilization because it’s only 5 pages that were affected. Hopefully it’s not that few if you’re going to page compression.

Carlos: Yeah, we like to see that number a little higher. So any differences in encryption? In 2016 we’re talking about all kinds of new features and things. Compression’s not one that I’ve heard about. Any changes between the environments?

Jason: There haven’t been any real significant changes int eh base compression feature since R2 when they added the nchar and nvarchar into being able to be compressed

Carlos: So 2008 R2 was the last big changes to compression.

Jason: Right. The huge big change there. And you know, with justification there for Microsoft, this is a building block for the other big moneymakers. You know, your columnstore indexes whith ycou can now have multiple writable columnstore indexes in 2016. That was more important to make a change along those lines under the covers to really kind of see that varbinary and compression as the base building blocks of that kind of thing. I don’t see it changing too terribly much. That said, there’s still some hidden things in 2016 so maybe there have been some changes.

Carlos: Yeah, that’s true. It’s still kind of a mixed bag as to what’s going to be available and they continue to be working hard to get that out the door. So, an unofficial Twitter poll by Kevin Kline, he was asking who of the Enterprise features, and ultimately compression is an enterprise-only feature, he asked who of those with Enterprise edition are using compression. And again, it’s an unscientific poll. But he found that 60 percent that had it were using it. Now he was thinking it was going to be closer to 20 or 30 percent. So it actually doubled his expectation. So again, that kind of seems a little bit low to me. What are you seeing as an adoption rate and why do you think those numbers aren’t higher?

I think a large part of it is that, a) with it being that there are still a lot of people that don’t actually understand what compression is. That’s one part of it. Two, we have a lot of shops that are mom and pop shops and they don’t have a lot of time to investigate. So, “if we haven’t heard about it, it may be that this grand old smoking gun that we can possibly use but how do we know that it’s going to do any good?” And then the other part of it is, being an Enterprise feature, it costs a little bit more money. A lot of shops try to get away with Standard Edition type features.

Carlos: Sure, but I think he was asking about those who had Enterprise edition. So I think 100 percent of his pool had Enterprise and 60 percent were using it. So I think he was thinking it was only going to be 20 or thirty. Of those Enterprise people, why do you think it’s so low?

Jason: It primarily comes down to education, as the first one. And secondary is going to be that it comes down to time and money. A) Time to investigate and test, and B) money to be able to have a test environment where they can spare the cycles of the employees to do that kind of stuff.

Carlos: Fair enough.

Jason: I was trying to look real quick because I had thought I had heard somewhere that in 2016 they had dropped it from Enterprise to Standard. So I can’t find anything in the documentation that says that it is an Enterprise-only feature in 2016, but don’t quote me on that yet. I’ll have to research that some more so we can be absolutely certain about that.

Carlos: And if that information becomes available, we’ll try to put that in the show notes for today’s episode.

Jason: Here is something that did change in 2016. The Unicode was added previously but they have extended Unicode support for compression. So that’s a hidden feature that I hadn’t known about. I’d change but you have to be using Unicode for it to be making that big of a difference. But let me research, because I see nothing in the documentation that says it’s for Enterprise only in 2016.

Carlos: Okay. Well I think even that, some of the features like stretch, I don’t think they’ve come out and said one way or another if it’s going to be Standard or Enterprise or if certain features are only available.

Jason: There are certain features that you can almost absolutely say with 98% certainty that it’s going to be Enterprise level. Compression is one that they’ve been talking about for a while, they’re going to drop it down kind of like how they’re doing for Availability Groups. They’re making it so that Availability Groups, a limited version of it, is available in Standard Edition.

Kind of replace the mirroring idea?

Jason: Yeah. It seriously makes sense.

Carlos: Oh it does, it’s great. If you’re going to deprecate that feature you gotta give them something else to be able to use, because you want to have a DR strategy even for the smaller. Very good.

Jason. Other questions?

Carlos: No, thank you for this compression discussion. Let’s do SQL Family.

Jason: Okay. You’re not pumpkinized yet?

Carlos: [laughing] No, I haven’t turned into a pumpkin just yet.

Jason: [laughing] That better make it onto the show.

Carlos: We’ll see.  So one of the things I always like to talk to other folks about is the way they work. So what is your favorite SQL tool? It can be a free tool, a paid tool, why do you like and how do you use it?

Oh that’s easy. Extended Events.

Ah! You’ve been talking quite a bit about that. So why is Extended Events your favorite tool?

Jason: Well, Extended Events basically is the bomb. [laughing] If you pretty much want to be able to do anything from monitoring your server to blow up your server, Extended Events can do it for you. And  don’t quote me on that, I’m not going to show you how to blow up your server with Extended Events, but we all know that we can do it with Profiler. But there are some special ways to do it with Extended Events. Any time you do anything really stupidly, right, you can blow up a server. In all seriousness, it gives you some fantastic insight into the database engine, into the optimizer, into SQL Server, and allows you to be able to make much more educated decisions about what you’re doing with your job.

Carlos: So, you’ve just inherited one million dollars. What are you going to do with it?

Jason: Don’t’ ask me what I’m going to do with it. It doesn’t matter if I inherit it or not, it’s not mine to spend! [laughing]Carlos: [laughing] Oh, should I be asking your wife?

Jason: [laughing] Ask my kids.

Carlos: Yes, that is one thing that you and I share, is lots of kids.

Jason: Yeah, we all know that it’s not really for yourself that you make money. You make the money for the kids so that they can go spend it. Might as well hand over the paycheck to the kids. Um, one million dollars. It’s actually a rather easy decision and boring. Basically, it goes into first Uncle Sam’s hands.

Carlos: Yeah, there is that.

Jason: 50 percent of it is inheritance tax. So that gives me with 500 thousand dollars which basically pays off all my debt and gives me a good chunk of money to put into some sort of savings that would then be inherited by the children either for their college or for missions, weddings, houses, basically to them.

Carlos: Very good. So you’ve been around a few times. You’re a consultant now. What’s the best piece of career advice you’ve ever received?

Jason: Best career advice I have ever received? It’s actually kind of a difficult one there. I think it really just kind of boils down to “Just do it”. Not specifically career advice, but kind of lifetime advice. Don’t wait, just do it. Just get up and do it. If there’s something that you would be, I don’t want to say proud about, because it’s different than pride, but can be honorable.

I saw a quote the other day that said, “Failure is not trying something and getting it wrong. Failure is not starting.”

Exactly. You know, it’s scary to be honest, as a consultant. Because you know, I’ve got a family. That’s really the hardest part. And I think it was Stacia Varga, and she’s not the first person who said just do it, because I’ve heard that from church talks a log, long time ago.

There’s also that multi-billion dollar company in Oregon that’s been saying that for a while.

Jason: I’m not even thinking of that one, no. Besides, I’m an Asics man. [laughing] so, Stacia is besides just doing it, you have o believe in trusting yourself. You know you can do it, and you know you have the skills, now just trust yourself. So it was kind of good, because she’s been a consultant for a very long and we’ve lived and worked together in the Las Vegas User Group area. And we had some chats. And it actually made a lot of sense. It resonated, right? You just sit down and you do it and trust in yourself. The work, when you trust in yourself and you do a good job, the work just keeps on coming. Sure, there might be a little bit slower sometimes than others, but a lot of times it just keeps coming. And seriously, the same is true of being a full time employee. As a full-time employee, the work never ends. The more you do a job and the more you do it well, the more that gets entrusted to you by your boss or your employers. So do something and do something that will be honorable in your eyes and that would be honorable to the company. It doesn’t matter what the company is.

Carlos: Good stuff. Jason, our last question for the day. If you could have one superhero power for the day, what would it be and why would you want it?

Jason: [laughing] Would I one day be smart?

Carlos: I’m not even sure a superhero power could that for you, buddy. [laughing]Jason: [laughing] you know, a really cool superhero power some days would be replication.

Carlos: Replication? So kind of like duplicate yourself to be in two places at the same time?

Jason: You’ve seen that show Multiplicity, right?

Carlos: I don’t know if I remember that one.

Jason: Go look it up. I think it’s Multiplicity, where he basically duplicates himself time and time again. So I can become much more efficient. And should I forget something, maybe one of my replicants could remind me. The problem with that, is now we’re getting geeky, is you could get split brain syndrome much like mirroring could have in SQL Server. And we don’t want to have a split brain.

Carlos: There’s a potential downside there.

Jason: So, Multiplicity is with Michael Keaton. And it’s a, go look it up. It’s from the mid-nineties.

Carlos: Well Jason, thanks so much for taking the time to chat with us today.

Jason: Yes, thank you.

Episode 39: Statistics Parser

1400
Today on the SQL Trail I talk with Richie Rump. Richie is a podcaster, blogger, and now “Dataveloper” at Brent Ozar. He and I discuss the story behind Statistics Parser, including how an Excel spreadsheet and a training in Atlanta prompted him to create the tool. Richie walks us through how it works and why it’s beneficial for both developers and DBAs. He also divulges his preferred settings for SET STATISTICS IO and why he doesn’t worry about physical reads. We then chat about the changes in his career and navigating the space between developer and DBA. Finally, we go into #SQLFamily, where he shares his favorite SQL Server tools and the career advice that brought him success.

Episode 39 Quote:
“…So at the end of the day, I started writing this IO parser engine in JavaScript. And by later that night when I landed in Miami, essentially the core of the engine had been written.” – Richie

Listen to learn…

  • How one Brent Ozar training in Atlanta inspired Statistics Parser
  • The Excel spreadsheet that started it all
  • How to use Statistics Parser
  • The security concerns you shouldn’t worry about
  • Why Richie uses SET STATISTICS IO ON and SET STATISTICS TIME ON for every query
  • Upcoming features for Statistics Parser
  • The meaning of his new “Dataveloper” job title
  • The book that inspired Richie’s dream super power
  • How “you’re fired” helped him succeed

As always, please join me on the trail by starting a conversation below or connecting on Twitter. If this podcast is helpful to you, please leave a review on iTunes. Subscribe to the podcast via: iTunes, Stitcher, or Podbean.

Show Notes

About Richie Rump
Richie Rump is a Project Manager turned Developer-DBA Middleman that he calls being a “Dataveloper”. He’s a consultant with Brent Ozar Unlimited and the developer behind Statistics Parser. Richie has spoken at SQL Saturdays on topics like, “The T-SQL Features in SQL Server 2012 You Can’t Ignore” and “Database Design Disasters: Common Mistakes Found in the Wild” and founded dotNet Miami. He is also a Project Management Professional (PMP) and Certified Scrum Master (CSM). When he’s not working and presenting, he’s co-hosting the Away From the Keyboard Podcast.

Resources
Statistics Parser
Brent Ozar Unlimited
Away From the Keyboard Podcast
Richie’s site
Follow Richie on Twitter
SP_DataProfile
SP_Blitz
SP_WhoIsActive

Transcription: Statistics Parser

Coming soon

Episode 38: SQL 2016

sql data partners podcast logo Compañeros! You heard rumblings about how awesome SQL Server 2016 is, right? Well the story in this episode will blow you away. Episode 38 of the SQL Data Partners Podcast covers a SQL 2016 case study that really shows the value of a knowledgeable SQL Server DBA using the latest SQL Server toolset.
Quote from Episode 38:

“Exactly the same queries, same amount of data, so we got the exact same number of records back from the queries, et cetera. And on two of those queries, on the Oracle racks it took over nine minutes and on our little VM it took us about a second.” – Rick Heiges on one client’s results with SQL 2016

I talk with Rick Heiges about the wins he’s seen for his clients using SQL Server 2016, including how one client’s move from a large Oracle rack to SQL Server garnered them huge improvements in query performance. As well, I ask Rick some #SQLFamily questions and learn more about the productivity tools and business advice he can’t live without. Episode 38 also includes:

  • Why one Oracle shop moved entirely to SQL Server 2016
  • The results of a showdown between a SQL Server VM and an Oracle Rack
  • SQL 2016 Features
  • Reasons why the Upgrade Advisor makes Rick nervous
  • Why Rick says that waiting for SP1 is a bad move

As always, please join me on the trail by starting a conversation below or connecting on Twitter. If this podcast is helpful to you, please leave a review on iTunes. Subscribe to the podcast via: iTunes, Stitcher, or Podbean.

Show Notes

About Rick Heiges Rick Heiges is a Principal Solutions Architect and Data Platform MVP at DB Best. He’s served on the PASS board as Director and as VP of Marketing and has spent the last four as the Director Emeritus of PASS. Rick spearheaded the 24 Hours of PASS event and also started a local PASS User Group in 2002. Resources Connect with Rick Heiges on Twitter Follow Rick on LinkedIn Migrating Oracle to SQL Server (MSDN) Rick’s article about SQL Server Availability Groups

Transcription: SQL Server 2016

Carlos: So Rick, welcome to the program!

Rick: Thank you, great to be here.

Carlos: Yeah, it was nice catching up with you at the Richmond SQL Saturday. You were telling me about this interaction you had.  I guess I should back up a moment. So, at the data-driven event, or the soft launch of SQL Server 2016, there was a big announcement in which they were going to start paying customers to actually switch from Oracle to SQL Server. We then started talking a little bit about the features of 2016 and you had told me about a customer engagement that you had where they were trying to do just that, either make the migration or considering making the migration. So I guess, tell us a little bit about that and maybe the reasons why they decided to give SQL Server a look.

Rick: Sure. Yeah so, at the data driven event they had this offer where you can trade in Oracle licenses for SQL Server licenses and there’s lots of caveats around it of course. So talk to your Microsoft rep for more details on that. But basically what they really want people to switch to SQL Server from Oracle and we were in a situation where we had a customer who’s running Oracle where a data warehousing system that’s being constantly updated with all this new data all the time, just flowing in. And they really were having some performance issues. And they were considering switching to SQL Server, and DB Best of course we do a lot of migration work, and so was put on this engagement to help them test out some of the new features in SQL Server 2016. Namely the clustered columnstore index. So to kind of paint the picture, they had a big old Oracle rack system. They had like four nodes and I think half a terabyte of RAM across the nodes, et cetera. And so that was the system that they had in their current production environment and also in their test environments. And we were given a teeny, tiny VM initially with only four CPUs and 8 gigs of RAM and we started testing. And we got some good results but I was like, “This is not a fair comparison.” So we got them to bump up the specs of the VM, nowhere near to what the rack is but still good enough to call it a bit more even, if you will. That was one of the big challenges we had starting off.

Carlos: Okay, so then what was the experience once you had gone through the columnstore process or introduced all these new features, what was the experience that they had?

Rick: They had a lot of queries that really were not performing well on their Oracle system and so we came in and helped them configure their VM, and their Windows, and their SQL Server, all the different configurations to set up to make it optimal. And we also helped them choose how to partition the tables accordingly and to also choose why type of indexes to go on each of the tables, because using columstore indexes are great. But they’re not the best for every situation. So we kind of went through that analysis as well. We’ve had some amazing results.

Carlos: Yeah, so let’s talk a little bit about them. From the way that it sounded, it totally blew them away.

Rick:  Yeah, we ran five queries. Exactly the same queries, same amount of data, so we got the exact same number of records back from the queries, et cetera. And on two of those queries, on the Oracle racks it took over nine minutes and on our little VM it took us about a second to return the same.

Carlos: [laughing]

Rick: Yeah, it’s amazing. And I was floored.

Carlos: That’s impressive.

Rick: And we would clear out the cache, et cetera every time before we ran the queries. And we still blew them away. And it’s just amazing to me what that really means to the business. I mean, they can make faster decisions, they can do more analysis, you know, get a better product. Which is really what they’re after.

Carlos: How much data was in that environment? Just for comparison, for people listening who want to think if they’re in the same boat.

Rick: Right. On the one main table that we were looking at, which was uncompressed on the Oracle system, it was about 100 gigabytes in size. There were other tables but that’s the one table we were focusing on. And after we put the columnstore index on it, we got down to 8 gigabytes of space. So, super compression and that’s really where all the performance comes from.

Carlos: I see. Now was that running on Linux by chance? Did you do a 2016 implementation on Linux?

Rick: No, this was before that announcement.

Carlos: So just a quick tangent here, when they announced that SQL 2016 was going to be on Linux, the Twitter universe kind of exploded with, “Ahhh, this is so great! This is so exciting!” Am I the only one that’s not excited about running SQL Server on Linux?

Rick: I think there’s lots of questions to be answered around that, right? But, if you look at it from a high level point of view, it gives customers who hate Windows another option for their database backend. So they don’t have to set up an Active Directory and other stuff just to run SQL Server. They can have it on Linux. So from that point of view, you know it’s big in that way. Also for ISV’s, they now have another choice they can provide their customers. Especially if everything else they do is in Linux and you know, they don’t want to set up Windows for SQL Server for that reason.

Carlos: Okay, so kind of some specific use cases.

Rick: That’s the way I see it, yeah. I’m sure there are lots of other use cases out there, as you mentioned, that are probably more exciting.  But those will come to light of course as the launch of that becomes nearer next year.

Carlos: Right, yeah. They’re in preview now, so they’re letting people have access to it. And you have to sign up and get approved, and all that stuff. And I’m not sure what the requirements are there but I’m sure there’s some process. They’re trying to monitor it and get feedback. All that other good stuff. So it almost makes me wonder with that, because with 2016 there are lots of new features that have been available. Of course we’re going to dive into columnstore indexes here in just a minute. But do you almost think that, and Microsoft wanted to point out that, they were in the top right quadrant of the Gartner chart but they’re better than Oracle now and many of these facets and what-not. Do you almost think that SQL Server is…I don’t want to say too big…but there’s a lot of stuff in SQL Server now. Do you think there’s like an opening for Postgres and these other players to sneak up and grab, I don’t want to say the Access market, but in between?

Rick: Right, and I think to a certain degree that might be true. That might be another way to look at it but you know Microsoft has been chasing Oracle for quite a long time. And finally with all this new innovation they have with stuff born in the cloud and the vision that they have, it is really they want to tout that, “Hey, we beat oracle. We’re the leaders now. They just can’t catch us.” And I applaud that action for them.

Carlos: Yeah, I think it’s definitely great for us and for database administrators in the SQL Sever space. Lots of really nifty things we’re going to be able to leverage and columnstore indexes is one of those. So you had mentioned the compression of the indexes as being a factor. So maybe we should take a couple of minutes and go over a basic overview of what the columnstore indexes are.

Rick: Sure. So we’re always used to seeing data in the form of a table. Horizontal row over row with various columns and that’s the only way we see it because that’s the way we’ve always been taught to see it. And that’s the way we’ve been storing data forever. And so what they’re doing is they’re turning it on its side for storage, so that all the first names are going to be in a single column, all the last names are going to be in a single column, all the states, et cetera. Just think about that. Since they’re all in the same domain, if you will, you’re going to get ultra compression. Let’s just say on the state column, right? There’s fifty states plus D.C. and Puerto Rico and some other two-letter abbreviations. But you’re gonna have, you know, fifty different combinations. So your compression for state is going to be very high. And so that really lends itself well to data warehousing, because whenever you are doing queries for data warehouse you are usually aggregating many rows together and so if they’re all the same type of data, you know a similar type of data that you’re aggregating, then it just makes it go faster. Plus, it doesn’t decompress the rows when it reads into memory. It stays compressed in memory until it accesses those rows. So it really saves space there.

Carlos: So the whole columnstore index gets saved into memory? Rick: Well, segments of it, right? Like a page.

Carlos: So as it’s reading, it reads it to the buffer. And those pages are in that sense compressed.

Rick: Correct.

Carlos: So, I know there have been some neat updates in 2016, and I know columnstore indexes were presented in 2014 but there were a number of restrictions. One of which was that you couldn’t update them.

Rick: Actually, that was 2012.

Carlos: Oh was that 2012? Okay, I’m going to have to go back in time then. So 2012… I have some 2012 instances. I should probably look at columnstore indexes. So let’s talk about some of the new features in 2016.

Rick: Yeah so one of the ones I talk about when I do my session on 2016, like I did at Richmond SQL Saturday this past weekend, I did a demo of row-level security. So that is now part of SQL Server. And so what that is is a way that you can assign like what type of data a user can see. A particular login or group of people can see. So as to the whole security thing, you have dynamic…

Carlos: Forgive me, before we jump into that, so I guess my question was more: can you give me some of the updates for columnstore indexes in 2016?

Rick: So in 2012, they introduced what they called the “non-clustered columnstore index”. And that was non-updatable, so that meant that when you put that non-clustered columnstore index on that table, you’ve effectively made that table read only.  In order to update that table, you could do a sliding windows scenario with partitioning, or you could drop the index and rebuild it. Drop your index, add your data, and rebuild it, which doesn’t seem logical. In 2014, they added a clustered columnstore index and you still had the non-clustered columnstore index, but you could only have one of those. Actually, you could only have one of those in 2012 as well, so you could have one clustered columnstore index and a clustered columnstore index in 2012 had lots of great compression but they did even better work on it in 2016. So the compression is better, performance is better all around.

Carlos: And those indexes are now updateable as well? So I don’t have to do the sliding window for them?

Rick: Correct. And the other really cool thing about indexing in general is that you can have both your regular row store indexes and column store indexes on the same table. Which you say, “Well, what does that mean?” Well this really lets us get into something called operational analytics, where essentially, let’s say on a single node you have your production database where you’re doing all these transactions, right? And maybe you have an availability group on that and you have a replica of that exact same database. So you have columnstore indexes defined as well as your row store indexes for your production activity, and then you have your replica that’s readable.  And so now I can have my production workload go to my primary replica doing all my transactions, then on a replica the exact same index structure and everything. The exact same databases, right? So I can redirect all my reads to my read-only replica to do my analytical type of processing. SO really what this enables is things like real-time fraud alerts. So I don’t have to wait until I load it into a data warehouse, hours or even days later to go through all the transactions to see if there’s fraud. I can do real-time fraud detection in that manner.

Carlos: So just to clarify in that scenario, in that high availability scenario, the columnstore index lives in both instances, but you’re really only utilizing it for the analytics in the read-only instance.

Rick: Correct, because a replica of a database is exactly the same on both nodes, yes.

Carlos: Very cool. So now we’ve talked about from the analytics perspective and all of the examples that I’ve seen or been exposed to have been in the data warehouse space. Is this really a data warehouse only feature? Or I guess you just gave a potential example of like a fraud detection environment. But are you seeing other scenarios where your people are using that in the OLTP environment, if you will?

Rick: Yes, yes. So I’m always like, “Use the right tool for the right problem.” So we still have row stores. They’re going to be very much in play for our transactional type of systems, because usually in transactions we’re going to do single lookups off of certain tables to make sure something’s there to bring it back, et cetera. And columnstore indexes are not good at that. They’re good at reading a whole page… many many rows of data… and doing an aggregation on that. So yes it is more of a data warehousing type of index, although like I said in the example I gave, you can kind of get the best of both worlds as well.

Carlos: Very neat. So, ultimately we talked about using the columnstore indexes in the aggregate form and choosing that best tool so I guess where are folks trying to implement columnstore indexes where it’s not helping them or potentially hurting them?

Rick: So if you’re just using a columnstore index without a row store index or traditional indexing, and you’re trying to do what I call singleton type lookups, that’s to where you’re not going to get very good performance.

Carlos: Right, because you’re not optimizing what the index is for.

Rick: Yep. Yeah, kind of like, we would say that in other indexes where you have created an index but don’t create all of the columns you need, for example, then SQL Server will say, “Well it’s faster for me to do it this other way and that’s the way I’ll go about doing that.”

Rick: Yeah, with the covering query or the covering index. Yes.

Carlos: Okay, well very good. Now we had kind of gotten into some of the other features. Do we want to circle back to row level security now?

Rick: Sure, I just mentioned that it’s a big thing for us, for DB Best, because row level security has been a part of oracle and IBM DB2 for quite a long time. And so now we have a feature that we can easily convert to or migrate the customer to, where it was a big pain in the butt before. So that’s it. Other security features like Dynamic Data Masking and Always Encrypted are going to be big in 2016. Stretch Database, even, where you can have part of your database on-prem and the other part in the cloud. Those are some of the cool features that I look at whenever I think about 2016.

Carlos: Yeah so I’m a little nervous about one of the features. And that is, so JSON coming to 2016. And IO was listening to Jovan Popovic, I think he’s the program manager that was talking about that. I thing that’s from the single instance of that conversation, and he was saying that you basically it can fit in any data type column because it’s just JSON. So you can stick it in a character field, a varchar field, whatever and then just start using JSON. I think it will be interesting to see how that plays out in different environments. Rick: Yeah, and when I see JSON, I just think of XML.

Carlos: Exactly, very similar.

Rick: Very similar to that. And as you said earlier, SQL server has so many different things in it, especially 2016. It’s really hard to get your head all the way around everything in this release.

Carlos: And one of the things that’s also really cool, and we’ve mentioned this, in that SQL Server or Microsoft ultimately wants to get you in the cloud and wants you to start using Azure. And the migration wizard or the prep tool to identify your upgrade to 2016 has a lot of neat bells and whistles in it in that it will help identify not only if you’re ready for 2016 but if you’re ready to go into Azure. And to use some of these other services, like Azure SQL Database or Stretch like you mentioned, and so I think there is a lot more hand holding or direction on if you’re ready to use these features and if not, here’s what you need to do to correct it before you go forward and knock your head on the wall for a couple of hours trying to get something to work that isn’t ready to do so.

Rick: Yeah and you’re talking about the updated Upgrade Advisor? Yes, and I have good and bad feelings about that. It’s great that they want to help them upgrade to the latest version, maybe to Azure. But the Stretch Database Analyzer that’s part of that, to me, it seems like they’re forcing that idea on you. And even if you don’t need it, if you would run it, it would say, “Yeah, sure! You can use Stretch DB. We don’t see anything wrong with it.” And just because you can doesn’t mean you should. And that’s part of the good/bad thing about it. ]

Carlos: Right, it’s going to show you the world, “Here are all of the things you can do!” yes. So with that data comes some responsibility to pull in the reins a bit and like you said, use the right tools for the right problem.

Rick: Exactly.

Carlos: Well, I think everybody is excited about 2016. We’re at release candidate 1 right now, and I think though no date was specified in the data-driven event, I think everyone is expecting it to come out here in the next little bit, at least in the summertime. We’ll see what the adoption is and how people go about looking at that. Another item there is that I think there’s still some sentiment out there in the companies where they’re waiting for, like, a Service Pack 1. But now with the rapid release cycles that we’re seeing, particularly in Azure, the mindset is going to have to change a bit there.

Rick: Definitely. And I’ve already been talking to my customers over the past couple of years about, “Hey, all these features that you see in the new version? They’ve already been tested in Azure.” And so the old adage of waiting for SP1 before you upgrade doesn’t really apply as much anymore these days. And actually they probably already have SP1 already on the board at Microsoft just for that reason, you know?

Carlos: Yeah, it’ll be interesting to see how long the delay is between the release and Service Pack 1. That will be interesting. I think in the past, now don’t quote me on these numbers, but I have a feeling that it was like a year. Is that not right?

Rick: I know in 2005 SP1 came out five months later. But that was mainly because of database mirroring.

Carlos: So there were some things they needed to resolve there very quickly. Very good. So Rick let’s do SQL Family. So one of the things I always like to talk about with folks is tooling to get work done. So tell us about your favorite SQL tool or tool that you use to get work done, why do you use it, and what does it do for you?

Rick: Well I’m a consultant, so I don’t actually do any work, so I really don’t use tools.

Carlos: [laughing] Yikes.

Rick: Seriously, actually, I’m just fine with the Management Studio. I’m used to it. And because I go into so many different environments, it’s ubiquitous wherever I go. So that’s mainly what I use.

Carlos: Interesting thought there. What do you think about the tighter integrations of the Visual Studio? Any chance that you’re going to start using Visual Studio User Interface?

Rick: Probably, as more customers upgrade, et cetera, and you know I’ll be going in that direction as well.

Carlos: Awesome. So you just inherited one million dollars. What are you going to do with it?

Rick: [Pausing] Probably put it away for retirement, because, um, I like working and I don’t think I’m going to retire anytime soon. That’s a ways off anyway, but still. I’ll just put it out there for retirement and go that way.

Carlos: You know you’ll be set. Very good. So what’s the best piece of career advice you’ve ever received?

Rick: Probably to sound confident and don’t be afraid to say no.

Carlos: So saying no when people approach you and say, “Hey, can you do this?” Or saying no when I don’t know something? Rick: Well, I say that anyway, because I don’t know everything and I’ll readily admit that. No, to say no whenever I feel like I’m getting too much stuff on my plate. I’m not afraid to give that advice to my people who come on-board with us. I say, “Don’t be afraid to say ‘no’ if you feel like you are getting overworked, because if you don’t say no, it’s going to be your fault.”

Carlos:  I think that is a good piece of advice. At least, trying to frame that a little differently. Pointing someone in the right direction, there might be some nice ways to say, “I can’t take this right now, but so-and-so can help you.” But yes, that’s a good piece of advice. Rick, our last question for you today: if you could have one superhero power what would it be and why would you want it? Rick: It would be to impose my point of view on everyone around me.

Carlos: [laughing] Wow, okay. I don’t know, I’m assuming there’s a super or a hero out there that’s like that. None I guess comes to mind. Maybe the mind meld folks.

Rick: Just think about it, right? It’s the ultimate weapon. You have other people think the same way that you do, so it works great.

Carlos: Oh boy, we’re going to have a whole bunch of Rick Heiges running around.

Rick: That’s right. Domination.

Carlos: Rick, thanks so much for being on the show today.

Rick: Great, it was a lot of fun and I really enjoyed this. It was great seeing you this past week and I look forward to seeing you again soon.

Carlos: Thank you.