Welcome back on the SQL Trail, compañeros. Does the thought of setting up your index maintenance plan keep you up at night? Are you looking for ways to make the job easier? In this episode of the SQL Data Partners Podcast, I talk to Sean McCown about Minion Software and the care and feeding of indexes. We also chat about his fragmentation and fill factor preferences, how and when to rebuild, and the ways that T-SQL could be improved.
Episode 40 Quote:
“…I figured I had to go by something that I knew I would be comfortable knowing that someone who was a sys admin was going to drop this tool in their shop, not know how to configure it, and not care to configure it, and not have any maintenance at all.” – Sean McCown
Listen to learn…
- The difference between index rebuilds and index reorganization
- The aggressive fill factor percentages Sean suggests
- Why he dislikes most maintenance plans
- Why a firm maintenance window is unnecessary with Minion Software
- The default settings Sean recommends for maintenance plans
- The link between CD cases and SQL Server
- Why Books Online is wrong
- The T-SQL features that Sean could live without
- The most useless superpower ever
As always, please join me on the trail by starting a conversation below or connecting on Twitter. If this podcast is helpful to you, please leave a review on iTunes so others can find it as well. Subscribe to the podcast via iTunes, Stitcher, and Podbean.
Sean McCown is a SQL Server DBA and Microsoft Certified Master in SQL Server. He and his wife Jen founded MinionWare, where they build the products they always wanted to use. As of 2016, this includes Minion Enterprise, Minion Reindex, Minion Backup, and Minion CheckDB. He also blogs about various SQL Server topics alongside his wife, Jen McCowan.
Transcription: Index Maintenance
Carlos: Well Sean, welcome to the show!
Sean: Hey, thanks!
Carlos: It’s great to have you. Now interestingly enough, I met you at SQL a couple years ago. And you were talking about something about this Minion, that you had. That you wanted to create this “Minion Environment” and that you had this program or this theory that you could kind of create these base tables and they would go out and execute or make changes to your jobs and all that would be in an automated fashion so that if you needed to change, for example, the location of a backup, you could do that and hundreds of servers could be updated instantaneously.
Sean: So it wasn’t really a theory. I had already had it in several shops. The theory was turning it into an actual marketable product.
Carlos: Well there you go, that’s right. Letting other people use it besides you knowing the ins and outs of it and all the documentation.
Carlos: And so, low and behold now there’s a product with the Minion. And one of the things I wanted to chat with you about, which is part of this minion, is the index care and feeding, or the reindexing part of it if you will. And so, lots of misinformation and different opinions, potentially, about how to care and feed those indexes. So I thought I’d have you on and talk about it. Obviously you’ve gone through lots of pains to make that an automated process that now touched several environments and getting some feedback. So let’s start there: what are some of the pains that we have with index maintenance?
Sean: Well, it really depends on what you’re using to do your maintenance. I hate to already set that up to be self-serving, but I really did create the product because of the pains that are out there. Not only in maintenance plans which are practically [expletive] useless. Oh wait, I didn’t mean to say that. I’ve been with Jen all day, anyway, you get the idea.
Carlos: [laughing] Thank you for toning it down a notch here on the podcast.
Sean: You know I did too! The maintenance plans from what I’ve seen, they’ve made some attempt to make them better, right? But there’s still a lot of holes in them. I would say the biggest problem is in varied configuration. Everything out there treats index maintenance like very table in every index in every database has the exact same requirements.
Carlos: Right. It doesn’t care the size or anything. It says, “Hey! You’re an index! I’m going to rebuild you.”
Sean: Yeah, exactly. Maintenance plans as they stand now are exactly like that, right? It rebuilds them, period. And other solutions out there that are very popular kind of do the same thing. They allow you to make changes and have different settings, but it’s a separate job for every one of them. So if I’ve got 3000 tables, 10,000 tables… I have one user with 350,000 tables in a sing database. And if you’ve got that many tables and a lot of them have different fill factors or different thresholds or need to be done on different days or they need to be done in different orders because this table is more important than this table, and so on and so on… then that’s this many jobs. It’s all of a sudden a full time job just to find out which job that table is being processed in.
Carlos: Sure. Let’s back up just for a second then to the maintenance plans, for example. We talked about, for example, that every index will get rebuilt. And I think, to find a factor on that, we all create our maintenance plans with a maintenance window generally like in the evenings, where we have downtime and we expect things to run a little slower during these times. And we all do our backups and our index jobs all at that time. And occasionally, we get the call form the SAN admins like, “Uh… I see my SAN is going haywire at one o’clock in the morning. What’s going on?”
Sean: Yeah. Oh, is there a question in there?
Carlos: Well again, my thought is that with the maintenance plans just saying, “Hey, seek and destroy.” Creating some logic to say, “Hey, let me interrogate you a little bit and do I think you’re an index that needs to be rebuilt? Maybe tonight you’re not, but tomorrow you will be.”
Sean: Exactly. So there are a couple of things that I built in that really help alleviate some of that. And one of them is the Prep Only option where you can gather the index stats earlier that day and use them during the job. Because one thing that I saw is that if you have a maintenance plan that’s two hours or four hours, that can take an hour and a half to gather the stats.
Sean: So you’re using a lot of your maintenance window finding out what kind of maintenance you have to do.
Carlos: Right. I guess I hadn’t interrogated that. I need to take a peak to see exactly how long those processes are taking. I assume that because you do have that step, you’re seeing that yeah, a lot of that time is spent saying, “Okay, should I rebuild you or not?”
Sean: Exactly. And the other thing that I did was I allow you to order the tables. So you can put the tables in any order you want to and to me, that extends your maintenance window. Because there have been very few times, and I have been doing this over twenty years, right? There have been very few times when there was a hard, “the index maintenance had to stop at this time. Period, end of discussion, no matter what.” When people tell me that I go “Dude, come on. You act like I’m your mother and I’ve never been in databases before.” They really mean, now they’ve had this drilled into them by their business units, but what they really mean is that their core set of tables that everybody hits cannot be reindexed outside of this window. And you could have one hundred or two hundred much smaller used tables that can have that rolling outage from the locks. And they’re really not going to take up that many resources. They’re not going to take up a lot of disk or CPU, and they’re not going to lock for very long. So seriously, who cares if they’re done out of the window? So it’s very important to have that two hour window if I know that all my really important stuff is hit up front and everything else just kind of trickles down, because I guarantee that if you take a table with three thousand rows, twenty thousand rows, three million rows, and re-index it in the middle of the day and it’s not something that gets hit pretty often, no one is even going to notice.
Carlos: It all depends on the usage, right? And your mileage may vary there.
Sean: Exactly. So don’t tell me that, “no, all 400,000 of our tables have to be done in this two hour window or the business unit is going to lose their minds and the place is going to crumble.” Come on, man.
Carlos: So let’s talk about that. So what do you define as, “here’s my criteria for the index that needs to be rebuilt?” So carte blanch, general scenarios, default settings?
Sean: Well, so my default settings are, I’m a little more aggressive in my default settings than a lot of people are, right? So I do re-orgs at 10% and rebuilds at 20. That’s a little bit more aggressive.
Carlos: It seems like 30 percent is that magic number.
Sean: Oh come on.
Carlos: [laughing] That’s one in three pages that’s affected.
Sean: We all know, and Paul has said this publically many times, that when he put out that number in BOL it was because they forced him to come up with a number and he pulled that number out of his ass. He said that publically many times. And the truth of the matter is that, a) I’m with him. He hated giving a number but they made him. And the truth of the matter is, that this is the problem with many maintenance plans out there, even the more popular ones no matter how much acclaim they’ve gotten, is that every table is different. Every table in your database could potentially be anywhere, and I’ve seen tables with just a few thousand rows, not be able to withstand any more than seven or nine percent fragmentation and I’ve seen other tables be able to stand thirty-five and forty-five and be just fine. So there’s a lot of variance in there. How much the stats are being changed, and how much variance in the WHERE clauses there is, and how much variance in the data. So there are a lot of factors involved in there, so every table has its own potential to have its own thresholds.
Carlos: So this comes back to knowing your environments.
Sean: You have to keep track of this stuff and see when the tables… now, you have to have a default. The reason I build the system the way I did was to set all of your tables at a guess. Okay, I put 20%, you’re welcome to change it to whatever you like. BOL says 30, and no one ever got fired for following BOL so they put 30. Whatever. Find, you put 30 in there. And the point is, to go in, there and be able to put in table overrides and say, “No, no, no, this table needs 15 and this table needs 22, and this table needs this, and this table needs that, and this table doesn’t really need it until 60 or 65 ‘cause that’s a small table and it just doesn’t have much.” But the point is to have everything covered and then go in and out in overrides and not have to put in tons of extra jobs just to do it.
Carlos: So I’m now curious. How many jobs, and I’m not sure if you’re getting data back, that sounds like a lot of work to go in and say, “I’m going to set this table at 20% and this table at 25.” I can see even maybe at the schema level to say, “Okay, my heavily used tables are in this schema. I’ll create one for this guy.” How realistic is it go in and say, “I’ve got a hundred tables. I’m going to specify a different one for even 10% of them.”
Sean: Right, so you know, right now you’re talking about ten tables right? Big deal. The good news is that you really, if you had 20,000 tables, you’re only going to have a couple handful of tables where this thing is really important. The entire database isn’t the most used portion of the database, right? So there are going to be some tables that cause you problems, for one reason or another, and those are the things you put in exceptions for. And for everything else it’s going to be good enough. So you can, if you had a schema with 1500 tables and they were really important and they all needed the same override, let’s say they all needed to be 10% lower than the rest of the database. You don’t have to put them in there individually. You can write a query to put that INSERT into a table and create the overrides that way. You don’t have your hand in there doing 1500 tables, that just defeats the purpose of it, right? But you do have your hand in there doing all that stuff if you need to create a separate job for every one of them and have that table in an INCLUDE with its own little parameters and stuff. You are going to have your hands in there 1500 times, right? But yeah? For this one, just run an INSERT. And the cool thing about that is that it can be dynamic. The way I’ve done it is, not only can you do it based on that. You can order the tables based off of, right now you base it off of the order you give the table itself. But you can, and it doesn’t take very much coding at all, you can do it based on fragmentation level and usage. Or just usage. So you can look in the DMV and say, “these are the most used indexes, lets’ do these first and reorder them based on that way.” And I’m going to build that into the system in version 2 and right now you have to kind of create your own intermediate step, but even then, all you’re doing is an update based off of the results of another table so it’s really not that hard. Once you get it written you apply it to everybody, right?
Carlos: So obviously looking at the most used indexes, we want to make sure that they get the best care and feeding. Because inevitably, in my mind, with good care and feeding, again we’re looking at the disk fragmentation and trying to keep it as small as possible based on these thresholds, I can potentially add a couple more indexes. Or I’m not going to be shy about adding those indexes because I know I have good care and feeding. I think we have to be careful with indexes and we can’t create every single index that every query wants.
Sean: …that crosses your mind.
Carlos: Exactly. But if we see that we have heavy usage there, we can do that. Now one of those defaults that we talked about, and another potential sticking point with index rebuilds, is the fill factor. And fill factor, just to rephrase it, take us through it. When we say fill factor what do we mean by it?
Sean: I like to use the analogy of my dvd case, right? So I’ve got this lovely dvd case, and if it weren’t so far across the room I’d bring it in and show it up. So you’ve got a lovely book of dvds and you’re putting them all there in alphabetical order, and then you get all the way to Z because you like Zoolander and Zootopia. So you’ve got all these guys. So you’ve got the thing not filled up, you have a few pages left in the end but you’ve put them all in there. And then you go and you buy another one that starts with a B. What do you do? All the spaces are taken. So do you find where it goes in there and move 30 pages of DVDs? No, that’s ridiculous. So fill factor is basically in that index leaving space for extra entries and on the page as well, leaving space for extra entries so you don’t have to move everything down. You’re saying, “Okay, I’m going to fill up each page by half.” And this is why I like this, because it’s a perfect analogy for reindexing, right? I’m going to move these guys by half and so I have four slots on each page and I’m going to put two dvds on each one. If I buy three B’s, now I still have to move it, because I have three B’s but I only have two slots left. I’m going to take that and put it at the end, and I’m going to leave a little sticky note on there that says, “It’s on page something or other.”
Carlos: Right. Hey, I’m over here now.
Sean: Right. That’s fragmentation. And eventually, when I get five or six or seven of these things, it’s going to get laborious and I’m going to have to take that page out, instead of moving everything, I’ll take that B page out of the back and just clip it back in there where the other Bs are instead of moving everything down. So I’m going to take that page and move it back where everything was.
Carlos: And SQL Server is going to get some efficiency with being able to read in order. It doesn’t want to jump around.
Sean: That’s actually reorg, you get the idea, right? That’s why I love that analogy. It really shows people who don’t really get the whole reindexing thing, it puts it in tangible terms for them.
Carlos: I noticed that your default fill factor is 90. Now I feel like, so my recollection, except for SharePoint, that 100 is the keep it at 100, fill it up, and we’ll take care of the fragmentation with the re-index. Talk to me about your thought process there.
Sean: Okay, so, my personal default is 80. And I compromised on 90, because I know everyone doesn’t like the 80 default. But there are enough OLTP systems out there and one of the things we deal with the most is people not reindexing enough.
Carlos: That is true, right. Throw them out there and they don’t even create the maintenance plans, right?
Sean: And leaving a fill factor on there isn’t really going to hurt performance that much, which is why I put 90. There are some extra reads a little bit, but it’s not really that bad, right? It’s not going to hurt your performance and it’s really going to help your INSERTs. If you’ve got an OLTP system it’s going to do nothing but help your INSERTs and it’ll keep you from having to do that maintenance more often, right? So, in other words, it lets you do your maintenance less often, and that’s a way to state it that’s not a negative. But you know, if you have a DSS system and you do full loads every night and you drop your loads, well then change it to 0 or 100 and move on. I guess change it to 100, because if it was at 0 it’d be re-indexed all the time. But you’d probably just turn it off for that database anyway.
Carlos: So you’re not necessarily looking at size in there? I guess, you mentioned inserts. So if I know I have a table with lots of inserts, maybe I can reduce my fill factor? I guess would that be the only factor to consider? Inserts with fill factor? Or changes, potentially?
Sean: Inserts and updates, really. But that’s really deleting an insert. So, it still boils down to inserts because it’s still gonna have to move that. So yeah, it’s all on inserts and you know it will keep it from being fragmented as often by default and I think by default that’s a good setting. That’s a responsible setting. Because most databases are OLTP in some nature.
Carlos: The other thing that I had is that, yeah, you might increase the index size a little bit, but at least you’re going to get htat buffer, because the other thing that people don’t do well is their growth settings. So when you do re-index, you’re going to run into log growth and database growth and then all of a sudden you’re going to get the dreaded, “You have 15 instances of IO taking longer than 15 milliseconds.” And you’re looking at your log in the next day and going, “what the heck?”
Sean: Yeah, exactly.
Carlos: Very good. So, other thoughts about index maintenance that maybe people aren’t doing very well? Other than setting it up, number one, that people should be thinking about?
Sean: Wow, let me think about that.
Carlos: I apologize, we didn’t talk about the rebuild versus reorg debate. When to use reorg and when to use rebuild.
Sean: Right, so reorg is really just, it’s really just there to get you by until another reindex, until another rebuild. So what it does is it puts those pages back in order and just moves the pages back in order. It doesn’t actually physically change the pages themselves or reestablish the fill factor that was there. It just takes these pages, and they get all shuffled up with page splits and so you’ve got all these pointers going, “Oh, don’t forget I have this one over there, and that one over there, and this one there,” so it just takes them all and that IO can really matter. So that’s why by default in my routine I put reorgs every night and just reindexes on the weekend. I kind of like to do it more aggressively than that but I kind of like to have a subdued default. I mean, coming up with defaults for these things is actually really kind of hard because you’re like, “god, what do I put for a default?”
Carlos: With so many environments can be different. Maybe you have a little more empathy for Microsoft now?
Sean: Well yeah, but not for the whole one percent growth on the lock. There’s…come on, man. That’s a horrible default. Okay, maybe one and one. But one and ten percent? Really? So you know, I figured I had to go by something that I knew I would be comfortable knowing that someone who was a sys admin was going to drop this tool in their shop, not know how to configure it, and not care to configure it, and not have any maintenance at all. So I wanted to give them a fill factor, I wanted to set them up to not have to do the full rebuild every week, or every night, right? So I gave them a fill factor and I’ running the rebuild job during the week and that will get them through to their weekend maintenance and then they can change anything on top of that. I’ve actually seen reorgs outperform rebuilds, as in the table performs better just on a reorg that it did with a rebuild.
Carlos: Oh interesting.
Sean: Yeah, I know. No one’s been able to explain it to me, but I’ve quantified it a few times.
Carlos: Very good, so I guess your mileage may vary. Sean, let’s do SQL Family. So one of the things I always like to talk with people is how they work. You created a tool and we can talk more about that tool, but what’s your favorite SQL tool?
Sean: What’s my favorite SQL tool outside of mine?
Carlos: It can be yours. Let’s talk about it.
Sean: Oh, it’s a toss-up between Minion Enterprise and Minion Backup itself.
Sean: And very quickly becoming CheckDB because I’m writing it now. Oh, it’s a sweet tool.
Carlos: So, Minion Enterprise consists of the reindex, the backup, DBCC checkdb…
Sean: And so many other modules it’s not even funny. Dude, it does so much.
Carlos: Very good. And so this is ultimately a way for administrators to administrate their environments with all these different components and be able to update what they have and set them much easier?
Sean: It takes all of those things that you would rebar across all your servers and it lets you do it from a single place in literally seconds. So a really good example that I like to give with the security module is, let’s say that you have Anne, she just gets hired to marketing, and she needs to look like Steve. But Steve has access to 40,000 databases across 700 servers.
Sean: And his permissions are completely different in all of these things, it’s all over the board ‘cause Steve’s been here 7 years and you’ve fined-tuned his permissions to everything, right?
Carlos: And you’re not using Active Directory groups?
Carlos: Shame on you! [laughing]Sean: Well, maybe it’s a mix, right? Maybe you’ve got it on some and not on others. So with Minion Enterprise, I can clone, what did I say, Mary? I can clone Mary as Steve and it takes me 30 seconds. And I’ll clone her across all 700 instances with the same SIP. So, just stuff like that, where if someone gets put into sa, even if they’re buried 30 levels deep in an AD group, you’ll get an alert saying, “this guy’s been given access to these groups on these boxes.” It’s just one of those things that really, really enhances your security because now you know. Because you can’t physically watch that many boxes at once and you don’t want that much stuff out on one of these boxes. So you do it all from one place.
Carlos: There you go, very cool. If you could change one thing about SQL Server, what would it be?
Sean: Um, I think I would own it. [laughing]Carlos: [laughing] It would be Sean Server!
Sean: That’s right, I could use that kind of cash. Other than that, oh, there are aspects of every piece of it that really piss me off.
Carlos: So obviously we talked about the default database growth settings.
Sean: Yeah. That’s an annoyance, but I wouldn’t spend my dev dollars there. I think I would make things in T-SQL a lot easier. Because there are a lot of things that they could do for us that make things really hard right now, and actually decrease your security.
Carlos: Kind of getting outside SQL Server itself, getting out to the box?
Sean: If they wanted to improve that, sure. They could make it easier. I don’t think they could make it safer. But they could make it easier, right? I mean, uh, things like right now we have to use dynamic SQL for just ridiculous stuff. If we could do something like SELECT * FROM @table, instead of having to write dynamic SQL for that. Or if I could get at databases like USE @database and then do a query in the context of an sp, right? And then switch to that database instead of having to roll through each of them in dynamic SQL, then that would help as well. Just those two cases alone would knock out like 90% of the dynamic SQL out there, because most of it is just getting around this kind of BS that they used to let us do. Now you brought up command shell.
Carlos: [laughing] No, no, no, I’m just saying getting out of SQL Server. I assumed that’s what you were talking about.
Sean: No, no, no, it’s fine. But, I will go off on one of my diatribes for 20 seconds.
Carlos: Here we go.
Sean: Everybody is so against the command shell.
Carlos: It’s been put on the naughty list.
Sean: Everyone keeps calling it evil and they keep saying it’s a security nightmare and it’s just not. It isn’t.
Carlos: We’ll make sure to put in your rant. I did read that today in prep for this interview.
Sean: For the security theatre? Right?
Carlos: We’ll make sure it gets into the Show Notes.
Sean: Oh, excellent.
Carlos: So Sean, you’re a very interesting fellow. And again, I was very interested to meet you on SQL Cruise. I admit, I was going to go on SQL Cruise anyway, but I was interested to chat with you and one of the things that struck me when we talked is that, at the time, and again MCM has kind of gone the way of the dodo. You were the only one that I was aware of that was a non-Microsoft, non-consultant person to get the MCM.
Sean: Yeah, actually at that time I was.
Carlos: I was like, “Wow, that’s kind of interesting.” And I asked you, “why did you want to get that?” And do you remember what you said?
Sean: No, I don’t, actually.
Carlos: Let’s see if your answer matches. Why did you want to get your MCM?
Sean: I wanted to see how I measured up, I think?
Carlos: What I remember you told me was, “I wanted to see if I could do it.”
Carlos: I was like, wow!
Sean: Because you have to be a consultant most of the time to get the level of experience it takes to do something like that. And I wanted to see, since I had been a DBA in major shops but never really a consultant, I wanted to see if I had the chops for that kind of thing. If I was as good as I was purporting to be.
Carlos: [laughing] very cool, so with all of that, my ultimate question is: What is the best piece of career advice you’ve received along the way?
Sean: You know what? I remember seeing that in your show notes and I forgot to even think about it. I’ll tell you. I don’t know, a lot of people don’t know that I was friends with the late Ken Henderson. And you know, the gurus guy books. And big Microsoft, he was based here in Irving and we used to sit up late and night and we used to chat. We were both late night coders and we would sit late night and chat. And I sat in his office elate a number of times and just, shot it with him and what-not. We sat on his couch and what-not and I liked it because he always kept the lights off. And so one night he gave me an excellent piece of advice, I don’t think he was even trying to at the time, but he was sitting there really going off about something on his blog and he said, “Don’t ever, ever write or talk about anything in your professional community about religion.”
Carlos: Our last question for you today. If you could have one superhero power today, what would it be and why would you want it?
Sean: I would like to read minds.
Sean: Because, knowing what people are thinking can really give you an advantage, if even knowing when to shut up. Something as simple as, “wow, I’m pissing her off. I’m going to shut up and walk away.” That’s a brilliant skill to have to not only know what they’re thinking and guide the conversation. But just to know when to stop it too.
Carlos: [laughing] I’m done, I raised my white flag. I’m out.
Sean: Okay, well it’s been good talking to you! See ya! Sorry I’m such a putz, right? [laughing] It’s like, “Oh wow, he knows he’s such a putz and he stopped! Good for him!”
Carlos: [laughing] There you go. Now you just have to deal with all the dirty laundry. That’d be the one downside that I see to that power.
Sean: Yeah, but you could also know when someone’s mind is wandering and they’re really not listening to you.
Carlos: There is that. It’d be nice to know that about the people listening to my podcast. Am I just talking to empty space right now?
Sean: Exactly. We saw this, there was this wonderful meme going around for a while about useless superpowers. The one that I really loved the most was 13th bulletproof.
Carlos: 13th bulletproof, what’s that?
Sean: You’re bulletproof to the 13th bullet.
Carlos: Okay, so number 14, you’re done.
Sean: So this guy was lying there with 12 holes in him and the 13th one bounced off.
Carlos: [laughing] Oh, so it’s only on the 13th bullet. I thought at least you got to 13 bullets.
Sean: 13th bullet proof.
Carlos: Yeah, that is useless.
Sean: I thought it was a great image too. He’s in an alley with 12 holes in him and the last one just bounced off. I was like, “aw, dude!”
Carlos: Very good. Sean, thanks so much for being here.
Sean: My pleasure. I love doing stuff like this, I don’t do it nearly enough.
Carlos: We’ll make sure that there are links to your product and your brand, some of the things we talked about today will be up on the Show Notes for people to listen to later. So we do appreciate you coming and chatting with us.
Sean: Excellent. And welcome me back anytime. I’ll darken your doorstep any time you like. Now the truth is, this is for your listeners, howmuch of what we talked about is going to be edited out? I counted two.
Carlos: I think, the index part you’re safe. Once we get to the SQL family questions, then no promises there.
Sean: I can’t wait to see it.
Carlos: Thanks again Sean.