After having attended PASS Summit and seeing all the new features coming it out, it can be very easy to overlook the basics–they are boring. These features have been around forever. This may be true, but they still play an important role and ignoring them won’t help get you those fancy new features any time soon.
In this episode, our topic is index maintenance, with an emphasis on what has changed or what might change. Is the query store feature going to impact the way we maintain our indexes? We invite one our favorite guests, Sean McCown back to talk with us and give some of his thoughts.
“You can’t really change the math of re-indexing and maintenance.”
“And part of the problem is not with that, but with not having time to do maintenance is allowing business people to run the show.”
“Rebuilding an index is absolutely rebuilding the index pages and putting them in the order that they are supposed to be.”
“It’s not the stuff that they add that I think is ridiculous. It’s the stuff they don’t put in.”
Listen to Learn
00:06 Introduction about the speaker and brief outline of the episode topic
01:12 Updates on minion ware and what’s changing on re-indexing
05:00 Problems that will arise of having no time to do maintenance
10:18 Fragmentation of indexes
15:04 Does SSD and other premium storages change the data access?
19:17 How to determine how the fill factor should be different?
22:51 SQL Server: Analytics on database
27:49 The difference between re-indexing, meaning rebuilding, reorganizing an index, or statistics
33:02 What stands out with minion re-index?
43:27 Re-index 2.0 features
49:00 SQL Family
About Sean McCown
Sean McCown is a Certified Master in SQL Server 2008 and a SQL Server MVP with 20 years of experience in databases. He is also founder and co-owner of the MidnightDBA.com website, where he records free SQL Server training videos and co-hosts the popular web show [email protected]
Transcription: How has index maintenance changed?
Carlos: So Sean, it has been a little while, welcome back to the program.
Sean: Thanks guys! It has been a while but not that long. I’m still getting people talking to me about the last episode I was on.
Carlos: Is that right?
Sean: Oh yeah, I get a lot of people at SQL Sats and whatnot telling me how much they listen to you guys and they heard about me and the stuff I’m doing through you guys, so yeah, good for you all. I didn’t realize you are so popular.
Steve: Well, it’s always good to have you on the show, and I think that helps with the popularity.
Carlos: That’s right, our MVP guest. Yes, so I guess 20 episodes. It was episode 90 was the last time we had you on.
Sean: Oh wow!
Carlos: Talking about Check DBCC. Yes, good to have you back on and always trying to keep up with what you’re doing. So today, ultimately our topic is going to be a bit about re-indexing. What’s in store for that? It’s one of those, I think everybody has a story about re-indexing or an opinion about re-indexing. I think even the future of where that’s going might change a little bit. And so I guess let’s just first jump off, why don’t you tell us, give us a little update on the minion ware stuff. What’s changing on the re-indexing?
Sean: Wow, so first I want to say we’ve been watching a lot of Voyager recently so I want to say your opinions about re-indexing are irrelevant.
Steve: What? The resistance is futile?
Sean: That’s right. Well, because to me it’s like I see this a lot. Companies seem to take pride. I’m going off on a tangent a little bit but it may have even been the one I won on last time. But I see companies do this a lot because you’re talking about opinions. And you get these companies that brag about how busy they are. “Oh well, our business is so slammed, we don’t have time to do re-indexing. We don’t have time for maintenance. We are so slammed.” You know, while that’s a cute sound bite, you can’t really change the math of re-indexing and maintenance, right? I mean, it’s nice to think that you don’t have time for it but that’s like saying you don’t have time to take care of yourself but you’re going to have time for the heart attack.
Carlos: So in that event I guess is do you believe that that is one of the reason, an additional reason perhaps that people are looking to the no SQL world. Again, not to say there aren’t indexes but they definitely seem to be same class citizens.
Sean: Yeah, people were looking to the no SQL world for because of the hype. They are advertising a gazillion transactions per hour and it’s so much easier. It takes a lot less maintenance because it got so much less overhead an all of that. But now they are getting bit in the ass by what, the fact that there are no keys on the table so your isn’t enforce and whatnot. So they are like, “Hey but I got corrupted data now.” “Oh, duh, you think so. Maybe you should add some of that overhead backend.” The stuff is here for a reason. I mean, no SQL has its place. There are sometimes when it just doesn’t matter, when that integrity is a lot less important or when you have a threshold for that sort of thing, right? For having what we call logical corruption. And you need the speed more than you need to have absolutely everything lined up. But most of the world need some sort of integrity.
Carlos: Right, you want some protection, you want some help in assisting what’s keeping your data. If you want to keep it around, right?
Sean: Yeah, so no SQL is really more hype than anything if you ask me. Yeah, that’s all I have on that.
Steve: Ok, so jumping back then to sort of our businesses is so busy that we don’t have time to do that. I think oftentimes I hear that and I see that, and quite often it comes down to misunderstanding of some of the fundamental maintenance items you need to do there. I see oftentimes like somebody will go through and do a complete rebuilt of all their indexes and then right after that they will rebuild all their statistics.
Sean: I see, yes.
Steve: And they will rebuild the statistics with like a 10% sampling or something like that which when you rebuild your indexes, of course your statistics get updated too. So you’re actually being wasteful rebuilding this statistics right afterwards.
Sean: And part of the problem is not with that, but with not having time to do maintenance is allowing business people to run the show. I mean, you get PMs and you get Sales Managers, and you get VPs, you got C-Level people and they are saying they don’t understand the database and they say, “We just don’t have time for that. We just can’t do it. I don’t have time, I can’t do that.” “We’ve got customers. We need to keep them online.”
Carlos: Well, in their defense it’s because they’ve tried probably to do, they haven’t done it in 10 years then they try to rebuild it and it hits them because the business is impacted in some way because it runs over. It’s still running at noon and nobody can get into the database, and so that’s where they’re like, “We’ll never do that again.”
Sean: Or they have a DBA like and this kind of merges into what you’re saying, they have a DBA that’s just doesn’t know how to do maintenance well.
Carlos: Right, and we’ve all been in that camp. I think I may have actually told this the last time we were on. I had been hired by a law firm to help with a conversion of the financial application and that was like the flagship project that I had come on for. There were two DBAs. He was actually in London at that time, this was in Richmond, Virginia where we’re located, so he was in London helping with some office and they were doing some conversion over there. And then we had the document management have been having problems, so the decision had been, somehow somebody else. So the DBA in London had talked about reorganizing the index that’s what has been approved. And for some reason we’re talking and we decided we’re going to rebuild the thing instead of reorganizing it.
Steve: Let me guess on Standard Edition as well.
Carlos: Standard Edition as well. So the issue was not what you may think it was. The issue wasn’t locking everything else out. The issue was we’ve granted disk space. So it started to rebuild and it creates that second copy and there wasn’t enough disk space on the drive to facilitate both copies of the index and that had already taken like 2 hours and some such thing. An attorney had needed to get into the system to get this document and they couldn’t. And then we were waiting for it to rollback and so that caused a big hoopla because we were not paying attention to what we were doing. We were all tired. It was like midnight when this was happening. We were both involved in own projects and we made kind of in the moment decision when that probably wasn’t the best choice.
Sean: Right, I had a situation years ago with a big client who hadn’t done maintenance ever, and they were being charged something like $40,000 a month on missed SLA fees from their customers. And they were about to both close the site and then the other sites were about to go to Oracle because SQL just couldn’t keep up with their level of business. Ok, sure if you say so. And so I got involved and they were 99% fragmented on almost every table in the database. So I got their maintenance taken care of and even did a little bit of scale up for them. Put some indexes on different lungs and whatnot. And told them that, and they were a true 24/7 shop, and I said, “Now we have to go down 4 hours every Sunday to do maintenance. We have to do this.” And he was like, “I can’t do that. I can’t afford that time.” So okay fine, right?
Carlos: What about the $40,000 a month?
Sean: So like a week later, Sunday, they decided to skip their maintenance. That following Tuesday, they had to pull on an emergency downtime to do their maintenance because the database has gotten so slow. Now they have gotten the taste of the good life. And the database has gotten so slow they could barely work. And this is the way they’d worked for years. So they did their maintenance that Tuesday as an emergency. The following Sunday they got ready to do their maintenance and the Plant Manager said, “We can’t bring everything down. We can’t do this. We’ve got to skip it.” And the General Manager said, “Stop. No. Take it down to the maintenance every Sunday. Don’t ever miss another one.” So we made a believer out of him for sure.
Carlos: Yeah, that’s funny. I think maybe we should pause just for a second because I’m thinking all of a sudden about Eduardo Cervantes and some of our other developer friends who are listening. I guess I feel like maybe we should just revisit or rehash some of these terms that we’re talking about. And so I guess even the idea of index fragmentation, right, you mentioned 99%. Maybe let’s just take a couple of minutes and talk about the very beginning, so we have a table, if there’s a primary key on it, you have the clustered index and obviously you have other non-clustered indexes. How do we go about getting fragmentation on our indexes?
Sean: Right, so my favorite analogy to this is with the CD case and it seems to be the one that everybody can associate with.
Carlos: You’re showing your age there a little bit, Sean.
Sean: I’ve got 4 CD books sitting over there.
Carlos: Ok, for you millenials, right? We used to carry our music around.
Sean: Yeah, exactly.
Steve: Yeah, and it wasn’t on our phone.
Sean: That’s right. So let’s say you got a nice big CD case with all your movies in it back before Amazon had it all digital. And this is going to dovetail into fill factor as well. So you alphabetize all of your movies in there, so you’ve got them on a certain order. That by the way is your clustered index. All of the movies are physically in order in this book. And so let’s say they come up with another Batman movie, Batman forgot his comb on Penguin’s Lair, and that page is full. What do you do? So you go to one of the pages at the end of the book, you put the Batman in there, and then you put a little post it note on the current B page that says, “Bs continue on page 75.” And so you get enough of those where you’re having to put them in the back of the book and make different markers to different things, and that’s fragmentation, where you don’t have everything in order. You got to go here for part of it, then over here for part of it, then over there for part of it because everything starts getting out of order because you haven’t left room to put anything else in there. So that’s what we mean by a fragmented index.
Steve: And the effect of that being out of order is if you’re trying to find that Batman movie or something else that starts with B, it’s going to take you a lot longer to find it because you’re starting in your B section and you have to flip to the back and jump around quite a bit more than if they were all in the expected location.
Sean: Absolutely, and if you put different letter on that same Batman page and then you got another B, well then that would be way out of order too. And you’re going to leave a post it note on that one to say where the next B section starts. I mean it can get way out of hand. However, if you were to say, if your CD book held four CDs a page and you were to only put in two CDs on each page because you know you might get some more CDs. They might come out with another Batman movie of nowhere. So the first one, you will only have the first two Batman movies on that page and you’ve got two free when they come out with a new Batman movie then you can just slide it in there right past the last one you have and you still got room for another Batman movie. Anything after that and you’re going to have to put it on another page, right? So that’s your fill factor as well. So you’re leaving space for inserts and once it fills up then you have to either start creating new pages or you have to re-index and recreate it with that so. A good example of that would be, they come out with two more Batman movies and your page is full, then they come out with another one. Well, now you have to tear your CD case apart and redo everything leaving two free per page so now you’ve got your expansion again, right?
Steve: While you’ve got that all ripped apart, if someone wants to find a movie, it’s going to be very challenging for them to find that.
Sean: Yes. That’s why classically they lock it. Right, nobody is allowed. That’s why classically they lock the table because we don’t know where this is right now.
Carlos: Under construction, right?
Sean: Yes, we don’t know where it is.
Carlos: We got the yellow tape.
Carlos: I think another reason why there’s maybe some confusion or misinterpretation. I think what you’re saying is correct. I think on the flip side however, people are going to say it really affects you, so that lookup affects you when it takes a long time to do those lookups. So with the advent of SSD, all these premium storage does that change the game at all?
Sean: Well, it helps because what you’re doing now is you’re taking what should have been a sequential lookup and turning it into a random lookup. So here you’re looking everything that is The Present or The Batman Present then you’ve got a range there and you should have a sequential lookup which is going to be very fast especially by the index. But now it’s random because they are scattered across all these different pages and maybe even all of these different extents and you’ve got these pointers that’s going to have to go everywhere for it which SSD does really well. That’s a lot less important if you don’t have an actuator arm moving back and forth trying to access spinning data. But you’re still pulling extra I/Os. I mean, that’s still I/O that isn’t as efficient as it would be if it weren’t arranged. But it has made it a lot better.
Carlos: So ultimately, the idea of the overall speed of getting your data as the cracks of the problem, and fragmentation and fill factor goes into that. I guess there is one more thing we should say about fill factor and that is the idea of, so we talked about pages, right? So you mentioned leaving a little bit of space there. Now, SQL Server does read everything from memory so you have to be careful. I don’t want to leave either you want to leave, and again, I guess I’m thinking about my CD case that has those four slots on a page. I don’t want to just put in one CD or DVD, whatever and leave the other three blank because those are a lot of pages that I’m not using, right? When those pages isn’t going to memory that basically means I have a lot of blank stuff in memory, so I guess there’s a tradeoff.
Sean: So yeah, it’s trading one set of complications for another as I like to say about almost everything. You’re giving up space to gain the speed, and to gain concurrency and to gain efficiency in I/O but you’re giving up other resources to make that happen, so yeah.
Carlos: And I guess on that what’s right really comes down to understanding what are the constraints on your system. And if you’re already constrained on memory you’re going to make it even worst if you need more pages to be loaded to get to the data. But if it’s one of those things where you’ve got way more memory on your server than you ever need, the unicorn world I guess. Well, maybe it’s not so much of an issue and I think that more memory can mitigate the constraints of your disk if your disk is already …
Sean: You know, maybe you’ll be lucky and you’re on a system that can use BPE or something like that.
Carlos: I’m not familiar with that acronym. BPE?
Sean: Buffer Pool Extension.
Carlos: Oh, got you. So we’re using SSDs, you can extend that beyond your memory capacity.
Sean: Exactly. But I found it also, I’m compelled to say that one mistake that a lot of people make is thinking that every table needs to have the exact same fill factor as well. They tend to set that and that is propagated by some re-indexing solutions out there but certainly not minion which is where all of these started to begin with. Which is what we’re doing new in minion, right? But a lot of the re-indexing solutions out there you’ll find both free and paid is that they don’t make it very easy at all for different objects to have different settings.
Carlos: Well, give us the short version of how you make that determination as to how the fill factor should be different.
Sean: Right, so that is unfortunately that’s an actual discussion. The short version is, and also takes some work, right? The short version is you have to monitor the fragmentation on the table, so you have to have some sort of monger out there, pulling frag stats from the DMV, say on an hourly, or bidaily, or daily basis, whatever you think your threshold needs to be. And then see the rate of fragmentation per day or per hour that these tables are going through and when they start misbehaving. When they start misbehaving as in when performance starts to tank. Then you can note the fragmentation level and if you know that this thing has a fragmentation level of 5% a day I guess then you can start with the 90% fill factor and see if that can get you through to your next re-indexing. That’s what this is all about, it’s getting you through to the next big re-indexing and doing a little bit every day even. I’m not a very big fan of the big re-indexing every week. I’m a fan of taking them as they are needed.
Carlos: A more incremental approach.
Sean: Right, but setting the settings to what will I need. If a table needs to be defrag at 20%, I mean re-indexed at 20%, then why make it wait three more days for the Saturday night maintenance. It’s only going to get worst from here.
Steve: I think the key there is knowing when they need to be re-indexed. I think so often people take the approach of we have to re-indexed everything on a weekly basis and that’s one of those that, I don’t know who disagrees with that.
Sean: Right, and quite often, and of course the frag monitor that I just talked about is if you’ve got a big shot that can be overkill or if you don’t have the skill to write something like that that can be overkill. If you want to take just like the poor man’s swag approach you can set your big table to 90% and then if it misbehaves before the next re-indexing then drop it down to 85%. Then if it misbehaves before the next re-indexing then drop it down to 80%. Keep dropping it by 5% until it makes it through your next re-indexing. And the extra page loads in the memory don’t have adverse effects either. There’s going to be that give and take. I would never recommend to going something like a fill factor of like 60% or something, or 50%. I think that’s a little much for me.
Carlos: Yeah, that will be a little too much. So now it’s interesting because you mentioned that idea of kind of getting the insights to fragmentation and to making adjustments, right? One of the things that I asked when we’re going to put this together was the future of re-indexing and obviously you have the new feature or the new version of the minion ware out. So one of the things that’s interesting that I see, and we’ve talked about on the show before from SQL Server in the News perspective is that SQL Server starting to put the analytics into the database. Looking at your query plan it’s saying, “Hey, we haven’t talk about this with like your table value functions.” So like the first time it runs, it doesn’t know how many rows it’s going to be affected. It’s going to guess one. It runs and it says, “Oh, I actually had a thousand.” The second time it runs, it’s going to reevaluate that plan and say, “Oh, I’m not expecting one. I’m expecting a thousand doing the changes at all.” And I have a feeling that that may impact this idea of how it goes back and gets those records and things. Do you think that that feature, that idea of basically the database deciding what to do and how those plans are going to get generated is going to affect this decision at all?
Sean: The decision of what exactly?
Carlos: Of whether to re-index.
Sean: Oh, I see. You know, I think in a perfect world what’s going to happen ultimately. And they are nowhere near that right now. Because right now they can do, they can look at some plan regressions. We’re simple plan regressions right now based off of CPU. If it’s going to cause a regression of greater than 10 seconds I think, then you can kick it into force the last known good plan. And that’s really the only thing you can go by automatically based off of that or based off of auto tuning.
Carlos: That’s the next thing they have added, so there’s the auto tuning. Basically that says, hey my stats have changed and I’m getting some differences then use the old plan.
Sean: Now, you’re putting something into that I don’t think exist. You’re making an assumption. It says if the plan is going to regress not if the stats have changed. It’s looking solely at the plan. So it says if the plan is going to take longer than 10 seconds, I think it’s going by CPU.
Carlos: The execution of the actual…
Sean: It‘s based of the execution of the actual plan. If it’s going to change significantly I’ll say right now because I don’t remember exactly what it is but it’s 10 seconds or 10 milliseconds, something like that. But if it’s going to take like 10 seconds longer for the plan to run then it’s going to look for the last known good plan. That’s completely different. You’re inserting causality.
Carlos: What I think what I meant to say was it’s looking at the stats of the execution, not the statistics. I apologize, that was unclear.
Sean: Yeah, it’s looking at a stat of the execution.
Carlos: A stat of the execution. Yeah, that’s right.
Sean: Well, I take that back. There are two conditions that can kick it in. It can be kicked in by the plan taking longer and by error count. If the last error count of the good plan is won then it will say, then it has to be greater than one that the current error count has to be greater than one, so we’re looking at timeouts and stuff like that. So it can be based off of error count as well and they call that error prompt. And it’s just a bit. Is this plan error? Is this query error prompt? Yes. Then use the last known good plan in going about your business. Yeah, I think that in a perfect world they would have to be able to tell the difference. They would have to know why a query is going bad. Why has the plan regressed? And right now in SQL as an engine, at least not exposed to us, there’s no way to tell the difference between a bad query that’s bad because the stats are out of whack, or because there is fragmentation, or because there’s memory pressure or whatnot. There’s no place to look to automatically say, “Oh, well, instead of rebuilding stats.” I mean, instead of rebuilding the index I need to reduce stats. Right, there is nothing in there that tells you that. It’s just kind of jack of all trades sort of thing. It’s a shotgun at this point. That’s why so many people quote re-index needlessly.
Steve: On that point, Sean. One of the things that I think comes up in conversation a lot and people get a little bit confused on when they haven’t experienced it is the difference between re-indexing, meaning rebuilding, reorganizing an index, or what statistics are? And can we take a minute to talk about that because I think that place into the whole conversation here.
Sean: Absolutely. Rebuilding an index is absolutely rebuilding the index pages and putting them in the order that they are supposed to be. It’s like that whole taking your CD case apart and leaving the two slots free on every page and making sure that they are all in order. So it’s literally completely rewriting those and rearranging all the data so it’s completely alphabetize again if you will whatever alphabet you’re using. Whether it’s GUIDs or first names, or social security, whatever, right? So it’s literally just putting the data back in order and making sure that everything is contiguous again. But your stats is data about that data so let’s say that you’ve got an alphabetic thing or let’s say you have a numeric thing. And so that stats would be something like, let’s say the data in your page is the first row is a 1, and the second row is a 5. So the stats would say something like this value is one and the steps between, and there are four steps between this value and the next value. And if the next one is a 6 then it would say, there is one step between this value and the next value. And if the next one is a 6, then it would say there is one step between this value and the next value. And so if the next value is 10 or 9, it would say there are three steps between this value and the next value. So it’s data about that data to let the engine estimate where data is going to be located so it can know whether it has to do a scan or a seek. Because there are times, I mean, I see this all the time where people say, “You want to get rid of all scans.” You don’t want to do scans, you want to do seeks. No, not necessarily. There are some times when it’s actually cheaper to do a scan than just to do a seek. So that’s what stats are, is it’s data about the data. And where that starts going wrong is if the stats haven’t been updated in a while and say we’ll go back to this first one, let’s say you’ve got 1, 5 and 6 as a data set. And then you’ve got between this value and the next value is four steps. So you’ve got one 1, and one 5. And it says, well there’s four steps in between but 5 is the very next value. And then you insert twelve more 1s. Well, now your stats are throwing completely off because from that first one to that second one is only one step. Your metadata about that data is thrown completely off. And then let’s say that somebody comes in and deletes that file altogether. Well, now you’ve got 5 listed in the stats but it doesn’t even exist. So it’s going to get there and it’s probably going to have to do a scan in order to find it because the stats are completely gone, right? So when you delete those stats you are basically clipping SQL’s legs out from under it and saying that, “I want you to find this data and by the way it’s none of your business what the data looks like. Just go find it and make it as efficient as possible.” And they are like, “But I need information.” So it’s kind of like, the stats are kind of like a trimmed down index that tells the engine what the data looks like so it can access it more frequently, so it can access it more efficiently. The more the data changes, the more those stats are going to get out of whack because you’re going to have updates. I mean, if somebody comes in and updates that 1, 2 or 12. And stats are only updated after 500 rows and 20% by default. So imagine having a table with a million rows in it or 200,000 rows in it and trying to join that with another table that’s got say 40 million rows in it and your stats are extremely out of date and SQL doesn’t know anything about it, that’s a pretty long lookup time.
Steve: Yup, ok. So then let’s say, I mean I know enough to be dangerous and I’m going to go and use some built-in maintenance plans, do my re-indexing or a I know a little bit more and I’m going to build my own solution to do it or I go and grab someone else’s maintenance tool out there to go do it. I mean, there’s a lot of different things that can happened there on how those get rebuilt. But what is it because as you talk about your minion re-index, what it is that’s really different or stands out with what you’re doing.
Sean: So much and even in version 2. Dude, let’s not get of here without giving you a couple nuggets from version 2, ok? But right now, currently the whole thing is table based, so all of the settings are table based, so that makes it really easy to be able to have different settings both at the database level and at the table level without having to have any extra jobs or job steps. So it’s just an insert into a table. So let’s say I want the entire database to have a fill factor of 90. Well, I can set that in the database table and then let’s say I want certain tables in there to have different fill factors, so one at 85, one at 80, one at 75. Well, I can set those. I can just put in rows in the table table. And now they’ve got all their settings and those settings also extend to every single one of the re-index settings Not just a handful of it but every single one of them. Some of them can rebuild stats after a reorg. Some of them can be sorted in TempDB. Some of them can use drop existing. Some of them can have online re-indexing. Some of them cannot have online re-indexing and so on and so on. One of the biggest things that we do in that sense is to give you a high level of configurability without any extra jobs or job steps. I think that’s the main thing that differentiates us and we’ve done what we call prep only. I think we talked about this a long time ago but the big problem with re-indexes on large databases is, let’s say you’ve got a 4-hour window. Take you an hour and a half to two hours, and sometimes longer just to pull your frag stats. In a big database it can take a long time, right? So in minion, you can pull those frag stats earlier in the day and use them in your maintenance window. So you could start two or three hours before maintenance window starts and pull the frag stats. And during maintenance window you can say, “Use those stats that I’ve already collected”, and it won’t try to pull them again or use them. And now you’re using your entire window for actual maintenance instead of using half of it to find out the maintenance you were supposed to do.
Carlos: Right, so almost like that rules based chart or list gauge the ability to finally tune all the individual knobs and settings.
Sean: Right. And you know, something I wasn’t very proud of for a while in Re-index 1 in order to do that. That’s was one of the things that you did to have a separate job for. So you had to have a job to run the prep only and then another job to run the actual re-indexing itself. But now we’ve brought that into scheduler table. We just released 1.3 of six weeks or 2 months ago something like that. So we just released 1.3 and we’ve got that scheduler table now. So now you could say something like at noon I want you to run this is a prep only and then at midnight I want you to run the re-index as a run prep job to use the stats that I collected at noon. And now they are just entries in a scheduler table. So now we’re back down to having no extra jobs at all. You could run prep only for 40 different databases at 12 different times throughout the week and it will all be just be table entries. It’s all still just one job who’ll do the running.
Carlos: Got you.
Steve: Yeah, that’s really interesting because I know if you have a different job for each of those or even a different job schedule for those different events it can get really messy in the SQL jobs for the SQL agent.
Sean: We had a client for 75 jobs just to manage their re-indexes. On a SQL Server they had a ton of databases like 3,000 or 5,000 databases on their and that’s 75 something jobs just to manage the re-indexes because they had special things that they had to do in there. That was on somebody else’s solution.
Sean: Also in 1.3 we gave you the ability to, we could always do this but we made it a setting to re-index heaps. So now it is a setting how you want to handle heaps both of the database and the table level so you can handle like it always has been and just re-index the indexes themselves. Or you can re-index the heaps themselves.
Steve: Yeah, and you know that interesting because that’s something a lot of people don’t think about when they are re-indexing as well. What’s going to happen with the heap?
Sean: And like I said, you can always do it but it followed the, I don’t know if I want to call it the hack attack method. We’ve always had this way of being able to alter the run. Now, you asked me before, not on film but the other day when we were talking on Skype. You asked me if we were going to be able to do something. If we are going to be able to do, if we had it coming where we’re going to be able to limit the re-indexes to like to page counter.
Carlos: That’s right, so basically my small index is I don’t care about.
Sean: Right, exactly. Now, I can actually do that today, and the way we do that is actually pretty clever. But it’s the way that you had to do the heap re-index before, so you will get the idea. The method to do that now is to do a prep only and that takes all of your information and puts it in a table. And then insert a step that deletes rows out of that table based of any criteria you want.
Carlos: Oh, see, do the little prep and purge anything you don’t want and then let it run in the maintenance window and you got it covered.
Sean: Exactly. Now, that gives you so many things, right? Yes, you can do it off of index page count, you can do it off of row count. It allows you to reorder the indexes because that’s another thing that we really do well, is that we allow you to do your tables in any order you want, so you can order the tables.
Carlos: But I guess, I still don’t like that idea that I have to be the one to go back and go through there and do it. I just fell like that there should be some minimum thresholds.
Sean: Hold on, I’m getting there.
Carlos: Ok, ok. Here we go.
Sean: Now, so there’s a couple of things going here, right? I was talking about ordering because we talked about this a long time ago. I remember where I was telling you guys that that’s the problem I have with re-index jobs that stop after 2 hours is very seldomly is that actually necessary. What you mean is I am not able to control the orders of the tables go in, and so in order to keep my important tables from running after that 2-hour time window I have to stop my own team. Whereas if I make sure that my 7 or 15 most important tables get done first then I don’t care if those other little tables run over two hours. Who cares, right? They don’t take long enough and they are not used enough, so being able to order your tables is not a big deal. So within this no man’s land, I guess we’ll call it purgatory between the prep only and the actual run. You can not only delete stuff out of there but you can also reorder the tables based off of anything you want and we’ve got some SPs that we’ve written that I don’t know where they are if they made it to the community zone or if they made it to the queries folder or what. But we’ve got some SPs that we’ve written to help you with this so that you can reorder tables based off of, or you can reorder the tables based off of usage, so you can make sure that the most used tables get done first, or the least used you can reorder them by schema, you can reorder them by anything that comes up in the stats DMV, or any other criteria that you would like. And this is kind of where I’m getting at Carlos is right now with other solutions. I hate to keep saying that but that what keeps coming back to. Currently with other solutions you get a hard coded, you can do this with this page count, above this page count, or below this page count. And that’s it, that’s the only decision you get to make. But what if I want to do it based off of like I said usage? What if I want to order them based off of usage or what if I want to delete the ones based off of usage or whatnot. Anything that gets below a certain usage don’t bother even re-indexing and I just wanted to just kill that altogether, right? And I don’t want to put exclusions in the table because that’s an exclusion list that I have to manage. I wanted it to be more dynamic so we have a big dynamic component to the way we’re doing this. So right now you can go code your own solution or you can alter one of the solutions that we’ve already done. We give it to you in SPs so you can just pass in a couple of parameters and you’re done. But in Re-index 2 that I’m working on now, I am following that exact same method. So what I’ve been telling my users to do all this time I’m building them into the product. So if you look at it right now, the way it works currently is it pulls all the frag stats and then in the exact same SP it runs the re-index. So frag stats gets pulled then it runs the re-index. Well, that’s no difference from if I were to do a prep only earlier in the day and then during the maintenance window I run the job, I run the re-index. Now we are only arguing about the amount of time that happens between the prep and the re-index, right? So in Re-index 2 I’m going to have two steps in the job. Well, maybe three but we’ll talk about that in a second. So the first step is going to do a prep only. Every single time without fail even if you’re just running the re-index it’s going to do a prep only. The next step runs that prep only. So that’s the exact same as pulling the frag stats and running the re-indexes, only now I have split it up in two different job steps. But one runs after the other so you’re talking about a difference of milliseconds. And so what that’s going to allow me to do now is build into the system these inserts, these purgatory inserts. So now, let’s say that you want to limit the run by anything that’s only over 500 pages. Ok, great. It will be a checkbox for you and behind the scenes after the prep only it will run that SP that deletes anything out of there that’s under 500 pages. And then let’s say that you also want to order it by the most used index, well after that, it will run the one that orders them by most used index. And let’s say you only want to do the ones that are in this schema. Well, hopefully you would have passed that into the table as a parameter but you see what I’m doing. So I can put in ghosted records, I can do anything I want, in that no man’s land because the prep has already. I’ve got a list of tables and indexes that I’m going to run and then you can delete, you can add, you can subtract, you can rearrange. You can do anything you want to before that run actually happens.
Carlos: These features you mentioned, is this 1.3 or is this talking about 2 now?
Sean: Well, ok, so it’s available on 1.3 but you got to code it yourself or get the code that we’ve written. In 2.0 that I’m writing right now is where I’m going to build that stuff in and I’ll build in a handful of those purgatory SPs. I’m going to stick with that for now. I’m going to build that handful of purgatory SPs and then I’m going to try to build in a mechanism for you guys to write your own and even upload them into the community zones so that everybody can share. So you can see the method, it’s going to be much more flexible than anything else you’ve gotten now because you can do more than index pages or row count. You can do anything that crosses your mind. I mean even going down to saying, “This index took this long last time and that’s too long this time so I want to delete it from the thing.” And the last thing I want to say about Index 2.0 as a little bit of a teaser, look for some good multithreading.
Carlos: Ok, there we go. Do we have an estimated date yet? Or you’re still on the works?
Sean: Oh, I’m still on the works. I’m wouldn’t even come close to having a date but I’m working on it diligently. But I will say that I just got, because one of the problems, remember we were just talking about frag stats and how long it can take to pull. I just got the multithreaded frag stats engine worked out. So you’re going to be able to say you’ve got a database that’s like 4TB, 6TB. You’ll be able to pull frag stats in as many threads as you want so I want to split it up, and I want to split the database up into 10 pieces and pull frag stats for 10 tables at a time. And you can do that in tens and just work through that database and cut your time drastically. I’m saying beautiful times where I’ve got a table with something like 53,000 tables in it on one of my test boxes and it takes, well on my laptop, it takes like 2½ hours to pull all the stats for it, pull all the frag stats for it. It gets done in like 12½ minutes when I stripped it up to like 10 frags. That’s a beautiful thing. So I’m looking forwards to it and more.
Carlos: Shall we go ahead and do SQL Family?
Sean: Let’s do it.
Steve: Yes, since you’ve been on the show a few times before we sort of shake up the SQL Family questions a little bit this time.
Sean: Alright, and I haven’t heard any of these, so this is going to be great.
Carlos: Yes, that’s true.
Sean: You we’re supposed to type them on the chat and I haven’t seen anything so this is. I like to spot an idea, let’s do this.
Carlos: Do you want to go first, Steve?
Steve: Yeah, so I guess you’ve been around SQL Server quite a while and you’ve done a lot with it and the first one question I have is not what would be the one feature you would change but what do you think is the most absurd or ridiculous feature that was ever added to SQL Server?
Sean: You know, it’s not the stuff that they add that I think is ridiculous. It’s the stuff they don’t put in. Things I’m going to call the half ass features. Things like not forcing us into dynamic SQL by not allowing us to use variable names as objects. You know, SELECT * FROM at table, being able to do stuff like that. Or forcing us into XML for everything and then not standardizing that XML. Like if you’ve ever tried to parse programmatically a deadlock XML graph and have 14 nodes in one, or 14 attributes in one and then some of those attributes don’t exist and then the other ones attributes don’t even exist, so not standardizing that XML. I mean put them there, have them know but put them there so at least they are programmatically for me to parse. Stuff like that where they just don’t go all the way and let us do things we need to be able to do. They force us into ridiculous decisions. We don’t even have thread safety in stored procedures.
Carlos: So this is going to take you back a little bit, right? But the first time that you ever failed to close the transaction.
Sean: I can tell you the last time. Just the other day I was like, “Why the hell is this taking so long?” And I went into the DMV and I was like, “Now, what is blocking this? What is blocked?” Yeah, that was just last week. So the first time? I really don’t know, man. You know, I’m going to tell you honestly. I have always been in the class of what I called deep shit coders. I make so many ridiculous mistakes. I troubleshoot stuffs for half a day before I realize I’m troubleshooting on one server and writing on another server. I do that kind of stuff all the time. When was the first time? Dude, I wouldn’t even begin to know, so many times to count.
Carlos: Oh, ok. I guess it was very public for me so that’s why I guess I things that’s on the dev server, that’s something different, right? You know, that first time that you did it and the email from the manager comes.
Sean: I can definitely tell you the first time I brought a cluster down.
Carlos: Ok, let’s hear it.
Sean: Ok. I was working for Pilgrim’s Pride and we had some like 80 something locations all feeding into a central location and we only had 2 DBAs at the corporate office, me and the DBA manager. I was fairly new to clusters and I didn’t really understand the concept, and I didn’t really know what I was doing was clusters and I went in there and I did something that I shouldn’t have done and brought the cluster down for everybody, because every site in the entire network feed into ours, right? So transactions across 80 something sites were failing in the busiest part of the workday. And I went to tell the boss what I had just done because I didn’t know how to get the cluster back up. I just walked in there and I said, “Hey, dude I got to tell you something.” He goes, “Ok, what?” And his phone rang and he goes, “Can you wait just a minute?” That’s actually for me. And he looked at me and said, “What did you do?” And I was like, “Oh god, uhmm, I brought the cluster down. It’s down right now.” And yeah, that was back when you had to take 25-45 minutes to spin up all the disks in the array. It wasn’t a pretty picture. At least I didn’t corrupt anything but it was not great.
Carlos: So we’ve had your wife Jen on the program here before as well, so a family friendly version of how you two met?
Sean: Wow! Ok, you know, that is actually a bitter sweet story.
Carlos: Oh boy, do we not want to include that one, Sean, we don’t have to.
Sean: No, no. There’s nothing wrong with it. We met when I was in my late teens. I was 18 or 19 and she was 13. We were both going to the same youth group. And we became friends because she was fairly introverted and I was, I don’t know if I was introverted or not, but I wasn’t a cool kid, and there were some definitely cool kids in our youth group and I wasn’t one of them. So we just kind of started hanging out together and we became friends and we were friends for, I don’t know, 20 years I think before we actually got together as a couple.
Carlos: Wow, ok, so late blossomed.
Carlos: All is well that ends well.
Steve: And in that which of you is the first one to dive into SQL Server?
Carlos: Even with the chef thing.
Sean: Oh yeah, yeah. I quit cooking. Well, I had my own catering company even after I left my chef who I apprentice under for so many years. I had my own catering company for a few years even after I was in IT. So I got her involved in SQL. I don’t know how long I was in it before I got her involved but she was an intern at FedEx at that time and I started talking to her about SQL because I guess she was dissatisfied with the way her current position was going. We have to ask her specifically what the catalyst was but knowing her, probably just to please her man.
Steve: Well, that wraps up my SQL Family questions for today.
Carlos: That’s right. Thanks Sean for being on the program. We do appreciate it.
Sean: Anytime because I love talking.
Steve: Yeah, and as usual it’s been a pleasure. Thank you!