Episode 149: Managing lots of databases

Episode 149: Managing lots of databases

Episode 149: Managing lots of databases 560 420 Carlos L Chacon

Do you ever feel like you have too many databases to manage?  Managing lot of databases can get tricky and in this episode we explore some of the techniques you can use when you need to manage lots of databases.  Tracy Boggiano joins us to discuss some of her techniques, as she has managed over 300 instances.

Episode Quotes

“Once you do figure out [all of your databases], I recommend having a Central Management Server, where you register them all and keep up with them in groups that you can run queries against them.”

“Collecting data for your monitoring comes in handy, so that you know what you need going forward.”

“I like to be efficient and make things where I don’t have to do things more than once, so if I get asked to do something twice, I try to find a way to do it.”

Links to tools recommended by Tracy

dbatools – Instance migrations and best practice implementations
dbachecks – SQL Server Environmental Validation
Telegraf – Collect performance metrics into one system and display that information
First Responder Kit – Brent Ozar’s blitz package
sp_whoisactive – Activity monitoring stored procedure
Central Management Server – administer multiple servers
MSX/TSX – Create a multi-server environment
Query Store – Query Store usage and adaptive plan tuning usage

Listen to Learn

00:40     Intro
01:33     Compañero Shout-Outs
02:14     SQL Trail
02:40     Intro to the guest and topic
07:27     Figuring out your full server inventory
08:44     Once you’ve got it figured out, good ways to configure for running queries
10:08     What has the dynamic been for Tracy between system admins and DBAs?
11:12     Use Telegraf to monitor all of your servers
13:26     The kinds of things you’re tracking and how you do it
14:22     Managing Query Store at scale
16:25     Using the master server/target server structure
18:43     Tips on upgrading so many servers
20:32     Keeping track of your server’s ownership with a spreadsheet
22:06     At what number of servers should you be automating things?
24:25     The difference between managing a small number versus a large number of servers
27:19     Statistical sampling for administration purposes
29:28     SQL Family Questions
34:44     Closing Thoughts

About Tracy Boggiano

Tracy is the Database Superhero at FM:Systems.  She has spent over 20 years in IT and has used SQL Server since 1999.  Tracy is a also co-organizer of a Special Interest Group (SIG) dedicated to advanced DBA topics in our local user group TriPass.  In addition, she is also the founder of http://WeSpeakLinux.com.  Tracy was an Idera Ace this year and named an Idera Superstar for 2018.  She was awarded the PASS Outstanding Volunteer Award in November 2017.  She has a chapter in the upcoming book Let Them Finish: Stories From the Trenches. Before she worked full-time as a DBA, she was formally a developer and network administrator.  She also tinkered with databases in middle/high school to keep her sports card collection organized.

Tracy has volunteered through the NC Guardian ad Litem program since 2003 advocating for abused and neglected foster children in court.  This is her life’s passion outside of working on Database Platforms and her favorite job.  More information about this program in North Carolina can be found at http://volunteerforgal.org or the national organization CASA at http://www.casaforchildren.org.

*Untranscribed Introduction*

Carlos:             Compañeros! Welcome to another edition of the SQL Data Partners Podcast. My name is Carlos L Chacon, your host, and it is good to be with you again. Our topic for today is a simple one, Managing lots of databases. Some of the challenges and struggles and strategies that you might take on when you have lots of databases to manage. Our guest today is Tracy Boggiano from North Carolina. Tracy’s been on the show before and we’re glad to have her back. we’re joined again, by Kevin Feasel and Eugene Meidinger. We also have a special shout-out to Angela Henry, who is joining us, and a little bit more prominently than last time. We didn’t do her justice as we got her on with Bert’s episode. And unfortunately, just from timing and whatnot, she will eventually help do these openings, but we’re just trying to get everything in order, here.

Before we get into the conversation today, we do want to give a couple of shout-outs. The first to Ayman El-Ghazali, Chris Albert, Jared Poche, Michael Garland, Torsten Strauss, Chahul Jain, Gonzalo Bissio, who is in Argentina. I’ve started, actually, doing a little bit of work with Gonzalo and he’s really sharp. I’ve really enjoyed working with him. And Jorge Albelaez, and I’m sure I’ve said that incorrectly, Jorge, and you’ll forgive me for mispronouncing that. So yes, always interested in these topics, and compañeros, if you have other topics you think we should be talking about, let me know at sqldatapartners.com. Reach out to me on social media.

I would be talking about the conference, except I’m there right now. It’s going on, we’re having a lot of fun, I’m sure, and maybe next year, we will grow a little bit, we’ll get some feedback from what we’re doing. I’m sure we’ll have a review. We’ll be talking about it in a couple of weeks, as far as how that went. There’s been so many announcements made, that I have just been a little behind the times, and so we’re going to have to get back with SQL Server in the News here, shortly.

Our show notes for today’s episode is going to be sqldatapartners.com/lotsofdbs, so (spelled out), or at sqldatapartners.com/149. And so with that, let’s go ahead and get into today’s conversation.

Carlos:             So Tracy, welcome to the program!

Tracy:              Welcome to you!

Carlos:             And, since I did a very poor job last time, we’ve had Kevin and Eugene on the program, before, but we also now have another special guest, Angela. We didn’t pull her in until almost the very, very end last time, so Angela, welcome.

Kevin:              Hi Angela.

Angela:           Hello.

Tracy:              Hi Angela.

Angela:           Hello.

Carlos:             Our newest panelist, here, on the SQL Data Partners Podcast. So it’s exciting to have her.

Kevin:              Interrogator.

Carlos:             Yeah, we’re supposed to take that from Tracy, but I suppose that’s in order.

Kevin:              No, no, she’s the interrogator.

Carlos:             Yes, thank you.

Kevin:              We will interrogate her, which is Tracy.

Carlos:             So, like the number of panelists we have on this program, today we’re talking about managing lots of SQL Server instances. Although Tracy has moved on to greener pastures, I feel confident that one of her previous positions was probably the most number of instances I’ve ever heard of in one company.

Tracy:              I’ve heard of more in other places at a SQLSaturday.

Carlos:             Is that right?

Tracy:              I talked to a guy in Charlotte that had over 1500 instances in his environment.

Angela:           Oh, that makes my head hurt.

Tracy:              He took the cake, yes. I was like, “oh, I need to revamp my presentation.”

Carlos:             Yeah, so I think ultimately the idea here is– and I’m not sure how very many is, but I think once you get over to, let’s call it at least one handful, maybe two handfuls. Once you get over two handfuls of servers, the approaches that you have to take are going to vary a little bit. And so ultimately what we want to talk about today is how to manage some of that.

Kevin:              Yeah, my rule is, if you’ve got to take your shoes off to count how many servers you have, you’ve got a lot of them.

Carlos:             Yes, that’s right.

Tracy:              Yes.

Carlos:             We’ve talked about PowerShell on this program before. So, is the easy answer just download DBA Tools and then you’ll be set?

Tracy:              That is one way to do a lot of things, but there are other ways to do lots of things, as well, because DBA Tools doesn’t do everything you need to do.

Carlos:             There you go. Maybe let’s hit some of that, first. What are some of the challenges? Once you go from, let’s just say 10 to, I don’t know, even 25, 30 instances, what are some of the challenges that we’re going to face trying to manage some of that stuff?

Tracy:              Some of the challenges that I know DBA Tools doesn’t cover is Resource Governor, is the common thing that we used at the last job. Copying extended events is something I haven’t seen it do. Setting up our error log retention and size. Turning on read, commit and snapshot isolation, which is something we commonly suggest turning on, because you have to have access to the database to the point where no one else has access to the database, so DBA Tools probably wouldn’t have something to do that. Changing your auto-growth settings for all of your database sizes are going to vary depending on what you want to do. Those are the type of settings you would have at the database levels that you might want to change that DBA Tools might not be handling for you.

Carlos:             Right, I think that’s an interesting idea. I kind of jumped right into the management side, but I think you bring up an interesting point in that if you’re going to have that many, and I think at this point we’re talking about over 100, then likely, they’re just going to reproduce, and you’re going to get bigger, and so the idea of even, “how do I get more instances up and running?” also comes into play in a conversation like this.

Tracy:              Yes, and commonly what we did do is we did have PowerShell and we used a combination of DBA Tools where we could, and then our own custom PowerShell commands to do what we couldn’t do with DBA Tools. And then we combined a lot of scripts together for deploying things, like we used Ola Hallengren scripts for maintenance, and that was before DBA Tools had a way to deploy those for us. We had sp_whoisactive and we had that set up to deploy before DBA Tools had a way to deploy that for us. So, there’s some of the things that now DBA Tools does that we were doing before it was in DBA Tools that are common scripts that you would set up. Brent Ozar scripts for the Blitz package, for example, we had those running on all our systems, and those are things that you would set up on your system and you would want to have available so that you can troubleshoot your system as far as performance goes, as well.

Kevin:              When it comes to having this many servers, one of the trickier parts that I’ve seen is in terms of finding out where those servers are, finding out what your full inventory is. Assuming that you don’t already have that in place, do you have any hints on how we might be able to do that sort of investigation?

Tracy:              Yes, there used to be a tool that you can use to scan based on 1433. I don’t remember the name of it off the top of my head, but if it’s not on that port, then you’re kind of out of luck on trying to find it, if it’s not on default port. Because I do know that one place I worked at, they insisted it not be on the default port for security reasons, that way if someone happened to get on their network, nobody could accidentally find it easily.

Kevin:              Because Nmap is so difficult to run.

Tracy:              That’s what the security people tell me. I was like, “okay, I don’t think that’s going to help you very much,” especially seeing we named all our servers with SQL in its name.

Carlos:             Yeah, that’s an interesting idea. From an auditing and reporting perspective, I think we try to have some standards, but then figuring where all that stuff is, I can imagine that at that scale, it could be difficult to track some of that stuff down.

Tracy:              Yeah, and once you do figure it out, I recommend having a Central Management Server, where you register them all and keep up with them in groups that you can run queries against them. Or you can use PowerShell to pull them out and run queries against groups that are in PowerShell using the CMS.

Carlos:             Yeah, I’ve used the Central Management System, I think it’s Central Management–

Tracy:              Management Server.

Carlos:             Server, there you go. But admittedly, it’s been more of dev, test, production. Once you get to that level, I would think that those groups would be too small. So then, how do you start segregating your instances?

Tracy:              If you need it beyond that level, you could have groups where you have the version for SQL Server, because versions change for DMVs and things that might be important to you. You can have it by your applications that you need or by departments in your company. It depends on how your environment is set up, really.

Carlos:             Right, and who you’re supporting a lot.

Tracy:              Yeah, who you’re supporting, yeah.

Tracy:              Because you can register a server more than once in the CMS post, so you can put them in multiple groups, so you can really set them up. You can set up your dev, test and production, but then you can also have by your version number, as well, so you can do multiple groups, even.

Carlos:             There you go. Yeah, you can see that I haven’t had problems like this, because that idea seemed foreign to me, putting the same server in multiple groups, but it makes perfect sense. You want multiple ways to be able to track that, potentially. Okay, so we know that we’re going to have some setup issues, some auditing-type components. What about, in that level of complexity, there always seems to be a handoff between the admins, the system admins, and then the DBA group. How has that dynamic been? What was that like for you, from a disc space perspective, when you’ve needed more, things like that?

Tracy:              Well, it’s been pretty simple. At the last company I was at, we pretty much just asked for space and they gave it to us, so that was pretty simple, because we can over-allocate space on the pure storage we had there. The company before that, not as simple at all. Getting space was very difficult and had to be budgeted and stuff like that, so I had to keep up with trends on how fast databases were growing so that I could put in budget numbers for each year and know what we needed for the year and stuff like that. That’s where collecting data for your monitoring comes in handy, so that you know what you need going forward.

Kevin:              Speaking of monitoring would you look at monitoring all of the servers, sub-sets of the servers, or how would you think about figuring out what servers are causing you pain at any point in time?

Tracy:              All the servers, all the time.

Tracy:              I’ve got a free tool to use to do it, so except for the hardware, so why not do all the servers all the time?

Carlos:             There you go, here. Plug your software, here, Tracy.

Tracy:              There’s a free tool called Telegraf. I’ve got a presentation for it online, as well, It’ll actually monitor SQL in Linux, as well in other products, as well as iSQL Server. And you have a backend called InfluxDB that’s a time series database and then you’ve got Grafana that gives you the graphs as the front end. It can monitor all of your SQL Servers for you and it has a very small footprint and it’s fairly cheap, because you’re only paying for your hardware because those are all open source products. So, all the servers, all the time.

Carlos:             Is that a little weird? Should I feel weird about recommending or talking about a product that’s going to monitor my SQL databases with a non-SQL Server?

Tracy:              No.

Kevin:              I’ve used Splunk, I’ve used Elasticsearch to store this information. InfluxDB works well, Telegraf is a good way of displaying. It’s a viable solution.

Carlos:             Yeah, I guess I’m curious. You mentioned that it was a time series database, is there something specific about using a time series database, like some problem that you’re trying to solve there?

Tracy:              Well, all the data is time-stamped that goes into it, so hence the time series part of it. All of the data’s going in as a specific time stamp that you’re collecting it, so if you’re collecting it in 30 second intervals, you’re time stamping it at 30 second intervals. It’s made to capture that data like that and store it more efficiently and query it more efficiently.

Carlos:             Right, so really just talking about monitoring at scale. In fact, I was involved in a LinkedIn conversation here, not too long ago, and they were talking about “hey, give me your recommendations for monitoring tools.” Once you get past, and I’m sure they give you some kind of deal, but again, you’re talking about more than 100 servers, they can get pretty expensive, pretty quick. So this is this idea that I can grab some of this information and display it, keep it, keep track of it in a way that can be cost-competitive, if you will.

Tracy:              Yeah.

Carlos:             So what kinds of things are you tracking, then?

Tracy:              Your CPU utilization, your wait stats, how your memory’s being used, information going across about your AGs, your disc usage, the space on your databases, the activity on your databases as far as transactions per second, batch requests per second, stuff like that.

Carlos:             Okay, so then it sounds like you’re going to get a dashboard with all of that information and then you’re going to have to know about the environment, or at least start digging in through that historical view to be able to say, “oh hey, this is a problem for me” or “maybe I should intervene at this point.”

Tracy:              Yeah, you’re still going to have to know your environment and you’re still going to have set up something like sp_whoisactive to capture to a table every minute to be able to capture things that are running long, to see if you can find something that was causing a particular problem, or have Query Store turned on if you’re using 2016 and above to go in and find things, because it’s not capturing queries that are running on your system.

Carlos:             Now that is something I do want to ask. Query Store, what I’ll say at scale, and it sounds like you’re not keeping Query Store in chunks. It sounds like you have it kind of continuously running?

Tracy:              Yes.

Carlos:             Yeah, so thoughts? Is that for all the databases? Do you decide, is that by default in your case? How did you decide some of those questions, particularly thinking about the number of databases you have to manage?

Tracy:              Well, we started turning it on because we were early adopters of SQL Server, so we turned that on in the first servers that we upgraded to CPT 2.1 in 2016 and we’ve seen how it went. And things went well, so we started turning it on everywhere else and we did run into a few issues with it and had to turn it off at one point. Then they released ACU for us, that fixed the issue that we were having, and we turned it back on and we’ve seen a dramatic performance increase when we were able to turn auto plan correction on when we upgraded 2017, we’ve seen a 20% CPU drop on most of our servers that were above 60% CPU utilization. I did have a blog article where I’ve put a print screen of out there and Bob Ward sited it on his website as well, in a blog. So we’ve really enjoyed Query Store at Channel Advisor. I’m not there, anymore.

Carlos:             Yeah, yeah, that’s right. That’s pretty cool, to take a feature like that and then be able to use it at scale. I guess, specifically, circling back to Query Store, are you just using the CMS and the PowerShell to manage that? Any other considerations for managing Query Store at scale, like that?

Tracy:              No, we turned it on, we limited it to 2G of data, so it’s not getting too large.

Carlos:             Okay, and it’d just cycle through?

Tracy:              That way it can auto clean up when it reaches the 2G limit and that’s pretty much it. That keeps about 30 days’ worth of data for us, because we don’t have a whole lot of adhoc workload and we’re processing 30,000 transactions per second on some of those systems. And we’re able to keep that much data to troubleshoot with.

Carlos:             Wow, that’s pretty cool. So, now what about the agent-front?

Tracy:              Well, we like to use MSX/TSX, the master server/target server structure at Channel Advisor. We have one master server that has all of our jobs on it, and we set up every server to be a target of that server, so that we can send all of the jobs over there. and that way we don’t have a situation where we have different jobs running on different servers and somebody just made a change in one spot and didn’t make it somewhere else. And then on some of our maintenance jobs, we actually have a formula stuck into it where it will pick a certain time to allow it to run based on the server name, that way it doesn’t all run at one time. So, our backups don’t all run at the same time and hit the storage all at the same time.

Carlos:             Sure, that kind of goes back into one of those setup type ideas, as to how am I going to get to get this installed and when’s all of this stuff going to fire, so that while your SAN admins can’t be that happy with you if that many servers–

Tracy:              You don’t want 60 production servers backing up to the same spot all at the same time, so we do have a formula in there. I’ve got a blog about how we do that, as well, but that shows how we take the absolute value of the server and round it or something, and I can’t remember exactly how we do it, but we have a weird way we do it. And then we get the hour and then we see if it matches the hour that the job is kicking off and if it does, we go ahead and let it run. That just spreads our jobs out a little bit for us.

Carlos:             Okay. Only because I had an issue similar to this, this morning, a lot of times we think about, SQL Server, one of the things we get a little bit complacent, maybe, about with, when I have smaller instances that I could get a SQL Server, Reporting Services and Analysis Services all in the same SKU. Did you ever have to deal with Reporting Services sprawl as well or is this just engine-specific?

Tracy:              I’ve just had to deal with engine-specific, so I’m lucky.

Carlos:             Cause I can imagine that that would present its own set of problems. Hopefully you wouldn’t need that many reporting systems, but who knows? Particularly all of these third-party apps, they all want their own environment. It can get pretty big, pretty quickly.

Tracy:              Yeah, I’ve only done with the engine itself, so.

Kevin:              When it comes time to upgrade so many servers, do you have any tips on how we might be able to manage that upgrade process? I’m not going to sit there and click next, next, next, on a few hundred servers, I hope.

Tracy:              Why not, Kevin?

Eugene:           You’re getting paid either way, Kevin. I don’t see what the issue is.

Kevin:              That’s true. If I can bill by the hour, I’m cool with that.

Eugene:           Yeah, exactly.

Tracy:              We actually have a custom PowerShell script we use to do that and we feed in the servers in the text file and we upgrade up to 20 servers at a time, doing that, and patching that way as well. I believe that’s available through the download of this presentation on my website.

Eugene:           A question about that, if you’re doing automated upgrades, do you have some automated way to communicate outage windows or do you guys just like Azure or like, “well, we’re just going to shut it down” or is everything highly available so that you can just do rolling upgrades and no one ever notices?

Tracy:              Well, in production we had AGs so we could just do it, but they insisted that we communicate anyways, which ones we were doing, so we just manually stuck it into our forums, which servers we were doing and when we were done. We didn’t get that far into automating it.

Carlos:             There you go, that’s an interesting side-bar there, is that we can talk about all the automation that we want to, but there’s still the idea of communication to the larger group, the users of those systems, and what’s going on and that there’s only so much automation you can get there.

Eugene:           With the Azure levels, you do get emails when there’s going to be general mains, but the way Azure’s designed is, it’s assumed that your server could go down at any moment.

Carlos:             Yes, that’s true.

Eugene:           So, you have to put stuff in different availability groups, or availability sets, I think is the term, so you can make sure that at least one of them is up while the other one’s down.

Kevin:              Another problem that I can see when working with a lot of servers is this idea of ownership. I have a lot of instances and I don’t necessarily know who owns them, if they’re still in use, what’s going on with them, can I bring them down during certain windows, what are the SLAs, how do we track all of that information effectively?

Tracy:              Well, it’s called a spreadsheet.

Carlos:             Don’t tell me, are you using Excel?

Tracy:              No, we had Google Sheets.

Kevin:              Poor man’s Excel.

Tracy:              That’s what we did at Channel Advisor, is we kept up with the newest spreadsheet, and when we added new servers, your name’s on the R server, Kevin, and we know what team owns the server just in case the manager’s not there anymore. That way we can go over there and communicate the change and we know what they’re supposed to cause in that department or what app it’s affecting, cause they forget sometimes when we go and ask them. And then we have to remind them and they’re like, “oh yeah, okay, we can do that.”

Carlos:             So, it is somewhat reassuring, as cool as some of the things that you guys have put together, in taking over all of these systems, it is nice to know that there’s still some real-world problems that you have. I still have to interface with my customers.

Tracy:              That used to all be in the manager’s head and not in a spreadsheet.

Carlos:             Oh gotcha, so there you go, that’s the first step.

Tracy:              I got it transposed into where other people could see it and it could be visible.

Eugene:           We talked about what “a lot of databases” is and Kevin’s numbers, if you have to use your hands and your toes, but where’s that point, do you think, Tracy, where it’s painful enough that you really should start automating and you’re feeling the pain, but it’s few enough that if you’re maybe not smart about it, you might try and brute force it? Do you know what I mean? Where do you think that inflection point is, where you start to see the most gains from automating things?

Tracy:              I would say around the 5 to 10 mark.

Eugene:           That makes sense.

Tracy:              Because I’m just, I use the phrase “lazy”. If I have to do things more than once, I like to write a script for it.

Angela:           That’s called being efficient, not lazy.

Tracy:              I know, I used the wrong word. I said I used the word lazy, but I like to be efficient and make things where I don’t have to do things more than once, so if I get asked to do something twice, I try to find a way to do it. So, if you’re around the 5 to 10 mark, you’re doing things 5 to 10 times a day, even with the CMS server, you’re still querying the servers and checking things, so you’re starting to get a little painful, you’re having to patch the servers with the next, next, next part that Kevin doesn’t like. And so, you’re going to want to start automating some stuff around there, because if you get to that point, your company’s probably going to grow and the next thing you know you’re going to have 100 servers.

Carlos:             Yeah, let’s hope, right? That’s a tough question, because we work with a number of people that don’t have staff dedicated to the database, when they have less than 3 instances. But then all of a sudden you get to 5, that’s probably the point where they’re maybe an organization’s looking for somebody. But I think, again, because of the things we’ve talked about on this program, you probably want to start automating as quickly as possible, because the business is going to start looking to you to provide other value, beyond just the administration pieces, whether that’s reporting or, you know, what have you, analytics.

Tracy:              Architecting, writing, stored procedures, you’re everything at that point, when you’re the only database person.

Carlos:             Yeah, exactly, until you get up to that, again I don’t know what that number would be, but all of a sudden you get 30, 35, then all of a sudden they’re like, “okay, maybe you do have enough to just focus on that.” I can see that being a gray area for a lot of folks, but I think we’d all be pro-automation.

Tracy:              Yes.

Kevin:              Aside from automation, what would you say is the biggest difference between the relatively small number of servers into the managing large numbers of servers?

Tracy:              Not quite sure I understand that question.

Kevin:              We’ve gone through it several times, “hey, you’re managing a lot of servers, automation is your friend.” If you’re only managing one or two, yeah, you can get away with not automating things. So aside from automation, what’s the big difference?

Eugene:           You’re asking about in attitude or workload or what? Cause it sounds like the whole thing of the podcast is automation is the difference.

Kevin:              Oh, I mean in the most general possible sense. Just saying that there are differences, it’s not just you have more servers.

Eugene:           So maybe what might surprise people about having to manage so many?

Kevin:              Yeah, let’s go with that.

Tracy:              The number of phone calls you’re going to get, probably, with problems that you’re not going to be able to solve unless you have some automation in place and some performance data to look at and things like that. The number of pages you’re going to get with problems.

Eugene:           The number of on-call selfies that you have to take.

Tracy:              Hey, I don’t take those anymore, but I might be in two weeks.

Carlos:             There you go, hint, hint, hint. Yeah, I know that just for my own self, while the number of servers that I manage is now spread out over multiple customers, the ability to retain all of that in my head, I’ve reached a saturation point, so I think this idea of just having to come up with that way of giving yourself the ability to look at history is probably one area that might be a little surprising, because you can’t always, “oh yeah, that’s right”. You have to be like, “ah, I haven’t looked at this server in,” whatever number of weeks. I can’t remember what I did last. Where is that information?

Kevin:              Yeah, that’s a good point, that need for metadata. I think also the need for some sort of knowledge of synchronization. I know that I’ve deployed these things to my 150 instances, or my 1200 instances. So, now that I know I’ve deployed them, how many did I really deploy it to? How many of them had some weird error that oops, I kind of smothered it in a catch box someplace? And how do I figure out which servers those are and roll them out when my change is not necessarily a blatantly obvious one, or if it’s hard to test?

Carlos:             Yeah, a special case?

Kevin:              Yeah, cause if I have a three percent failure rate in 1200 instances, hey, guess what, every time I run this thing, I’m going to have some servers fail.

Tracy:              Yeah, and you can write a check behind you for stuff like that, it’s unfortunate that you have to do that. But that’s just something you’ve got to consider as well, so you’ve got to write a check to make sure what you did actually worked.

Kevin:              Yeah, yeah, agreed. What do you think of the idea of statistical sampling for administration purposes?

Carlos:             Of course the analyst would be asking that question.

Kevin:              Well, like for example, CHECKDB, if you have a large number of very large instances and not necessarily able to run CHECKDB on every database on every instance because you don’t want to blow out your SAN.

Tracy:              Yeah, you can take and restore it to another system and do that, or you can just run– the way we did it at ChannelAdvisor is we only ran one a day and went through them that way, and we ran them during the off-peak hours instead of just running all the script on the box. We ran a wrapper around it, so it would only run one and it picked the oldest one that it hadn’t ran it against and ran it, and then we recorded into the table the last time it finished running. So, there’s different ways to do that, and like I say, you can restore your backup somewhere else and run it. We did have a process for doing that against one of our environments where it would restore and run CHECKDB over there as well, an automated process for that. We did physical onlys in prod and then we did full CHECKDBs on the other environment.

Carlos:             Yeah, so I guess that then lends itself to just another idea of you may have to think of different ways of doing traditional tasks. That’s an example of doing physical only database checks in production and then where am I going to search the data out?

Tracy:              That’s a whole podcast for what you do when you have very large databases.

Carlos:             That’s right, that’s right. Which at one time I tried to reach out to Brian to have that conversation and he blew me off. I think we ended up talking about AWS or something, instead, but you know, that is something we want to come back to. Okay, awesome, so I guess we’re going to have links, again, Tracy’s mentioned a couple of the projects and things that she’s put together to help her manage some of these. We’ll have those on our show notes. And we’ll make sure that all gets connected, but good stuff. Good stuff, today. Shall we do SQL Family?

Tracy:              Yeah.

Carlos:             All-time favorite movie?

Tracy:              Okay, I’m a kid at heart, so don’t laugh.

Kevin:              I’m already laughing.

Carlos:             Yeah, I was going to say, “too late”.

Kevin:              Any time somebody says don’t laugh, I just start laughing.

Tracy:              I’m going to go with Disney’s Aladdin.

Carlos:             There we go.

Tracy:              I’m a big Robin Williams fan, so I could pick anything that he does, but Disney’s Aladdin’s like my favorite movie.

Kevin:              It was definitely better than Mrs. Doubtfire.

Tracy:              I wouldn’t pick that one as my favorite one. But yeah, Disney’s Aladdin. I like all the Disney movies, but I like all Robin Williams movies, as well.

Carlos:             There you go, yeah, he’s definitely a funny guy. City or place that you most want to visit?

Tracy:              I’m going to go with Paris, France. And if my parents ever hear this, yes, I know I was there before I was six, but I don’t remember it.

Carlos:             There you go. Yes.

Tracy:              My dad would insist on saying that.

Carlos:             Yes, the only thing that I dislike about going to Paris is all the weight that I would gain from the pastries that I would eat.

Kevin:              Oh, and the Charles DeGaul airport.

Carlos:             Do they have good pastry shops there? I don’t remember.

Kevin:              No, but it’s an awful airport. It’s a reason that you wouldn’t want to go.

Carlos:             Oh, there you go, okay.

Angela:           Yeah, well, if you do go, do not try and cross the Champs-Élysées by yourself.

Kevin:              As you’re stuck in the third lane of a 12-lane roundabout.

Angela:           Yes.

Carlos:             Yes, yeah, that’s right. Pro tip there, there’s actually a bridge underneath. You go underneath, not on the street.

Angela:           Yeah, don’t cross on the street.

Carlos:             Don’t cross on the street.

Angela:           They will run you over in a heartbeat.

Carlos:             Yes, they already know, you’re in the street, tourist! You know, must be an American. Anyway, no offense, Julien, of course.

Tracy:              Maybe I should change my answer. No.

Carlos:             My French editor, there. Okay, here we go, food that reminds you of your childhood?

Tracy:              Peanut butter and jelly sandwich.

Carlos:             Very nice, classic. Now, I have to ask, what type of jelly are you using on said peanut butter and jelly?

Tracy:              Strawberry.

Carlos:             Very nice.

Kevin:              What kind of peanut butter?

Angela:           With seeds or without?

Tracy:              Peter Pan Crunchy.

Angela:           Well, yeah, is there any other kind?

Kevin:              Yeah, there’s Peter Pan Creamy.

Angela:           No.

Carlos:             Yes.

Angela:           Why even bother?

Kevin:              Reese’s Peanut Butter was great.

Carlos:             Crunchy is like have they finished it? It’s like they’re selling me an unfinished product.

Kevin:              They’re making me do half the work.

Carlos:             That’s right, you know?

Kevin:              Reese’s Peanut Butter was really good, though.

Carlos:             Oh, that was a spread?

Kevin:              Yep, they had it in a jar. It’s in an orange jar, or an orange lid.

Carlos:             I don’t know that I’ve ever tried that. That may be something I have to go look up. Okay, here we go, Tracy, remind us how you got first started with SQL Server?

Tracy:              Well, once upon a time, they wanted some reports out of the system that was on SQL Server and they told me to set it up in Access, so I linked them all into Access and I did some reports. And then I figured out they were running slow, and then I figured out they were running slow because it was on SQL Server and didn’t have indexes to support my reports. Then I started messing around with SQL Server and I did some bad things to SQL Server. Then I learned to do better things with SQL Server.

Carlos:             There you go. Yes, I’m sure you’re very sorry about those things you did. But now you’re making everything all right with all of this new stuff that you’re making available to the community. Now, if you could change one thing about SQL Server, what would it be?

Tracy:              I’m going to go with change the cost threshold for parallelism. I’m tired of seeing it at five. That’s like from 1995.

Carlos:             There you go.

Tracy:              Every job that I went to, that’s one of the first things you have to change, now, because everybody has it still set to five.

Carlos:             Still set to five, yeah.

Tracy:              And then they wonder why things are running like crap.

Carlos:             Yeah, that is interesting. They talk about backwards compatibility, but I kind of wonder why that would affect backwards compatibility, but I suppose if you are running a server from 1995, then I guess I could see how that’d be a problem.

Tracy:              I hope no one still has those servers left.

Angela:           Well, those servers cost a lot of money in 1995. You don’t want to just throw them away.

Tracy:              True.

Kevin:              So how would you configure?

Tracy:              Start off with a value of at least 25 and then go from there.

Carlos:             Yeah, I know one of the guys that works with Guy Glantser put together a script that he came up with. It’s one of those, you have to know the data, basically is the answer. He came up with a couple of things you could look at that might help you indicate which queries were parallelizing and again, I didn’t feel like it was cut and dried, necessarily, but you still kind of had to go with a gut feel. Like, “okay, I feel like this is my point.”

Tracy:              Yeah, it’s still a guessing game as to where to put it, but I mean, five’s way too low.

Carlos:             Best piece of career advice you’ve received?

Tracy:              “No one is ever going to look after your best interests like yourself.”

Kevin:              Looking out for number one.  I like it.

Tracy:              I just got it this weekend, too, so it’s kind of —

Carlos:             There you go, taking that to heart. Tracy, our last question for you today. If you could have one superhero power, what would it be and why do you want it?

Tracy:              Teleportation. It would make it so much easier, not only to get to SQLSaturdays, but more importantly, to go visit my foster kids.

Angela:           Very nice.

Carlos:             There you go. I like it. Well, Tracy, thanks so much for being on the program today.

Tracy:              Alright, thank you.

Angela:           Thanks, Tracy.

Carlos:             Once again, special thanks to Tracy for joining us today. Do want to give a special shout-out to an organization she advocates for, CasaforChildren.org, it’s a national organization advocating for the abused and neglected foster children in court. So, if that’s something that you’re interested in participating in, obviously you can reach out to Tracy for additional information, or if you’re considering yourself donating some of your time for a worthy cause, it’s definitely a worth-while organization.

One of my takeaways here is that there are lots of different ways to skin the cat, here. Everybody’s going to have a slightly different model, and you could be doing some things differently that we didn’t talk about today, and that’s okay, We probably didn’t check all the boxes off the checklist, but regardless, some ideas about how to tackle some of this, challenges that you might be facing, and of course every environment’s a little bit different. I think, compañeros, that’s going to do it for today’s episode. As always, reach out to me on social media. I’d like to hear what you want to talk about, what you think we should be discussing. You can reach me on LinkedIn at Carlos L Chacon, and I’ll see you on the SQL Trail.

Leave a Reply

Back to top