Episode 15: Working at a University

1400Corporate culture is a BIG deal when it comes to job satisfaction, and it is interesting to see how different sectors can be.  Today we talk with Doug Purnell, DBA at Elon College in North Carolina and Jamie Wick, DBA at William and Mary in Virginia about their experiences working in the university system.

Check it out compañeros, and have fun on the SQL trail.

Transcription: Working at a University
Carlos Chacon: This is a “SQL Data Partners” podcast. My name is Carlos L. Chacon, your host. This is Episode 15. Compañeros, today we’re going to take a look at the life of two DBAs that work at a university, and get their experiences.  The release date for this episode is October 28th of 2015. If you’re in the Seattle area, I hope that you will find me at the PASS Summit. We have ribbons this year we’ll be passing out. I would enjoy chatting with you.

As always, if you enjoy the show, I invite you to subscribe to the podcast. We are looking for feedback. You can reach out to me at [email protected], or on Twitter @CarlosLChacon. If you are looking to move your SQL servers to the cloud and need some help, please check out sqldatapartners.com to see how we can help you make that transition.

We do appreciate you tuning in. It’s always a pleasure to have you. Compañeros, welcome to the show.

Children: SQL Data Partners.[music]

Carlos: Compañeros, I’m super-excited to have Doug Purnell and Jamie Wick on the show today. We are talking about being a DBA at a college campus, and going to look at a little bit of the human resources or the professional development perspective of that.I have known both of them personally. They’re great folks. Doug is the DBA for Elon University. He’s been there for four years in Elon, North Carolina. Elon is near and dear to my heart. It’s where I grew up. I actually went to Elon Elementary. Doug, welcome to the show.

Doug Purnell: Great, great to be here.

Carlos: Jamie has worked at the University of William and Mary in Williamsburg for the last three years. He spent some time with us in the Richmond SQL Server Users Group, and is instrumental lot of times in our SQL Saturday that we have in March. He’s also spent some time in the University of Australia, so we may touch on that a little bit as well. Jamie, welcome to the show.

Jamie Wick: Thanks.

Carlos: Question, right? Ultimately, what led you to pursue a path in the university system? Doug, let’s start with you.

Doug: For me, it was something I’d never done before. I was working for a public automotive company. We had the Sarbanes-Oxley, all the K10S, all that regulation. A friend of mine, still works there as a sysadmin. I heard a little bit about the change of pace, a little bit different. It’s all about the students and the professors and helping them succeed, instead of the shareholders.For me it was a great opportunity. I would be the lone DBA, the only DBA there, and look forward to helping them with their data needs. They didn’t have one before.

Carlos: This almost goes back to that culture, the culture of the organization, purposes and motives behind why it is you’re there.What about you, Jamie? What drew you to the university perspective?

Jamie: I guess a series of steps, really. I started out as a desktop support tech at Oregon State University. Started there just because I was wanting to get into IT. It was where I did my undergrad. Then when I went to the University of Queensland, then I actually got into being a DBA.Once I’d made the step from one university to the next university, and I really liked the university environment, the learning, the emphasis on learning. When I came back to the States, I was looking around and thought about going private industry, and decided that I just liked sticking with university work.

Carlos: That’s something we’ve talked a little bit about, branding, on another show. We had Steve Jones on, talked about branding and building a name for yourself, or a brand. Is there a brand of being a DBA in the university system? Do you have opportunities to network with other folks that are in the college system?

Doug: A little for me with the ERP system we use, Colleague. It’s from a company called Ellucian. There’s a yearly conference that I help run in Myrtle Beach. It’s actually coming up in November. A lot of those shops came from a different database platform. They’re moving to SQL server. A lot of schools don’t have full time DBAs. The Colleague environment is usually meant for small community college programs.I’m lucky in the sense that it’s an opportunity for me to help train other university staff on the way SQL server can benefit the organization. From that viewpoint, I guess it’s more of a branding thing. I can help out in that way.

Carlos: Jamie, you had mentioned going from university to university. Is there a benefit with William and Mary, which is your current employer, to already have worked in the university setting?

Jamie: Getting hired on at a university in most IT, I’ve done desktop. I’ve done Windows Server and DBA in pretty much any university that I worked at. There is a…I shouldn’t call it a preference, but an inclination towards hiring people that worked at other universities, because the university environment is very different from higher government type organizations and/or private industry in that you’ve got budgetary restrictions.The universities try and foster the cooperative, interactive learning. You get a lot of anything, any request tries to be met. If a professor comes up with an idea of, “I want to do something that relies on a database,” say, teaching something, researching something, and you need a database, there will be an effort made to try and come up with something to make that work.

Carlos: Support them in their effort there.

Jamie: Right. You don’t get a “No” unless it can’t be done for a financial or policy reasons. Whereas, when I worked in private industry, it was more of a, “You don’t get to do it, unless you can show a financial benefit,” or in government, “You don’t get to do it unless you can show it’s not a cost to the government,” or something like that.It is a different way of looking at things. Because of that, the universities, they do tend to hire people that are more familiar with it. If you come from an area that’s very structured, very rigid, that’s a lot harder to make the change into the university environment.

Carlos: The transition to get into that culture. We’ve talked a little bit about the culture’s a little bit slower-paced. It’s not so hectic, it’s focused on the student. There are some restrictions, however, with the budgets.I know you guys go to conferences. You’re well-connected in the community, so you hear all the people talking about new technologies, different things. Do you feel like you get to play or have, from a professional development or a skill development perspective, you feel like you have the resources that you need to be effective and continue to be competitive?

Doug: I do, I find it very similar to other jobs, though. I think you make your own opportunity in that area. I think having friends with a sysadmin and the SAM guys, you bring them doughnuts, whatever it is. Doughnuts in my case. Shout out to NC Jelly Donuts, by the way.You foster that relationship. Spinning up a VM for, say, the CTP release of SQL 16, or trying out availability groups and spinning up some VMs. That, I think, helps. You have to have that mindset.

Back early in my DBA career, in the SQL 2000 days, I didn’t have that same mindset. I stayed on SQL 2000 even through the 2005 release. That urgency wasn’t there. Now as I’ve gotten older, you have to stay ahead or stay up to date. It’s hard, but I make that a priority.

Jamie: I agree with what Doug says. A lot of it is, the access to training really varies university to university. I’ve worked for universities that had no training budget. You did not get to go anywhere. At best, they would buy you a book from Amazon. You read that, and that was your training.I’ve also worked other places that would fly you halfway around the country for a weekend conference. A lot of it is how the university’s doing for a budget. It can change year to year. You can get a great training one year, and not get any training for the next three.

Carlos: A little bit of feast or famine there.

Jamie: Yeah. I guess as far as what you get to play with, a lot of that has to do with what your university’s licensed for. If you got a pretty good Microsoft agreement where you’ve got access to things, if you’ve got the time and the hardware available, you can spin up test machines and learn what the new features are.That’s a lot of what I do, is trying to be proactive in looking at the new features of the SQL release, then taking that to the management or letting the information be more disseminated around, saying, “Hey, there’s this new feature that could make our situation better by using backup encryption,” for example. “It gives us better security. It gives us better everything.”

But, being able to proactively do that, because otherwise it’s reactive. You don’t do anything for [inaudible 10:59] .

Carlos: Some of that revolves around the licensing that the universities get. They get a little bit of a breakthrough. This is not a show about licensing, but I guess you can explain to us a little bit about how that works.

Doug: For my understanding, we have an enterprise agreement. It includes software assurance. “We got you signed up for premier support in addition to that.” That gives you some other benefits from some hours.

Carlos: Access to things.

Doug: Access, yeah. But from the licensing standpoint for SQL server, it feels like a site license, but we’re licensed for a certain number of people and resources, but it’s enterprise. I can spin up enterprise when I need to.During the true-up period, we go through that and make sure that everything’s in line, but it’s great. The discounts are substantial. The 7,000 per core for enterprise is out there. But, in education world, it’s a lot less than that. A lot less. It is nice.

Carlos: Colleges tend to be localized. Sports teams, everybody follows their favorite college and whatnot. You’re at parties, you’re at your in-laws’ house, maybe that’s a better example. Then you talk, and you’re like, “Oh, I work for the university.” Any prestige working in the university there?

Jamie: Depends on the university, I think.

Carlos: [laughs] The university? OK, interesting. I wasn’t sure because if I could say, you work for Elon, private college. Do you get a bump, or are you like, “Oh, OK. You…”

Doug: In North Carolina, some people, “Oh Elon, they have a really nice campus.” The landscape crew does an amazing job. They’re working on their persona in North Carolina and trying to attract some students. You run into an alumni or something like that, it’s great.I ran into one last week. I wore my Elon hat. He was a grad. It was cool to talk with him about things. There’s a little of that, but it’s not like Michigan, and University of Florida, where it’s football, sports, that kind of stuff. With Elon, it’s more about the education, I think, and the campus. It’s cool, I’m proud. It’s a great place to work, so no complaints.

Carlos: Not that either of you two are looking at moving, but the downsides of working at a college? We talked about budget already.

Jamie: Budget’s the big one, pay freezes. A lot of it also depends private college versus public college.

Carlos: I guess you’re not managed by the state legislature.

Doug: No, it’s all private. Budget-wise, it’s OK. It reminded me a lot of a traditional IT shop. You have to submit for training. For example, the PASS Summit I go every two years. I’m a co-leader for the BI User Group in Winston-Salem, so I get the free registration. The other co-leader, we switch off every year. That helps save some money.It feels a lot like a traditional IT shop. It’s run like a traditional IT shop, which is nice, so I’m used to that.

Jamie: See I’m on the other side of it where we’re public, and so we’re actually controlled by the state legislature.

Carlos: You’re still considered a state employee?

Jamie: Yes. The Board of Visitors for the college has a certain amount of discretion in how they run the university, but technically, we’re still classified as government employees. Because of that, when it comes to hiring practices, they’ve got the whole tiered structure of, you get hired in at this level, and you have to stay within the high-medium-low rankings.When government decides that there’s no pay increases, or a hiring freeze, or whatever, we often get hit with that kind of stuff.

Carlos: Now, I think you’re the only DBAs. It’s a little harder to maybe compare with other people. I know the state level, pretty much all across the board, you either get a three percent increase or whatever a lot of times, or you get nothing. They have other incentives for you too like that?

Doug: Yes. There’s a small percentage bump each year for salary. Some of the other benefits, for example, you can audit some undergrad classes. I can get my MBA for free there, so I can sign up for the classes and do that. They’ll give you time away from work to take some classes within reason. That’s neat.There’s also a tuition exchange program that Elon’s a part of. I believe there’s other hundreds of universities that take advantage of it. I don’t know if William and Mary’s on that list. But if there’s, for example, someone at University of North Carolina is interested in coming to Elon, and someone at Elon wants to go there, there’s some exchange that can happen, so you can save some money for your children.

There’s some benefits there, plus all the facilities around campus. The sporting events are free. The cultural events are free, so it’s nice. There’s some things you can take advantage of.

Carlos: Is it the same way in the public side?

Jamie: Not quite that nice.[laughter]

Jamie: No. Depending on the university, because it’s actually been different at every public university I worked at. Sometimes you get tuition remission, or free tuition, whatever you want to call it. Sometimes you get subsidized tuition. Some places I’ve worked, you don’t get anything. It really varies.At William and Mary, they do let us take one class per semester tuition free. If you want to do an actual degree program, though, you have to qualify for those degree requirements. If you want to get your Master’s, there’s a requirement that you do your first six months on campus in a certain order. You actually have to take a leave of absence from work to be able to do the first couple semesters. It can cut both ways.

Carlos: Jamie, you worked at a couple of different places. Let’s say that someone wants you. They hear this, and they’re like, “Hey, being a DBA at a college sounds like something that would be interesting for me.” What kind of questions should they be asking in an interview to decide that the culture is right for them?

Jamie: Culture questions at university? Getting a feel for the group dynamic that you’re interviewing with is the big one, at least from my perspective. Most of the groups that I’ve interviewed with, whether it’s a department level, a faculty level, or a university level position, you’re going to be working with a certain core group.You need to find that out, because it’s like working with any group where some of them are very, very rigid, and some of them are very relaxed. I know some IT groups that they’ve got work from home options, where you can work from home one or two days a week. Other ones where it’s, “No, you’re in at your desk in a cubicle eight to five.” Then any outage times.

Just learning what the co-workers you’ll have in your group do. How they work and seeing if that works for you, because that’s the fastest way in and out of a group in a university is if you don’t mesh with their dynamic, you’ll get out pretty fast one way or another.

Carlos: They’re going to push you out. Now, both of you have started speaking at different events. We’re here, we’re at the SQL Saturday in Raleigh, North Carolina. Any points for starting to present at some of these, opportunities to take that back into the team? Doug, you’re the lone wolf, if you will, so maybe a little bit harder.

Doug: Yeah. I got my start from attending the DBA user group in Greensboro. Kevin Good was the leader at that point. It was pretty much, put me on the spot and said, “So when you speaking next?” I said, “Well, I’ve never spoken before.” I gave him a four-month “I’ll do it in the summer.”

Carlos: It seems like forever.

Doug: I know. All that time, I was like, “What am I going to speak on?” He was, “Something simple like, what kind of a small project have you done? Did you learn anything new?” I just did a general Tips and Tricks on DBA presentation at the local level. Then with that bug, I took it back to work and did some brown bag lunches, like a Lunch and Learn. That helped a lot.Then I believe my first SQL Saturday was here in Raleigh last year. I’ve done D.C., I’ve done Chattanooga.

Carlos: Richmond.

Doug: Richmond. It’s fun. There’s plenty of opportunity. I enjoy it. It’s giving back. You’d learn a ton developing your session. I go over the slides over and over again. I research, and I research, and I research because you want to get the right content, you want to be able to answer some questions and give the attendees some useful resources. A lot of work goes into it, but it’s great. It’s fun.

Carlos: Jamie, your team is a little bit bigger. Any options to take that back into the house and share with some of them?

Jamie: The team might be big, but when you get down to it, there’s myself and one other guy that do the SQL servers.

Carlos: I see. The audience is pretty small there. [laughs]

Jamie: Yeah. There’s no opportunity to take it back, but caveat on that is when I started getting into the presentation stuff was a couple of years ago at PASS Summit. I did the networking dinner. That was the biggest mistake I’ve ever made, because I ended up at a table with John Morehouse and Mike Hillwig. They’re very involved.They’re sitting there looking at me going, “Why haven’t you gotten involved yet?” They’re both great guys. I like them both. They definitely did a little bit of spiel promo towards getting involved.

Carlos: Twisting your arm a little bit there?

Jamie: Yeah. I got involved with Richmond, and decided to do a presentation and go from there. My bringing back is more presenting at SQL Saturday and at the user group, and getting more from that direction because I don’t have people to interact with so much at work. If I’m going to keep learning, I’ve got to get with the people that do this.

Doug: That’s the biggest part of being the only DBA at Elon is I don’t have…

Carlos: The escalation path. [laughs]

Doug: The other day, I implemented a change data capture for one of the databases in one of the developers. It had a specific need of tracking changes for a table. It was great. I was, “It’d be nice to talk this through with somebody, and whiteboard it out.” I talked to my manager, I talked with the developers, and they just wanted to see the changes in the table. They didn’t really care.

Carlos: They just want the result. That’s all they care about. [laughs]

Doug: I used the local user groups, and SQL Saturday is my networking ability to talk in front of some PEAT folks, and use that opportunity to help with their DBAs. That’s my outlet. That’s why I do it. It’s fun.

Jamie: One thing I’ll say on that is I’ve got a similar problem. The one that helps me the most is the good old SQL help on Twitter.

Doug: Yeah, I’ve seen you use that.

Carlos: You use that quite a bit.

Jamie: Between Doug and some of the other guys around here. When I post something on Twitter, they’ll comment. But if I’m really stuck, that’s really the way I have to go.

Carlos: You mentioned talking to your manager. From a review perspective, your manager’s technical. Are they over the group?

Doug: My manager, she’s also a developer. She does some coding in the group as well. She understands a lot about databases, which is great, so we can talk database design and that kind of stuff. It’s good.In review process, I used, like we talked about, the drive for learning new stuff helps in the review process. It’s constantly showing learning new things, or introducing new concepts, change data capture for example, to the group. It wasn’t there before. It definitely drives me.

Jamie: I guess, from my standpoint, my direct supervisor is a great guy. He is definitely not a database guy, though. He’s more into security and some things like that. As long as the SQL servers are running, and running well, he’ll do whatever he can to support me to make sure that I’ve got the time and whatever I need to do what I do. He doesn’t get too involved with what I do.

Carlos: Thanks for talking with us. It’s always interesting to see the other side. Sometimes you tend to think that the field’s always greener on the other side, so to chat with you guys about your experience.Before we let you go, we have a couple of standard questions we want to ask. We like to try to provide value to some folks, so we want to know what your favorite SQL tool is. It can be paid tool, a free tool, but why do you use it and why do you like it, or how do you use and why you like it?

Doug: For me, it’s SSIS. Automation is key, especially I’m the only one there, so anything I can do to find a repetitive task and automate it. We do a lot of data feeds to third parties or internal movement of data. I grew up on DTS and SQL 2000, and gone up from there. It’s a great automation tool. The logging it has, the ability of sending out emails, whatever it is, it’s a great tool. That’s my go to tool for automation.

Jamie: I don’t know if you’d call it a tool or not, but I have to say that the thing that I get the most value out of is probably PowerShell. Just being able to make it interact with jobs to interact with active directory for doing stuff to SQL, and handing off data to other systems, and harvesting data from other systems. That’s probably my go-to thing.

Carlos: That’s interesting, which I hadn’t recognized. You’re the first group that hasn’t provided a third-party tool. We’ve talked about the Microsoft licensing. Do you have the opportunity to purchase these other third-party tools, or is it pretty slim pickings?

Doug: Pragmatic Works has a great add-on to SSIS, some extra components. We use that for some secure FTP. What else? We have Redgate SQL Compare for schema comparisons, that’s good. There’s some money, but it’s not…

Carlos: You haven’t used the tools sets in what will become the SQL server.

Doug: Yeah.

Carlos: That’s good. We do have one last question. Before we do that, let’s hear about another way folks can learn about SQL server.[background music]

Announcer: Hello there, compañeros. I want to tell you about a unique SQL server training opportunity that is unlike anything you’ve encountered. As a listener of this podcast, you’re entitled to a special offer. SQL Cruise is a premier training experience set in the most unlikely of places, a cruise ship.Tim and Amy Ford have created a wonderful culture on SQL Cruise. With five or six technical leads from various industry sectors, you and roughly 20 other students will talk shop in classes while at sea.

While you are in port, you can enjoy the wonderful world we live in either by yourself or with the group. The beauty of SQL Cruise is the unique relationships you will form with other SQL cruisers.

Worried about leaving your spouse? Bring them along. I did, and she had a great time with the group. In fact, I have been on SQL Cruise twice. I’m looking to go a third time. You may be wondering if this is serious training. I can assure you, it is as good as any conference you will attend on land. It never crossed my mind that I could be talking about SQL server with several Microsoft MVPs on the beaches of St. John’s.

I know you must have other questions, so I suggest you head over to sqlcruise.com and check out the site. Once you are satisfied and you want to sign up, when you get to the question, would you be so kind to share the name of who referred you and what source material led you to us? You enter “SQL Data Partners,” and Tim will ensure you get $100 off the training registration.

This is a great offer. I hope this $100 savings will entice you to check out the site. I did and went on the Caribbean cruise and had a great experience. I know you will, too. Maybe I’ll see you on board, so head on over to sqlcruise.com and use the code “SQL Data Partners” to save $100.

Carlos: Our last question of the day, if you could have one superhero power, what would it be, and why would you want it? Doug.

Doug: What is it, teleportation? I think just to be able to bounce from place to place at speed of light, just boom, there. It would be cool.

Carlos: I agree. That would be very, very fancy. Jamie?

Jamie: You know what I do for a living, I would have to say somewhere along the lines of being, what is it, omniscient or something. Not having the stress of banging my head against a wall, figuring out why something doesn’t work. Just knowing what the problem was.

Carlos: There you go. Knowing all the answers. Don’t you think things are going to get a little boring after a while?

Jamie: I can make a lot of money.[laughter]

Jamie: I can take a lot of time off work.

Carlos: There you go. Very good.Carlos: Well said. Doug, Jamie, thanks again for being on the show. We really appreciate it. Compañeros, we’ll see you on the SQL Trail.

Children: SQL Data Partners.

Episode 14: SQL Server Clustering

1400Clustering is a now tried and true technology to help increase the availability of your SQL Server environment.  I talk with Paul Popovich about his experiences with clustering and how his approach has changed as the technology has evolved.  If you are looking for a good introduction to some concepts of clustering, this episode is for you!

It should be noted this episode does not include the new feature of Availability Groups.

Show Notes

SIOS Data Keeper
Denny’s blog post software assurance
RSAT – remote system admin tools
sp_whoisactive sp_blitz


Transcription: SQL Server Clustering

Carlos L. Chacon: This is the “SQL Data Partners” podcast, my name is Carlos L Chacon, your host and this is Episode 14.

Today, we are talking about clustering, with Paul Popovich. Just to make sure that we’re on the same page, this is the clustering that has been around for a while. I think at least 2005, it was available. This is not the availability groups that have become a little bit more popular in SQL Server 2012. Clustering is our topic today, and we hope that you’ll listen in to the session today.

For those of you who are headed to Seattle, that will just be in another week, probably another couple of days you guys will be heading out. We will have ribbons available this year that will say SQL Compañeros. I hope we can connect at Summit. Be sure to get your ribbon from me, and it would be nice to chat with you.

As always, if you have feedback or if we missed something in today’s session, maybe it wasn’t quite right, we welcome that feedback. You can hit me up at [email protected] or on Twitter, @CarlosLChacon.

Super excited that I was able to be here, in Pittsburgh, and that’s where I ran into Paul. I’m excited to have him on the show today. As always, compañeros, welcome to the show.

Children: SQL Data Partners.[music]

Carlos: Paul, welcome to the program. It’s nice to have you. How are you doing today?

Paul Popovich: Hi, Carlos. I’m doing well. Thanks for having me on.

Carlos: Not a problem. Now, remind me. Where are you working for and what are you doing these days?

Paul: I recently started a new job at a large northeast Ohio hospital system, University Hospitals, with a title, for whatever that’s worth, of Senior SQL Server Database Administrator, or Senior SQL DBA.

Carlos: OK. Very good. I take it you have a handful of clusters over there in your environment.

Paul: Yeah, we have somewhere in the 20 to 30 range of clustered instances, and a grand total of 300 plus SQL instances, in general, with mirroring, replication, clustering. Then, as the university chain grows, we end up buying hospitals and having to absorb 50, 60, 70, who knows how many instances per hospital, into our collection.It’s an ever growing thing of cleanup, standardizing, best practices, getting our scripts into all the instances. We have an offshoot of Ola’s maintenance to do all of the things and all of our backup jobs and standardizations, best practices, stuff like that.

Carlos: Yes. Thank goodness for Ola’s scripts. Right?

Paul: We love Ola.

Carlos: That’s right, Ola, if you’re listening, you’re still welcome to come on to the show.

Paul: [laughs]

Carlos: I’d like to come back to that actually. That idea of replication or mirroring outside of clustering, or I guess on top of.Let’s make sure we come back to that at the end, but ultimately today we are talking about clustering. Why do we want to cluster? Why would we want to build a cluster?

Paul: Well, the main argument that I had to my boss when I built my first cluster was, “I like sleep.”When you have single standing SQL Server instances, if things go bump in the night and it doesn’t come back up and it’s not there and you have all sorts of things go wrong. Having a cluster allows you to take the hardware aspect out and let the SQL instance fail over from one node to another.

That way, when the motherboard fries or whatever may happen, say you lose power in a rack or something really silly happens. SQL just starts back up on the other node, goes through its check recovery on start-up, and your instance is back up in a few minutes.

Carlos: You’re off and running. Yes, we should note that clustering is more about the operating system or the server components, as far as being available?

Paul: Yeah, you’re trying to take away hardware failure. Because the SQL instance is only alive on any single node, in the cluster, at any single point in time, so it provides a higher level of availability, I like to say.

Carlos: Right, so the idea is you have two operating systems, two servers, each of those servers connected…

Paul: Ideally identical.

Carlos: Ideally identical, yes, you know at least they have to have the same operating system. That I do know.

Paul: You don’t have to have the same hardware, but don’t cheap out on your secondaries, because then if you ever have to use it, your users are not going to be happy that their performance is halved.

Carlos: No, that’s right, even I would say from a patching perspective. A lot of times we’ve seen nodes just swing over and stay there to help with that patching perspective. They should be the same.

Paul: Same patch levels. Should be the same hardware, buy in pairs. Or if you’re buying multiple boxes buy whatever you’re buying all at the same time. Build it all up at the same time, same OS patching.

Carlos: But they have a shared repository so there is only one database on these servers.

Paul: Yes. You need shared storage for this. And actually that’s half kind of a lie because if you buy some tool called SIOS DataKeeper you can get away with not needing shared storage because it does block level replication for you.But, this is not an ad or anything for SIOS, I just want to make it aware that if you want to do some crazy things in Azure, you need SIOS DataKeeper to make a cluster in Azure. That is actually supported by Microsoft and you can do all that in the cloud, since you can’t get shared storage in the cloud.

Carlos: Very interesting. I might have to chat with those SIOS folks. I did not realize, that it was an Azure ARK ready component if you will.We’ll make sure that we have that post, and some of the other information in the show notes, which you can find at sqldatapartners.com/podcast. You’ll be able to see today’s episode and other information out there.

Ultimately, about availability. That’s where we want to be able to keep those up, so in a hardware failure we can get back more quickly.

What do we need in order to build a cluster?

Paul: We’re doing this on-prem…

Carlos: On-premise.

Paul: I need minimal two boxes. You want to build them on Server 12 or Server 12 R2. In fact, I would argue Server 12 R2 because it has nice clustering enhancements with, “Last Man Standing”, and some waited voting. Then you need SQL12, SQL14 and you can build clusters in standard edition, which is nice.There is some licensing implications, they change things around in SQL14, to get the passive note. It used to be free. SQL8 R2 12 you could just buy. You could spend eight or ten grand on SQL Standard, put it on a box, make it a cluster, that’s great.

You didn’t need to license the other note, because you weren’t physically using it were for anything, it was just sitting there waiting. In ’14, they changed the licensing around, where you now need software insurance. Thank you, Microsoft.

Carlos: Thank you for that.We need the hardware, but what about some of the other objects that we need in order to get the actual physical cluster up and running?

Paul: If you don’t have a SAND, you’re going to need a SAND. Assuming, we’re not going the SIOS path, so you need shared storage. You can either do it over a fiber channel or you can do it over iSCSI. It all depends on what your infrastructure team is going to support.If you’re a large invested environment, that already has a lot of things up and standing, you probably already have a SAND or many SANDs with all the plumbing needed for fiber or iSCSI in place.

Carlos: Talk to your SAND administrators, right? Make friends with them and they’re going to need to give you some shared storage. You’re also going to need to talk to your network folks.

Paul: When you start to build clusters, first your Windows cluster gets an object. That’s an AD object, that’s going to go somewhere in an OU. That’s going to need an IP.When you install SQL Server, there’s four different parts of ways to think of this. You have a physical Windows Server with a name on your domain, so that has a DNS entry.

Carlos: Server 1?

Paul: Right. You have server 1 and server 2. Then you have a Windows cluster that sits between the two. That’s cluster 1. That’s a OU object and a DNS item.

Carlos: So the outside things can ping that name and depending on which one is active, it would respond.

Paul: The Windows cluster will return with an IP address just like anything else, and that’s a look-up just as you would expect. That’s 1 and 2, node 1 and node 2, on cluster 1. Then you have your SQL Server, so you’re going to have a named instance out there.It will be, “SQL\” instant cluster. The front half of the “\” is a DNS entry, so that’s a pingable object too. That needs an IP. So we have 4 IPs, minimum, to get one instance out there.

In my current job, we have an enterprise cluster with eight nodes and eight instances of SQL. We can drop, shift things around, patch, do upgrades, do whatever we want.

If you’re already running out of IPs on a certain block, or if your CISCO guy is not exactly your best friend. [laughs]People love booze. Nice bottle of Jack, or whatever your friends are into. Make friends with your SAND guy, your AD guy, and your network guy.

Carlos: Grease those wheels a little bit.

Paul: Yeah. Whatever you got to do.Once we have those objects and we talked about the active directory components that we’re actually going to need to create some objects in the active directory. You’re going to need some permissions to be able to write.

The permissions, I’ve only ever gotten it to work with domain administration, and I believe they have changed that. To make the Windows Cluster, you need to be a Domain Admin.

A lot of SQL Server DBAs will have to get everything presented to them afterwards. Their server team will stand up to servers. Their networking team will lay the ground work. The server team will build the Windows cluster itself, IP everything, set it all up and then you’ll go ahead and install SQL on it with the IPs they give you for your cluster instances.

Carlos: Now I have used it. The example I have is that the objects by default got created on a specific OU.

Paul: They do. They end up in the default computer’s OU. So you’re server 1 and server 2, cluster 1 and instance 1 will all end up in default old computers. You can move those around.That’s the way, we got around the security components. I didn’t have a domain administrator. The user I was using to install had rights to create on those default containers and then by policy, they begin to move into where they should be.

You can re-stage them, but I haven’t had a lot of…success. I don’t know if I would bother.

Carlos: I’m on the same opinion. You get someone with rights or give them a place where you can create them when you have no rights.

Paul: I would feel better about just having them just show up as you want them created. Deal with moving them around, making sure you have proper OU policies or proper rights to the OUs afterwards. Clustering will tell you through warnings if it can’t update something, there’s a cluster error log that’ll tell you some of that stuff.

Carlos: There you go. It creates its own little world there in the cluster.

Paul: Cluster manager.

Carlos: There you go, cluster manager.

Paul: It’s actually the failover cluster manager.

Carlos: There we go. I think some of the different server additions have changed the [inaudible 11:40] a little bit.

Paul: Yeah, they’ve changed the name and changed the look a little bit. There’s one 2003 cluster hosting SQL 05 at UHN and the server, they’ve totally changed the look of it from ’03 to ’08, but they’ve done that with everything, so that’s to be expected.

Carlos: The only constant is changed, right? Ultimately, this is about availability. Now we have our cluster setup. We have all the objects that we need. How does that work from an availability perspective?We talked about that only one server is going to see the disks, for example, that shared disk and then be available. How do I make that a little more available so when things go bump in the night, as you mentioned, I’d have to be woken up too?

Paul: There’s actually two ways you can go ahead and present your disks. You present the disks to both servers and clustering handles the on/off button. If you go into disk management, you’ll see them, enable them with one node and disable or offline on the other.That’s the easy traditional way, because you only have to worry about storages on a node. I know that the other node has it disabled, I don’t need to worry about that other node doing anything silly to the disks like querying it or running NTFS check line and nothing. It’s on or it’s off.

Carlos: The SQL services are stopped on that other node.

Paul: The SQL services on the other node are stopped, so everything…even if you wanted to turn them on, they couldn’t because there’s no drive, so life is easy.

Carlos: [laughs] Don’t do that, I have…should I admit this? I have made the mistake of going into a cluster and thinking, “Hmm…”

Paul: Always know which node you’re on.

Carlos: “…The services are set to manual. That’s not right.” [laughs]

Paul: You want to handle everything, all the on/offs from inside cluster manager, because it will do all the work for you. It’s an RSAT tool snapping, you can manage them from your PC remotely, everything is great and wonderful. It is a little annoying with the way they’ve changed RSAT tools, where if you are on Windows 7, you can’t manage RSAT on server…

Carlos: There’s that word, I apologize. RSAT, what is that?

Paul: The Remote System Administration Tools. It’s for AD, DNS, failover clustering, there’s a whole bunch of other…file system resource manager.

Carlos: OK.

Paul: Those are free download from Microsoft, but if you’re on Windows 7, you can only manage Server 2008 or 2, I believe. If you’re on Windows 8, you can do 12. Windows 10, you can do I think 12 or 2, and then Server 16, which isn’t still on public availability, or CPT. Whatever it’s on nowadays.It gets annoying with having the right version of Windows and the right RSAT tools installed for the right server you want to manage.

Carlos: That breaks up this idea of…in clustering, there is a number of items that we need. We know that we’re going to need some disks available for SQL Server, but the other thing that we have is we have quorum.

Paul: Yes. Quorum is long and involved topic. To keep your cluster up, you need votes, and quorum contains votes. Each node in the cluster gets a vote.

Carlos: You have to have a majority.

Paul: To have your cluster be up, you have to have a majority. But if you have a two-node cluster, you have…

Carlos: 50-50.

Paul: …equaling votes, yes 50-50. There’s two different ways to add a third vote. It’s a file share or disk witness. Depending on your setup, it depends on the way you want to make your file share or disk witness.I guess the easiest way I could probably think to describe it is if you are doing a Windows failover cluster with a SQL instance and you’re serving up via fiber channel, I would do it via disk witness because if your fiber channel drops, you’re going to lose SQL, you’re going to lose the quorum drive. It’s all going to go away.

Carlos: SQL Server will not be available.

Paul: SQL Server will not be available, and you know you’ll have an issue. If you do SQL Server on fiber, OK, that’s great, then you do a file share witness. If your disks drop, your cluster is still going to be up because you’re going to have networking. You’re going to have a vote for Node A, and a vote of Node B, and a vote from your file share quorum witness.It all depends on the situation, what is best for you and the best way to configure it. You could talk ad nauseam just on quorum itself. But the easiest way to think of it is, it’s votes. If you don’t have enough votes, you don’t have a cluster. You have a cluster, you have problems.

Paul: That’s right. The other comment or the other idea with that file share is that in certain cases, I think in test and again I would say in other versions, which we’re not talking about today, the file share can be helpful in…the file share quorum can be helpful in test situations where you maybe don’t want to be pinging for disk space from your SAND guys or whatever. You can use that [inaudible 16:42].

Paul: That is true. There is actually another way to present disks to your cluster. Starting in SQL 14, you can use cluster shared volumes. Hypervisor has been using it for years. That’s how they do live migration, so each node in the cluster has all the disks on it mounted, which it would anyways, but they’re all live.Cluster manager tells the VM it’s living on Node A, then you do a live migration to over it in Node B. It doesn’t want to drop the disks because then your VM would go offline and then remount them on Node B. Since the disks are both alive and up on both nodes, it just does magic Microsoft stuff in the background where it just shifts the workload over.

SQL Server now has that option for failover cluster instances in 2014. I avoided it in my last job only because it was Version 1 implementation in SQL. I’m not a big fan of anything that’s Version 1 and new in SQL. I would prefer not to have to worry about those sorts of things, given that if you’re doing the traditional way of just storage on both nodes and it’s on and off. There’s really nothing to go wrong. There is just, “It’s on or it’s off.”

There’s Knowledge Base articles in Windows Updates for cluster shared volumes. I don’t want to have another thing that I have to worry about to patch in and read the release notes, and add another layer of complexity there. It just makes your life so much simpler to go, “I have storage 2 in B, it’s on A, I don’t have to worry about B. It only goes on to B when I do whatever I do, be it maintenance or a fiber failover so.”

Carlos: Yeah. Ultimately we create these clusters for availability. We talked about a scenario in patching. I can patch server 2, reboot that, and then during the maintenance…outside of the maintenance window and the maintenance window, I move it over…

Paul: Outside of the two to three minutes you should take for your SQL Server instance, and that depends on number of drives and number of databases on startup and what’s…

Carlos: The size…

Paul: …the size.

Carlos: …all that stuff.

Paul: Ideally, that makes your patching window somewhat simpler. Your downtown would have so much shorter. You patch Node 2 on whatever your schedule is, then you roll over the instance to it. Patch Node 1 because you want to keep them the same, because you don’t want them to get out of sync with patches. Then you cut your window in half.

Carlos: You’re done. That is an interesting point that we probably need to specify, because I have ran into situations where the managers are like, “We put in clustering. Our downtime should be zero.”

Paul: It’s a higher level of availability, not highly available. That is…

Carlos: Even two minutes, I’ve had my comeback even a little bit faster than that. When you failover, there will be some downtime.

Paul: When you hit the failover button, SQL stops, it moves the disks from one node to the other, and it starts up again.

Carlos: The nice thing is that it will take care of those transactions for us. If there was something that would, that it hit SQL Server that wasn’t yet committed for example, it will roll that back. It would be highly consistent database. If there’s something that is there, we’ll know that we’ll have it, so new connections won’t be available until it actually comes up…[crosstalk]

Paul: …the databases come back up and available.

Carlos: There will be a little bit of a downtime. We’re good. I think that was a nice little introduction, an overview of SQL clustering. We will put that in the show notes, links to some of the references that we’ve made today. You can check those at sqldatapartners.com/podcast and click on today’s episode.Before we let you go, Paul, we have a couple of things we want to ask you about. Here we’re always looking to add value to our listeners, we’d just like to get different opinions, different ideas about how you’re using tools to help you in the SQL Server space. What’s your favorite SQL tool? Why do you use it and what you like about it?

Paul: If you get your hand on any monitoring software, just buy it. Don’t try to roll your own. Be it Spotlight, or Idera or SQL Sentry. It makes life so much simpler. If you have to start rolling your own type of tools, go find, Brent Ozar’s tools. It’s sp_BlitzCache, sp_Blitz, ask Brent. Go find Adam Machanic’s Who is Active.For all the free stuff you can get online, the Ozar suite and Adam Machanic’s Who is Active will give you a pretty good overview or look into most things that are going on in your SQL Server and help you find some pain points pretty quick.

Carlos: Yes, and only because he was a prior guest on the podcast, Steve Stedman also has a monitoring tool that he’s been rolling for the last five years. It’s actually pretty good. I kicked the tires on that. Another free option.Very good. Paul, I know you’re here in Pittsburgh presenting today. You’ve made some friends in the SQL community, and you had some experiences in the hospital space. Ultimately, we’re looking for an experience that will help illustrate why it is you enjoy being a database professional or DBA.

Paul: The ability to take a query that runs for minutes, then to have it run for a second, two seconds, 10 seconds or whatever the reduction in query run time you can get by. Be it rewriting it, or finding the right index. That feeling is just awesome, like, “Oh, that report used to run for half an hour. Now it runs for two minutes.” That’s free. That’s 28 minutes of cycles you just opened up for whatever other work is going on. It’s just fun to me.

Carlos: Very cool. That ability to provide value, and people, they want to see those results.

Paul: Right. That’s completely quantifiable. You can’t just say, “What did you do all day?” “Well, this 34-minute query now runs for four seconds.”

Carlos: [laughs] Yes, there you go.

Paul: It’s the best feeling in the world.

Carlos: Very good. We do have one more question for you, Paul. But before we ask you that, let’s hear about one additional instance that folks have to learn about SQL Server.[music]

Carlos: Hello there, compañeros. I want to tell you about a unique SQL Server training opportunity that is unlike anything you’ve encountered. As a listener of this podcast, you’re entitled to a special offer.SQL Cruise as a premiered training experience set in the most unlikely of places — a cruise ship. Tim and Amy Ford have created a wonderful culture on SQL Cruise. With fiber 6 technical leads from various industry sectors, you and roughly 20 other students will talk shop in classes while at sea. While you are in port, you can enjoy the wonderful world we live in, either by yourself or with the group.

The beauty of SQL Cruise is the unique relationships you will form with other SQL Cruisers. Worried about leaving your spouse? Bring them along! I did and she had a great time with the group. In fact, I have been on SQL Cruise twice, and I’m looking to go a third time. You may be wondering if this is serious training, and I can assure you it is as good any conference you will attend on land.

It never crossed my mind that I can be talking about SQL Server with several Microsoft MVPs on the beaches of St. Johns. I know you must have other questions, so I suggest you head over to sqlcruise.com and check out the site.

Once you are satisfied and you want to sign up, when you get to the question, “Would you be so kind to share the name of who referred you and what source material lead you to us?” You enter SQL Data Partners, and Tim will ensure you get $100 off the training registration.

This is a great offer and I hope this $100 savings will entice you to check out the site. I did and went on the Caribbean Cruise and had a great experience. I know you will too. Maybe I’ll see you on board, so head on over to sqlcruise.com and use the code SQL Data Partners to save $100.

[background music]

Carlos: Paul, we do appreciate the conversation today. Before we let you go, we want to know if you could have one superhero power, what would it be and why do you want it?

Paul: I’d want Batman’s power, and that’s being rich.

Carlos: [laughs]

Paul: Then I could buy all the toys he has and the Batcave.

Carlos: There you go. You can overcome a lot of obstacles there, right?

Paul: Probably, you would think. I mean, you would think the Wayne Foundation would have fixed up Gotham by now with all the money he had. Why are all these super villains all around there? We got to look it from a different perspective, it really doesn’t make a lot sense.

Carlos: Money is not everything there.

Paul: There’s something missing there.

Carlos: [laughs] OK, very good. I do appreciate the conversation, Paul. Thanks for being here.

Paul: Yeah, thanks for having me, man. I appreciate it.

Carlos: Compañeros, again if you’re out in the summit… here at the end August or October, rather, 2015. We hope to see you there. I’ll see you on the SQL Trail.


Children: SQL Data Partners.

Episode 13 The Apply Operator

1400 I saw Kevin Feasel’s SQLSaturday session entitled Apply Yourself and thought it was pretty funny title.  While not close friends, I have know Kevin for a while and knew he had good stuff, so I attended the session. He there shared this quote in his session.

“If you don't understand the APPLY operator, your skills are somewhere around the 50th percentile at best.” -- Adam Machanic.

I knew immediately I wanted to have him on the show.  He introduced some pretty cool concepts. This episode will introduce some concepts, but you will have to do a bit a leg work if you want to start ‘Applying’ this into your code–I’m sorry, I couldn’t resist.

Show Notes

Kevin’s Session Notes
Kevin on Twitter
SQL Cruise

Transcription: Apply Yourself

Carlos L Chacon: This is the “SQL Data Partners” podcast. My name is Carlos L. Chacon, your host and this is episode 13. Today we’re going to talk about the APPLY operator. Something that I probably am not using enough, I happen to be in Pittsburgh listening to Kevin Feasel’s talk about the APPLY operator. He kind of made me a believer.

When I heard it I knew that I wanted to have him on the podcast. We’re going to go over that, talk a little bit about some areas in which you might be able to apply it, how you can use it better and then of course some of the benefits.

If we get something wrong, you want to fact check us, you want to hear something else about it. You can hit us up on Twitter. I’m @carloslchacon or you can email me at [email protected].

All the show notes, there’s going to be lots of examples. We’re going to be talking a little bit in the abstract. I don’t want to say the abstract, but obviously to see some of the specifics, you’re going to want to see some of the code.

We’re not going to be able to do that over the podcast, but all of that will be available. Kevin’s actually made his scripts available online and that will be available in the show notes at sqldatapartners.com/podcast.

Again, if you like what you hear we invite you to give us a rating. We’d love to get some feedback. If there’s topics for future episode, we’d like to hear those as well.

As always, Companeros, welcome to the show.

Children: SQL Data Partners.[music]

Carlos: Today, we have Kevin Feasel with us. Kevin, are you still going by database administrator?

Kevin Feasel: I am officially a database engineer.

Carlos: Engineer. I knew you had introduced yourself as something else.

Kevin: Yes, I tend to blur the lines a little bit. Basically, this is the first company that I worked for where there really is a difference between a database engineer and a database administrator.Historically, I’ve loved the hybrid roles where I’m doing some administration backups, restorations, I’m working with snapshots, I’m working with the administrative side, but also writing code and optimizing tables, optimizing queries, that kind of fun stuff on the development side.

Now, I work for a company in Durham, North Carolina called ChannelAdvisor, where we actually have a split. We have a whole administrative team. I’m on the database engineering team, it’s all development work.

The negative side is I don’t get to play with the production. The positive side is I don’t get calls at 3:00 AM, so it’s a fair trade-off.

Carlos: [laughs] Now, being in Durham, does that make you a Duke fan or…?

Kevin: No, I…

Carlos: Did they make you swear allegiance on your…?[laughter]

Kevin: No, although I did love Christian Laettner’s shot. I’m sorry Kentucky fans, you all hate me right now.

Carlos: [laughs] You had to bring that up. OK, Carolina fans either, don’t leave us just yet. I’m sure Kevin has some very nice things to say about Chapel Hill. [laughs]Today we want to talk about the APPLY operator. Now, the APPLY operator, fair warning, it’s not a basic topic. There’s just a couple of things, some vernacular terms we’re going to throw around, that you should be familiar with. If these are completely foreign to you, then the APPLY operator conversation may not be suited for you just yet.

The first one, of course, they’re very commonly…a lot of the examples that you used, Kevin, were in dynamic management functions, dynamic management views. These are the views that Microsoft has made available for administrators, to be able to see under the hood, what’s going on. The other, table-value functions. We create functions that will return logic to us.

Kevin: A results set.

Carlos: A results set to us. We also talk a little bit about common table expressions, which, a good definition…Jeff Johnson would kill me. I’m going to say, a subquery that starts with a whiff. [laughs]

Kevin: That’s a way of doing it. I like to think of it as pulling the subquery out to the top of the full query.

Carlos: There you go. Then, I’m just going to say, subqueries. Where you’re joining to another select statement. Subqueries. Each of these are some scenarios that we use, or talk about, or differentiate, a little bit, with the APPLY operator. Hopefully, that’s at least somewhat familiar to you.We’re talking about the APPLY operator. In fact, the name of your session was “The APPLY operator.”

Kevin: “Apply Yourself,” yes.

Carlos: Oh, “Apply Yourself.”

Kevin: I had to sneak in the pun somewhere.

Carlos: [laughs] When we talk about the APPLY operator, we’re ultimately talking about two keywords that we can use in our queries. The first is the OUTER APPLY, which you mentioned was much like a LEFT OUTER JOIN, and then the CROSS APPLY.

Kevin: Which is semantically similar to, but not equivalent to, an INNER JOIN.

Carlos: There you go. Ultimately, to review why we would want to be using APPLY operators. Why even approach the subject?

Kevin: Sure. I’ll start by saying that the APPLY operator’s been around since SQL Server 2005. The reason that it was introduced was to allow us to join two functions. Functions had been around in SQL server since at least 2000. I believe they were introduced in 2000 but I’ve never used a version before 2000.

Carlos: We won’t hold that against you.[laughter]

Kevin: I saw a SQL server 7 version in an environment I worked in once but I didn’t have to touch it, so that was good.

Carlos: Lucky you.

Kevin: Yeah. The APPLY operator was introduced in SQL server 2005. The reason that it was introduced was to give us the ability to take a data set and iterate over the data set and perform some type of function. The simplest used case was dynamic management functions.A great example I like to give is we have a dynamic management view called sys.dm_exec_connections. That shows all of your open connections. All of the sessions that are currently open on your instance. That dynamic management view gives us some interesting details, like when the session started, how many reads, how many writes.

It also gives us this thing called the most recent SQL plan handle. Which is a binary blob that acts like a pointer. The binary blob itself isn’t very useful to us, but we can then take that pointer and pass it in as the parameter to a dynamic management function. A DMF called dm exec SQL text.

That will return to us the actual query being run, or that has recently been run whose pointer is that pointer we passed in as the parameter. We can do that for an entire result set. If we have 100 active sessions, we can see all the queries that are being run at present against our instance.

Carlos: Super helpful if we’re looking at execution plan details and the meta data, if you will, about those planes.

Kevin: There is another dynamic management function which will show you the execution plans themselves. Yes, absolutely.

Carlos: I guess the classic scenario where you would use an APPLY operator is in the connecting of those two for the view of the function?

Kevin: Yes. Or a table and a function. Generally, a dynamic management function is involved in there somewhere. You can also use user defined, custom, table valued functions. I found that the performance ramifications of using them, even just very simple table valued functions, there’s a significant performance overhead to using a function as opposed to in lining that SQL code.

Carlos: Some of the reasons why we would want to use an APPLY operator. In what scenarios would we want to do that? One of those we just talked about, right? Joining table valued functions. With an inner join we can’t pass a parameter but with a CROSS APPLY we have the option to do that.

Kevin: Exactly. If you’re back in SQL 2000 world, you probably learned at some point that you could do an inner join to a function and as long as you have all your parameters have defaults, you could use an inner join. But as soon as you needed to pass in a parameter, inner join won’t work anymore. It’ll give you an error. That’s where the APPLY operator starts to become important.

Carlos: Another scenario is in pre-aggregated data?

Kevin: Yes. The example that I give actually comes from a Jason Straight blog post. I’m glad that he put that out. It was part of a T-SQL Tuesday on the APPLY operator. The link is in my code demos because I wanted to give him credit for it.The general idea is I want to figure out, say from the AdventureWorks data set because everybody’s got AdventureWorks, how many times has a water bottle been purchased? There’s an easy way to write this query where you can select from the sales detail table. I can join a couple of other different tables to get some important information, but then I need to get my count.

My count of records in these sales order detail table, because that will tell me how many times somebody has bought a water bottle, but if I want to get any other detail like the product sub-category name, or the product name or even if you’re looking at say the month of purchase, I now have to group the resource set.

Group by class, it’s fairly standard, very straight forward, it’s got my known aggregated attributes, and that works. That works fine. There is no problem with that. However, there is a chance that we’re not writing the query in a way that will allow it to perform most optimally. In this scenario, the query returns back something on the order of 187,520 rows. I wish I didn’t remember these things.


Kevin: I’ve blocked out all the important things from my mind, but I know these numbers exactly. I’ll show an execution plan that sees, oh, I have a nested loop join of one row, because I only have one product, my water bottle, but it shows up in the sales order detail table a 187,520 times, so my nested loop join basically on the left hand side.I’m iterating through once, but on the right hand side I’m burning through at a 187,000 times, then I have a 187,520 rows going through the stream to a stream aggregate which smashes all that back down to one row, and returns that one row to the user.

By contrast, if we can find a way aggregating those 187,520 rows earlier on, we can join one row to one row, because I know that my water bottle. I’ve got one water bottle, and when I aggregate that product up, I should only have one row.

Join one row to one row, and I can end up getting better performance because it’s less effort of the CPU, it’s not less IO, we still need to hit those same rows, we still need to pull out that same information, but there is less for the CPU to do.

Carlos: Yes. That’s [inaudible 13:31] interesting, so we have basically the performance gain by using the operator and ultimately you talked about the optimizer being able to…that’s another term, right?The “optimizer”, that thing that helps translate your query into what the database is going to do, so the optimizer will see that and be able to make some logic assumptions about what your query is going to do in order to give that back.

Kevin: Exactly. It can understand that, “Oh, I know there is only one row up here, and I know I can aggregate this information earlier on, so I join one row to one row,” so it’s exactly what’s happening. You can use the Apply Operator to do this, you can also use other techniques.You could take a sub-query, and inside the sub-query you’re doing a count operation in there, and the optimizer can see that and say, “Oh, wait. That looks like I should pre-aggregate the data here, and then join one row to one row.”

Ideally, the optimizer will take any logically equivalent version of this query including the first one that I talked about, and say, “You know if I pre-aggregate this data, I’m going to get better performance.”

Unfortunately it’s not always perfect, it has to find a solution in a very quick amount of time, and it finds one that’s good, and says, “OK, this one will work,” because we get the same results back and in many cases it is still going to perform well enough, but sometimes you need to [inaudible 15:11] out that extra bit of power.

Carlos: There you go. Using the other APPLY operator will give it the hint that it needs to help, just get a little help if you will.

Kevin: Yes, exactly.

Carlos: Now, interestingly enough for it so any time we talk about performance, whether it’s data professionals or, “Hey, you know we want something that will help perform much faster.” I guess we need a throw up of a flag of caution that the APPLY operator in this case is not necessarily always going to be the fastest scenario.

Kevin: Absolutely not. Basically, the APPLY operator is going to try to force you to do a nested loop join. Not a hundred percent of the time, but 99 percent of the time. Because…

Carlos: You can almost kind of expect that.

Kevin: Yes, because you go back to what the APPLY operator does. For each record on the left hand side on your table, we’re going to take row and perform a function. We are going to perform some type of operation and get some outputs.That lands itself very nicely to a nested loop join, where with a nested loop, we’re taking each row and then we’re looping through the right hand side, it’s just that normally we’re going to have one record come back from the right hand side, or relatively few records come back from the right hand side.

That can be beneficial when you’re looking at say I want to get a relatively small percentage of a very large table. By contrast, the other join at operators, a hash match or a nested merge join, they may involve pulling all of the table into memory, and when all they need a tiny percentage of that table, they’re generally not going to perform as well.

Carlos: A little bit more overhead to get what you need, and so you actually mentioned an interesting statistic one that I was not familiar with, and so your discussion was about 20 percent or more of the records returned, so skinning that table. Basically bringing in a lot more of that table, you’re seeing results that are a little bit faster there. It’s just like the merge join or even the hash join, would can get you there.

Kevin: Yes. The [inaudible 17:38] that I’ve seen is that, if you have less than half percent of your table being retrieved, you generally want to use an index seek operation. If you have more than 20 percent of the table being retrieved, you want to use an index scan.If you’re somewhere between, either one could possibly work, it really depends on your data size, there is no hard cut off point for, “I need a scan here or a scan there,” but half percent, 20 percent, those are pretty good rules of thumb.

Carlos: This is, if we’re just stepping back even for a moment, this is always one of those where trying different ways about peeling that banana, if you will, or skinning that rabbit, you may have to try a couple of different ways because data, your situation will be a little bit different, your mileage may vary, and so you’ll need to understand that, play around with it, and use some of…show statistics and show your time options, so that you can actually see what the queries are doing and…

Kevin: Yes. You definitely want to check execution plans, you want to see what those statistics look like, because even you’re using the APPLY operator, you might end up getting a larger number of IO operations, because I’m using a nested loop join, so for a table with say generally just a few million rows, it will be about 3 read operations to grab a row, if you’re using an index seek.It’ll take three read operations to get to the information that I need from that row. If I’m pulling back 20,000 rows, I’m looking at upwards of 60,000 read operations, may be fewer because if data is clustered together, it can do a brain scan, but that’s not something that I want to get into too much here.

If we’re looking at the 60,000 reads, a single scan of a section of a table probably is going to be fewer than 60,000 operations. However, it may take longer. It may take up more memory. It may push out pages that other queries need. It may make your other queries slower.

Carlos: I guess, as always, we can take this only an isolation. Then you look at the overall system and go from there.

Kevin: Exactly.

Carlos: Another scenario, very quickly let’s talk about the example that you used — I want to see the last five orders or some information about the last five orders from a customer that they APPLY operators as an option to use here as well.

Kevin: Yes, this is where the operator really starts to shine. If I want to get the top “N” records from some table, I can very easily write a CROSS APPLY or an OUTER APPLY and inside that APPLY operation I can put in a derived function or an ad hoc function. I am going to use the terms interchangeably.Now inside the derived function, I will have my operation to grab the top rows. Let’s say that I have a set of customers, and I want to get their latest orders. Customers are on a customer table, orders are in an order table. I want to get the last five orders for each active customer.

My query is going to look at from customers, and then I can CROSS APPLY. Inside the CROSS APPLY, I write my derived function that says, “Select top five” and whatever order information I need from orders where customer ID is equal to the customer table’s customer ID.

That will allow me to very quickly write a query. If I only have three orders for a customer, I’ll grab the three. If I only have two orders, I’ll grab the two. If I have seven orders, I’ll grab the five latest. It’s a nice easy way, very easy to understand the query, very simple syntax. That will give me back exactly the rows that I need.

By contrast, if I tried to grab them all together, I can’t use the top operator because the top five of customers joint or orders that’s only going to give me five orders total.

Carlos: Right.

Kevin: Alternatively, I could use something like a common table expression or a subquery and inside there grab records for each customer using something called a window function. Do you want to get into window functions here?

Carlos: No, we will just say basically that’s a way of ordering that. That sounds like maybe another option not to talk about…

Kevin: That’s right.

Carlos: [laughs] …another topic to talk about at another time.

Kevin: Another day, yes. We can grab the five latest orders using some other method. Syntactically speaking, code, beauty speaking, I actually prefer the APPLY operator just a little bit, because I think it’s a little easier to read.

Carlos: Your biases are coming out a little bit there Kevin [laughs] .

Kevin: Yeah.

Carlos: Speaking of readability that leads us into one of the other sections that you suggest there. That is on simplifying calculations. Particularly the idea, some of the codes that you’d shown again, the code examples will be sqldatapartners.com/podcast.The link will be there to show you some of these examples with some information about comparing and contrasting the different ways. One of those that made that process a little simpler to read was in calculations section.

Kevin: I’ve worked for companies which have not always had the best data design, have not always had the clearest written code. It happens. I’ve written some horrible looking code.

Carlos: Yeah, that’s right.

Kevin: At one point I was doing work on Medicaid data. I had subqueries inside subqueries inside subqueries inside subqueries to perform the calculations. My select clause was three quarters of a page long. I was really proud of myself, because look at all this code that I’d written. Then they told me to change something at it. I said, “Oh, boy! That’s not good.”

Carlos: Reusability factor went down a little bit.

Kevin: Yes, exactly and part of the reason in my select clause was so big was I would have little sections of code that were repeated. I’m building a mathematical calculation. I need to take A plus B and at that becomes something, and then I need to take that and multiplied by C and then I need to do divided by D and then I need to perform some other operations on this stuff.I need to take that quantity into two separate things to give me two different metrics and some copying, and pasting all that code over and over and over. One of the fantastic uses of the APPLY operator — my favorite use of it — is that you can really simplify calculations. A derived function does not need a “from clause.”

Let’s say that I need to perform some operation. I need to take A plus B. In a CROSS APPLY, I can say select A plus B as C. Now I can query this newly derived attribute and it is now called C, so I can select C. Let’s say that I need to multiply C times D and divide that quantify by E. I can now have another cross supply operation that says, “OK, select C times D divided by E as F.”

I can keep going on and chaining things. Now I’ll show in my select query instead of some really complex code, it will be more like here’s A here’s B, C, D, E, F. Really simple select statement, really easy to maintain and even easy to follow through. It’s sort of like showing your work when you’re doing high school math. It’s algebra at that point.

Carlos: I think as you went through what helped me, you think about that a little bit more as if I was going to be doing those steppingstones. I think traditionally I’d be doing temporary tables. A plus B, stick it on a table. Then do some other operations. Stick that to another table. You can have these groups or sections that you can logically step through.The APPLY operator, the CROSS APPLY operator gives you those sections to look at as well and you can say, “Oh, OK, here’s kind of what I’m doing here, here’s what I’m doing here, here’s what I’m doing there” — visual just makes it a lot easier.

Kevin: Yeah. Here’s the beautiful part. I’m glad that you mentioned temporary tables because with the temporary tables, I’m writing out to tempDB. I am writing at least a memory, probably to disc. Then I’m reading from disc. I’m writing disc. I’m reading from disc.While with the APPLY operation, you’re not writing to disc, you’re not creating new tables, you’re not creating new structures in memory. What happens is there’s a little scalar calculation that you can see in the execution plan, which if you’re already using some function, like a sum or counter something or you’re using a left or right or substring or any function like that, “Hey you’re already seeing that compute scalar.”

There is no real overhead to using the APPLY operator to simplify your calculations in that way. It’s all done in line. The Optimizer takes that and says, “Oh, this query that you’ve written that is human readable is exactly the same as this ugly looking query when you get down to it, so I’m going to give you the same execution plan.”

Carlos: There’s a couple of options. There are a few more that are around there. We’ll put those in show notes and make those available for you to take a peek. Of course, they’re in your coding options as at Companeros. I hope you’ll check those out. Kevin, I think it’s been a great discussion about the APPLY operator.I know I learned some things and I think those listening will as well, to use Kevin’s pun to APPLY yourself a little bit better. Before we let you go we do have a couple of other questions that we want to ask you.

Kevin: Sure.

Carlos: Ultimately one of the things we want to do is create value. We want to talk about some things, help folks be better about in their positions. We also like to talk about what tools you’re using. Tools that help us to perform a little bit faster. What’s your favorite SQL tool? There could be a paid tool, free tool, but why you like it and how you use it?

Kevin: I’ve two of them, but I’m sure there’s somebody has already talked about “SQL Centric Plan Explore” before.

Carlos: It’s come up once or twice.

Kevin: Yeah. I’ll let that one slide. The other tool that I really enjoy is by a company called “Devart.” It’s called “dbForge SQL complete.” The reason that I like this tool, I actually ended up paying for it out of my own pocket. The reason that I like the tool is because it will auto-format your code.Now there are a number of other tools that will do auto-formatting of code, but my current employer, we have some very strict coding standards. We’ve some very specific coding guidelines and I would try out the other tools. They would fail for various reasons.

Sometimes, for example, one of the things that we have is that all store procedure parameters must be fully left aligned, but parameters in other circumstances should be indented and some of the tools would say, they’ve all got to be left aligned. Some of them will say they’ve all got to be indented.

This one actually allows me to get down to the level of detail that allows me to have the output look pretty much exactly what I want. It’s not perfect but it gets 95 percent of the way. More importantly, when we’re building up Dynamic SQL, I’m looking at the outputs and it can be just a mess of gibberish.

All I do control K, control D and now it’s at least human readable or closer to human readable, depending on who wrote the code.

Carlos: [laughs] Those formatting options can be very nice, very handy. We have one more question, but before we ask you that we want to talk a little bit about another opportunities that folks have to learn about SQL Server.If you’d been listening to the show, you know we talk about SQL Cruise — sqlcruise.com — you take the best of the SQL Server training. You put that on a cruise ship and you’ve got SQL Cruise done by Tim and Amy Ford. They put together a great training experience. You have four or five top-notch instructors there on the boat. Now, Kevin are you a cruiser?

Kevin: I have not. I need to do it at some time.

Carlos: There you go. I have an offer there for you and for anyone else who wants to go, Tim’s offered the listeners of the podcast, a $100 off if you want to go. If you go to sqldatapartners.com/sqlcruise, you can follow the instructions there about how to get your $100 off and Tim would be very excited to have you on the cruise.Who knows, maybe I’ll see you on board there. Kevin again, thanks for all this good information. We do have one last question for you and that is — if you could have one super hero power, what would it be and why would you want it?

Kevin: I would want “phase walking.”

Carlos: Phase walking? We’ll have to explain that to me.

Kevin: Yes, it’s OK. My favorite X-men character was Kitty Pryde, “Shadowcat.” She had the ability to walk through objects to be able to go up and down, left and right, walk through walls, go through floors. I’ve always loved that concept. I think that would definitely be my power.

Carlos: Why would you want that power?

Kevin: First of all, I need to go over to that other room. I’m lazy. [laughs] I don’t want to go to the door. I’ll just walk right through it.

Carlos: OK, very good.

Kevin: Second, I’m driving. Someone’s going to hit me. I hope that doesn’t happen but, if I could phase walk, if I can get my car to be able to just go through something, I don’t have to call the insurance companies.

Carlos: It can make traffic a much more easier to bear.

Kevin: Lots scarier for everyone else, but yeah [laughs] .

Carlos: Very good. We’re Companeros. I hope you appreciated the conversation, got some good value out of it. Again, if there’s something that we said, that you’ve questions about, you can hit us up on Twitter @CarlosLChacon. Kevin you’re also on Twitter?

Kevin: I am, it is @feaselkl.

Carlos: We do appreciate you Kevin for you taking your time to be with us.

Kevin: Glad to be on here.

Carlos: Companeros, I will see you on the SQL Trail.[music]

Children: SQL Data Partners.

Episode 12: Database Corruption

1400Database corruption is something that will eventually come to all data professionals.  While I have been fortunate to only come across is a handful of times, I was glad that I was able to get back the corrupted data and get everyone back on track.

In May of 2015, Steve Stedman set the data community ablaze with his database corruption challenge.  Some of the biggest names in the community jumped in with solutions to his challenge and I knew I wanted to have Steve on the show.

If you are interested in hearing more about his experience and his suggestions for dealing with database corruption, this episode is for you!

Show Notes

Database Corruption challenge
Database Health Monitor

Transcription: Database Corruption

Carlos L. Chacon: Welcome to the SQL Data Partners podcast. My name is Carlos L. Chacon, your host. This is Episode 12. I’m super excited to have Steve Stedman on the show with me today. He’s the Database Administrator from the Pacific Northwest. We’re talking about corruption today.

Certainly, that’s a challenge that all data professionals need to consider. Steve had a corruption challenge earlier this year and it was very heavily attended. Lots of big names threw their hat into the ring there. If they were paying attention, I know it’s something that I can learn from and I hope that you will today, too.

We are always looking for feedback. We hope you’ll give us some feedback either on iTunes or on Stitcher. You can hit us up on Twitter. I’m @carloslcachon or you can shoot me an email at [email protected]. We hope you’ll reach out to us. Let us know how we’re doing. If there’s topics you want to talk about, let us know that too.

If you’re headed to Seattle for the summit, the PASS Summit this October 2015, we are going to have ribbons. For my compañeros who are listening to the podcasts, it will say, “SQL Compañeros.” As you see me, please make sure to get your ribbon at summit. It would be fun to meet up there.

Also, hopefully you’ve noticed we have a new mic. We’ll be using this new mic from here on forward. Hopefully, the music and the words, there will be quite such a distinction. There will be a better listening experience. Again, thanks for tuning in compañeros. As always, welcome to the show.

Recorded Voice: SQL Data Partners.[music]

Carlos: With that Steve, welcome to the program.

Steve Stedman: Thanks, Carlos. Glad to be here.

Carlos: Thanks. I know you’ve talked a little bit about this in some previous forums, but go with us briefly into how you decided to create the Database Corruption Challenge?

Steve: That was interesting. In the beginning, I didn’t really think of it as a challenge. It started out as just a blog post that I was doing on my blog at stevestedman.com, explaining the specific type of corruption. After spending a few minutes on the blog post, I decided that rather than just explaining the concept, or showing how to do it, I thought it might be a lot of fun to throw it out as a challenge to see if anyone would like to participate.I posted that first challenge on a Saturday morning, and announced it on Twitter, and LinkedIn. Nobody responded. I didn’t get any feedback.

Carlos: [laughs]

Steve: After about six hours, the first solution was submitted by Brett Ozar, who then Tweeted about it, and it spread from there. I think his help really got it moving at that point.

Carlos: OK, very good. He’s known to be a mover and a shaker, that guy. [laughs]

Steve: Who knows, if he hadn’t been the first to submit a winning solution, it may not have gone anywhere.

Carlos: Interesting. You ultimately come up with 10 corruption challenges, and so I’m curious now. I’ll admit that luckily, and I’m knocking on wood here [knocks], I’ve only experienced corruption in a handful of cases. You’ve mentioned that you started out with a blog post. Did you think you were going to end up with 10? How did you know that you would be able to come up with 10 ideas, or different forms of database corruption?

Steve: Honestly, in the beginning I didn’t know I would be able to come up with 10. I just started doing it, and then I think it might have been week two, or week three where I figured out that I’d limit it to 10, and just go until that point.

Carlos: OK, very good. Now that you’ve had the challenge, you’ve had people submit, and of course you’ve graded them and announced a winner, and whatnot. Do you feel like your experience with corruption has increased? What’s been your learning experience through this event?

Steve: I think the learning experience has been great. I think going through it with some of the big names who were actually participating was kind of like being judged each week myself.

Carlos: [laughs]

Steve: When I would create the challenge, as to “Did I do it well? Did I do it right? Was it worthy of throwing out to the community?” I absolutely learned something every single week. I was forced to go for more and more difficult corruption challenges. I made mistakes going through it.If I made a mistake the people participating certainly caught it, and let me know. One of the things I had to do, because everyone out there has different versions of SQL server, when I put out a corruption challenge I myself had to solve it first to make sure that it was solvable with no data loss, but I also had to make sure that on SQL server 2005, 2008, or to 2012, and 2014 that it was doable.

Carlos: [laughs]

Steve: Some of them that were using features that didn’t necessarily work on 2005, for instance.

Carlos: I see. Can you give us an example of a database corruption challenge, that maybe you weren’t as familiar with before the challenge started? Do you have one of those?

Steve: Yeah, I am thinking probably week four week five. I forget which week it was. It was a database corruption challenge that involved no backup files. Let me think, the data had been corrupt and the only way that we were able to get the data back or to get the things back once the corruption was fixed was using Change Data Capture, which had been enabled on the database.I had not done a lot with Change Data Capture before that. There were a few things that I learned in doing backups and restore with CDC enabled. I am a lot stronger right now. However, I sure didn’t know what I was doing with them at that point.

Carlos: [laughter] OK. So let’s get into a little bit about that, about how you would go about attacking a database corruption. Again, my experience is that once you have identified that you actually have corruption, you need to then go in and identify how widespread the corruption is. Are there other components like that in the challenge as well?

Steve: Yes, absolutely. The big part of figuring out how to eliminate the corruption without losing any data is first addressing what data is actually corrupt. If that data is corrupt, how can you save some of it off, pull it from an existing backup or bring it in from an index or even extract it out of a DBCC page output, if you have to.How can you pull the data in a way that once you eliminate the corruption and when you run CheckDB or CheckTable for instance, and it just purges pages as they come across corrupt pages, how are going to refill those again? I think first understanding exactly what you’re going to lose, and then figuring out how you are going to get it back is one of the key things in being successful, at fixing that database corruption.

But the other thing that you should keep in mind is, once you are at that known point where you have corruption and you’re going to attempt to go fix it, you need to have a way to get back to the exact same point, in case what you do makes it worse because sometimes you try something, and you realize, “Wow, I didn’t realize half the table was going to go away.”

Carlos: [laughs]

Steve: “I thought it was just going to be a dozen rows.” If you’ve got a way to roll that back or to get back, even just backing up that corrupt database that can be restored and used again, if you’ve got a way to get back to the stage, it really reduces the risk associated with working on database corruption.

Carlos: Sure. That’s an interesting point. I think potentially, you may have referenced, I think it’s the DBCC command. Is it restore with corruption?

Steve: There is DBCC, CheckDB or CheckTable and with those, there’s different options. For instance, one of them is Repair Allow Data Loss.

Carlos: Repair Allow Data Loss. That’s the thing.

Steve: What that does, when it goes through, it finds pages that are corrupt. Again, a page is an 8K chunk of memory in SQL server. That 8K chunk may contain several to many rows of data out of a given table, and when it finds even a single corrupt row in that page, what it does generally is, it throws out the whole page, then rebuilds the indexes around it as if those pages had never existed.So you end up, even if you have a single corrupt row, you might end up losing 15 to 50, who knows how many rows to clean up that single corrupt row.

Carlos: That would be an example of you missing the data, then maybe you wanting to get some of that back?

Steve: Yes, absolutely.

Carlos: So taking a backup of the database at that point before you start to make changes would be part of your repertoire of fighting database corruption?

Steve: Absolutely. I’ve been making a backup for at least like, stopping the SQL Server, copying the MDF and LDF files off, so you can use them again if you had to as well. [inaudible 10:08] other options, we are just doing a regular backup.

Carlos: Just help me with that. I guess my understanding is wrong there. I was under the impression that if you stop the services when it went through its initial check, to bring the database back online, it would see the corruption and you might not be to get the database back online. I guess this is not your experience.

Steve: It depends on, on the type of corruption. Now, there certain corruptions, like there’s a certain page, page nine, in file one. That’s the boot page for the database. If that happens to be corrupt, you certainly don’t want to take your database offline or anything. There’s other things you want to look at first, before dealing with that one, but generally if it’s in an index or in a table somewhere, you can usually stop the database and restart it without any difficulty.

Carlos: Without problems. OK, very good. That’s good to know. How would you go with backup? We’ve now taken a backup of our database. We are ready to start making some changes to it. How those steps that we take will determine, I guess on the type of corruption that we identify in the database?

Steve: Yes, absolutely. The one thing is you’ve got a backup or a way to get back to the known bad state. You want to assess what good backups you have. There might be a case where the corruption occurred three or four weeks ago and somehow nothing caught it and all of your backups you have for the last three weeks are bad.

Carlos: They’re no good.

Carlos: There’s other cases where you might run into the corruption and you have a backup from yesterday. This is just fine. However, there could be transactions where a lot of data that’s changed in that times. You want to assess what you’ve got as your available tools to bring that data back in, a good backup or a bad backup necessarily.Then once you have a point or when you get back to the point in time that you’re right now, a way to start over if something goes wrong basically, you know what’s missing and you know what other backups or other places you have to pull data in from, then you want to figure out exactly what’s going to be lost when the corruption is fixed.

One way you can do that is you can do the DBCC CheckTable for instance, inside of a transaction. Let it fix the corruption. Look at what’s left in the table. If it’s not what you expected, roll back that transaction.

Carlos: I think that’s a very interesting point but not one that I would have used. That’s interesting that you put that out there. I’m curious, in your challenge you’ve come up with these little nuggets or these little potentially ways of fighting corruption based on the different options.Now I know that there are the parameters of the challenge in there. Did you take the time to write up, “Here’s what I would’ve done to fix it”? I guess the solutions are there as well.

Steve: Yes. For each of the 10 challenges, the way I worked each week was I posted it and I usually gave people from three to five days as an opportunity to solve it themselves, submit their solution and then once all the solutions were submitted, I posted the results of who the winners were each week. In some cases, I would post my solution, “Here’s how I solved it.” In some cases I would even post, “Here’s how I corrupted the database.”

Carlos: OK.

Steve: There’s usually a lot of warnings on that one that says, “Don’t try this at home.”

Carlos: You’re right. [laughs]

Steve: Then I would find when there was either the winning solution from one of the participants each week, I would post that to my blog as well or if there was someone who was not the first one to solve it, but they had a really interesting solution that I hadn’t seen before, I would post those as well. Of the 10 different competitions, for every one of them is at least two posted solutions and sometimes three or four.

Carlos: OK. Very good. So folks that maybe wanted to sharpen their corruption skills, they could still use this to go out, get the corruption challenge and try to solve that by themselves and then take a peek at the answers, if they get stumped?

Steve: Yes, absolutely. If you go to stevestedman.com/corruption challenge, that will take you to a summary page that shows you here is the 10 weeks of the competition and it will include links to the initial challenge as it was posted, and the solutions from each of the winners each week.Then from there you can jump to other solutions. But as long as you don’t really cheat and jump off to that solution before you’ve tried to solve it yourself, you’ll be in good shape.

Carlos: We should say that, those links to Steve’s site and the database corruption challenge will be in the show notes as well. So, if you come across those on the podcast page, there’ll be links there.You mentioned being careful with corrupting the database in your own environments. Any thoughts or parameters about doing that or would you recommend folks just grabbing what you’ve created for them as the best first step?

Steve: As a first step, I would certainly recommend grabbing what I’ve created out there and using that. However, once you’ve mastered all of those you want to go on and try some more, you can give it a try to corrupt your own database. However, it’s much more challenging than you might think.For each of these challenges that I did, some of them took me 20 hours of preparation to put together the challenge, test it, confirm and post all the details on it. Others didn’t take quite as long. They only took six or seven hours.

There’s a lot of work that’s gone into there and you don’t think you can just start messing around in memory in SQL Server and think you’re going to end up with something that’s going to be usable, a corrupt databases that could be recovered in any way.

Carlos: Recoverable. Having said that, how many instances that you start to muck into the SQL Server and not be able to recover?

Steve: Let me think back. There were a couple that I tried where I realized after I got into it that it wasn’t going to work in a specific environment. For instance, the Change Data Capture one that I enabled, that’s a feature that tracks changes on certain tables in SQL Server.That was one that I originally built it on, I think SQL Server 2008R2 and then I realized.”Oh, that features is just not there in the older versions of SQL Server. So, I’m not going to able to make it work.” But going through it, there were certainly ones where I corrupted the database and even knowing how to fix it, it took me two or three different tries to get to a solution that actually worked. But before doing it I always make sure I had a way to get back to that known states, so I could try again if I had to.

Carlos: That’s an interesting point,. I guess. I know we’re going to talk a little bit about generalizations. If folks wanted to deep dive they should got out to your site, take a look at those different instances. But I think if you were to generalize that corruption and being able to repair from corruption, I think a couple of things you talked about are important.One, being able to get back to that state and then two, would you actually recommend trying your change in that restore copy, if you will. You talked about testing your data to make sure that’s what you wanted it to happen or you go ahead. Let’s assume that this is a normal “active transactional system.” People are using the system at the moment. Are you going to be making your changes in that prod environment or you’re going to be making that in the copy environment?

Steve: Eventually you’re going to have to make it in the production environment, to replace that corrupt database. But whenever possible, I would want to be working in an alternate environment and that alternate environment may be a second copy of that same database with a different name, running on the same server. Or maybe just to restore that database running on a different SQL Server depending on what environment you have around to be able to work on those things.But I certainly wouldn’t want to start in production on that live database as the first point. However, there may be certain scenarios where you’re working through and transactions are failing or the system is just not usable because of that corruption, and you may want to just block users out and say the system is down until the corruption can be resolved.

It really depends on the scenario. If it’s something that the corruption’s there but it’s not really posing a problem for other people, it might be OK to let it run for half a day or a day while you actually work out the problem.

Carlos: Yeah. Particularly if that corruption is in older data, for example, archive data, things like that.

Steve: But if that corruption is in a bank account type database where it has recent transactions in it, and every time somebody tries to use it, it fails and money is lost somewhere, you’re going to want to block people out until that gets resolved.Once you’ve got your plan on what you’re going to do, what you think you’re going to do to fix it, and if you can run that plan by someone else who knows little bit about corruption, maybe more than you do, whether it’s somebody who competed in the challenge or even me or somebody you work with, if you can run that plan by someone, often times when people dealing with corruption or this type of issues, it’s not always when you’re well rested and bright-eyed and bushy tailed. It’s when you’re late at night or sleep-deprived.

Sometimes, if you just run it by someone else, they can ask questions or check to make sure you’re not going to do something really foolish. It happens. People make mistakes, but anytime you get someone else’s opinion on that, it might help reduce those mistakes.

Carlos: We are talking a little bit in generic terms, but are there things that as you went through the challenge that you may be found were like, “Oh, my thought process was that I would overwrite the table or let’s say overwrite the data from the backup and I know now that that’s the best way to go about that.” Do you have some no-no’s of database corruption restoration that you learned as a result?

Steve: The one that I would say to stay away from at all costs is the undocumented DBCC WRITEPAGE command. If you want to use it to go create a corrupt database, it might be a good thing to use, but if you’re trying to fix corruption or actually try to bring a system back, there’s a lot of danger in using that command because, really, what it’s doing is it allows you to write data over any chunk of memory inside of the database.One of the tendencies of people who learned about it in the corruption challenge was just to use that command to overwrite what might be broken somewhere, but there are some negative things that can happen with that, and I don’t know it’s for certain, but it is rumored that when you use that, somehow it tags something in your database so that Microsoft is no longer able to support your database, and they’ve kind of got an out if you ever had to call support.

I don’t know if that’s true, that’s only a rumor I’ve heard, but I would recommend staying away from DBCC WRITEPAGE unless you’ve used up every other possible option, but I would never use it on a production database. You want to use it on a copy, and then maybe copy the data out of that other database.

Carlos: There you go, very good point. That’s a good advice for database administrators in general, data professionals. Always give yourself a way to get back.Another component that we wanted to talk about with corruption is finding a source of the corruption. Now this is a little bit different in your scenario because you are the one causing the corruption. [laughs] So what are some of your thoughts about how you go about finding the source of corruption, once corruption has happened?

Steve: I think that probably falls into two different camps to look out there. One would be that one off corruption that hits and you fix it, and then you don’t see it again. Oftentimes, you may not be able to track down when it occurred.

Carlos: Just like [inaudible 23:38], maybe there was hard reboot of the system, something quite yet written quite right.

Steve: Yeah. Then there’s the ones that you get corruption for some reason, it gets fixed, and then a few days to a week later, it comes back again. When those kinds of things happen, generally, there’s something going on that really needs your attention to get fixed.For instance, it could be a networking issue between your SQL Server and your storage on the sand. There’s a story I heard one time about a janitor who, when they came in to do work in an office, they unplugged the SQL Server every night to vacuum around it, and then they plugged it back in. Those kind of things can certainly cause corruption.

But being able to figure out when it occurred and…once you figured out when it occurred, then you can start looking at other events related to that. For instance, if you know when it occurred, you can look at the error logs, you can look at the Windows error logs, as well as the SQL Server error logs, and try and figure out what events are happening at that point.

If you have it occurring again and again, is it happening at the same time everyday or every week or is there something going on at that point in time that’s causing it or is it just completely random? Which is going to be a little bit harder to track down.

Carlos: We’ve talked a little bit about what you do about the corruption, just identifying the source. Now that you have gone through this, any changes into how often you are checking for corruption?

Steve: Absolutely.

Carlos: [laughs]

Steve: What I’ve learned with this is that the sooner you can detect corruption, the better off you’re going to be trying to resolve it. For instance, if you have corruption today and you find out about it tomorrow, and you’ve got a backup from yesterday that you can use, that you can pull data back in from, you’re a lot better off than if it’s six months out that you discover you had corruption.My recommendation is that you do DBCC CHECKDB or CHECKTABLEs or certain subsets of those as often as you can without having a negative impact on your system. I don’t mean hourly, but if you can run it on key databases at least once a day, that’s great. In the real world, that doesn’t always happen, so I’ve set up scripts and sometimes with databases when you can run a subset as them every day of the week, and by the end of the week, every one of your databases has been checked.

The other thing I like to do with it is to make sure you run DBCC CHECKDB against the system databases, the non-user databases, Model, Master, TempDB, those type of databases, MSTB, because if those get corrupt, it may have a different impact. If your user data gets corrupt, but you want to know about it so you can fix it as quickly as you can.

Carlos: I had a master database corrupt on me a couple of months ago. [laughs] May be just because it’s master, and we didn’t know…so this was a consulting engagement, I wasn’t with them. They called me up and were like, “Hey, we have this error and it looks like our master database is corrupt.”They didn’t know about it till they restarted SQL Server and then the database servers [inaudible 27:11] because master was corrupt. That’s definitely what you want to know about [laughs] before you start treading things off because you may not be able to come back.

Am I naïve or am I incorrect in my thinking that the master database, if there was a corruption issue, I don’t want it’s more important, but a little bit more bothersome than a user database or is that not good thinking?

Steve: It’s certainly more bothersome, and that you might not be able to restart your SQL Server without quite a bit of difficulty.

Carlos: Yeah, that’s a bit of a bothersome.

Steve: I think, additionally, most people aren’t as experienced at dealing with backups or restores or rebuilding of the master database if they have to. I think people are a lot more familiar with, “Maybe I’ve got corruption, I’ll just pull in a backup from last night on my user database.”

Carlos: Yeah, a little more everyday issue, and having to overwrite that.

Steve: As part of the corruption challenge, I did not issue a corrupt master database to anyone because I didn’t want to destroy their environment. [laughs]

Carlos: The community thanks you for that. [laughs] I think this has been a great conversation, definitely being able to learn more about corruption. I know that I still have a little bit of homework that I need to do and get in there and read some of those posts. Is there anything else that you would like to bring up as a result of challenge during corruption that you think that the listeners would [inaudible 28:57]?

Steve: Plan ahead. I like to think that fixing database corruption is a lot like when people have to do CPR. At the event that you need to do either of those things, if you have to go online and do the research and figure out how you’re going to do it, you’re probably not going to do it as well as if you trained ahead of time.They’re both situations that can have pretty negative outcomes, although CPR would certainly have a more negative outcome than most database issues. At least with database issues, if you do it right, you can try again.

Carlos: That’s true. You get multiple opportunities.

Steve: Train, plan ahead, and act like you’re going to have a corruption tomorrow and train for that.

Carlos: Thank you, Steve. We do appreciate the conversation. We do have a couple of other questions. One of the things that we like to do is try to provide some value to folks and see what other data professionals are using and doing. So one the things we like to talk about is your favorite tool. This could be paid tool, it be a free tool.I know you have actually written one, so I am not sure you want to talk about that. But what is your favorite tool and how are you using it?

Steve: I am a little bit biased on this one. My favorite tool is Database Health Monitor. That’s a tool that I’ve spent the last five years building. Earlier this summer, we got to version 2, finally, and then just a week or so ago, I released version 2.1. It’s a free tool that I’ve make available to the SQL Server community. What it does for me is really a couple things.First, it provides a quick way to get the status of your SQL database, your jobs, your queries, your user connections, general health of the database without having to go through some of the more challenging ways of getting that out of SQL Server Management Studio.

Second thing that it does is you can install a monitoring database that will keep track of historic waits over time, and with the small job, it will track wait statistics and then reports that will show you, historically, back in time from when you first started monitoring what was going on, what was causing SQL Server to be slow, what was causing it to wait on things.

If someone says, “Hey, yesterday at two o’clock, the database was slow. Do you know why?” you can go back and see what queries were running and why it was waiting at that point in time. There’s a lot of other tools out there like that but they are very expensive, and this is a free thing that I did to give back to the SQL community.

Carlos: I think from that history perspective, [inaudible 31:49] the value add of those third party tools is the capturing of the data. It gives you that history perspective, and then some basic reporting on top of that, so to have a tool that folks can use to get both of those things is very, very helpful.A lot of people come to the database profession in a variety of different ways, and we all love it for different reasons. Right? I’m interested to get a story or some background on something that helps illustrate why you enjoy being a database professional.

Steve: One of my favorite things is working on performance tuning on SQL server. I know we talked about corruption, but performance tuning is a lot of fun too. The thing I love about this is when you can solve problems for people where you get immediate feedback at how valuable it is.An example of this was, a couple of years ago I worked on a system that had about 200 users who each had to perform from a repetitive task about 15 different times a day. That repetitive task took about five to seven minutes to complete, waiting on a single query to run. I did some performance tuning on the database, and I was able to reduce the time it took that from five to seven minutes, down to three to four seconds.

Carlos: [laughs]

Steve: It was a good gig. I got paid for it, and all that, but what made me really excited about it was when the client sent me a response that was basically an equation that showed how much time was saved by their staff every day. It worked out to be something like 20 to 30 full time employees by fixing that one query.The client was happy, I was happy, and it was just a few hours of work. I was able to save them so many hours of work every day. I was able to make a significant impact on their business. That’s the type of work I love to do, where you can do something, and get feedback, and it makes a difference in people’s lives.

Carlos: That’s right. That’s the nice thing, and the scary thing, all at once. That you have access to be able to affect that change, but then you also have the responsibility to come along with it. The corruption that we just talked about is one of those areas that you are then responsible for. Always being able to help, I think that’s a very common theme throughout the data profession.

Steve: Yes.

Carlos: Very good. Before we let you go, Steve, we do have one more question for you. That is, if you could have one superhero power, what would it be, and why do you want it?

Steve: Wow! That’s an interesting one.

Carlos: [laughs]

Steve: I think I would have to say time travel on that one.

Carlos: There we go.

Steve: The ability to jump back in time, and then to return to the present, kind of a “Doctor Who” type time travel, where when somebody comes to you, and they’ve got database corruption, you can just jump back in time and figure out what happened when it occurred.

Carlos: That’s right. [laughs]

Steve: I think that would be a lot of fun.

Carlos: I live in Richmond, which is a city that has lots of history in it. I’ve always wanted to kind of jump back in time to see what it was like. If all the stories they tell us in the history books are really true. [laughs] I think it would be very interesting.

Steve: There is so much to learn.

Carlos: That’s right. You have a little bit more noble pursuit in avoiding database corruption, but hey…[laughs]

Steve: I would do a bunch of fun stuff with it too, I’m sure.

Carlos: No question. [laughs][background music]

Carlos: Steve, I really appreciate you being on the show today.

Steve: Thanks for having me. It’s been a good time.

Carlos: Yes, and compañeros, I’ll see you on the SQL trail.[music]

Recorded Voice: SQL Data Partners