Episode 97: SQL Server Replication

Episode 97: SQL Server Replication

Episode 97: SQL Server Replication 560 420 Carlos L Chacon

If you have ever been responsible for a replication environment, you know what a pain that can be.  Many of the guests of the show will point out their frustration with replication and many are loathe to use it–myself included; however, there are a few brave souls out in the world who do use replication regularly–and even use it well.  Chuck Lathrope is our guest today and he talks us through some of the components of replication and why we might not need to be so skittish about this long time feature.

We cover all the basics including an overview of Transactional Replication, Merge Replication, Peer-to-Peer replication and Snapshot Replication, the concept of publisher, distributor and subscriber in relation to replication.  The most important piece is Chuck gives us some good business cases for why we might use replication and I think you will find the discussion interesting.  Thanks for joining us on the SQL trail!

 Episode Quote

“I like to talk about it to waylay fears and give out some best practice knowledge”

“Stick to people who know what they are doing”

“The real ultimate goal of replication is to offload work from your main source of truth database server”

Listen to Learn

– Why you might use SQL server replication
– Different types of replication
– Use cases of transactional replication
– Replication and trace token
– Troubleshooting tips
– Where to get good information







About Chuck Lathrope

Chuck Lathrope is a Data Architect for Limeade, a SaaS health and wellness platform. He is a two-time SQL Saturday event speaker on replication, was a Top 5 nominee of Red-Gate Exceptional DBA Award in 2009. With over 20 years of experience in IT, he have used different operating systems as well as worked with different applications. He was also a Windows System Administrator for 12 years.

Carlos: Chuck, welcome to the show.

Chuck: Thank you very much guys, glad to be here.

Steve: Yeah, it’s great to have you on the show. I know we kind of been talking about this for a few weeks trying to get together, and happy to have you here as a guest.

Chuck: Yeah, I know. Glad to be here guys.

Carlos: Yeah, even if you are a west coaster you’re still accepted here.

Chuck: Oh, great!

Steve: Anyhow, Carlos, there are more west coasters than there are east coasters right now.

Carlos: Oh men.

Chuck: Yeah, we’re winning.

Carlos: Well, what I lack in numbers I make up for ego apparently.

Steve: Alright, so today’s topic is on SQL server replication. I know this is a topic that I first heard you talked about at SQL Saturday in 2012 and I found your presentation there really interesting I didn’t picked it up very quick very soon after that. But then when it came around to a time that I needed to use replication I lean back to some of those things that I learned from you back then, so it’s great to have you on the show to be able to talk about replication today.

Chuck: Yeah, great, thank you! Yeah, I love talking about SQL server replication at SQL Saturday events. I try to do it as often as I can because I have all the battle wounds from having a very large environment with the transactional replication and so. Whenever I can I like to talk about it to waylay fears and give out some best practice knowledge.

Carlos: Well that’s interesting because I think there is a lot of fear around replication. We’ve brought it up in Episode 56 and yeah nobody wants to touch it with a 10-foot pole and here you said you have a large environment with replication, so I know I’m interested to kind of get in here and talk about some of your experience and why maybe we have so much fear around it.

Chuck: Yeah. I think the difficult part about replication is there’s a great job of keep retrying even though there is a failure. So if you’re not monitoring that correctly it will go sideways and if you’re not monitoring appropriately then it can get all kinds of bad and then you get to stuck situations like, now I’ve got this gigantic log file or distribution database. How I’m going to fix it? And then you go on the net and google stuff and there is so much bad advice out there that would basically destroy your entire environment and you will get basically a lot of wounds from trying to follow some of the advice out there. It seems to me like most of the advice out there is for people doing it like a dev environment not a real production environment so it’s not great. I usually stick to people like Hillary. Anyways, there are some great people out there that do from the Microsoft side and from consultant side that give great recommendation for replication. I mean, stick to people who know what they are doing, not random posts on the internet.

Carlos: Well, so I guess shall we go ahead and kind of let’s tee up some of the components around replication and let’s talk about some of the moving pieces.

Chuck: So there are three types of replication. There is actually four but one has been deprecated, that updated subscriptions. There is Transactional Replication which is always running, always sending out data to your subscribers. And then there is Merge Replication which allows you to do bidirectional updates, and then there is Peer-to-Peer Replication, and then there is Snapshot Replication. The snapshot really is you do it on less frequent basis than all the time like transactional. So like in a data warehouse scenario I could do like every 6 hours, or every 12, or every 24 hours and get that over to my data warehouse, so that’s what snapshot does. And my expertise happens to be in transaction replication.

Steve: Ok, so then what are some of the business cases or real use cases around transactional replication then?

Chuck: Yeah, so the real ultimate goal of replication is to offload work from your main source of truth database server. So you got multiple databases and they can all
be a part of different publications then send out to subscribers which are your clients that receive that data. The main goal there is just to offload work on off of that main production server and that’s kind of goal in life.

Steve: So occasionally in high availability or disaster recovery conversations somebody will bring up replication as a possible solution and I’ve never had a good feeling about that and I’m just curious what your thoughts are.

Chuck: Yeah, well just like you guys were talking about in Episode 59 and higher availability solution. Replication wasn’t really designed to be a higher availability solution. It was really designed to replicate some data to give off to other servers so they can do processing of that same data for whatever purpose you may need it for. Because what you can do with replication is published out tables or if you have Enterprise you can publish out a view and you can even publish out to your store procedures and functions and whatnot and your indexes. But you don’t have the security that kind of goes along with that and then there is still automatic failover anything like that. Your best option there is to stick with always on availability group or log shipping or database mirroring to do that high availability.

Carlos: Ok, so if I really want to just move some data off to another server for reporting purposes that’s where transactional replication or snapshot replication might be a good solution but not for the HADR.

Chuck: Yes, and a nice thing about the replication is it’s available on all editions and versions of SQL server so I can have Standard Edition of SQL out there and use that. I can even use that in my distributor server. So kind of give the bigger picture of that transactional replication. You have the publication server and that’s your main source of truth data, and then you have the distribution server that holds the distribution database that then sends that information to the subscribers so it’s like there is always a 3-way kind of path goes from publisher to the distributor to the subscriber, and you can have many subscribers and then you have one distribution server that could actually be on the publisher or the subscriber side. You don’t have to have a dedicated one. But when you start getting to bigger environments you definitely want a dedicated distribution server.

Steve: Ok. Now, I know you said it works with all of the latest versions of SQL server, when was replication first introduced?

Chuck: It’s been forever. I think it even came back in Sybase days. It’s been around that forever. It basically just pulling data from your log files looking to see which of those transactions are flagged as mark for replication and then sends that on down the path, so it’s been around for ages.

Steve: Yup, so with the replication going from a publisher to the distributor to the subscriber does the subscriber have to be on the same version of SQL server as the publisher?

Chuck: No, it doesn’t. You can have a mix. What you typically have as your distributor just needs to be at a level of your publisher and so you could have 2008 R2 publisher and 2014 distributor and a 2016 subscriber so the distributor just needs to be at a level according to your publication so you can have high level on your subscriber but you can’t just go lower on your distribution server than your publication server.

Carlos: So we’re talking about some of these terms, right, publisher, distributor, subscriber. What the original documentation release that I saw mentions kind of this newspaper hierarchy which some of our millennials may not even know
what that is. And so I feel like, and one of the things that was brought up is just some of the overhead and maintaining some of these. Where has some of the love been? Because it’s been around so long where has some of the love from Microsoft been with replication?

Chuck: Yeah, good question. So the latest kind of feature set just with replication is the support for always on availability groups such that if your main active server, primary.

Chuck: So you’re primary server can move to any of the other secondary servers and replication will go along with it, so that’s kind of the main reason support. And also that same team also does change it to the capture and change tracking so some of those features have been receiving a lot of love in the past few editions of SQL server. But SQL replication has been around for such a long period of time. It’s not a lot more features they can add on there that I could really think of other than maybe helping out with support ability of replication, so most of the love has gone that route.

Carlos: Got you. And I think that’s where I would have to agree is that in a setup like you mentioned like all the kind of the bad advice from fixing or troubleshooting. I guess I would like to see a little bit more love there because I feel like a lot of the fixes would just start over again. You know, and so being able to tap into that a little bit easier and feel better about what’s going on would be super helpful.

Chuck: Yeah, it would especially for people just getting into it. It could be overwhelming to do. I mean I was so passionate about in the past days to help author a chapter of a book SQL book from Redgate just on replication and kind of hear this. Here is how you can kind of monitor and here are some best practices for it just to kind of get more information out there because it’s been relatively few books out there. I can give a link to that. And then also Sebastian Mein which I think you guys had on a recent podcast. He created fundamentals of SQL server replication. It’s a free download from Redgate’s books too is a great starting point in kind of getting into replication.

Steve: Yup. Now, on the front of troubleshooting replication don’t you have some script available that you built that are available maybe on GitHub that can help with that? I know I’ve used them a couple of times I just forgot where exactly they were.

Chuck: Yes, so my SQL Guide Track on GitHub has some replication monitoring script for live use. And It kind of what I’ve lessons learned in replication it gives you the ability to create what are called tracer tokens. And it’s basically you insert this tracer command into your log file and it’s basically stuck at the very end and it watches it go through the entire process to your subscribers to see that latency that involved in there. You can do that with the replication monitor tool which is a GUI tool that you would use to kind of monitor and maintain replication. But the trouble with the tracer token is that it’s not really, you know, if you’ve got a busy environment that’s kind of overwhelmed currently, that tracer token might take ions of time to get through and it’s you mainly do it or you can automatically script it. The Microsoft IT guys actually have upon one of the free sites for script center where you can monitor replication. And you’ve automatically created this tool that would just continuously insert tracer tokens and you could see kind of health with it. But I found that a little bit limiting so my trick to tracking replication is look at the distribution database and figure out how many commands are waiting to be distributed out to the subscribers, and then I monitor that on a frequent basis like every 15 minutes or every half an
hour and it sends me an email. And it’s kind of gone above some sort of threshold, and that threshold is a variable amount that I can put in as a parameter. Because in some environments, you know, 5,000 records behind it is a big deal. On the other environment it’s be hundreds of thousands of records behind if you have a really gigantic installation. So that’s what my monitoring kind of does. It just monitor that queue that’s basically up there in distributor and then I also monitor for kind of pain points that replication has.

Carlos: I apologize. Before you get into that, I want to like I guess stop, one and make sure that everybody knows so all the scripts and things we talk about today will be available at sqldatapartners.com/replication so we’ll make sure to put all those things up there. But you had talked about tracer token, so I guess I’m all of a sudden thinking, it’s like I’m thinking of a tracer and that’s not profile, all of a sudden there was something that came to mind that was like that but. This is new to me, right, so it almost sounds like this is an object that I want to create in SQL server that’s going to start capturing commands which again kind of sounds like profiler or extended events. Is it just something that is specific to replication? I guess help me define what the tracer token is.

Chuck: Yeah, it’s very specific to replication. You basically you are sticking in this special command. You can use T-SQL that create this little tracer token or you can use the replication monitor tool to insert that tracer token and just watch it in the GUI. And it just shows you time from your publication to distributor and then the time from distributor to subscriber. Basically this goes through that path. You don’t see it just happens kind of behind the scenes kind of little hidden thing that happens but you can programmatically watch that information and see the results of it.

Carlos: Ok, so I don’t need to create it or modify it. It’s already there. It’s just what I’m using to collect the information from.

Chuck: Yeah, you have to basically interject it. It’s like putting a dye into a stream. You put this dye, a little drop in, and you can just do like one at a time. So you have to manually put that dye drop in the stream and you watch that dye drop go the entire path to the ocean I guess.

Steve: But in this example, that dye drop as you refer to it is really a record being injected into the transaction log. And then when it’s complete that means that it’s been through the part that catches the transaction log and moves it to the distributor and then actually ends up on a subscriber side. Conceptually it would be like almost like if you have a table and you just inserted a row to it and then you knew on the other side that that row actually arrived.

Carlos: Because that’s the one you’re looking at.

Steve: Yup.

Chuck: Some people even use what I call a canary table, so this create a generic table and then it will update the date on it. So it will be just a table of date of one column and this is just the current date and so replicate that up and use it kind of like a canary table so they can look at the subscriber and see how basically latent they are because it should be as close to the current date as possible. That’s another method you could use for monitoring.

Carlos: Interesting. Yeah, I haven’t heard of that one.

Steve: So I know with the tracer token in my experience of using the replication monitor to go through and inject tracer tokens on a few dozen different publishers. It can be painful clicking through all the screens and it sounds like you can do the same thing with T-SQL to just inject the tracer token. To me that’s one that I wasn’t aware of and it seems like that I could really speed up the process of troubleshooting, what’s working and what’s not working with replication.

Chuck: Yeah, I mean the scripts that I have are pretty useful. It doesn’t go down typically to the subscriber level though so I have the typical monitoring of jobs and failures. The standard best practices that all DBAs do for all production databases you need to do that for all the components in the replication including your distribution database and your subscribers so just the best practices known on jobs fail because all the replication jobs are basically kicked off with SQL agent
jobs so you need a monitor for failures of those and be watching that.

Steve: So you mentioned these jobs as SQL agent jobs and it seems like there are several of these jobs that are created when you initialized replication. What are all the jobs that are created in and what specifically do they do?

Chuck: So there is the log reader agent job that’s basically sitting on you master database you’re doing the publication on that is reading the log file. Then there is the distribution agent. It’s sitting in either one of two locations at your subscriber or at your distributor. The reason you have that is what we call is there is a push and pull scenario when you get the data to subscriber. And when you do a push all the work that is happening on your distributor server and when you’re in a pull it’s done on the subscriber side. So if you have a really gigantic environment and you have this dedicated distribution server you want to even offload some of the work off of it so you actually have what’s called the pull subscriptions in it. Basically you think of it, “I’m the subscriber. I’m going to the pull the data from the distribution server so I’m expanding all the work to figure out what data that I need to grab and I’ll go to distributor and grab that information. And the push scenario is usually in your smaller environments, and smaller I mean like you’ve got 5-10 kind of subscribers. In that scenario the distributor is doing all that work and pushing that data down the subscribers. So this is kind of the performance tricks that you kind of switch going from push to pull.

Carlos: Right. Now, what’s interesting is we talk about these jobs. We kind of almost talk about, you mentioned reading it from the log, right? The log is a sequential operation. You know, the way that it writes and things. Ok, and then you’ve talked about workload. Now, obviously the answer is it depends but from a percentage, a generalization perspective, is the reason that replication gets out of whack is because it just gets overloaded and you just can’t catch up?

Chuck: That is one of the scenarios, and then which if you’re monitoring properly it should be good. I mean you could have not pick the proper server for your distribution server, and maybe it’s kind of poor performer, or doesn’t have enough RAM, or doesn’t have enough CPU. Typically it’s I/O problem. So you find out quickly that you have tons of transaction you’re trying to replicate and you’re overwhelming your distribution server. Because in smaller environments you can do it all setup in one server if you wanted to. It’s not really I don’t know why you would ever want to do that because the whole point is offload work off of your publication server but you could overwhelm any of those…

Carlos: So getting back in there and I think you brought up or made a major point at least in my mind and that is the reason that you would use replication and again the offloading component. I think a lot of times replication gets jumped to because you’re trying to keep to systems in sync or something like that. Again, some of these new features maybe a little bit better at that but you’re potentially using replication for the wrong reason. And so replication get’s installed on that main server and so it just has to do with more work to keep up with all of that.

Chuck: Yeah, that all the truth. I think people are trying to use this because it’s available on all editions. It is like, “Oh, I can use Standard edition and save a lot of money”, because you really don’t have replication. I think database mirroring can do in Standard edition with two nodes. I can’t remember now. But now in 2016 always on availability groups can do two nodes kind of cluster.

Carlos: Right, they give you the read-only, or you can’t read from it if it’s available.

Steve: So jumping back a minute to when we’re talking about the different jobs on the
publisher, distributor and subscriber. One of the tips that I heard recently and I haven’t actually try this myself but it was a suggestion that for those jobs, it was suggested you go in and add a second job schedule to it so if something goes wrong and that job fails. There will be a second schedule there that would restart it. Is that something that you’ve heard or is it a common best practice?

Chuck: No, I mean, replication does retry and retry so, and the distribution agent where you automatically retry when you hit a failure, like if you’re like automatically scheduling like a snapshot, and it fails, that could be a scenario we might want have these multiple things. But this like you’ve said before, proper care and feeding have your SQL server and monitoring those job failures. I mean, I could see it in a maybe a snapshot scenario but I wouldn’t see it in the transactional replication because it will just continue to retry. So basically what it does is it retries and fails. You know a minute later retry again and fail, and it will do that forever until you basically fix the problem. And usually the error message is it provides are pretty useful.

Steve:  Okay, great.

Carlos:  Well so I guess getting you, kind of keeping in theme with that, you know, again the sky is the limit but what are your, you know is there like a general rule percentage-wise other than just that process getting overloaded? Why does that stuff get out of sync?

Chuck:  Yeah. You know I think it’s kind of these random hiccups that kind of happen and when you have these bigger environments, you know there’s something happens. It like the common error that I see even in my environment is that the road doesn’t exist at the subscriber. Like well, why doesn’t it exist? So why don’t you just add it?

Carlos:  Right exactly. Like you’re subscribing, you should get that real. That’s part of the process, right?

Chuck:  What the heck happened to that bro? And so then you have to spend time to figure out because everything starts backing up once you have that one error. So you have to do tricks like you can skip that one row or maybe you figure out what row it is. There’s a sp_replic commands that can get you information about the transaction that was trying to apply to subscriber. And you could figure out what row it is and go manually add that row. There’s also a command line tool diff.exe I think. We’re basically at the command line, datadiff. It can do a datadiff between your subscriber and your publisher and tell you what rows kind of missing and kind of screwed up and actually fix them for you.

Carlos:  Interesting. So you’d be okay and I guess to that point so the row being missing in the subscriber, is the common one of that that I’ve seen, you’d be okay with us just basically you know scripting that out and inserting that row?

Chuck:  Yep. Just to make it go past that error but I usually find this, usually like a group of commands that somehow didn’t make it to subscriber. And there’s a, you could have the build that you two, let’s say is, there’s different agents that you can set up on basically all these jobs. But the ones I typically modify in a short term, if I just wanted to skip errors because I know that it’s trying to update a row that got deleted for some reason that doesn’t exist in the publication. So it really got and skip errors. And so you can setup just the SQL agent to use this new agent that’s kind of like an agent profile is what it’s called. So I could change this agent profile and say skip errors and so skip a bunch of errors but then you really want to use that Datadiff 2.0 to figure out what rows are kind of out of sync and you can just manually update subscriber. And sometimes it really comes in that somebody on subscriber actually pull in and deleted a row but in reality is I know that’s typically not the case because I’ll make that security read-only on the subscriber side. So it’s like there’s no way it got deleted but somehow it just, you know, row missing so you got to do every do to update subscriber and you have full rights to do. You can do whatever you want to subscriber side and replication won’t stop you. You can even make a setting that just says, I don’t want to delete any of the data. So when the delete happens on the publication I can turn the setting on replication say don’t delete this row, if you wanted to do that for your workload.

Steve:  Okay sometimes that’s something that we talked about I think in SQL Saturday recently was with that when if the stored procedure was updating a million rows. I just clarified to make sure they’re right. Yet, if you didn’t replicate the stored procedure call it would send all millions of those updates across through replication but you could just push that store procedure call so it gets called to neither side. And then it’s happening sort of independently on both sides rather than pushing it row by row. Is that right?

Chuck:  Yeah. Yep.

Carlos:  Okay so now help me kind of connect the dots there right because in my mind I’m thinking replication, that’s by object right. I go in and I replicate this column with this table and whatnot. How do I setup or how do I visualize the process of saying when this store procedure gets executed that’s what I want you to take down instead of updates and deletes to this table?

Chuck:  Yeah. So you basically configure replication and tell it that I want to use, to replicate these calls to these store procedures. And I honestly haven’t used that use case because usually in my environment it’s like I can’t just describe it as one transaction because it’s related to maybe some other data that’s within the system or the data comes from something else. I haven’t been able to ever do that but you basically just tell replication I want to use to replicate these commands with some stored procedures and you just tell it and it will. If everything’s right it will use that store procedure call and replicate that rather than all the individual changes that happen to your table.

Carlos:  Right. Well I think that’s a pain point for a lot of people. And going back to the whole subscriber thing and like checking when the subscribers have issues, is that when they have, to say more than a couple, even then let’s just say you have three. When that’s near you had, I’m still how to troubleshoot one, so the source system. Two, my distributor which in you said it should also be another system, and then each of the three subscribers. So that’s five servers that I have to go and kind of do something with. And I think that’s maybe one of the pain points around replication. Doesn’t it give you a great way to centrally do all of that?

Chuck:  Right. Well the replication monitor allows you to see all those three servers and help you with troubleshooting. You know, I was in an environment where we had like 8 publications and like 30 to 40 subscribers in a dedicated distributor. And we use to have replication monitor up on our knock window and people would, you know look for errors and stuff happening there but that was just unrealistic and that’s why I created those monitoring tools to look for those kind of errors or replication latency where you’ve got tens of thousands of commands that haven’t been pushed to the subscriber. And it’s like semi-alert and then every 15 minutes I see the trend, if the trend is going down and everything’s good, if the trend is going up something’s wrong because that
replication will continuously retry even though it might’ve complete error. And if you don’t monitor that well enough you come to this point of it’s been for 48 hours how come? You don’t want to be in that scenario.

Steve:  So with the transactional replication on the subscriber side, if I’m using that as a reporting mechanism to offload some of the work from the publisher, if I need specific indexes on the subscriber to help with whatever the reporting is doing on that side, do I need to add those on the publisher and replicate them over or can they just exist on the subscriber?

Chuck:  Yeah, that’s a great question it’s because that’s when the most powerful features of the replication is I can say either replicate the indexes that exist on the source publication or don’t replicate them. And the great thing about not replicating them is your publication server has this specific workload and use case and your subscriber’s typically always different. And what you can do is you can setup in the replication setup is it, what’s called the post-replication script. It’s basically one file that can have any SQL code in there that will then create anything you want on the subscriber side. And it is ran after the publication, when you initialize the publication, so initialization is just the method you go to publish all your data and get it to subscriber. And so in that command there, there’s actually two that you can do, you can do a pre and a post after you do this initialization. So what I typically do is that I come up and figure out my tuning on my subscriber side what indexes I need. I just add it to that script. And that script is you know, check if it exist already, if not you know, add this index to this table. And that gets applied after your initialization has happened so your table’s been populated with PCP. You know, that’s the use of PCP behind the scenes and then it creates indexes if you told it to. And if you told it to not do the indexes then it’ll basically go to your script if you have one. You don’t have to have one. And then you just add them and I think that’s the super hidden benefit right there for replication.

Steve:  Oh yeah. It really seems like that could eliminate a lot of the replication bandwidth by not having to have those indexes in two places and synchronize over.

Chuck:  Yeah. I mean it’s not going to replicate over all your index changes. I think that’s kind of, it’s just whatever happens in the data tables themselves.

Steve:  Right. But on the initialization of it it’s going to have to bring, if you’re replicating the indexes, it’s going to have to bring over the whole index right?

Chuck:  No, it won’t transfer that data. It replicates out the script. Here’s what the index is and it will create that after the fact.

Steve: Okay. So then you’d use this post-script only if you needed, or if the indexes were only needed on the subscriber side.

Chuck:  Yeah.

Steve:  Wow. Lot of great information here. This is good stuff.

Chuck:  Yeah we only scratched the surface.

Steve:  Oh yeah.

Carlos:  Oh yeah, that’s right.

Chuck:  There’s so much. It’s crazy and the amount that you have to get into and just carry on that earlier thought of having those three subscribers. So you typically would have those behind the load balancer. And then so I have this like common name, like this is my order processing system. So my applications stack would just point to this when load balance name and then it would get the information that it will need behind the load balancer. And then when I have maintenance, I can say load balancer take this one server out. And then it allows me to create a new snapshot and then I’ll pause all three of them. So all three of them will become stale but once I do a new snapshot to a new initialization everybody tries to grab it and pull down so you cause an outage if you don’t pause the distribution agent. So then I just, then I work on that one server, get the snapshot applied and then make sure the indexes are then applied. And then add it back to the load balancer and I take my next server out and then so it’s up to date. And I may
decide that I’m going to take all of the servers and just use this one that’s now up to date or I may have one stale and one up to date and one maintaining the other. And so it’s kind of what, in a bigger environment, what it looks like.

Steve:  Okay, great. Well shall we wrap it up with the SQL Family questions?

Chuck:  Sure sounds good.

Steve:  So Chuck how did you first get started with SQL server?

Chuck:  Yeah. It’s how it was. And for that, you know, I didn’t really pursued databases as a career until that kind of dotcom crash in the early 2000’s when unemployment paid for my MCDBA certification classes. And so that like I was completely scared to write in SQL when I was doing Access, like I avoided it like.

Carlos:  Sure.

Chuck:  And some are tipped classes like oh this isn’t so bad. So I did some consulting work in SQL 7 in 2000. Then I became a contractor at Microsoft supporting SharePoint for just the worldwide internal SharePoint farm and work with some guys and Mike Watson They are kind of well-known players in that space and they kind of look to me as the go-to SQL guy. I’m like, “Holy crap I’ve never done SQL clustering before, and load balancing stuff.” I’ve always, you know, just done this MCDBA course before there and that kind of gave me the bug so I become a developer after that. And then I had a boss who said that, “No you can’t be the DBA because I don’t think you got the skills to do that, so just keep doing your SQL development stuff.” So I basically quit and found a new job doing that SQL developer and DBA and that were, yeah that’s a hidden lesson out there for managers out there. It’s pretty easy to piss off a developer and say no or, you know an ambitious person you know looking to learn new things. And my next gig, I work for 7 years as a DBA and a DBA manager and supported just a gigantic replication farm.

Steve:  So, doing all the things your previous boss told you you weren’t qualified to do?

Chuck:  Yes. And I even got picked for a top 5 finalist for the Redgate’s DBA of the Year Award. You know it sounds like ultimate response to no you can’t do that. Okay. Sure I can totally do this. It’s not a problem. So now I’m a data architect so I’ve kind of gone the spectrum from developer to data architect but I’m still a DBA as well. It’s tough to find good DBAs.

Steve:  It’s really interesting how telling someone they can’t do it will motivate them to just be wildly successful at doing it somewhere else.

Chuck:  Yeah. And I’m really thankful he did that because I thought I wouldn’t be where I am today. Yeah, I love being a DBA. It’s a great job.

Carlos:  Now as great as SQL server is, if there was one thing you could change about SQL server, what would it be?

Chuck:  Only one?

Carlos:  We’re going to take your top one today. Maybe the next time you can, we’ll dig deeper.

Chuck:  Yeah right. I think that we need to get licensing in more in tune with modern hardware specs because yeah you know SQL express cost 1 gigabyte a RAM and a 10 gigabyte database. I mean you’re kidding me, and you might just as well say say, “Go to MySQL because we don’t really care about you.”

Steve:  Yes. There are very few environments that SQL express actually is a good thing to you.

Chuck:  It’s useful for right. I mean they can use it even the replication environment which is kind of cool. But I mean my phone’s got more power and capability than SQL express does. And then Standard edition’s kind of the same way. I mean you’re limited to, what’s it, 64 gigs of RAM and you know they say that you can use more with the heap space now with the later editions. But still I mean it is like designed by untechnical marketing people and no reason where these numbers come from. So if we want to limit to one thing that change, I think we,
we’ve got to fix that.

Carlos:  Yes.

Steve:  Okay.


Leave a Reply

Back to top