Compañeros! Welcome back to the SQL Trail. How would you feel if you could automate all of your mundane, routine SQL Server tasks? If you’ve ever wondered how to get started in PowerShell, then this episode is for you. In Episode 51 of the SQL Data Partners Podcast I chat with Janakiraman Sattainathan about learning PowerShell the use cases where its implementation makes a huge impact. Jana was able to migrate over 20 terabytes of data and 4 thousand tables using a PowerShell script he wrote in just two weeks.
Episode 51 Quote:
“[A] lot of DBAs out there already do that. They download partial scripts that are available online and they run them as a one-off thing. When they need it to do that one specific thing they won’t come up with it. But once you pick up a book and start reading through the scripts, then it will all make sense and then you can tweak things a little here and there, add things, and make your own and that’s the way to do it. “ – Janakiraman Sattainathan
Listen to learn…
- How Jana went from PowerShell “kid” to programmer in two years
- Why SSIS isn’t enough for large automation tasks
- How to discover new use cases for PowerShell
- Why a collection of scripts that took two weeks to build beat out all other tools
- How Jana migrated over four thousand tables using PowerShell
As always, please join me on the trail by starting a conversation below or connecting on Twitter. If this podcast is helpful to you, please leave a review on iTunes so others can find it as well. Subscribe via iTunes, Stitcher, and Podbean.
Show Notes
Resources
Jana on Twitter
Learn PowerShell in a month of lunches
Questions About PowerShell Basics That You Were Too Shy To Ask
Microsoft Powershell Course on MVA
True Multithreading in PowerShell
Windows PowerShell Basics (MSDN)
About Janakiraman Sattainathan
Jana is a production DBA for Scor and lives in the Charlotte North Carolina area. He is certified in both Oracle and SQL Server. He is a database and PowerShell aficionado. Follow him on LinkedIn.
*Untranscribed introductory portion*
Carlos: So Jana, welcome to the program!
Jana: Thank you so much Carlos, thank you for having me.
Carlos: So we’ve gotten to know each other through the SQL Saturday and the PASS organization. We’ve bumped into each other at several different events, and as we were talking you mentioned to me a kind of interesting scenario that you had in your work environment and that you had wanted to share that with others. I think ultimately today our conversation is going to revolve a little bit around around PowerShell, which we have talked a little bit. We had Mike Fal on the show earlier talking a little bit about it. But today we’re going to be talking a little bit about some of the use cases that you’ve seen where PowerShell can be useful where you have to administer data. So I gues, let’s jump into that. When you think about PowerShell and how you’re using it in your environments, what are the use cases that you’re now using PowerShell for?
Jana: Well, let’s start with a question to you. So what is a common situation in your job where you’re doing something that is repetitive and manual that’s killing you every day?
Carlos: Wow, repetitive and manual. There are many administrative type tasks that are repetitive, and hopefully we have some kind of scripting for us. I think about backups, I think about index care and feeding, you know, looking at our database corruption checks, things like that. And of course, any kind of monitoring potential.
Jana: Exactly. Those are perfect candidates, but those are the usual suspects. There are a lot more that could be done. Mundane stuff that we don’t even think could be automated but we keep doing it manually, and those are good candidates for automation. In my case, we have about four new instances at work and these systems make up about 125 databases or instances in SQL Server, 200 in Sybase… we also have MongoDB and MySQL. I don’t claim to know anything about those. SQL Server and Oracle those are the two things I know. Sybase I could do some things because it’s similar to SQL Server. Managing that many instances with any tool is difficult. That’s why we are using PowerShell. So we have the past three of all these database names what kind of application information to use, and from PowerShell we can hit all instances as if they were just one.
Carlos: I see. So basically, you mentioned 200 SQL Server instances. I type one script and apply it to the 200 servers?
Jana: Exactly, so if we wanted to get the same information, for example all of the dev instances updated, 200 plus instances, then I could just issue a command saying “select a version” and I say “against all these” and then I’ll get back those servers. And that is a good use case.
Carlos: So there’s no question that there’s value there in being able to administer, especially as you have that number of instances. One of the bars of entry for a lot of folks, I am in this camp, where I came from more of an admin side. I was a network guy at one point in time, and then kind of came over. And then you think about the scripting components and maybe from a developer’s perspective, you have a slightly easier way of getting involved. I guess, what was your experience in getting into PowerShell so that you could learning it and so you could then apply it in your environment?
Jana: Well, frankly I am a kid when it comes to PowerShell, because I started learning PowerShell in December of 2013 and I really started PowerShell only in 2014. I used to go to sessions, I believe the first edition at the time, and they used to show all these cool demos and it was very inspiring at the time. I would come back and sit down and plan to do something and the max thing I would do is the equal [command]. And that’s it. And I didn’t know how to go to the next step. And then I figured, “maybe I’ll fix something that we do manually and maybe we’ll augment that.” And I picked up a book, a nice book, entitled, “PowerShell in 30 days”. And this book was a fantastic book. So I started in day 1 and spent an hour, and the next day spent a chapter 2, and after 3 or 4 days I couldn’t wait. I just went from chapter 6 to chapter 30 in the next couple of days. I worked through the examples and at the end of it I was really comfortable, because it started from- it assumed no knowledge of PowerShell. And then I bought the next book in their series, “PowerShell 2 in 30 days”, which is a power advanced version of the same. And after that, soon I was just in the middle of everything and I was hooked.
Carlos: Another interesting component is that there is no replacement for getting in and getting your hands dirty. Now the community has done a good job of making some things available, and there are pieces of scripts almost like T-SQL queries that are out there but it’s ultimately up to the individual to take the time and figure out how all of the pieces fit together.
Jana: Yeah, a lot of DBAs out there already do that. They download partial scripts that are available online and they run them as a one-off thing. When they need it to do that one specific thing they won’t come up with it. But once you pick up a book and start reading through the scripts, then it will all make sense and then you can tweak things a little here and there, add things, and make your own and that’s the way to do it.
Carlos: And so it’s 2014 and you’re kind of getting into PowerShell, and this leads up into an interesting dilemma or issue that you had in your work environment.
Jana: Small things I had been updating and I was getting really comfortable and I have the confidence that I can do anything that can be done in PowerShell because I think it’s all docs and classes. And we have an on-going migration which is about to be completed in the next couple of weeks and it involves moving from Oracle to SQL Server. We initially manually had to move 200 tables and about 13 million rows. At the end of everything it turned out to be 35 billion rows and 4,000 tables, so you can imagine how it just kind of ballooned. At the time, they had a consulting company come in and they were trying to work on some things and the end result wasn’t usable. And so my boss asked me to step in and do something. So in two weeks we had a prototype in PowerShell and it ran for two weeks in two puny VMs, but we could get it done. We initially named it “PowerPump”. It was simple. It was a list of tables that we needed to move: here’s the source, here’s the destination, in a CSV form. This thing would just look through the list and for each of the tables it would further divide them up into chunks and say, it’ll load five chunks at a time. Soon I was tweaking it a little bit here and a little bit there.
Carlos: So I’m guessing the source was Oracle and the destination was SQL Server. And that’s like a 1 to 1 mapping, so part of this, at least from my experience, is “Why wouldn’t I use SSIS for something like that?”
Jana: That was what was attempted. They built an SSIS, but the problem is that the manageability aspect of it.
Carlos: So how do you toggle on and off, and get those chunks?
Jana: That is one aspect, but also you have the threading. Well, SSIS doesn’t have a great threading mechanism. So you can spin your [inaudible], but at the time there wasn’t anything that you could use to thread SSIS.
Carlos: So if you were to estimate the number of lines of code, what are we talking about?
Jana: I would say about 20 thousand lines.
Carlos: [laughing] So that’s substantial, right? That takes a little bit of, two years of PowerShell to get in there and say, “I know it and can make these tweaks and get under the hood and make it happen.”
Jana: They’re all like this and start small. It’s all functions. So you start with a little function that says, “Give me a connection to R.” I get the connection, that’s the function. “Give me a connection to SQL Server.” You get the connection, that’s the function. So now you have two or three sets of functions that say, “Give me a connection to SQL, Oracle..” and then I stack them up and say, “Give me a DB connection,” and I give you a database type. So similarly, instead of using a DB command to get a DB result, and then all these things I had about 50, 60 calculations sitting at the high level but 50-60 at each table level, and you start building in all of these. And once you have that, then the business is like, “Yeah, you say you’re copying everything but we don’t have a clue and we want to test it.” So okay, how can you prove that you’ve copied everything? So then I came up with a process to validate it.
Carlos: So ultimately, I think at the end of the day, with PowerShell I can get my objects, I can extract some of the difficulty, and break that into small pieces, and then test it.
Jana: Yes, absolutely.
Carlos: So then in your case, you were able to migrate 20 terabytes of data, 4 thousand tables, over to SQL Server.
Jana: Yes. And that was all in PowerShell.
Carlos: And how long would you say it took you to put all those scripts together?
Jana: We always had a working version from the get-go, but it wasn’t fast.
Carlos: You mentioned that initial two weeks. So after two weeks you had something that could move the data from Oracle to SQL Server?
Jana: Right. So it ran aground at that point, and I had to go and accept issues and clean up the data and restart it from where I left off. It was all manual.
Carlos: Well that’s a neat user story. And use case. Very cool. Should we do SQL Family?
Jana: Yes, absolutely.
Carlos: So of course one of the things I like to talk with people about is their favorite tool. We’ve been talking a little bit about PowerShell, and that’s okay if that’s your favorite tool. But what’s your favorite SQL tool and what does it do for you?
Jana: My SQL tool, my favorite SQL tool, is something that I built. And the reason I say that is my favorite tool because there isn’t one in the market like it that I could use and I wish I could show it to you. So the tool is like SQL Server CMS, but it’s a CMS on steroids. I built it for both SQL Server, Oracle, and Sybase.
Carlos: I see, interesting. If you could change on thing about SQL Server, what would it be?
Jana: I know the community is going to hit me for saying this, but…. Be more like Oracle.
Carlos: Agh! Yes, that’s the death sentence.
Jana: In a lot of ways, 2016 is changing the game. But in a lot of other ways, T-SQL is way behind compared to PL-SQL, which is the native language of Oracle. And the partitioning, we only range partitioning.
Carlos: So what’s the best career advice you’ve ever received?
Jana: I’ve received a lot of advice, I’m not sure which is the best one. I went to one Steve Jones’ sessions, and it was about blogging and giving back at the same time making yourself known in the community. And I attended his session and only after that did I start blogging.
Carlos: Our last question for you today, Jana: If you could have one superhero power, what would it be and why would you want it?
Jana: The one thing I would love to do is to do time travel. You could kill two birds with one light travel and I would live to see how things will be.
Carlos: Sure.
Jana: And that’s one reason I love dreams, not the dreams as I’m seeing something, but lie down asleep and you dream something.
Carlos: They go anywhere.
Jana: Yeah, you can go anywhere to places and oddly enough, you can even control your dreams. Let’s say you go back to sleep and try to continue that same dream. You can’t. And time travel is like that. It’s a dream. That’s one thing I would love to do. And the other thing is, once you’ve seen the future, you could go back and buy some of those stocks.
Carlos: That’s right. Well very good, well Jana thanks so much for being here today.
Jana: Thanks s
[…] SQL Data Partners Podcast Episode 51: A PowerShell Story (Carlos Chacon Jr.) […]