It is easy to feel like the odd man (or gal) out in the SQL Community. We are constantly hearing about new features, products, and architectures and our companies may not always have the same enthusiasm when we describe what could be.
This is probably even more true in the Analytics space as much of the marketing has been pointed in that direction. Even PASS tried to create a conference just for Analytics. While there are some amazing new technologies out there–we might–GASP–not need them in every scenario.
Our guest today is Kevin Wilkie, a BI architect with Innovative Architects and he talks to us about some of the tools he uses and spoiler–they aren’t all that fancy. He also talks with us about how he manages the tech and business users so each feels comfortable and each can get their work done.
“Excel’s that tool that’s sort of magically hiding in front of everyone.”
Listen to Learn
- Some of the tools Kevin uses in his analytics role
- The criteria you should use to determine whether you should use a new tool
- Why using simple tools might hold an advantage over ‘better’ tools
- The data profiling task in SSIS
About Kevin Wilkie
Kevin Wilkie is BI architect with Innovative Architects. He is a community participant and helps organize SQL events in the greater Atlanta area. He is a graduate of Southern Polytechnic State University. He also gives the SQL Data Partners podcast two thumbs WAY up.
Transcription: BI Tools
Carlos: Okay, well Kevin, welcome to the show.
Kevin: Thank you Carlos. Good to be here.
Carlos: Thanks for coming on with us today. So compañeros for those you on the east coast, you might have run into Kevin, he’s just out the Atlanta area. A bit of a big, I’ll say supporter of the show, particularly, kind of early on getting some supportive feedback from Kevin was helpful to me. I appreciate now, that we’re able to have him on and talk a little bit with him about some of the things that he’s doing. Ultimately, I think our idea today, or our focus is going to be that, the analytics environments, have started to take over more and more of the attention that the data world is getting. Lots of focus is being placed there, and I think as administrators we want to be able to start looking there, how do we potentially move over into the analytics world, and what tools are those analytics folks using. Ultimately, I think that’s where our focus is, tools, analytics, how and when to appropriately apply them.
We talk about these tools, and ultimately we’re talking about the Microsoft stack, right? These are going to be tools you’ve heard of and probably installed before. Let’s just go ahead and start that conversation, Kevin, why don’t you talk about some of these tools that you’re using for data analysis.
Kevin: Thank you Carlos. I think one of my favorite tools, since I’ve been doing SQL for years, is SQL itself. I just get into the actual SQL itself to play with the database and that simple queries just so that I can see the overall form of the database of many of the tables. Don’t get really into the nitty-gritty as much, I try not to in SQL Server itself, but I like to at least the overall form, we see things like the foreign keys, the primary keys, any constraints that the table will have. You know kind of where the data will lie. That’s where we’re going to try to stay, just at that level with SQL server itself, and for data analysis in the overall distinct values there are, how far the field exists in the table, like looking in the max, or things like that.
Carlos: I think maybe, let’s just step back just for a moment, and talk about, we’re talking about analytics, you mentioned you like to spend your time in SQL Server, I think that most people would say that is true. Ultimately, when we need to start looking at tools, is either because we have either questions that we want to ask the data, that we can’t get currently. Or we want to look at the data a different way, or present that to other people so that they can review it and ask questions of that data. Right?
It’s interesting, you went from the SQL Server, and then kind of into SSIS, but you used that as more of an ETL, it almost sounds like a data cleansing, or a data integrity operation, rather than moving it from one place to the other.
Kevin: In this case, you’re right. It is just, for this one SSIS is fairly good at just, like the task itself says, its profiling data. You’re just seeing the basic outlines of how far it can go from min to max, maybe a few averages and standard deviations, but not overly heavy on what you can do with it, and how much you can see with it.
Especially, the biggest thing I see is how many nulls were in the table, because if you have a four hundred thousand row table, and three hundred and ninety eight thousand are nulls, usually the business needs to know that, we need to rethink why that field exists in the table.
Carlos: Sure. There’s not enough data there to validate, or use that as a valid option.
Steve: Just a question around that Kevin, I know the data profiling task in SSIS that you mentioned, that’s not something that I have myself used in the past, but the things you’re describing there, are things that I would normally have done in SQL Server and TSQL. I’m just curious, what are the benefits of getting it that with the data profiling task, rather than just running queries to do it?
Kevin: Because with the data profiling task, I can do it a little bit faster, and I don’t have to do as much work … As much work to actually seek for each, let me back up … For example, if I wanted to do the min/max, standard deviations, and averages, I have to do a little bit of work on a number, for each field, I have to write different equations for SQL.
Kevin: Whereas, data profiling task, I can do it in one shot.
Steve: Okay, that makes sense, sounds like something I need to go try (laughter).
Carlos: Now also in your role, you mentioned being able touch SQL server, and then also be able to use SSIS. Now generally, while that line is getting thinner and thinner, there has traditionally kind of been a border between, the administrators, or the ‘keepers of the data’, and then what we’re going to allow them users access to. Now, we think in analytics, we’re talk about end users and I guess there are the Power BI folks we see in the community, but generally, we’re think of our business users. Any heartache with giving them access to SSIS? Or is that something more purely for the administrator in kind of a prep move?
Kevin: It’s really a tool I use when I’m setting up, or doing the beginning phases of a warehouse. I don’t really like to go, I wouldn’t like to go a normal user access, because there obviously are multiple tools that the SSIS world can do. And even SQL sometimes will intimidate the normal user too much, that shows us run away.
Kevin: I like to keep these two tools are really the IT use. If you really know what you’re doing, these two tools are great, but for the normal everyday user, not so much.
Steve: Okay, so then, if you’re considering SSIS and then TSQL, and SQL server for the IT side of things, and then the normal users come along and they want to be able to go and do some analysis, what are you recommending, or how are you going about doing that with other tools?
Kevin: Well, there are two tools, well two branches of users, I like to call it; the more advanced users, I would have in Power BI, Tableau, something that there’s a lot of features, almost IT, but not quite. IT can still have security on it so they can make sure they’re not going crazy, and adding a bunch of stuff that they shouldn’t be, or even looking at data they shouldn’t be. Where in both those worlds, you can still do 98% of everything you’ll ever want, at least for the Power users, but, also, the second class of users, will mix in with the Power users because, I like to give them a tool that they really know rather well, it’s Excel.
Carlos: (Laughter). Oh boy, we’re going to have to wash your mouth out with soap now Kevin.
Steve: Excel’s that tool that’s sort of magically hiding in front of everyone.
Kevin: Yeah, but it has a lot of features that everybody’s gotten use to, everybody can play with, but the sad thing is everybody likes to think they’re a expert at it. With some of these tools, you really, really do have to know what you’re doing, but at least you’re in Excel … You can play around some other ways and still get the same data, but there are lot of easier ways. Even, VLOOKUP for some people at performance, automation, right? Repeatability. Now all of a sudden, we think about Excel, I don’t think about any of those things (laughter), in that term.
Can you give us some thoughts, or under what circumstances should we be using Excel, versus all these automated things we like to be using?
Kevin: I like to use Excel, for more of proof of concept. To actually see, “oh yeah, we can do this. This makes more sense this way. I have only like twenty-thousand rows.” Excel is great for that.
Kevin: Especially if it’s already in a spread, in less space, it’s 99.9% of our stuff is spreadsheets. Why did I take it out of Excel and put it in SQL Server, massage it a little bit, then run through with it? Let’s just go ahead and do what we need to in Excel, that’s what people want, let’s give it to them in Excel.
Carlos: You know, one area I do think that we could probably be a little, or we might leverage some of those Excel users, and that is, in that data validation parts. Because the business users are coming to us with these reports that that want written, they’re going to be showing us that data in Excel, and I think we could potentially help leverage that to say, “well look, this is the process that I went through, let’s validate that this is, I guess, maybe before I build it, let’s validate that process with them. Right? I take this data, I do this to it, I apply this, and then I add that, and this special sauce, and then bam! There’s my number.” I think that could potentially help accelerate that report development process.
Kevin: Right. Heck, even Excel itself can do a lot of what-if scenarios. It can do the T-test, it can do a lot of Nola tests that more of the data analysis sites will use. You can do all your tests there without having to go to ‘R’ or choice is.
Steve: When you’re talking about Excel for data analysis at that point, do you find that the Excel Solver tool is being used quite a bit for that?
Kevin: I’ve used Solver, not as much as I’ve used the data analysis itself.
Kevin: For those who don’t know, both data analysis and Solver are just very simple add-ins that are added, are simply Microsoft Excel add-ons. They’re almost loaded straight out of, from … When you loaded Excel, but it’s like one or two clicks and they’re there.
Steve: Right, so for instance, I’ve seen the Solver used in data analysis to run the what-if scenarios, where it has to work through a number of variables and figure out, “well what if we adjusted this in this way, or in that way.” I’ve seen different times, trying to port that same kind of thing into SQL Server, once somebody’s figured that out in Excel and finding that that can be very challenging.
Kevin: Yeah, it’s not simple because Excel is made for statistics and numbers, it’s made, “if this variable changes, and this variable changes, and this variable changes, what’s the one, I can get the most I can get out of this other number?” Excel works great for that. I used to do that SQL, it would take a little while to figure that one out, because it changes for each scenario.
Steve: Yup. As we go through this, we’ve talked about different tools, and we’ve mentioned SSIS, and Excel, and SQL, what else do you see being used for this analysis side of things?
Kevin: We’ve started to use SSRS, especially the new SSRS 2016. It has, the great mapping capabilities that more and more of our clients are starting to really like, because they’re not just seeing the data that’s actually in their little area, these companies that want to do more and more stuff outside of Georgia, the south, United States, they want to do stuff all over the world, so it’s great so they can see, how is their product being bought, in Atlanta, in Richmond, in Seattle, wherever. They can do all kinds of analysis even down to the, at least the city level, I’m sure some of these people have done it even further down than that.
that you are in the architect role, at least that’s one of those hats that you wear, but how much of a learning curve is, building SSIS reports, particularly ones that come from stored procedures, I think are kind of the very, I won’t say basic, but that’s like, a lot of people cut their teeth on those kinds of scenarios. Right?
Now going to a report that you mentioned that includes the geography components, what kind of learning curve is involved there?
Kevin: For SSRS, at least for me, there wasn’t a huge amount, because for the most part, once you have the city, most of the data will have cities, states, zip, at worst it will have … Most of our stuff will have that data in there …
Carlos: Mm-hmm (affirmative)
Kevin: You’re good. Thankfully SSRS picks it up, runs it with it.
Carlos: Oh wow, okay, so it’s really just, I like this type of map, here are my columns and there you go, I can see the data.
Kevin: Yup, because it’s all built in and going for bangs, it’s great for that.
Carlos: Got you.
Steve: So it’s basically using Bing as your presentation engine, wrapped in SSRS?
Kevin: Yes, that’s exactly what it does. At least from what I’ve been able to tell, it is using Bing to just roll the data out there, and it figures out “oh this is where Atlanta this is, this is where Richmond is, this is where the counties around it are.” Figures it out rather quickly, much more quickly than I would expect.
Carlos: Now another tool that I’ve heard talked a lot about, and I’ve heard scenarios, but I haven’t, it seems like only in very niche situations, or with people that really know their stuff, have been able to get Master Data Services to work well. Am I off base there, or why is there option of Master Data Services.
Kevin: I wish I could tell you (laughter). I think there’s a couple things going on with Master Data Services, that makes it where people don’t want to use it as much. First part is, it can be highly taxing on the actual business user themselves.
Carlos: Mm-hmm (affirmative)-
Kevin: Because they have, somebody has to be on it every day making sure, “yes, this one is a real valid answer, let’s put it into our data warehouse. No, this one really means this.” These are the people that are the think we’re actually pushing it as much as we should be, because, especially well, let me back up, with the small ten, fifteen people shop, there’s really not a big use for this. I wouldn’t think, there are probably scenarios where yes it would be great, but for the big, major companies, and some of the groups that are within it, it can be useful, but I don’t think that the people that are in charge, see the value as much as it should be.
Carlos: Okay compañeros, I just want to ask a question to all of you, would you be interested in hearing more about Master Data Services? Again, I’ve talked with lots of people that are mostly product managers, so if you’re interested in us doing an episode on that, let me know. I’ve kind of shied away from it, just because, again, it kind of seems like you really have to spend quite a bit of time to get up to speed on it. But if it’s something you think we should cover, let us know, hashtag SQL podcasts.
Steve: It’s interesting on that one is that, with Master Data Services, it seems like there’s two categories of people I’ve seen. There’s the people who swear by it, and it’s the best thing they’ve ever used. And there’s the people who swear at it (laughter), and it’s the most painful thing they’ve ever used. Maybe it’s that there’s some misunderstanding there, but I think that there’s certainly a lot of value there that people could, if you have to do the data cleansing side of things, you could get with that.
Kevin: I agree. Especially since, it’s been out since 2008, R2, it’s been available for a rather long time, more and more people should be at least heard of it, like it enough to at least try it, but it’s just not happening, for whatever reason.
Carlos: But I think some of that, it needs to maybe get a little more love from Microsoft, or from an interface perspective, it still feels a little bit form-ish, and I think that may be a part of it as well.
Kevin: Now Carlos, you’re not talking about the extremely early versions of it, it was, I agree completely bad, 2008 R2 was horrible. Twenty, well was a little better, I haven’t seen the 2016 version of it, but I believe they’ve updated it some.
Carlos: Okay, I admit I haven’t looked at 2016, I’ve heard that they’ve been doing some updating, but I haven’t seen any screenshots.
Kevin: Yep, my wife is going to Leblanc, who’s also from the Atlanta area, on earlier, talking about Power BI, and of course that continues to change. I guess, thoughts about introducing that, or making that part of the repertoire of analytics.
Kevin: I think, just like everything else, you have to play with it, you have to work it in, once you see how, I don’t want to say life-shattering, but it can really help your world by using Power BI, or all the tools that are available, especially even MDS, just to see if it’s available. Test it, play with it, see if it works for you, some of these things just won’t …
Carlos: Mm-hmm (affirmative)
Kevin: But others will, and you’ll see, it’s much easier once you actually start playing with it, and know what you can do with each of these tools. That’s why I’m encouraging, just try some of these things, try different scenarios, see what this works, see what works, see what doesn’t.
Carlos: I feel like some of the rub with the tools as well, is that, they don’t … It takes a cross functional team, right, to make that happen. We talk about wearing different hats, and some people can wear all the hats, but even in smaller organizations, there is some divide, you’re going to have to share that workload, and I think where we run into some heartburn is that, not a single person is going to know all those tools, or able to do all of those steps for the analytics. I think maybe, that’s the way sometimes, at least on the administrator side, we tend to balk at it a bit.
Kevin: Agree. I think a lot of the DBA’s I’ve talked to, they want to, it’s not just DBA’s, it’s any programmer at all will try to say, “oh, I do this, I am good, I know what I’m doing, I can do it all.” Especially in the data analysis world, you really can’t, there is so much stuff out there, you may have a tool that can do the job, but it may not be the best tool. But you can do it, and you can get around to it, so you still like, just like SQL, you have to have friends out there who know different areas, who can help you with whatnot. That’s one of the things, sometimes you will get just try talking it out with them, sometimes that’s the best help.
Steve: Yeah, that’s a good point. On the different tools, I kind of look at it as the tools that somebody might have in their toolbox. You got, maybe you got a hammer, and a drill, and a screwdriver, and Excel, and Power BI, and SQL are kind of the equivalent of those, where one of them might be the right tool to do one job, but maybe not the right tool to do another job. You use a screwdriver to screw in a screw, you could probably get that same screw screwed in with a hammer, or at least hammered in, but it might not be the best way of doing it. Sometimes with Excel versus Power BI, or the other tools, it’s kind of the same kind of comparison where you might be able to do something really quick in Excel, that might take you more time in Power BI or vice versa. already there, just do it there and be done with it.
Steve: Exactly, and I think there’s lot of the business users out there, who may not have that SQL Server experience, but if they can pull the data into Excel, then they can use it in an area that’s comfortable for them.
Kevin: Not even bother with the IT department which …
Kevin: Somebody has thirty thousand things going on, one thing that’s easier to do will be helpful.
Steve: That’s a very good point.
Carlos: (Laughter) Okay, so last thoughts, or as we kind of wrap this up.
Kevin: I just want to encourage more people to actually play with different tools that are available. Don’t just, “let’s play with SQL Server, we can do it all on SQL Server.” Now you have thirty thousand tools out there, not just what is normally setup for the IT, but for, you have Excel, you have a lot of open source things out there that you can get and use, that use SQL Server, or Bing to get better. Great tools, lot of people have put a lot of time into some of this, and some of them may be quicker than your regular tools you normally go for. I mean, even at worst, you’ll learn something new about yourself, or about the tools, or what scenarios it can be used, and what it shouldn’t be used for.
Carlos: Some of that I think though, has to be, it comes from a business culture perspective, because I think, like you mentioned, the business users are going to bring us an Excel file, or they’re going to bring us some handmade report, if you will, that came from who knows where, and are going to be like, “I want this.” So I think part of that is, maybe a little interpersonal, or personal relationship skills, to say, “you know what, let’s talk about this, let’s see if we can’t play with this other tool, or look at this in a slightly different way to get where we need to go.”
Kevin: Right, in that case, you really have to work with your business team. If, sometimes I have work with teams to, this is always how it’s always been, and this is how it’ll always be, I need it copied over at Thesaurus. I want a report everyday, at such and such time, do not talk to me about any other things, this is what I want, and this is what I need. End of story.
Kevin: And then there’s the one’s you can work with.
Carlos: And there is an advantage, at least, to having the users know what they want. Right?
Kevin: Yes, and know how to get there. That’s great help to all of us.
Carlos: Right. Okay, should we do SQL family?
Kevin: I like it.
Steve: Yes, let’s do SQL family.
Carlos: So Kevin, how do you go about keeping up with technology, and all the changes that are happening, especially, in the SQL Server area?
Kevin: I read lots of blogs. I have a old school, I have a RSS feed, lots of those, and listen to lots of webcasts. My favorite is just SQL Server user groups. I learn probably more through those than I will through any amount of reading, or anything I can think of to get all the information that I could ever want. Plus, it let’s me see friends that I haven’t seen in quite some time. User groups are the number one way of getting, my way of getting data. It’s everything.
Carlos: If you could change one thing about SQL Server, what would it be?
Kevin: All right, this is where the DBA out of me comes out. The auto close feature, that is to me one of the dumbest things I have ever seen out there on SQL Server. So many of the brand new DBA’s will try to say, “oh, let’s just hit that, and we’ll be fine, we won’t ever have to use the database again.” Oh my God, it would drive me insane when I would start looking through the logs, and I start seeing “Closed, closed connection, closed connection” What? No, let’s just get rid of that feature once and for all. I don’t want to see it at all, I don’t, I’ve never even seen why it’s even there.
Steve: Good point. Okay, so can you tell us about the best piece of career advice you’ve ever received?
Kevin: This was from my wife, who listens to this podcast along with me every week. She actually said, early on, even when we were starting dating, she told me that, at night, how did she put it … No matter what happens, as long as the systems are still running, and you’re just doing performance tuning, it’ll all still be there in the morning. Because, like all SQL developer, well auto-developers period, we try to get two or three more seconds faster for all of our short procedures. It can wait, don’t just stay in the office until nine, or even your own work at home office, don’t stay down there until midnight, or two in the morning. See your kids, see your family, see somebody. Don’t just stay in the office and work yourself to death. Get out.
Steve: That’s a really great point, and I think that, when people do get out, or have something else outside of just their work life, they end up doing a much better job in the time that they’re there doing their job.
Steve: Very good point.
Carlos: Our last question for you today Kevin, if you could have one superhero power, what would it be and why do you want it?
Kevin: Teleportation, yup. I notice y’all have been lately talking about “I want to be able to fly.” Oh heck no, I’m going to be teleporting because I’m going to be there now (laughter). So flying towards there, heck with it, I want to teleport.
Steve: (Laughter) I like that.
Carlos: There you go. Well Kevin thanks so much for being on the show with us today.
Kevin: Oh, thank y’all, that was great. That was a great time.
Carlos: Yeah, we do appreciate it, you taking a little time with us.