What’s in a name? That which we call a rose, by Any Other Name would smell as sweet. We could say the same for analytics — there are plenty of similar or related words we use to describe the role or process of working with data. In this episode, we talk with Nicki Tinson about her journey into analytics, how others, like you, might break into the fields, and what has remained the same even when the names change. I am joined by Kevin Feasel and Eugene Meidinger, so you know we keep the conversation lively.
Be sure to check out Nicki’s Facebook if you are interested in making the transition into analytics. Tell her Carlos sent you.
Episode Quotes
“Understanding that DBA side [is] really critical. For any of us that like the analytic side, that’s the really exciting bit, but 90% of the other stuff is what takes your time, because you’ve got to get those foundations there.”
“I think it really is about trying to get very proficient in one thing before you start adding lots of other pieces of tech to it. There’s things you do in Excel which [is the same concept as] writing SQL code.”
“Getting out and understanding your business and how it works is so crucial, in my mind, to understanding business processes. If you focus on solving people’s problems, you will naturally learn the tech as you go along.”
“[One challenge is] helping analysts to think about, ‘what could you do in your business to do things differently’, rather than just waiting for people to give you work to do.”
“I think the challenge will be [finding] people who have that skill set, who can come and walk with people who know their business, but don’t understand this advanced analytic side, to help them see the potential [and] walk them through it.”
Listen to Learn
Nicki’s Story https://empoweredanalysts.thinkific.com/pages/about-me
Resume/CV Builder https://empoweredanalysts.thinkific.com/courses/cv-builder
Facebook Page: https://www.facebook.com/EmpoweredAnalysts/
00:40 Intro
02:09 Compañero Shout-Outs
02:47 Conference
03:45 Intro to the guest and topic
08:34 Is there really a big chasm between working with data and analytics?
11:46 Separation of roles between data engineering and data science
16:32 Excel is a great starting point into analytics
19:08 Transitioning into regular analytics instead of advanced analytics
22:20 Non-technical people may have an edge
24:10 Do you need a stats background to get started in analytics?
27:02 Nicki’s work – project-based versus free-flowing
29:48 Different kinds of businesses need different kinds of output
32:35 The point you can take the initiative in your career to look for new challenges
35:52 Training for the next level
38:23 Resources for getting started in analytics
42:30 Kenneth Fisher’s crossword puzzle
44:15 SQL Family Questions
50:15 Closing Thoughts
51:46 Bonus conversation: The difference between artificial intelligence, machine learning and data science
Kenneth Fisher’s crossword puzzle: https://sqlstudies.com/2015/08/03/sql-crossword/
About Nicki Tinson
Nicki Tinson is a Business Intelligence Manager, with a passion for using data to make a difference to any business or organization. She uses SQL Server extensively, developing and querying data warehouses, as well as using a variety of visualization tools, all with the intention of bringing insight to the business and driving action from it. She was delighted to pass the 70-461 exam last year, whilst on maternity leave.
Her background in Education was the starting place to help others reach their fullest potential. Now she has set up Empowered Analysts to help coach and guide analysts through the course of their career, from entry level all the way into management.
She lives in the UK, with her husband and two very energetic young children. She has recently taken up gymnastics and can’t wait for the time when the whole family can start mountain biking together.
*Untranscribed introduction*
Carlos: Hello, compañeros, this is Carlos L Chacon, your host, and welcome to Episode 137 of the SQL Data Partners Podcast. It is good to have you on the SQL Trail again! Thanks, as always, for tuning in. We appreciate it. Thanks for listening, thanks for connecting with us and giving us a little bit of your time. Our guest today is Nicki Tinson. She’s a business intelligence manager, coming to us from the UK. Nicki, which we’ll get into in the conversation, did not start in IT. Got into IT after going through a different route. One of the things that she has done now is, she started up what’s called Empowered Analytics to help coach and guide analysts through the course of their career and to give others the opportunity to come into analytics, to give them the framework, if you will, of what they might need to do to if they wanted to make the move. That’s what our conversation is centered on today, is this idea of, “hey, so you want to get into analytics?” Talking a little bit about that and what her current workload looks like. We are joined again by Kevin Feasel and Eugene Meidinger. They are with us and they give their take. These are interesting guests as well, because they have done a very similar thing or are trying to do that, and so we talk a little bit about the challenges, struggles, and current status, if you will, of where everybody is, there.
We do want to give a couple of Compañero Shout-Outs. The first to Peter Hall and to everyone digging on my Las Vegas suit on social media. I attended the GE Centricity Conference in Vegas. One of the things we sponsored and one of the things we were looking to do as a small fish in a big pond is to stand out and so I posted up a picture of me in the suit and it got a lot of responses on social media, but also in person as well. I got a lot of different looks in addition to the ones that I normally get with that suit.
For time’s sake, we’re going to skip the SQL Server in the News this week, but I do want to talk about the conference, the SQL Trail. I need to stop calling it a conference. The event, the gathering. The data platform gathering of the East Coast in Richmond, Virginia, October 10th through the 12th. We have put together or compiled a YouTube video from what happened last year with some attendee feedback and whatnot. Kind of a promotional video, but I think it does give some insights into what it is that we’re trying to do, and it’s always nice to have some feedback and other’s experiences. So that’s up on YouTube and we’ll make sure that it’s up on the site as well. If you’re interested in looking at that, you can head out to sqltrail.com. I am happy to announce that we do have our Friday lab in place. I’m not going to announce that just today. We actually talk about it next week. Kevin, Eugene and I are going to do an episode by ourselves and we’re going to talk about it a bit more there, next week.
Our URL for the show notes for today’s episode is going to be sqldatapartners.com/analytics and that is going to be an “s”, or sqldatapartners.com/137. With that, let’s go ahead and get into the conversation with Nicki.
Carlos: Okay, Nicki, welcome to the program.
Nicki: Hi, Carlos. Thank you for having me. It’s very exciting.
Carlos: Yes, it’s always nice having a guest from the UK to join with us.
Nicki: Thank you.
Carlos: Your topic is extremely interesting as well. It’s one that we get asked quite a bit as we see some of the marketing shift and where the new jobs are going to be happening and this idea of how do I get into analytics and what can be my course, if I’m not currently doing that? Why don’t you give us a little background about how you got into analytics and why you think others might find it an interesting career path?
Nicki: Yeah, well, great question. I’m going to be really honest. At the start of my career, I chose to go to university and I had absolutely no intention of being where I am now. If you’d asked me 15 years ago that I’d be doing what I’m doing now, so I’m currently working as a business intelligence manager in the UK. There is absolutely no way I would have envisioned myself doing this. Originally, I was going to go into educational psychology, which meant working with children. I did a psychology degree, so I actually did some statistics, inadvertently planned, obviously knew what was coming, so it’s been a while since I’ve got to do any standard deviation, just to preempt any questions coming up later. So, I just happened to do that in my degree, so I was going to go into Ed Psych, and I had to do teaching and I was going to do a Masters. When I left my degree, I did a teaching qualification, and then when I left that, and it was amazing, all of my work experience was with kids, everything. Nothing was with data at all. Then when I was trying to find a job, I couldn’t get a job. We had a massive funding problem over here and loads of teachers were made redundant. That first year after leaving uni was just really awful and you obviously appreciate how much money you spend at uni, so I come out of this degree and I was like, “uh, what is going on? I knew exactly what I was going to do. I was going to do this,” and it was a massive curveball. I really didn’t know what to do. I guess after about a year. I ended up working for a local government and I was working with an organization called Sure Start. I think in the US, it’s similar to HeadStart, I think of sort of years gone by. It’s all helping children in areas of deprivation and health outcomes and all of that kind of stuff. There was a lot of data that they collect to prove that that funding is making a difference. That was really, actually, the start of my data career, because I ended up working in a team that were very heavily on the data side. They worked with Access databases back then and collating information, bringing it together, giving that to management, and so that was really the start of my journey. I think what was really interesting about that was my desire to work with that particular subject was really something that inspired me and that kind of connection to the why. “What problem am I solving here? Why am I doing this?” I ended up moving away from local government and into the private sector and actually ended up working on mortgages where I covered predictive analytics, in that kind of role. It was a really interesting start to that and it’s just really gone on from there. So now, I work as a business intelligence manager. Honestly, the journey’s just been really interesting. I’ve constantly been doing things I never thought I would do. I do a lot of data warehouse development now, as well. Again, 5 years ago, that would have terrified me. It’s just–
Carlos: Sure. Now we say data warehouse development, you’re talking about ETL, maybe SSIS packages, things like that?
Nicki: Yes, yeah, a bit of that, yeah. I’ve used a bit of SSIS packages, we don’t really need it so much for our data. We tend to use a lot of stored procedures and we can get the data in those kinds of ways, so yeah, inserting data and that kind of more BI developer type of role, but all really important, again, solving problems with trying to make your queries performant and not taking forever for data to run. That’s a huge challenge in the data field, making sure that you’ve got good, solid foundations there, before you can even make anything meaningful out of it.
Carlos: Right. I think that’s also interesting that you mention some of those tasks, because I feel like it’s not a super gap. I feel like a lot of times those who aren’t in analytics but are working with data, feel like, “oh gosh, there’s this huge chasm to where I need to get to be able to start working with analytics.” But the chances are that moving data from one system to the other is something you’re going to have to encounter pretty quickly when working with data? And that’s just kind of an evolution of that or getting better at those skills, at least part of it, right?
Nicki: Yeah, absolutely, and I think when I’ve looked at it, data science is the new term that everyone loves. It covers so much stuff, and actually when you look at the kind of really broad definition of data science, it really encompasses all of the things that we’re looking at now. So, your construction of tables, making sure it’s performing, understanding that DBA side because it’s just really critical. You can’t wait for hours for data to run, especially when you’re dealing with big data and really understanding that foundation. It also covers the chelation, understanding the business processes, getting out to the business, talking to people, getting a real sense of what that data means, and then the predictive analytics and the machine learning side of it, which is a bit like, I guess for any of us that like the analytic side, that’s the bit you really want to get to. That’s the really exciting bit, but it’s like 90% of the other stuff is what takes your time so that the bit that’s really what you maybe want to do, is quite a small proportion of time, because you’ve got to get those foundations there.
Carlos: Right. I’m going to use this as an excuse to bring in Kevin and Eugene here, because I know that they have moved over more to that analytic space as well. So, guys, would you agree that you have to have that base before you can move over? What’s been your experience and how much have you been able to retain, if you will, as you’ve moved into more analytic roles?
Kevin: Sure, I’ll go first. Hi, Kevin Feasel, professional podcast guest and happen to be running a predictive analytics team, but that’s kind of a side project. Being a podcast guest is the main deal. Hearing the description, I’m completely in agreement. I think that a lot of what we think of as really advanced analytics work is the stuff that we’ve been doing for decades. It’s ETL or maybe ELT if you’re going to be fancy, that way. It’s moving data from one system to another system. It’s understanding data models, data rules, data contexts. It’s trying to figure out how things fit together and most importantly, most of your time is going to be spent working with the awful, awful data that’s on hand. The dirty data that needs to be cleaned up. Nicki threw out the number 90%, completely in agreement. Actually, I broke it down. One of my questions was going to be how much time you’d spend data cleansing on projects. I have written down 90, plus or minus 5.
Nicki: Oh, awesome!
Kevin: I might be under estimating the amount of time, too.
Nicki: Yeah, that’s good. We must be right, then. What are the chances?
Kevin: Two independent points.
Nicki: Yeah, exactly.
Kevin: But I think a question that I want to lead into here, is do you see a separation there from data engineering versus data science, the two sets of roles? Or do you see this as one role?
Nicki: I personally think that it’s going to depend on the business. I mean I love working for probably mid-size companies because you get this more end to end experience. The role that I’m in at the moment, I get to go from that BI developer, the ETL side, the DBA parts of it, all the way to literally getting into the business, speaking to people who are not tech at all and understanding the business processes and looking at how we can add value. For me, having that complete end to end experience, one is very rewarding because I get to see that when I’m set in front of CO trying to make it more performant. It’s not something that I want to do all the time. I understand why I’m doing it, and I understand the end game, so for me, I find it very helpful having that. I guess where there’s a challenge for more enterprise-level businesses, is that the data is so vast, the teams and the business is so massive that it probably is more difficult for that to happen, realistically. But I guess that’s the question for a bigger company, what’s your opinion of that?
Kevin: I also probably should give my quick definitions of the two for people in the audience who don’t know what data engineering is versus data science. We just kind of talked about it without actually explaining it. For me, data engineering is the stuff that we’ve been talking about that we do a lot. It’s taking data, moving it from source to source. It’s working with models, the data cleansing, it’s doing the stuff that is plumbing. We’re data plumbers. We used to call them ETL specialists. We’ve called them other things along the way, but today the term that will get you the most money is data engineering. The difference is, now we’re expecting to learn new things like, “oh, I need to be able to pull data from a Spark cluster.” Or “I need to be able to run Kafka streams and get data from lots of devices through very quickly.”
Carlos: Right, so the number of sources has just increased.
Kevin: That’s the fundamental difference. Yes, it’s not the style of the job, it’s just the boxes that you’re checking that you know how to do. I see, by contrast, data science is more of a statistical analysis. There’s the old joke that a data scientist is a data analyst who lives in California. Basically, you add the term data scientist, like, “I’m an analyst, eh, I’m a data scientist now. Pay me three times as much. I’ll do the same work.” But when you have the opportunity to specialize, these are the people who aren’t necessarily doing as much of trying to figure out why row number 4 and row number 5 are supposed to have the same value but don’t because they have different keys somewhere and more time trying to figure out how many layers this convolutional neural network really should have.
Carlos: Did we lose you, Kevin?
Kevin: No, no, I think I just killed all of the conversation, is all.
Eugene: You got really excited about some of the nitty gritty.
Carlos: It sounded like you weren’t quite done. I was like, “okay”.
Eugene: You had us at convolutional neural network, Kevin.
Kevin: Somebody’s bingo card just got checked off, like “I win!”
Eugene: Spark, Kafka, I’m waiting for GDPR. I think that’s the center square, cause–
Nicki: Oh man, don’t get me started on that one. But it’s interesting, actually. Sorry, I’m going to make a comment about GDPR, but that just goes to show you how valuable data is. And that’s just going to explode, I think, over the next few years as well, analysts within the GDPR area. It’s huge and it’s, for anyone starting off, I sort of feel sorry for people in a way because where do you start with all of this? If you’ve got no experience of any of this, it’s like there’s a lot of just tech. Even if you’re just looking at just the predictive analytics and the machine learning and that data science, the algorithms and really looking at that data in that way. That is a vast arena and to become an expert in that alone is, I probably feel now, in my career, I’ve got a good foundation and now I can start to look into that and lean into that. But to even go there, is a big undertaking, I think.
Carlos: Yeah, that is interesting. You do have to have that base to do some of these things. I guess we’ve talked about some variations here. So, coming from maybe a different path than maybe we’ve come. Maybe closer to what you came from, Nicki, and that is, I want to call them Excel masters.
Nicki: Yeah, absolutely.
Carlos: The folks that can slice and dice Excel, do you feel like that could potentially give them at least a good enough base or at least an understanding that they know how to work those numbers, so they could start making some of those leaps?
Nicki: Yeah. I would actually say Excel is a really, really good starting point, because there’s actually a lot you can do in Excel now that even 5 years, things have developed and you’ve got things like Power Query and Power Pivot in there. So, to be able to connect data in that way within Excel, that’s a massive leap. I used to use Access databases. I wouldn’t need to do that, now, to get all of that data together. For a lot of businesses, they’re not even in a place where they necessarily have their data in one system. I’m lucky because I work in an organization where we use Dynamics NAV, which is an ERP system, so as an analyst, I’ve got all of my data in a database. It is in one place. I have no random spreadsheets elsewhere, so that in itself is a huge hurdle. But again, for slightly smaller sized businesses, they don’t necessarily have that luxury, but they still need that information and they need to get it understood and read in. So, yeah, going back to your question, I think Excel is a really, really good starting point. I think pretty much all of the fundamentals are there. I think it really is about trying to get very proficient in one thing before you start adding lots of other pieces of tech to it, because there’s things you do in Excel which, when I eventually started writing SQL code, it’s the same concept, isn’t it? An IF statement or a CASE statement, conceptually, you get what you’re trying to do with the data. I think visually, if you’ve not come from a technical background, which I hadn’t, I got visually the way data strings together and connects, so then when I started writing code, it was like, “oh, okay, that makes so much sense.” It just becomes very straightforward. I think that if you try and deal with too much in one sitting, you don’t get anywhere. That’s the danger. I think it’s becoming a bit of an expert in one thing at a time.
Eugene: I think if the goal was to try and help some of those people who are looking to transition to analytics, I think we should probably make a distinction and I don’t know what the right term is, but maybe between some traditional BI or regular analytics or what Microsoft a lot of times calls advanced analytics. I’m kind of stuck in the middle between the two and I want to make the leap into advanced analytics. But I think with a lot of regular BI, you know, business intelligence, you can lean on the business for validation so you have a tighter feedback cycle and you have a shorter distance to go. Because when someone in accounting just wants to see year over year growth, they can validate that data. They can validate the result and go, “that doesn’t look right.” They have a way to check that, whereas when you start to move into advanced analytics, you need extra education, extra training because you need to be able to validate the model or the data outside of the business, once you get into either predictive or prescriptive. So, Kevin was talking about a neural network, well, how do you know it’s working right? How do you know your model’s working right? If you make a model that says, “okay, we want to reduce custom return, so we should intervene at this point”, well, the business can’t look at that and go “well, that makes sense” because you’re telling the business what to do. So, you need this extra understanding of statistics and data science and modeling and that sort of thing. I think for a lot of people, they really should be focusing on that first step, that first foundation that you talked about, of just regular analytics, regular business intelligence, where it’s maybe a little bit more descriptive or a little bit more summarization than KPIs. Versus what the cool new hotness is now, the machine learning, the data science, the thing that you’re supposed to have a PhD in statistics in. I don’t know what the right term is to make that split, but I think for people looking to transition, a lot of them are maybe looking at the fact that data scientists are making 130 grand, when really, they should look at, “okay, well, how can I learn, like you said, Power Query and Power Pivot?” That can get you pretty far, especially if you’re in a non-coding role.
Nicki: Yeah, definitely. I was thinking about this the other day. The best example I could give to explain all of this is, did you guys watch the Karate Kid? The 1980’s version of it?
Carlos: Of course!
Eugene: A long time ago.
Kevin: The good version.
Nicki: Good. So, when Mr. Miagi, I think it is, he’s teaching Daniel, the Karate Kid, how to do karate, and at the end, he’s obviously amazing at it. But he gets him to go and wash his car and it’s like wax on, wax off and he’s like, “why are you getting me to do all of this rubbish stuff? This is really boring! I don’t want to do this!” And it’s kind of this is how I feel about data and data science, because there is a lot of stuff you’ve just got to get stuck into that you don’t necessarily want to do, that isn’t as high profile, but you have to do it to get good at what you do. At the end, yeah, you can become like a Karate Kid in data science, I guess. But it’s that there is a lot of effort and time going into that initial work. We haven’t really even touched the soft skills here, because getting out and understanding your business and how it works is so crucial, in my mind, to understanding business processes.
Carlos: Right, and this is where I think the non-technical folks have an edge, because they are a bit more attuned to some of those, and it wouldn’t be so irregular to adopt a vertical. Whereas, from a tech, you think the tech can apply everywhere and you’re like, “well, if I only do health care, or if I only do government, or I only do finance, somehow I’m pigeonholing myself.” But you’re non-technical, that idea of having that vertical is almost, that’s the first option or that’s the first choice that you make and then you go and find out more about that and figure out how you can apply skills to that area.
Nicki: Yeah. I think that kind of, I guess, a challenge for all of us is like what’s the problem we’re trying to solve, being the first point and everything else kind of comes to you, I think. Because when you’re focusing on the problem you’re trying to solve, like I was thinking of Netflix, for example. They will use algorithms to work out what sort of movies you might want to watch. But ultimately, they’re doing that to try and keep you engaged and get you coming back for more and continuing to pay the monthly subscription. So, when you get really super focused on what behavior are we trying to encourage, then you kind of step back from there. It’s not like you just go analyze a load of data and look at algorithms, etcetera. You look at the business problem you’re trying to solve and you move back. That’s again, the people who are not as deeply technical, that’s a bit less scary, because it makes conceptual sense, I think, and then you can tag on those other elements to it more naturally.
Kevin: So, when it comes to learning these basic skills, getting started, how much of that is the statistics itself? Or in other words, how much of a stats background do we need in order to get started in analytics from some other field?
Nicki: My view is, and this might be controversial, so feel free to disagree with me, but I don’t think you need to have a stats background. I happened to have one when I did my psychology degree, because that was, you know, psychology’s a science, essentially, so there’s statistical significance and you learn about chi squared and t-tests and regressions and all of that kind of stuff. So, I happened to have come from that background, but I, ironically, found that when I went to work for government, I was like, “oh, I understand t-tests” and because they got me in for the data sort of side, and I was really quite excited about bringing that in. Honestly, it kind of went down like a lead balloon because I wasn’t meeting people where they were at and I wasn’t trying to solve their problems. I was trying to go, “look at what I know! I know all of this stuff!” and force it on people. I think where I had to learn was to step back and go, “okay, just hold that for when it’s helpful.” So, like when I worked in mortgages, they’re very heavy on the predictive analytics with the credit scoring, so that was perfect. That stuff started to naturally come in because again, that was relevant to what problem they were solving at that point. But when I worked in local government, just trying to explain how a bar chart worked was a challenge for me. Just even the concept of averages, at times, was quite difficult. So, I think that, again, those are kind of touching on the softer skills when you go in, of just being really sensitive to the environment you’re in, wherever that is. I honestly believe that in your career, if you focus on solving people’s problems, you will naturally learn the tech as you go along. You’ll do it in a much more manageable way. It won’t feel like it’s overwhelming because you are going to pick it up incrementally. Going back to the statistics stuff, if you happen to work in an industry where, for example, mortgages and the finance world, predictive analytics is massive. Again, they’re going to drip-feed things to you, so you will learn what you need to in the job. We did a lot of school car monitoring, but I actually didn’t need to have done my degree and have that stats background, because I naturally was mentored through it. Again, I think if you’re in an industry where it’s important, they’ll have relevant training for you, to help you through it.
Kevin: Okay. And following up from this, on a completely different subject, so it’s the ultimate follow-up. Is a lot of your work, does it tend to be more project-based, or is a continual free-flow of “gotta do this one thing real fast and then I gotta do another thing real fast”?
Nicki: It feels like a bit of a mixture of both, at times. It is essentially project work, where in the past I’ve had regular tasks to do, to use a database term, more of a front-end role. Now it is a lot more project-based because I’m in an IT environment.
Kevin: Okay, so what does one of those projects look like?
Nicki: It might be about looking reducing costs in a certain area of our business, so that might mean that in order to provide management with information that they need about costs in that area, we might need to look at, some of it can be quite detailed logic at times. Getting that into the data warehouse might be making that decision about putting it there. So again, you’re right back to those foundations and getting that data in, in a way that we can, before we can even analyze it, doing some prototype work. Then feeding that prototype work into the data warehouse where we can make it a bit more stable, and then, providing the regular reporting off the back of there. Because I’m in an IT environment now, my mindset is very much about automation, so I don’t want to be churning out the same reports if people can just access them, themselves. We use PowerBI a lot. Once we’ve got all of the data in the data warehouse, we can get it built up in PowerBI and then people can slice it how they need to. Again, those data visualization tools, they’ve come on phenomenally in the last 5 to 10 years. That makes your job much more automated so then you can start focusing on edging towards more of the advanced analytics, like you say. Going back to what you were saying before, one of the challenges is also that. It can be easy to fire-fight because we might get adhoc analysis every now and again and if you’re not careful, just taking a step out of it, if you’re not careful, you can get dragged into that. So, you’re not necessarily focusing on what’s going to add the most value to a business, because you’re just kind of giving everybody all the information they want all the time. There’s always information you can provide, it’s endless. So, I think you have to have quite a lot of control in this area, because you can literally create reports forever and ever and ever. It’s definitely a challenge.
Kevin: How often do the outputs of your projects end up being services instead of reports or dashboards or visuals?
Nicki: When you say services, what sorts of things do you mean?
Kevin: I’m loading the question by getting into microservices, but in other words, a process that’s running that maybe you have a real-time predictor or some process that you’re expecting somebody else to call and pass in input data and you give them back out the outputs that you’re generating?
Nicki: So, you mean the machine learning side of things, is that what you’re thinking?
Kevin: Yeah, a predictor function or something that some other part of the organization is going to plug into, to fill up their Excel spreadsheet with results for what the next quarter’s budgets are going to look like.
Nicki: Yeah, in the role that I’m in, I don’t tend to use it so much, but I would say that there are areas where I think we could go into. I work in more of like a factory type of production environment. I’m in the automotive industry, and so we refurbish vehicles and it’s all about getting those cars in and out of the factory as quickly as possible. For us, it’s very different to an organization like a mortgage company or an organization like Amazon, for example, which are kind of like putting things, “buy this, this is the kind of information we know about you, and we can kind of encourage your behavior” and that kind of thing. What I feel, I don’t really tend to do that kind of work where I am, but I think that’s a mindset, partly. I think some organizations are very traditionally focused on predictive analytics and really lean toward machine learning, and I think finance is a massive one. For example, if you ring up for a mortgage or a credit card and they do a credit scoring on you and you get that, “we’re going to approve you straightaway” or you know, “you guys need to see an underwriter”. Those kinds of environments really lend themselves, naturally I think, towards that kind of advanced analytics. I think there are other areas that don’t as much, or haven’t historically, but I think there’s a shift in mindset. I think there are areas where we could do that, but I think that, again, trying to empower businesses to think differently about machine learning. So, really helping analysts to think about, “what could you do in your business to do things differently”, rather than just waiting for people to give you work to do. I think that’s quite a challenging area. I don’t know whether you guys see that in the work that you do.
Eugene: Actually, about that, and kind of a selfish question, but I think it ties into a lot of the things we’ve been talking about. I think there’s definitely, at least in the work that I do, there’s a lot more space for thinking analytically and using some of this technology. You talked about how it’s not necessary, a lot of times, to have a statistics background to get started, and a lot of times you can learn a lot of stuff on the job. So, the question I have for you is, in your mind, where’s the point where the job isn’t doing enough to keep pushing you forward, and you should start, maybe looking outside and say, “okay, well, maybe we’re not doing very much or anything with machine learning our job, but if I could actually just learn how to get started with it, then it might be an easy sell to my boss” or something like that? At what point do you think it makes sense to not depend on the job to drip-feed you the next thing, but instead say, “okay, I need to start taking some more initiative, here?”
Nicki: Yeah. I think if you’re the person that, you know, in that scenario, where you’re going, “okay, when do I take the initiative?” I think there’s probably like a tilting point in someone’s career where they are just really ready for those next challenges and I generally would say they’re in the more senior analyst type of roles. I think when you’ve essentially focused on solving the problems of your business, when things are a little bit more settled and you can get the attention of either your direct boss or the people above them, then I think that’s often a good point to bring this stuff in. So, if you’re in a position in your business where you’re going in on a weekly basis, queries are running really slowly, there’s loads of locking going on, that’s probably not going to be a good time to go, “oh, can we do some machine learning?” Because, you know, the people that you’re speaking to might be like, “well, we can’t. Look, we’re in a fire at the moment and no.” So, I think when things are settled and there’s always going to be things to do, so this is the risk, but I think when you’re not fire-fighting, when things are fairly stable, then that’s the point at which you can even of yourself just kind of think about, “well, if I was in charge, what would I do differently? What would potentially add huge amounts of value to the business? How can we bring in that extra million pounds?” And think blue sky, because, for me, I think as an analyst and having that background, having that entrepreneurial mind-set really starts to set you apart. I think those predictive analytics and the advanced stuff is really about thinking about things that you can’t really see, yet, so it’s seeing the opportunity where other people don’t. Those kind of skills, mentally, and just giving yourself some time to even think about that, then I think that you might be able to find a couple of areas in the business where you could potentially bring that in, and then that’s a good starting point to go and speak to your boss about even some initial pieces of training. I guess the question would be, then, what training would you recommend?
Eugene: You know, it’s funny, I can say what I’m looking at right now. I think Kevin’s probably going to have a better answer, because he does more of this next level stuff, but I think if nothing else, it sounds silly, but just starting to go through Khan Academy, just so you understand some of the vocabulary of some of this stuff, if you don’t have that statistics background. I don’t know, as I’ve been going through it, it’s kind of amazing how much of it interlocks into each other. Just there’s so many of these different pieces and it’s like when you’re programming, you want to get to the point where an IF statement is so you don’t even have to think about it, you just breathe it. Or when you’re writing SQL, I’m at the point in my career where I can write a SELECT statement at three in the morning, sleep deprived, without any problem. I can write an IF statement, sleep deprived, without any problem. Telling you the exact difference between standard deviation and variance at three in the morning, not quite there yet. But you could probably tell us the difference between a t-test and a, was it chi, chi, I can’t even pronounce it.
Nicki: Oh, chi squared. Well, I probably couldn’t. It’s been a while.
Eugene: That’s fair, but you get the idea where, I think if someone’s just starting to look at that next level, probably just Khan Academy and just starting to pick up some of the very, very basics, so you have that vocabulary to even think about it. There’s so many pieces of just vocab you need to move forward.
Nicki: Certainly stuff when I’ve connected with other analysts, there is this feeling like people really want to get into this, the neural networks, which I know nothing about. I just want to say that it sounds really interesting, but it’s kind of like you either do a PhD or like a week’s course somewhere, which, it’s like one extreme to the other. Clearly, you can’t learn, you can’t be an expert in a week. So, it’s kind of like that mentorship side of this as well is so important, having people to help walk you through and show you that. I think that’s going to be the challenge, particularly over the next 10 years in this space, of people who have that skill set, who can kind of come and walk with people in businesses, who know their business, but they don’t understand this advanced analytic side, to come and help them see the potential, but also walk them through it because realistically, I wouldn’t do a PhD right now. I don’t know whether I’d ever do a PhD anyway, but I wouldn’t do that right now, and it just seems a shame to miss out on this area that’s very exciting and very, very useful.
Kevin: On that topic, are there any other resources that you might recommend for somebody who’s interested in getting started in the field?
Nicki: Do you mean literally just getting started out?
Kevin: If you have a little bit of a background, but you’re not doing it for a living, you don’t have the strong academic background, but you’re interested in moving forward and building up some of these skills. But you don’t necessarily have a mentor directly around you?
Eugene: So, if Carlos wants to stop being a DBA. He’s so excited by this podcast episode, he wants to become a data analyst.
Carlos: I’m already working on my resume, by the way.
Nicki: Well, on that note, Carlos, I do actually have a CV building course for people who want to be a data analyst, so if you’re interested, I can send you the link.
Carlos: There you go, I will make sure we put it up on the podcast, the show notes for today’s episode.
Nicki: Yeah, I would say places like Edx and Coursera. There’s actually some good free courses out there, so we talked about Excel earlier. I definitely think if you’ve not got any other experience, start there, yeah, Edx, Coursera. If you’re in a business and they’re happy to give you some training, Pluralsight, Pragmatic Works, I’ve used their training. That’s very, very focused on the Excel and PowerBI and that kind of arena. There’s some really good free courses out there, there’s some very good paid ones. I think that going back to my analogy of the Karate Kid earlier, it’s about practice, so I would recommend that people don’t try and just saturate themselves with learning. Don’t try and learn everything about Excel, for example, because there are things about Excel I don’t know and I know quite a lot about Excel. It would be like, focus on vlookups and pivot tables, because the concept of joining data sets together and summarizing them, you’ve got that in those two elements. Get a couple of data sets and use some vlookups to get some data connected to them. That’s a pretty good place to start, really, and practice. Practice, practice, practice.
Kevin: Yeah, a couple more resources I would throw out, especially on the side of learning with R or Python, DataCamp is usually very good. They have a set of good courses, some of them free, some of the paid. A question that I have is what’s your opinion on doing Kaggle competitions?
Nicki: Is this the, they have data sets? I have come across Kaggle.
Kevin: Yes, Kaggle is a website that they have a set of competitions. Some of them are data sets that are generally publicly available and the competition is going on forever. It really is just a way to test out your skills. They have one data set which is housing prices in Ames, Iowa. They also have a series of active competitions where companies will pay money. They put out a prize that says, “the best model to solve this problem will get this much money.” One big example was there was a major airline in the US which put together a competition to try to reduce the delay times for flights, under the idea that every second that they could save is millions of dollars over time. So, any way to improve the way that they could schedule and reduce delays and keep planes in the air for longer, saves them so much money that they’re happy to pay out a few hundred thousand dollars.
Nicki: That’s amazing. That is great. My thoughts are just that working with data is an incredible career. I think it’s we literally have gold at our fingertips. There’s so much opportunity there. I mean, there’s an organization called Data Kind, which are encouraging data scientists to volunteer to work with different companies to literally end poverty in certain parts of the world or whatever those organizations do. It goes to show you what we can do with data. So yeah, the sky is the limit.
Carlos: Okay. Very good. Awesome. Thanks again for being here, Nicki. Before we let you go, I’ve started doing Kenneth Fisher’s crossword puzzles. I’ve started here with this Best Practices. I want to throw a couple of these out to the group here and see if we can come up with an answer. I am looking at 4 Down, and admittedly, this is mostly for DBAs, but so it says, “adding this option will almost always speed up your backup.” Eleven characters.
Eugene: Buffer count?
Carlos: Oh, I was going to go with compression. I guess I should say the second letter is an O.
Carlos: I think compression.
Carlos: Let’s see, I’ve got 8 Across. It says, “do this regularly, or corruption may sneak up on you.”
Kevin: How many letters?
Carlos: Let’s see, seven.
Kevin: CheckDB?
Carlos: CheckDB. I think that’s right. Yeah, that makes sense to me. I have the wrong group here. I’m asking a bunch of DBA questions to analytics guys. So, my last question, because honestly, I don’t know what this is.
Nicki: It doesn’t bode well for us.
Carlos: It’s a three-letter word that says, “how long will it take you to recover?”
Eugene: RTO, come on.
Carlos: Oh, it’s an acronym!
Eugene: That’s Database 101. That’s like in the MTA certification for databases.
Carlos: Tada! I was not thinking of acronyms.
Eugene: Oh, come on.
Carlos: So, man, look at that. See, now you’ve embarrassed me, Eugene.
Eugene: I’m not even a real DBA. I just play one.
Carlos: There you go. Okay. So, Nicki, shall we go ahead and do SQL Family?
Nicki: Let’s do it.
Carlos: So, all-time favorite movie, of course, that you wish to publicly declare.
Nicki: Oh, I managed to whittle it down to two, so you’re going to have to take this.
Carlos: Okay, here we go.
Nicki: I’ve got Inception, which had Leonardo DiCaprio in it.
Carlos: Ah, is that why? Let me just list the Leonardo DiCaprio movies? Is that–?
Nicki: No, no it isn’t. No, it’s all about the subconscious mind and I kind of like that stuff anyway, so I like it.
Carlos: Gotcha, okay.
Nicki: Kept you on your toes. But then also, because I have to bring something more humorous into it, Guardians of the Galaxy as well. It’s just so funny. So, there’s the two.
Carlos: There we go. So, Karate Kid, was Karate Kid like in the top 5?
Nicki: No.
Carlos: No?
Nicki: That’s just good for analogies.
Carlos: Okay, very good. Okay, a food that reminds you of your childhood?
Nicki: Oh, well, in British style, it’s got to be roast beef and Yorkshire pudding.
Carlos: There you go. Okay, now I’m curious, when you’re looking for roast beef and Yorkshire pudding, where do you go to get it?
Nicki: Oh. Well.
Carlos: You’re making it?
Nicki: No, no, pubs do it so well.
Carlos: Okay, okay.
Nicki: Yeah, gives you a good excuse to go out for a meal.
Carlos: Very nice. The city or place you most want to visit?
Nicki: Well, this is a hard one as well. Such tough questions! Apart from the crossword; that was really hard. I’m a bit generic about place, but I would like to see the Northern Lights at some point. So yeah, that would be amazing.
Carlos: Now tell us, how did you first get started with SQL Server? We know you came from the education space, so when was the first experience you had with SQL Server?
Nicki: I managed to pick up a bit of SQL when I worked in the mortgage sector, because it’s very common to use SSAS, the Statistical Package as opposed to the Analysis Services. So, within SSAS, there was a procedure called PROC SQL and that was my introduction to SQL, then the SQL construct. And then I thought, well, I was sort of looking to be more challenged, so I realized that a lot of companies use SQL Server. I thought, “well, if I can use PROC SQL, it’s the same constructs, it’s just in a slightly different environment, so that’s fine.” The very first time I used it was in a test that was sprung on me at an interview. I wasn’t told that I was going to use it. There were just very slight variations like whether you need to use semicolons and that kind of stuff that make the query not run, so my technical test was not great, but I managed to charm my way into the job and I’ve been there ever since, so it was all good.
Carlos: Well, there you go, very nice. If you could change one thing about SQL Server, and I guess we could add, since you’re using some of the other tools as well, maybe SSIS or SSAS as well, but if you could change one thing, what would it be?
Nicki: Oh, they’re all hard questions. This was particularly difficult because probably up to using 2012 onwards, prior to that, I would say them not having the ability to use the lead and lag functions. I know you could use cts to get data from different rows, but it was a bit long-winded, so prior to then I’d be like “give me the lead function” that kind of thing, and now they’ve got it, so I’m pretty happy, actually. So yeah.
Carlos: Very nice. What’s the best piece of career advice you’ve received?
Nicki: I haven’t received it directly myself, it was what I read in a book. Years ago now, I read ‘Lean In’ by Sheryl Sandberg, who’s the COO of Facebook. ‘Lean In’ is all about women in leadership roles and trying to encourage women to think about their career and all of that stuff. There were a couple of pieces in that that were really, really interesting and you sort of touched on it in one of your podcasts around the imposter syndrome. The whole concept of ‘Lean In’, which is if you get the opportunity to be around the table, lean in, get involved, particularly in a data analysis environment. Often you are the most junior person in the room at times. You might be with managers and directors and CEOs and it can be really unnerving, but they are all amazing opportunities. So, that was one thing from the book. Also, this concept of she uses the term Tiara Syndrome. This kind of thing of people think just sit there and don’t say anything and just work really hard and you’ll get noticed, but she says, just go for it and again, it’s this concept of leaning in. Don’t just wait for things to happen. Go and get them. I love it.
Carlos: Very good. Good advice. We didn’t really touch too much on that in this discussion, but that idea of kind of encouraging women to get into the field. It is wide open and there are lots of access points.
Nicki: Yes, absolutely, and again I think you’ve touched on this in podcasts where everyone’s got a place, it’s just really valuing yourself in it, knowing that you’ve got something to bring and that you can add value. Yeah, definitely.
Carlos: Nicki, our last question for you today, if you could have one superhero power what would it be and why do you want it?
Nicki: I’ve gone for breathing underwater and being able to explore oceans. I did kind of think about the flying, but yeah, breathing underwater and exploring, there seems to be a lot of places we don’t know about.
Carlos: There you go, I guess living on an island will do that. Might do that to you, right? Well, awesome, Nicki, thanks so much for being on the program today. We had a great time.
Nicki: Thank you. It was amazing, thank you so much. I feel like we’ve had some good discussions about data science.
Carlos: Thanks again, Nicki, for joining with us today. We really appreciate it and of course Kevin and Eugene, for jumping in and their contributions as well. We did extract a little bit of the conversation from this podcast, and if you want to hear Kevin, Eugene and Nicki batt around the idea of the differences between artificial intelligence versus machine learning versus data science, you can hang on to the end and we’ll play that snippet after the end. If you want to take a peek at Nicki’s Empowered Analytics, it is actually a Facebook page, so facebook.com/empoweredanalytics. You can take a peek at her course and some of the materials that she has. For those of you who want to get more into analytics, or ways to get started, that might be a great resource for you, there. That’s going to do it for today’s episode, compañeros. Thanks again, as always for tuning in. We do appreciate it. We hope that we’ll see some of you in October at SQL Trail. We do still have some slots available. If there is something that you think we should be talking about on the program, please let me know. You can reach out to me on social media or on LinkedIn. I do enjoy connecting with you on LinkedIn. I’ve gotten to know most of you, we have quick conversations and I’d love to keep those going. But you can reach me @carloslchacon and we’ll see you on the SQL Trail.
*Bonus Conversation*
Kevin: My last question that I’ve got. In order to be fully buzzword compliant in the year 2018, could you please tell me the difference between artificial intelligence versus machine learning versus data science? Are they really the same thing or are they really different things?
Nicki: You’re mean. It’s very late in the UK.
Carlos: Oh, look at the time!
Nicki: You’re breaking up. I can’t hear you. I’m going to be really honest, I’m not sure I could explain artificial intelligence as well as I know you can. But my understanding of data science would be more of a broad, from the architecture side. You’re working with a lot of data, so the big data and the Hadoop and Spark and all that, whatever data, and bringing it together and understanding the business processes and then the predictive analytics and the machine learning come into the advanced analytics. I would say that data science covers all of those things and that’s a pretty huge skill set to have. I think the machine learning is what was being discussed earlier about this feedback of we have this statistical understanding of the data and this is the recommendations that we’re making out of it, and then potentially, automatically feeding that back into an application. That would be my summary. What’s your take on that?
Kevin: I’m going to defer to Eugene and see if he has anything.
Eugene: Oh, the difference between machine learning, artificial intelligence and data science.
Kevin: That is the question.
Eugene: That’s the question, right?
Kevin: It’s totally not an interview question.
Eugene: Sheesh, as a non-expert. Oh, well, can I work at Channel Advisor if I get it right?
Kevin: No, no.
Eugene: I hear you guys have an opening. All right, sorry. So, in my mind, the distinction between all three is the first two describe a set of technologies and mathematical principles. So, machine learning primarily refers to unsupervised learning, in my mind. That’s basically, you give it a data set and you’re either saying, “okay, here, do some clustering.” Or maybe you do have a training set, so maybe it’s more supervised learning, you’re saying, “okay, this 20% is right and this 80% isn’t” but in either case, you’re basically able to give the algorithm some data and you tell it, “go run with it. Give me an answer.” Artificial intelligence, in my mind is a much broader set of algorithms that maybe someone did some hand-coding for, but it’s replacing something that we would normally associate with human intelligence, so, a good example is facial recognition. I don’t think you’re going to get to facial recognition from just a lot of base principles. You’re going to have to have a lot more hand-coding, a lot more human intelligence injected into there. Then data science, I think isn’t really a set of techniques and principles as much as an overlap of three important areas of career. It’s programming, statistics, and domain knowledge of what it means to work with the business. If I was in a mock interview and I had to answer that question, that’s how I would try and break it up.
Kevin: Okay, that’s interesting.
Eugene: Does that mean “wrong” interesting?
Kevin: Alright, look. Some of this is going to be opinion-based. I’ve heard so many different people talk about this in different ways and some people say they’re three totally different things. Other people have said, “ah, they’re very related, very close to one another.” I think some of it is opinion and there’s somebody currently raging, probably driving right now yelling at the radio that I’m totally wrong on this and that they have the correct answer. Keep yelling. I totally agree with you, person raging on the drive. For me, I can get behind that concept of data science as a confluence of a few different skills, or an emphasis particularly on the statistical analysis side. The machine learning? Agree with Nicki, this is a feedback process. It’s an algorithm that you give information, it spits out results and as a process of finding out whether or not the results were correct or not, so matching reality versus prediction, it then is able to take those changes and find a way to improve the algorithm itself in the process. An example, this is a method that’s called the Online Passive Aggressive Algorithm, which is how I Tweet. The concept is that you have, say, a real easy example, two categories. Is this a cat, is this not a cat? You show it pictures and say, “this is a cat” and it eventually will predict, “this is a cat” or “this is not a cat”. Based on a set of weights that we won’t get into, it predicts that something is a cat and it’s really a turtle, so you tell it, “no, this is not a cat”. What will happen is that when you give it the feedback that “no, this is not a cat”, all of the weights change to the point where the martial decision becomes “no, this is not a cat” and then you go to the next picture. “Is this a cat?” And every time it’s correct, it’s passive, it stays, all the weights stay the same. Every time it’s wrong, it’s aggressive. It aggressively changes to make that last answer correct. So, it’s consistently learning based off of results that are fed and can learn from actual responses to predicted events. That’s the machine learning part. It’s learning from its results. Artificial intelligence, I take a much more behavioralistic approach to AI and think of it as agents who are acting. Where an agent is some independent process, actions derived from some thought process or some set of rules or heuristics that are built into the agent or that the agent learns over time based on stimuli and then it performs some action. It will select that this is a cat or not a cat, or that it will go find all of the cats in the room and move them over to a certain location. So, this is, in my mind, it’s something that is not so much a technique as much as it is a precursor for some ‘thing’ performing actions. And that was Philosophy Corner with Kevin.
Carlos: Yeah, there you go. I was going to say, “until Elon Musk says I can start using it, I’m going to treat it with a grain of salt.”
Hello,
This was insightful to be honest. Especially about the Excel as a starting point, Be skilled in one thing at a time, ETL insights and Karate Kid approach.
Also knowing the Business Problem and understanding the solution was important take outs from the conversation.
Thanks
Sandeep Garg