Episode 24: Query Store

1400One of the new features being added to SQL Server 2016 is the Query Store which will allow for greater access to the performance history of your databases.  This is going to be extremely helpful when you are approached about ‘What happened this morning?’  With the Query Store feature, you will now be able to provide some insight into that question.

I chat with Borko Novakovic, the program manager for Query Store feature, and he shares with you, Compañeros, details about the new feature and how those interested in performance monitoring will be able to use this for their benefit.

Even if you aren’t going to run SQL Server 2016 as soon as it comes out, you will find this information interesting and I hope it helps you along the SQL trail.  Have fun out there.

Transcription: Query Store

Carlos L Chacon: This is the “SQL Data Partners Podcast,” my name is Carlos L Chacon, your host and in episode 24 we’re talking about the Query Store, the new 2016 feature that will enable you to get better history of the performance of your database.

I’m joined today by Borko Novakovic, from Serbia. He is the program manager for the Query Store feature. In other sessions that he has given he compares the Query Store to the flight data recorder for your database. I know this is going to be very useful for a lot of administrators out there that want to get access into that and see why queries may have changed in their performance over time.

After 20 somewhat podcast interviews you’d think that I’d have down all the sound quality issues and I am embarrassed to say that something happened on my side. Borko fortunately sounds very good but my side…

Anyway, something happened. I made some edits so that Borko is a little bit more featured there. I do apologize for that, and I hope that you’ll bear with me and continue to tune in.

If we miss something, if you want some other feedback to send my way, of course you can hit me up on Twitter @CarlosLChacon, or you can shoot me an email at [email protected].

Compañeros, it’s always good to have you. I hope that you have enjoyed the holiday season and are looking forward to a great 2016. I know I am. It’s good to have you, and as always, compañeros, welcome to the show.

Audience: SQL data partners.[music]

Carlos: Borko, thanks for being here. Welcome to the program.

Borko Novakovic: Thank you, Carlos.

Carlos: Let’s go ahead and jump in. Tell us why the query store will be such a cool feature.

Borko: Query store is a new feature or a tool in SQL Server 2016 and Azure SQL Database that dramatically lowers the bar for the query performance troubleshooting. In a sense, you can think of it as a fly data recorder for your workload. It captures and collects all the relevant data that is important for query performance troubleshooting, so that the data is always available for you when you need it.

Carlos: Right.

Borko: Basically with the query store, it does two fundamental things for our users. It collects all relevant compile and run-time data related to query and execution plans. Based on that data, you can do any kind of analysis and take corrective steps if you have performance problems. That’s one thing.Another thing, it really simplifies the process of plan forcing in the SQL Server. Prior to Query Store you would need to either use plan hints or plan guides. Both of these features are not that intuitive for average users. Now with the Query Store, you are just one click away from forcing the plan that you believe that is the best one for your query.

Carlos: Yes, that’s going to be interesting. I think one of the big differences between the Query Store feature and the current, for example, DMVs we have now, is that we can get some insight into the last query that was run, and then the cumulative nature of that query, where that the Query Store’s going to give me point in time over, this query’s run ten times today, I want to see stats about each of those ten query attempts.

Borko: One of the problems with the DMVs is actually in the fact that when you look for the specific plan in the plan cache DMV, in some situation, you might not be able to find it, because for many reasons, SQL Server might decide to evict the plan from the cache. It’s not longer in the memory. In many cases people do troubleshooting basically. Customers are complaining about the performance, DBA is doing performance analysis, and he cannot conclude what went wrong,With the Query Store, that cannot happen, because the full history of query execution is persistently stored inside the user database. Even if you had the performance problem a day ago, or a couple of days ago, you could go back in the past and see how the query was executed, how many times it executed, what were all different plans that were used for the query execution, and compare them side by side.

Carlos: I think that’s a common scenario where a user will come up and maybe nonchalantly say, “Hey, this morning I was having trouble. It seems fine now after lunch. But what was going on this morning?” [laughs]

Borko: Exactly. That is one of the main use cases for the Query Store. And of course, having the data for the analysis will give you opportunity to open new different scenarios. You can analyze, “What are the top resource-consuming queries today, during the last week, during the last month, ever?” which was completely impossible to do with the DMV.

Carlos: It is only cumulative. You can only do it since last restart, or since the last reset. Say, only give me last week.

Borko: Under the condition that the plan is currently in the cache, which might not be the case. Now you have statistics both for the queries that are currently in the cache as well as for those that are not in the cache, for ad hoc queries. You can analyze top resource-consuming queries, but also you can understand better your workload. You can understand the nature of your workload. You can conclude whether your workload is parameterized.What is the number of different queries? What is the number of different plans? How many plans you usually have per query in your workload? Stuff like that. Then you can distinguish between queries that execute very frequently in your system versus those that execute once a day, once a week. Then you can apply different techniques to optimize all of these.

Carlos: Now we’re familiar with including the actual plan results when we execute a query, but that process of collecting that information takes effort for SQL Server to capture that information. How are we getting around some of the performance impacts of now collecting this information for all of that queries over the course of, say, even a week?

Borko: I could translate your question if I understood it correctly. “What is the average performance impact of running Query Store with the workload?” Right?

Carlos: Fair enough.

Borko: This is the bottom line question. Basically, of course, the usual answer for this question is it depends on your workload, right? But I will now try to explain how it depends. The main fact that is critical for the performance overhead is how many different queries and how many different plans you have in your system.What happens is, if your workload is predominantly ad hoc workload, you have, let’s say, thousands or hundreds of thousands of different queries. Then, actually, your workload is difficult both for the SQL Server and the Query Store. It actually puts a lot of load both for the Query Store and the SQL Server, because the SQL Server will be constantly compiling new queries and new plans, which probably means you are not using parameterization enough as you could.

In that case, you might observe some performance overhead. But if you have really well-packed, really well-designed workload — let’s say you have a couple of thousands of different queries, and then you have plans that are being reused — then the overhead is really, really minimal.

When I say “minimal,” what we observe in Azure SQL Database — and we are running Query Store on thousands of databases for months now because Query Store is really a cloud-first feature, and we really learn a lot and improve a lot based on Azure feedback — what we see is really negligible performance overhead. On average, it is a few percent — one to two percent, even less.

Now, it doesn’t mean that every workload in the world will have that performance overhead. As I mentioned, you can construct synthetic workloads that can have five, maybe even more, overhead. But what I’m trying to say, what we observe on a thousand of different databases is really a minimum overhead. The design we implemented is such that we are minimally blocking current workload.

What happens is, actually, there are two integration points between every workload and the Query Store. Every time when the query is compiled or recompiled, then the new plan and compile time statistics are being sent to the Query Store, using the message that exists between the query optimizer and the Query Store.

Every time a query gets executed, then the query processing pipeline sends the runtime statistics and Query Store then calculates statistic information for the interval over which query was executed.

Now, if you go to my previous point. If you compile rarely and execute frequently, then the only point where overhead exists is actually during the runtime. This really is a minimal overhead.

Carlos: OK.

Borko: If you’re executing different queries all the time, then what you’ll be doing — what SQL Server will actually be doing — is SQL Server will be compiling plans all of the time, and then send plans and compile time statistics to the Query Store, and that will basically incur significant performance overhead. The most expensive overhead comes from plan compilation and sending the compile time statistics.

Carlos: Sure. For those, maybe using some Entity Framework and other things, you might be a little bit more in trouble. If you’re using store procedures and whatnot, you’re…

Borko: Exactly.

Carlos: …A little bit safer. I think you could use the DMV because the DMV will give you the object information if the plans are coming from an object that’s per procedure, or views. If it’s ad hoc, that information will be there. You can get a sense, overall, for what’s the percentage of ad hoc versus object plans, if you will — plans that come from storing procedures, then.That would give you a sense for what my performance in fact might be.

Borko: You can use the Query Store, of course, to easily identify whether your workload is ad hoc or not.

Carlos: Once you’ve set that up, then obviously you’re collecting that information, and I’d expect the Query Store to tell you that.

Borko: Yes. Then, if you have that information in front of you, then you can decide what to do because you have some techniques that you can apply even without changing your workload for parameterization, and then reduce the number of different plans that will be used during the query execution.

Carlos: Sure. The benefit there of having that information is being able to force the parameterization, as you mentioned, to reduce then the number of plans.

Borko: Yes, and just to mention one thing. Overhead of running ad hoc workload is not only in the fact that there are too many recompilation that happens, or compilation that happens, but also in the fact that Query Store is being flooded with a large number of actually similar queries and plans. And then what can happen in that case is that very easily you can exceed the size quota Query Store has inside of every user database.Maybe now I didn’t explain that part, but basically Query Store stores the data inside the user database. There is a pretty fine size quota that you can set and figure to any value that you think is reasonable. We start 100 megabytes, but basically if you’re running ad hoc workloads, what we observe is that those kind of workloads very quickly exceed the quota. They are generating new and new queries, new plans, which are all similar, but not the same.

In that case, what happens, Query Store goes to the read-only mode, because it exceeds the quota, and then you are not operating the latest data. If you go to Query Store you won’t see the latest data, because it will stop collecting the latest data.

Carlos: Once it gets to about 100 megs, it stops.

Borko: Yeah. Which is a second bad thing which happens, and yet another reason why you need to do something about ad hoc workloads.

Carlos: Let’s just close that loop for those who are experience the Query Store for the first time. You mentioned how they get stored. So previously, if you’re comparing that to Dynamic Management Views, the information gets stored in the resource database, it’s not seen to the end user.Now you’re saying I enable Query Store on a database, the query is formed — or what goes into the Query Store for that database, actually gets stored inside that database — and then that information is available to me via SQL Management Studio.

Borko: Yes, and what happens is the data is persisted across backups and restores, so even if you backup your database and you store it in some other place, you get the Query Store data inside. I can just quickly and briefly explain how Query Store works and what is the data that Query Store collects and captures inside the user database.Let’s start from queries. Every unique query text is being stored and captured inside internal structure of the Query store. First of all, Query Store uses internal tables to persist the data, and all the data is being exposed to set of views — system views — that you get as part of the SQL server.

If you go and search for sys.querystore, you will find a couple of views that expose query text, queries, plans, and run-time statistics. We started with text. Every unique text is being stored in the Query Store.

Now, it’s very important thing. One unique query text can produce multiple records in Query Store. This is because, as we know, if you execute the same query text using the different context setting — we call it “plan affecting settings” — you can actually produce different results. If you change the language, if you set the different set different ANSI_NULLS. If you change some of these set options, the same query text can produce different results.

Therefore, we store the same query text under the different set options. We store it as different query entries in the Query Store. That means you can analyze and tune those queries independently. For us, it’s also important whether some query text is used inside or outside of some containing object, like store procedure, function, trigger.

Say that you have a query text. The same query text that is used as a ad hoc query and the same query text that is used as part of the store proc will generate two query entries. You will be able to actually analyze them independently, and if you need to tune that query text, you have actually option to independently tune your ad hoc query versus store procedure query.

Then, for every query entry in the Query Store, we do collect every execution plan, every plan that we get from the compilation phase, inside a separate table and expose as part of a separate view. A set of compile time statistics are collected, like the CPU, time view for the compilation, and some other metrics.

Then, the most important part. This is the compile time part. Every time when the query gets executed, some of its plans is being executed. What query store is doing, it divides execution in time windows. We call it “run-time intervals.” This is something that you can control and configure. You can say, “I want to track execution every minute,” or every five minute, or every hour, or every day. It defines the granularity that you’ll have inside the Query Store.

For every query that was executed on that interval, you will have exactly one row in the Query Store with aggregated run-time statistics for that plan.

Carlos: If I understand what you’re saying right, is that I have the option of saying I’m going to keep this information for a month, but I want you to aggregate it in chunks of one hour, so if the query runs ten times an hour…Borko. Yes, in chunks, right.

Carlos: …I’m going to see the average…am I going to see average results, or cumulative intervals?

Borko: You’ll see a couple of metrics. You’ll see the number of execution times. How many times query was executed in that interval, then you’ll have average, and then you have maximum and minimum, and standard deviation for the duration. Using the number of executions and average you can easily calculate the total duration of that query.Those are the statistics methods that you have for your analysis. It wouldn’t be feasible to store each and every execution of the query, because there are queries that are executed thousands of times in a minute.

Carlos: Let’s get into some of that. You mentioned that your experience with being in SQL Azure Database. You’ve seen the usage and some of the scenarios. What is a general guideline?Let’s say I’m just starting out. I want to get some better insight into my environment. What are some of the scenarios or, “best practices” may be a strong word, just yet, but the setup components that I’d want to look at, that I’d want to implement?

Borko: Usually what we do explain to our customers is how they get started with the Query Store. The number one thing that we advise to our customers is, “You need to install the latest version of SQL Server Management Studio,” which means, 2016 Management Studio.You probably know that as of CTP2, you have a standalone installer of the Management Studio, which means you can install the Management Studio independently from the SQL Server.

Which is especially interesting and useful if you don’t have SQL Server running [inaudible 19:53] in your environment, but you want to use Query Store in Azure. Why we do suggest people to install the latest version of Management Studio, because Management Studio comes with a set of great UIs that you can use both to configure and analyze the data that Query Store actually collects.

As a starting point, if you go and click the properties of your database, then on the left-hand side you’ll notice a new, basically a tab– a Query Store tab — for the database. It opens a form that you can use to activate the Query Store. To set the operations mode to retry it, which means I want Query Store to start collecting the data.

There are a set of very important parameters that you need to configure for the Query Store. For many people it’s actually good enough to take defaults because SQL Server Management Studio comes with defaults, but some settings are important and I want to emphasize those.

I already mentioned the maximum disk size. So, we start with 100 megabytes. Another settings that people need to consider is for how long you want to keep the data inside the Query Store. I think we came out with a pretty conservative default, which is one year and two days.

Carlos: OK.

Borko: Which I think is really long period of time for many users.My advice for users is if you don’t plan to analyze data for such a long period of time you don’t need to keep the data because it will be overhead to the system.

So, if you want to analyze data only in the last month, then just set that value to 30 days and that’s completely fine.

Carlos: Then we download it. So, what about the event that, so, I’m a good DBA, right, I want to review my systems, let’s just say on a quarterly basis. I keep it for three months.Is there a way to persist that data or even export some of that data to say, “OK, I’ve made these changes. I’ve affected parameterization on some of these queries. It’s ad hoc queries that all look the same. I don’t necessarily want to keep that data but I do want to get some thoughts on if the changes that I’ve made have been good for my system.” So, would the advice be to just keep that for the two months, or the two quarters, rather, so that you can compare those or is there a way to export some of that and say, “Here’s where I was. Here’s where I am now.”

Borko: You can use different techniques to the data from the Query Store to any external structure because we do expose views so that you can use SSAS. You can use any of other export procedures to export the data to your custom structures. You can use backup restore or backup mechanism to export the data.Now, we don’t have any specific mechanism today to export the data. Only the data that is related to the Query Store to some format and then import into the same format to another database. We don’t have that mechanism today. I must admit that we are thinking about having such a mechanism, but it doesn’t exist.

So, if you want to compare. You always want to keep last month of the data, but you had some plan forcing decision in the past and the you want to analyze how your database today compares to one year ago. Then you need to do analysis on your own.

If you export the data previously in some structures, in some tables and then you export the contents data to the Query Store in the same database, you can do analysis.

I must admit that today you are pretty much dependent on your own skills for exporting and analyzing the data.

Carlos: The MSDN site has lots of great information of how to use and set up that and we’ll that information all with the show notes today. It’s sqldatapartners.com/podcast. You can go in, get some of this information. Some of the step-by-step details you can get on how to set those things up, will be there, available for you as well.

Borko: Yeah, thank you. That’s awesome.

Carlos: We had mentioned that this has been Azure SQL Database, with cloud-first feature. Are there going to be any differences between the Azure feature and the on-premise feature when it gets released? Which I know is a little bit hard to tell because it’s just CTP3. Current thoughts, right, in November 2015, of what differences might there be.

Borko: If you compare today query storage at Azure SQL Database and in SQL server [inaudible 24:44] CTP3, you won’t find any difference. The feature set is completely the same.As of beginning of November, we are actually at the GUI level of quality in Azure SQL Database. We call it “globally available,” which means we are really at the production level of quality in the Azure SQL Database which automatically means we have the same level of quality on prime because we are using the same code base. Everything is the same.

Carlos: Now, one thing that we maybe had danced around just a little and that is, once I have collected all of this information — and you mentioned SQL Server Management Studio adding the newest version — and that is the analytics that I’ll have inside of SQL Management Studio. There’s actually reporting…

Borko: Yes.

Carlos: …and some charting available for you. So, it’s not just looking at raw data, if you will. It will guide you on that path to decision making.

Borko: The visualization that we provided in SSMS actually are reports that help you go through the main scenarios. Although you could analyze Query Store views on your own — there are six different views that you can join and combine the data — it’s of course much easier to go and look at predefined reports because they visualize the data in a very intuitive and simple way.I will quickly cover what are the three most important views and reports that you provide as part of SSMS. First one, we call it “regress queries.” So, that is the view that you can use to easily and very quickly analyze what are the queries that today behave worse than compared to the, let’s say, average on a period of last seven days.

If you will, the default is “Give me the queries which total duration is worse than comparing to the last seven days.” Of course you can change the metric that you analyze. By default it’s duration, but you can use CPU time, you can use memory, you can use IO metrics.

Instead of writing complex queries against the Query Store, you can go and configure the reports and it will give you top-end queries that actually needs your attention.

Then, you can go and analyze a different perspective and we call it “top resource consuming queries.” Those are the really the most expensive queries that you have in your system. Again, you are a free to specify what is the metric that you are interested in. Whether it’s total duration, or CPU, or memory, IO, and what is the period of interest. Again, even if those queries are not regressed, maybe they behave completely in a predictable way, they are still the most expensive, which means those are the guys that you need to pay attention to, right?

Because you might have thousands of queries, but usually you have handful of these that are very expensive and then you need to actually pay more attention. And in some cases you can easily see, “Hey, this query was executed with three different plans,” and, “In last month, this one was the best one, but for some reason SQL server was switching between those three and I need to analyze why.”

Of course, when you click on the plan, it’s very handy on the bottom of the visualization, you have plan visualization. Very complex data is visualized in a very intuitive way and you have everything that you need. Query and plan execution, and a single plan drill-down.

Just to conclude, the third scenario, we call it “overall resource consumption,” which means, “I don’t want to go into a” — and some people start even from that one, right? — which is, “this is my database, this is the time axis, I want to see what is the total CPU for my database in last day.”

I have intervals and I can see the chart and how CPU duration or memory consumption look like in last day and then for interval I can double click. From that point, I’ll actually go to the top resource consuming queries for that interval.

So, using SSMS will actually save you a huge amount of time.

Carlos: It’s a complete feature, right? Not only the collection of that data, but then giving you the tools to drill down and make sense of what information is being collected.

Borko: Exactly.

Carlos: You presented at PASS this year. It was the last session of the conference Friday afternoon at 3:30 pm. The room was packed. It was probably, gosh, a thousand people in there. More. So, I know that Query Store is going to be a very interesting feature. You had mentioned some metrics actually in looking at the 2016 features and kind of where Query Store fit in there. You want to talk about that for a second?

Borko: Based on the data that we got from people that wanted to share the information with us, Query Store is really among top three features currently. Among top features from SQL Server 2016 and we are very happy about that because we know that is something that we can really help our customers to optimize and understand their workload much better than before. Yeah.

Carlos: So, it will be exciting. We were chatting, I’ve seen the demo, but Borko smacked my hand and said I need to download CTP3 and get in there and play around with that a little bit more. So, I have my homework assignment for me. Compañeros, I hope that you’ll take advantage to do that.Of course you can use the Azure SQL database to do that if you have access there, or you can the download the CTP version.

Borko, thanks for chatting with us. We want to transition just slightly into what I’m now calling the “SQL Family” section of the podcast and it’s an opportunity just to chat a little bit more about you. How you do your work, and things. We were talking about Query Store and it’s OK. I know that you’re the program team for that. So, Query Store is your favorite tool, that’s OK. But what’s your favorite SQL tool?

Borko: I think that answer is easy for me. The favorite tool for me is really SQL Server Management Studio.

Carlos: OK.

Borko: Not because I work for Microsoft, but because…I mean, I loved SSMS even much more before I joined Microsoft. I joined Microsoft four years ago, but before that I’m in the industry for maybe 14 years. I was first developer and then I was product manager and also software architect in ISV company that work and built different solutions on top of Microsoft technology stack using the SQL Server, and I found SSMS extremely useful.Why? It’s because I had everything that I need in a single place. Now, this Query Store visualization, which is really extraordinary, makes me love it even more.

Carlos: Sure.

Borko: And I think to be completely honest I think we are doing a great job to improve SSMS in the future.

Carlos: Yeah. That was funny. I hear that the SSMS that SQL server Azure 2016 is got a little bit of a face lift in its next version, right? [laughs]

Borko: Yeah.

Carlos: Very good. Very cool. You’ve mentioned, you’ve been with Microsoft for a while. Ultimately you are over in Serbia, you going to have a slightly different experience. For example for myself, being in the United States we all take different paths on the trail of the SQL trail. What is the best piece of heard advice you’ve ever received?

Borko: Basically it’s interesting. I graduated from Belgrade University. I actually took a course in electronics. I was supposed to work in hardware industry. [laughs]

Carlos: [laughs]

Borko: The best advice that I got from my very good friend and my colleague was, “You should be called developer. [laughs] It will give you more freedom, more flexibility. You’ll be more creative.” That was the best piece of advice.I never worked as an engineer in electronics. I started immediately after the faculty. I started as a simple SQL developer. At some point I worked for a company that built a solution on top of SQL Server, that’s like 10 years ago started my SQL journey.

Carlos: Our last question for today. If you could have one superhero power, what would it be, and why would you want it?

Borko: That’s the difficult question for me. Probably going back in the past, and see how history looked like is something that I really like. I’m a really big fan of history. I’m really into history. I like history. On my free time, I actually explore and read a lot of about history.I think I would go to the old Rome and see how actually things looked like because I think there are a lot of needs, a lot of beliefs about the history. The reason not because of me being interested only, but also maybe to explain to people how things look like. I do believe that learning from history is the great way to pave the road for the future.

Carlos: Very interesting. I enjoyed travel myself. I think you’re on a great location in Europe, to get access to a few of those spots at least of course not as they were.

Borko: Greek is near Rome is near so history is near. [laughs]

Carlos: [laughs] Very cool, Borko, thanks again for chatting with us today. You’ve been great.

Borko: Thank you, Carlos.[background music]

Carlos: We’ll have again…short notes are available at sqldatapartners.com/podcast if you want to check that out. Any additional questions we’ll probably be following up just because this is such an area that has a lot of people interested.As we get closer to RTM we may reach out to Borko and have a chat with him. Thanks again. Compañeros and we’ll see you on the SQL Trail.

[background music only]

Audience: SQL Data Partners.

Episode 23: IO, IO, It’s Off to Tune We Go

1400If I were a betting man, I would wager your disk IO system is probably the biggest bottleneck in your SQL Server right now.  Sure, there are lots of moving pieces, but if you can tame the IO situation, you are putting yourself in the drivers seat.  So how do you go about putting together the pieces you need to identify your IO constraints?

I chatted with Theresa Iserman, a Premier Field Engineer at Microsoft about some of her thoughts and why IO in general is an issue.  She gives us some guidelines and even some metrics to check out.  Join us in the conversation, and I’ll see you on the SQL trail . . .

Show Notes

Theresa on Twitter
Paul Randall Wait Stats Query
Perfmon Template with Counters Mention In Episode (Coming Soon)
Performance Analysis of Logs (PAL) Tool
Pssdiag and Sqldiag Manager

Transcription: IO Tuning

Carlos L Chacon: This is the “SQL Data Partners” podcast. My name is Carlos L. Chacon, your host, and this is episode 23. Today we’re talking about IO tuning, and I have as my guest, Theresa Iserman.

Theresa is a SQL Server Premier Field Engineer with Microsoft. I’m always glad to have the Microsoft folks on the podcast. They have quartered the fore front of a lot of what we do as SQL server professionals.

Today we’re going to talk a little bit about IO tuning, how to capture it, how you’d go about deciding what changes you need to make.

Theresa makes a couple of very specific counters or measures in some of her comments. Obviously, you’re in your car, you’re walking your dog. You’re not going to have an opportunity to write those down so we’ll have those short notes available at sqldatapartners.com/podcast. You can check out today’s episode, episode 23.

In addition to that, I’m actually going to put together a template based on some of the counters that she suggested.

That way if you want to go ahead, just put that right in. You just start taking a look at some of those items and begin measuring that over time. Compañeros, it’s always good to have you. I do appreciate you tuning in.

If there is something that you’d like to hear about or something that we might need to improve about one of our episodes, you can reach out to me on twitter, @CarlosLChacon or you can reach me via email at [email protected]. Let’s get into it and welcome to the show.

Children: SQL Data Partners.[music]

Carlos: Theresa, welcome to the program.Theresa Iserman : Thank you Carlos.

Carlos: It’s great to have you. We happen to meet up in Portland, and I saw your session with a very interesting name that I called “Anatomy of an IO tuning exercise,” and I thought, oh, man! A lot of us have IO problems. We know that our systems are limited by the resources that they can run.More often than not, IO or those disks are probably going to be the biggest bottleneck in our systems. I want to have you on the show today, to talk a little bit about this, your experience with tuning some of these IO problems.

Interestingly enough, there’s a side note here, is I’ve done a 20-something of these episodes and you’re our first female guest.

Theresa: Oh, wow.

Carlos: [laughs] You just happened to beat out Jessica Moss. In fact, I think maybe I’ll have to have you ladies back to back because, just from scheduling, we couldn’t quite get her on.So we talk about IO and, looking at how we’re going to go about tuning some of that, what are some of the components? What are some of the system processes that we had to be familiar with to help us really identify what’s going on in our systems?

Theresa: Yes. We want to look to see where we have a problem. A lot of times, when we identify an issue related to IO, we’re looking at latency. Latency is a measure of time in terms of the time it takes to return a page or an extent from disk to memory. The amount of time that that takes really equates into time that our users are waiting.We want to know what’s normal in our systems by performing baselines in terms of — we want to know what’s normal in terms of latency. When we identify problems, many might tell us the server is slow. We want to identify where that problem is.

We would start typically with — everybody has their own tuning methodology, but some of the tools that you could start with would be wait stats. That would give you an idea of what your system is waiting on. If you identify a wait that’s related to IO, like write log or page IO latch, you might want to dig in and look at virtual file stats, which is a dynamic management function that gives you file by file information about your leads, rights, and stalls. You can look at on latency by file.

Carlos: Let’s actually dive into that a little bit, into those waits. Why would my SQL server be waiting? You mentioned a couple of things. It could be waiting on IO, you mentioned a couple of wait examples. Under what circumstances would, my server be waiting?

Theresa: We always have waits in our system. They happen by design because we have lots of things happening in our system and resources are unavailable to process the things that that SQL needs to do at a time. We have a process called scheduling.We all know about processors that we have in our servers. Each of those processors is associated with a scheduler. The scheduler queues up threads, and these threads work their way around different states on the scheduler until all the resources that they need to perform their task are available.

Any time a resource is not available, the thread is not on the processor and running. We’re waiting until, or the thread is waiting until what it needs to perform its task is available.

Carlos: These wait stats, and ultimately we are using some dynamic management views, some DMVs, and SQL servers collecting the amount time that it’s waiting for all of these difference components.

Theresa: Exactly. We have a couple wait tests that are particularly helpful. I mean a couple of DMVs that are particularly helpful when we’re looking at wait stats.We’ve got sys.dm_os waiting tasks. This is, typically, the first DMV that you want to run when somebody tells you that things just started.

This is a picture of what’s happening right now on your server. It’s not cumulative like many of our DMVs are. You can also look at sys.dm_os wait stats which gives you a picture of all your wait stats and since the server was last restarted, or the wait stats were cleared.

Typically we want to look at this DMV with what we call Delta. So we want to take a snap shot at one point and then a snapshot at another point and compare the difference so that we can get an idea of what we’re waiting on that particular period of time.

Carlos: OK.

Theresa: We have to do some computations on this DMV to see if to remove benign wait stats, like if people wait for, command in our code, we don’t want to include that. We don’t necessarily need to include lazywriter or checkpoint waits.I think Paul Randal has a really great query that you can use that includes information about the derivations that are available and the wait stats to filter out. If you search on the sqlskills website, the waits…

Carlos: Yes, that was one of the ones I was going to bring up. Ultimately, with a lot of these DMVs the work has been done or kind of getting to the nuts and bolts of what you’re looking for, is out there available from our community. I think even his post is called “Tell Me Where It Hurts”.

Theresa: Exactly.

Carlos: We’ll have that available. A couple of these scripts and some information we’ll have available at sqldatapartners.com/podcast. You can look at today’s episode on IO tuning. We’ll have all of that information available so that you can go and grab those DMVs and get started.Now that we have this information, we’ve used the DMVs to collect the wait information. You mentioned ultimately we’re going to be looking for those IO specific waits. Then we looked at another DMV you mentioned was using the file stats comparison and getting that file by file comparison. Why is looking at that information important in your database system?

Theresa: That’s a great question. We want to know exactly where to pinpoint our troubleshooting efforts. We want to know which file in our system is hot, which one is the one that has latency, that’s affecting our workload, and what our storage can deliver.If we identify where the bottleneck is, then we are able to remediate our issue by drilling down further and finding out if there’s a specific query that’s impacting things if a query needs to be tuned, or what we need to do if that query has to stay the way it is, if we need to make some adjustments on our storage side.

We go into virtual file stats. You can look at one individual file or you can look at all of your files in your system to identify reads and writes on each file, IO stalls. We can look at your transfer size. The biggest thing here is latency. You get to look at latency without going to look at perfmon.

Carlos: I found that this is extremely useful. Generally we tend to think, have several databases on our system and we’re not sure which one is being the slow guy here. We tend to think of our larger databases as being the ones that get the most usage or whatnot.I’ve been in a couple of scenarios while, looking at the file stats of the database, I found a smaller database was actually taking up a much larger percentage of the IO. When we actually looked at doing some things including removing that smaller database that we could put somewhere else, the whole system benefited from that move.

The size of the database may not be a great correlation to the IO that that database is using.

Theresa: Right, that’s a great point. That’s a great point. We can’t make any assumptions when we are doing troubleshooting. We really have to let the data determine the direction that we go when we’re looking at things.

Carlos: We collected some waits stats. We’ve looked at our file stats. We try to figure out which of our data files or data buses are calling the most problems. What other collection tools do we have available to help us continue to gather stats?

Theresa: Another tool I like to use Perfmon. Perfmon is a Windows tool that lets you monitor operating system and SQL resources so we can diagnose performance problems. We can use it both in real time and by collecting data that we can use later to do some analysis.Let me say one more thing about Perfmon.

Carlos: Sure.

Theresa: With regards to Perfmon, we have all this counters that are available to us and the counters let us monitor specific activity related to a subsystem or a SQL in general.We can look at things like SQL logs, SQL windows processes, process information and logical disc and these are all objects that are available in Perfmon.

We look at these counters and we can correlate information from the counters to tell us, give us a picture of what’s happening on our system and we can use them specifically related to IO to look at the logical disc counter.

There’s hundreds of counters to look at in Perfmon, but when we’re looking at IO, I want to highlight a few in the logical disc object. We were talking a little earlier about how we want to look at latency.

We also want to look at latency. This is the amount of time it takes to deliver an IO from disc to memory. We also want to look at high-ups and we’re going to look at throughput. At least these perfmon objects are not called latency, high-ups and throughput — they’re all part of the logical disc object, and we have counters. You just have to know by getting experience which counters are associated with latency.

For latency, we have three counters that we want to look out and it’s average to seconds per week, write and transfer.

If we’re looking at high-ups, we want to look at disc reads per seconds, disc transfer per second and disc writes per second. We’re looking at here how many IOs can we read right in a second and then transfer as a combination of reads plus writes.

One last one is throughput. Throughput we can think of is the quantity of stocks that were put through the pipe. It’s technically the number of bytes at SQL reading from storage or writing to storage.

The counters that we look at here are disc bytes per second, disc read bytes per second, and disc writes bytes per second.

Carlos: I think that’s where we get a good correlation to the files stats to see which files are moving data in and out of the system more often in others.

Theresa: Right. We can use all three of this metrics to compare to the file stats especially latency that’s going to be the first place that going to give us an indication of what’s problematic.When I think of latency, I think of kind of acceptable values for latency and I know that if I see something that’s higher than what I would consider acceptable for my system then I would want to drill in there.

A general rule of thumb, which might be different on each person’s system, is that for data files, you want to have for all TP system probably less than 10 milliseconds of latency and then for log files you want to have zero to two milliseconds of latency, and then for warehousing workload, geometrics are going to be a little bit different because you’re looking at the quantity of information that could come over that could be delivered.

Carlos: Right.

Theresa: When you see numbers that are higher than these base lines then you know that there’s some pertain place to take in.

Carlos: You mentioned each environment is different, but I know you’re in a lot of different environments. Let’s say there on some older hardware, are this speeds for they are the new faster stuff or you feel like even some of the older disc? Let’s say five-years-old, could still get this kind of numbers?

Theresa: I think that you want to target this for any system that you’re working for. Obviously, faster disc is going to deliver different metrics, but the based off process is that your users are going to be waiting for a certain amount of time, and for each system you have to think about how long is it OK for users to be waiting.It might be OK that your users wait 20 seconds for a particular request and we have to work with our application means to find out what’s acceptable on their side, but if we’re looking at different tiers of disc, this is a general place that would give you an idea where to drill in.

Carlos: Sure. Now, I know opening a perfmon for the first time, there are thousands of cameras that seems like it can be a little bit tricky to get started. Obviously, you’ve pointed us in the right direction from IO perspective, but I’m wondering have you ever or do you use the PAL tool at all?

Theresa: Somebody in my team actually develop the PAL and I haven’t used it a ton, but it’s a great starting place and I guess some people use it more for just starting, but it gives you an opportunity to put the results of your perfmon analysis into this tool and get some recommendation and graphs that help aggregate information to do your troubleshooting.

Carlos: Right. Davis Plus, I think he’s kind of taken ownership of taking the SQL server counters that are use in the PAL tool. David, if you’re listening, we still love to have you on the show. The other nice thing that you mentioned was the charting with that sometimes even just looking at the pure numbers.As we’re the detectives, the numbers can make a little bit more sense to us, but sometimes as we then tried to convince some span to be in this area because we need to increase our server capacity what have you. Having those charts I found can be very, very helpful [laughs] because they provide some thresholds for you like these numbers that you gave and you can give that all over a week or even a day. Then show them, “Look, this is what’s happening during our peak load.”

When you’re running these specific reporting queries things like that, and then giving them that visual representation like it kind of goes, “Oh. I get it now. Let’s go invest in another disc or what have you.”

Theresa: The numbers are what tell the story. I think when you can substantiate what’s going on your system with some hard metrics, it really helps drive decisions that business is making about the investments that they’re putting into hardware and systems.

Carlos: We collected this information. We look at our wait stats, our DMVs. Now, we have some perfmon metrics. We think that we have identified a specific area that is the problem. How do you go about identifying the changes that you’ve made if they’ve helped or not?

Theresa: Great question. Whenever we’re doing performance troubleshooting, we want to have a methodical way that we do our analysis and then we want to methodically go through remediation.We want to analyze with the efforts that we put into remediation, the same way that we analyzed our issue. We have this process that you looked at to find out what your issues were and then after you do your remediation because of the same process and you compare the numbers to see what the improvement or degradation was.

Base lining is something that really helps so that you know what’s normal in your system, but if any absence of base lines if you identify a problem, you look at where your numbers are during the problem identification process and then after remediating you’re going look at what the values are to see if you have some improvement.

Carlos: Maybe honing in on the performance stats, for example that you gave us, is a good place to start, but occasionally you’ll have other metrics that might change because of something that you’ve done and you may see. I guess my question is this, how do you then balance an increase let’s say in CPU, for example, if which you’re after is a decrease in IO? How do you balance seeing one number increase while another number gets lower?

Theresa: Over time you get familiar with what you can correlate, but performance troubleshooting is a process that I heard somebody describe like the game Whack A Mole, where your immediate one problem and then another one comes up so you focus on what you’re trying to remediate first and then it’s expected that you’ll probably see something else comes up because really veer cases are tuned perfectly.

Carlos: That repetition or repeating process one step at a time and continue to measure as you make gradual improvements.

Theresa: Exactly.

Carlos: What are common scenarios that you have seen specific to IO operations that folks might be able to start digging into or looking at a near systems to see if they are experiencing this?

Theresa: One of the things that I demonstrated in the presentation that you and I were together for was the location of log files. We were seeing log files and data files on the same storage. The characteristics of what you need to do for log filing into right sequentially.If you’re using spinning media, you know you’re writing sequentially to that disc spinning around, but then if you’re accessing your data files for an oil TP workload, you’re typically doing random access on your spinning disc.

That disc had is going to be moving all around that disc for what you’re doing in your data files, but it’s going to need to write sequentially for the log files. Those two things that you’re trying to do were kind of conflict.

Your log writes are going to slow down and so we hear a lot about how we want to separate our log files and data files on spinning media that’s because the characteristics of the workload is different for each file so that’s something that we see a lot as people configure systems.

I would say that’s probably one of the common things we see storage that can’t deliver at the speed that we need it to for the workload.

Carlos: They’ve probably taken some defaults there. It should take a little time to divide those out.

Theresa: Right.

Carlos: Thanks for this information. Of course, we’ll be posting some examples and some access to the DMVs on the website sqldatapartners.com/podcast.Now, we want to switch gears a little bit and we’re going to go to what I’m going to call the SQL family portion of the program. We like to get to know you a little bit better and kind of some of the things that you’re doing in your career.

The first question I have is what’s your favorite tool? Normally, we would like to talk about SQL tools here, but I’ll expand that. This could be a free tool or paid tool. How do you use it? What do you like about it?

Theresa: I love perfmon, we talked about that a little bit so I focus on another tool called SQL diag. SQL diag is a tool that comes with SQL server, and we use it a lot in my job for diagnostic information.When I work with a customer, I don’t have the benefit of going back to the system like we do as a DVA on premise where you have a problem and you say, “I’m going to look at this configuration setting or I’m going to look up what’s happening on this drive.”

In my job, I have to collect all the information upfront. SQL diag is one of the tools that I use that gives me a whole bunch of information in one fell swoop. Doesn’t take a lot of time to gather, it’s fairly lightweight and it’s free and comes with SQL server.

Carlos: It used to take profile traces as well, does it still do that?

Theresa: It still does. You can get a profile or trace. You can get perfmon information. You can get your application logs, your event logs, and your SQL logs. I don’t capture all of that, but some people do. I like to capture the perfmon and trace information correctly because trace can be a little bit of a heavyweight tool.

Carlos: In the Microsoft group, you visit with a lot of customers. You’ve been on a long journey, spent a lot of time in SQL server. Take us back a little bit. What’s one of the best pieces of career advice that you’ve received?

Theresa: I think to rely on people in the community, to find a mentor, to give back. I think I thought when I first started that I could just figure out this all on my own. What I have found is that as I’ve gotten to know more people in the community and had different mentors over the years that we can’t learn it all on our own and each person has experiences that they share that we haven’t had.So we can add so much more value to the customers that we’re working with by knowing what other people experiences are. They help us point us in the direction when we’re solving a problem.

What we learned we can share back with other people so there’s this really nice symbiotic relationship, but finding a mentor is really important.

I still have mentors now, but as I was getting started I had a mentor that helped me with some B6. Now, I have mentors that help me learn about different scenarios I haven’t been exposed to and people asked, “How do you find a mentor?”

You can ask people to be a mentor. I’ve read people stories, people blog a lot in our community about their experiences. One person that just starts speaking at SQL past, for example, I mentioned that she asked to help her out and she’s got a long way from just having that experience.

Carlos: Those in the SQL server data community are extremely fortunate because as you mentioned folks are extremely generous with their time. If you’re willing to put it a little bit of effort to reading the articles at least trying to understand the concepts then they are people who help you, you’ll get into the finish line.

Theresa: Exactly.

Carlos: Thanks again for being here. We do have one last question for you and that is, if you could have one superhero power, what would it be and why do you want it?

Theresa: Oh, what a great question, Carlos.

Carlos: [laughs]

Theresa: If I’m thinking career wise, I would love to have a photographic memory…

Carlos: Oh, here we go.

Theresa: Because our industry is changing so fast and I love learning and I wish I could keep up with everything as it comes out. I feel like we really have to focus on one area. If I had a photographic memory, I could know all sorts of things. I could be all knowing.

Carlos: There you go. Lots of MSDN articles rallying around there.

Theresa: Exactly.

Carlos: You mentioned career focus. I guess there’s another power that you’d want that’s not so career focused?[laughter]

Theresa: I guess to see into the future and I wouldn’t have to think about career at all. We could just know what the lottery numbers were to pick.

Carlos: [laughs] There you go. Yes, you can have too many of those by going around. It would be you’re sharing a lot all of your winnings with [laughs] the same folks. Theresa Iserman, thank you so much for being here.

Theresa: You bet Carlos. It’s a pleasure talking to you. Thanks for having me.

Carlos: It’s been great. So, compañeros again sqldatapartners.com/podcast, we’ll have additional information. If you missed something, you saw the question. I will have Theresa’s Twitter information. Sure, she’ll be willing to answer any questions that you have and we’ll see you from the SQL trail.[music]

Children: SQL Data Partners.

Episode 22: Eight Years of sp_whoisactive

1400In what has to be one of the most downloaded SQL Server scripts on the planet with over 73,000 downloads, sp_whoisactive has been providing active session information to data professionals for eight years.  I catch up with Adam Machanic to talk a bit about how the procedure came to be, how he uses it and what is in store down the road.

If you are using SQL Server and haven’t used sp_whoisactive before, I invite you go and download it right now, then come back and listen to the program.  We will wait for you.

Have fun on the SQL trail compañeros.

Transcription: sp_whoisactive

Carlos L. Chacon: This is SQL Data Partners Podcast. My name is Carlos L. Chacon, your host, and this is episode 22. Today we’re talking about sp_whoisactive with Adam Machanic.

Adam was one of the first people I reached out to when I started recording my podcast in the summer of 2015. Because of his schedule, we weren’t able to meet immediately. He asked me to reach out to him a couple of months, which I did, and he was gracious to sit down and chat with me.

There aren’t too many bigger names in the community than Adam. And in listening to the recordings, you can tell I was a little starstruck, so I apologize for that. Some of my questions, I stammer a little bit too much, and I try and prepare and have those ready ahead of time. This time just didn’t go quite as well as I had planned.

It is interesting to sit down and chat with Adam, what his thoughts are as far as how we should be using sp_whoisactive, how it was designed and some of its future plans.

As always, we’d love to get your thoughts and feedback. If there’s a question that you think we should arrest, hit me up on Twitter at @CarlosLChacon. It’s always good to have you compañeros, and welcome to the show.

Children: SQL Data Partners.[music]

Carlos: Adam, welcome to the program.

Adam Machanic: Hi.

Carlos: We were discussing, before we got started here, about when this project started. You actually created a series of posts about sp_whoisactive of which you can see out on sqlblog.com. The very first post, you have, “Hey DBA, why is my application so slow? Hey DBA, why is my query taking forever to return the results? Hey DBA, something is broken. Fix it quick.”The worst thing a DBA can do is to base a decision on an uninformed guest, and hence this creation of this tool, sp_whoisactive, was created. Take us back. When did this project officially begin for you?

Adam: I started working on it back in 2006-2007, something like that. Just like everyone else, I had been using SQL Server since 1999 or so. I started working on version 7.0, SQL 7.0, whatever it was called, whatever version that was, then SQL Server 2000.Obviously, throughout that time, I was using sp_who, sp_who2. They basically gave nothing in terms of…No insight in terms of what’s going on. It will tell you that a SELECT is running or that an INSERT is running or something like that. It gives really confusing output basically. Basically, it doesn’t tell you anything to debug the system.

When SQL Server 2005 came out, we had this DMVs, Dynamic Management Views, and I started playing with them right away, and it took me quite a while. It took me two years to understand them and see the power and figure out where to get things.

I was working on some project with a bunch of consultants and basically working for an insurance company, and they were eight of us all working on this little tiny box that we all had to share the same server.

For some reason, the company would not give us…They were paying us probably a lot more in aggregate per hour than the server was worth, which is interesting to think about. Anyway, we kept killing each other. One person would run a query, and it would take over the whole server, because it was tiny. No one really knew what was going on.

Because I would have to run something and wait for it to finish, I had a lot of downtime in the project. I started writing this script to figure out who was running what and try to work out some way that we can all survive and co-exist on the same server.

So over about six months, I played with this quite a bit and built up the very first version that I released on my blog. It was just a script at that time, just a big, huge chunk of script that I released on my blog on December 31, 2007. From there, I started getting a bunch of feedback. The response was really positive, and I started bolting more and more stuff on it.

At some point, I turned it into a start procedure, and I had people asking for all kinds of different options and switches and things like that. It just organically grew into what it is today.

Carlos: Wow. That’s something to think about, the staying power of it. Looking on your site, I can see comments from October 2015. You are now on version 11.11?

Adam: Yeah.

Carlos: Obviously, it has some staying power.

Adam: Yeah, a little bit. Interestingly, that version 11.11 has been out for quite a while now. It’s actually been the most recent version since 2012. There’s a couple of small bug coming. They are not even really bugs. There are misbehaviors that I’m aware of.There are a couple outstanding requests. Actually, several outstanding requests, but not too many bugs that exist in that version, so I haven’t really felt a need to write a new one. It’s just been sitting out there since 2012, very stable at this point, but very minor, a couple of issues. It’s been pretty cool, actually, because I use it every single day.

You talk about in the software industry, eating you own dog food, and I do that every single day with SQL as I do. Every single piece of work I do, I’m constantly using it in modeling my own processes and my own work. I think one of the reasons it’s worked out really well.

Carlos: Is the go-to, when people are asking those questions, “Hey, this screen has been running, and it’s been running for a while. What’s going on?” That ability to get in-site into the server, as you mentioned, to see what’s happening, to see if there’s blocking, to see if it’s just [inaudible 06:56] taking a long time has been super, super helpful.Now, as you mentioned, there are stored procedures, so people can download it. They install that, and they can run it, and of course there are some default parameters, but you don’t have to supply anything with it. It will give you an output of what’s currently running on your system.

Adam: Yes, that’s correct. That’s the whole idea is. I tried to make the default parameters as generically applicable as possible. No one really stopped to think about what they put into it. Unfortunately, I don’t get as many anymore, but I used to get a lot of questions like, “I installed it, and it didn’t return anything. I keep running it, but it just doesn’t give me any output.”The reason is it’s not supposed to give you any output unless there’s something interesting forward to show you. People won’t stop on the dev server, which you absolutely should do. Test it, don’t trust me. Don’t trust anyone with something you’re downloading on the Internet.

You are going to put on your test server. You are going to try it there first. Make sure it doesn’t crush anything. But of course, there’s nothing running on your test server, and it’s design, it’s called ‘Who Is Active.’ It’s only supposed to show you something that’s actually active. It’s not supposed to show you all the background stuff that’s constantly running on your server. Because honestly, who cares?

What you want to see is who’s active and who’s making the server slow right now. When you run it on your test server, and nothing is running, it doesn’t show you anything, because it says, “I don’t have anything interesting to show you.” I used to get a lot of emails from people who would say, “I think it’s broken. It doesn’t give me anything. What’s up?” That’s a bit of a learning curve for some people.

Carlos: There you go.

Adam: Yes. I think particularly very useful in a dog fight, if you will, when people are starting to ask questions about the database, then it becomes your friend. I know that we have options to add some additional details that get returned. For example, stored procedure names or objects that are running in addition to just the query that’s running at the moment.We can see execution plans in addition to that, things of that nature. But as I mentioned, I admit to normally running it with no parameters. Are there other functions that you’ve been using? You mentioned using it every day basis that you think have been the most helpful for you?

Adam: Yeah, definitely. The first one is @get_task_info. Basically, there are three different modes that Who is Active can use to collect and weigh information. First of all, there’s a very like weight mode. The second mode is the mode where it collects the top weight per request, and then the third mode is a mode where it collects all the weights per request.The default is the middle mode where it just collects a single weight per request, and it basically ignores certain types of weights, for example. It ignores parallelism weights and prioritizes blocking weights and lock weights and things of that nature, and then it surcharges you what it believes is the most important weight for you to focus on. That’s the default.

If your server is under heavy, heavy duress, and sometimes your server will be so loaded down that sp that was active won’t run very quickly. If that happens, you can put it into like weight mode. In which case, it won’t capture any weights, as weight collection is a little bit expensive.

Normally when I’m running it, I actually run it with four weights collection where it collects all the weights, and you control these three modes with the same switch. It’s called @get_task_info. @get_task_info=1 is the default, and that’s the one where it just collects a single weight per request.

@get_task_info = 0 puts it into lightweight mode. @get_task_info = 2 puts it into heavy, collect-everything mode. And that’s the one where you’ll see all of the parallelism waits, if there are any, all of the waits across all of the threads that are running on your request.

You’ll see how many threads are running on behalf of your request. You’ll also get a couple of additional columns added to the output. You’ll get a physical IOs column that I captured from one of the thread’s DMVs and a contact switches column that I captured from the same one.

You’ll get a lot more information at the expense of, obviously, a little bit heavier run time. And that’s usually the mode I actually use when I’m debugging things, because I want to get as much information as I possibly can. So that’s a really, really useful one, especially if you understand how to read waits well and how to interpret them. It gets very, very heavy.

A second mode that I’d like to turn on is called @get_transaction_info. By default, sp_whoisactive only tells you whether your SPID has, or how many transactions your request has opened at the moment. It doesn’t give you any more information aside from that.

If you turn on @get_transaction_info = 1, a new column is added to the output that actually shows you which databases your session has open transactions in, how many log records have been written in each database, and how many bytes have been written to the log on behalf of your transaction in that database.

Carlos: Oh wow, OK.

Adam: And that’s super, super useful. I do a lot of ETL work, a lot of batch processing. Sometimes the transaction logs will start growing, and we want to know why is that happening.

Carlos: Sure

Adam: Sp_whoisactive @get_transaction_info = 1, and I have the answer in a second. Another thing I’ve noticed is that if you are involved in a large, fully logged transaction, you can actually figure out how far into the transaction you are based on reading the log records.So for example if you have insert running, and it’s fully logged, it’s actually going to log one log record per row that’s been inserted, on average. Sometimes there will be page splits and other things like that, that will mess with the numbers. But basically, it’s going to be one log record per row per index that’s being inserted into.

If you note, for example, I’m going to insert 10 million rows, I know that my table that I’m inserting it to has three indexes, then I know that there’s going to be approximately 30 million log records generated on behalf of my transaction.

So if I’m running my insert and someone comes and says, “Is that insert almost done?” I can run Who is Active and say, “Looks like we’ve done 20 million log records, therefore we’re about two thirds of the way done. So it’s been two hours now. I’ll talk to you in an hour.”

Carlos: OK, very interesting. Yeah, very helpful. Anytime you can provide that information right to your users to give them that status, it’s super helpful, and it makes you look super informed.

Adam: Right, exactly. That’s the point.

Carlos: Yeah. Now, you had mentioned an important concept, and that is using it in your testing.

Adam: Right

Carlos: I have to admit that I usually do a little more firefighting with it. And you mentioned a specific scenario, this log instance, for example, looking at the log transactions to see where you are and how much volume that you have. When do you normally start bringing that in into your testing as you’re developing, for example you mentioned your ETL packages or ETL processes?

Adam: I’m running this thing all the time, so basically if I write a batch process, and I’m running it, I’m testing it, even from the very first test, I’ll be running sp_whoisactive in another window, capturing off query plans, waits, so on and so forth.Basically, I’m a very performance-minded developer. When I write code, I don’t want to have to go back and play with it later. I want it to perform from day one and keep performing.

Carlos: There you go. That’s good advice right there.

Adam: It’s good and bad. It depends on what you’re doing to be perfectly honest. Most of the projects I work on, I’m brought in because I am a performance-minded person, so it makes sense there. For some other projects or some people, it might not make as much sense to focus quite that much.I would say exercise caution in that area. Sometimes you need to get the job done. And if the query is a little bit slow, as long as you’ve gotten the job done, you don’t need to spend three days carefully tweaking every single aspect of it to make it as fast as humanly possible. That’s more often than not actually what I do, because that’s my world.

Carlos: Oh, OK.

Adam: That’s not everyone’s world. That is my world.

Carlos: Sure. So maybe like an 80:20 rule can apply there as well.

Adam: Yeah. Don’t always do that, because it’s sometimes a waste of time. But by and large, that’s the way I operate. I’ll have that thing running all the time while I’m testing processes that I’m writing, while I’m testing anything I’m writing. I find it’s just much easier for me to use that tool since I’m very familiar with it.I know how to read the output really well than to turn on a trace or do something else that other people would generally do. I see a lot of DBAs turning on a profiler still or extended events or whatever to capture information. I don’t find myself doing that anymore, because I just use Who Is Active to capture all that stuff, or occasionally some other DMV queries.

Carlos: As you begin to become more familiar with the wait types, you can troubleshoot that a little bit easier. I guess thoughts around once you have that list come back to you, if you’re not as familiar with all of those, I guess thoughts about maybe next steps. Where would you go next? Now, I have this list of top three waits, if you will. Where would I take that information?

Adam: There are lots of great articles online about different waits, just Google, effectively. Plug any wait into Google, and you’re going to get some information most likely. You’re going to get someone’s blog. MSDN has some stuff. Some of the MSDN blogs have some stuff.You’re probably going to get something. If you don’t get something, then you go to Twitter to #SQLhelp, and you ask there. And then if you still don’t get something, then you have to call Microsoft.

Carlos: Hopefully it won’t come to that. But yes, the community is very anxious and willing to share that information.

Adam: Yeah, and I have a couple of things on my blog as well. There’s so much information out there. It’s actually gotten to the point where it’s more important to look at all the information and figure out what’s real and what’s not.There’s so much information now that there’s almost misinformation in certain cases. It gets difficult. Some of the waits are much more complex than people realize. Things like IO waits, for example, are actually very complex. They’re not necessarily always about IO. They can be about other things depending on what else is going on in your server.

It’s important to really be careful with what you’re looking at and make sure that you use an evidence-based model and not guess.

Carlos: Sure. And that brings up one of those tools that we can look at some of those as the execution plans. And I know you can get the execution plan back as a result, as a parameter, if you enable the parameter to get the execution plan back. But one of the things that might be a little surprising is that you’re looking at the plan and the cache, right?It’s not exactly the same thing as running include the execution plan in your SQLs or management studio, and then seeing that result.

Adam: Well, yeah. There are two kinds of plans in SQL server. There is the estimated plan and the actual plan. I think these are really badly named. They’re both miss numbers in their own way.The estimated plan, if you look at it, will be the plan shape. It’s going to run. It’s going to be cashed unless there is recompile that occurs just before the plan runs. Most of the time, the estimated plan is in fact the actual plan shape it’s going to run.

Then we have this thing called the actual plan. The actual plan is these generally, almost always the same exact plan shape as the estimated plan, but it includes some additional information: number of rows per iterator, number of executions per iterator, and some other information. It’s the same plan, it just has a little more information.

Who Is Active, since it pulls the data out of the cache while the plan is running does in fact show you the actual plan shape that’s running, even though it is called an estimated plan.

That’s why I say that. I think those are really badly named concepts in the SQL Server world.

Carlos: Sure, sure. Fair enough. I was referring. I guess my thinking was that I wouldn’t necessarily see the actual rows and estimated rows, for example, from an operator pulling it from the plan cache, right?

Adam: That’s correct. You will not see any in that data. The data is actually very, very expensive for SQL Server to collect. That’s the reason it’s not available by default.In SQL Server 2014 and above, there’s a DMV that will actually tell you all of those numbers on a per iterator, per thread basis, but the big caveat is your session actually has to have plan collection enabled, so return to actual plan, so set showplan XML on, and whatever that is has to be enabled for your session, or you have to have either trace or extended event session collecting the actual created plan.

Either way, you’re adding a lot of overhead to your plan and the SQL Server to collect that information. Unfortunately, we can’t get that too easily.

Carlos: Another reason to begin using that in your development process is where you’re not having the load of the system there. You can spare that overhead, and it collected information for you if you put into production.

Adam: Correct, correct. Of course, I don’t know if it’s Murphy’s Law or some other law, but once you go to production, everything is different.

Carlos: That’s right. Nothing like doing in a production system. That’s for sure. We talked about version 11.11. There is a version 11.111, so 11.1, if you will, that was created for Azure. Now that lot of DMVs are available in Azure, they’re working all those things out. It will work. But interestingly enough, you mentioned a general lack of Azure. So is that changed at all for you personally?

Adam: First of all, just to clarify, I did release version of 11.111 and did work in Azure, and now it doesn’t anymore. So what they did was they released all the DMVs finally in Azure, then they ripped one of them out that sp_whoisactive does use.First there was, it had a reference of MSDB, because it pull some information about SQL agent jobs. I had to pull that out, because there’s no MSDB in Azure. Then it did work, and then they pulled another DMV out, which is sys.dm_os_sys_info, which is a DMV that has information about.

It only has one row and has a bunch of columns in software link system. It provide information like how much memories on the system and how many threads there are and how many physical cores there are and bunch of other information, how long the server is going up, things like that.

Anyway, they ripped that one out. Unfortunately, that’s pretty deeply embedded in some of the logic, and it was active, so I’m going to have to, and there’s no replacement that I can find. So I’m going to have to, firstly for rewrite part of it, if I do want it to run on Azure.

Then to get back to your question, yes, I do want to run on Azure. I’m getting lot more request these days, actually, for an Azure version. It seems like that’s finally starting to be some attraction around Azure, especially with some of the newer SQL Server features they’ve announce.

I can really take advantage of it — backup Azure, Stretch Databases, things like that. Is it called Stretch Tables? I don’t remember exactly.

Carlos: They are at the table level. But yeah, stretch…

Adam: Stretch Tables. Some of these features, I think, are really going to start driving adoption. It’s getting easier and easier to set up an Azure database. They’ve done a really good job in terms of giving people free or nearly free access.I think we’re going to see quite a paradigm shift in that direction in the next few years, so I definitely will be helping to support that with Who Is Active.

Carlos: Do you have any download numbers for us, version 11.11? How many times they have been downloaded?

Adam: This one has been downloaded about 73,000 times. I think, probably, in total all the versions into there, probably half a million or something like that times, because there’s been ton of different betas and other versions. 73,000 times. It’s interesting to think about those numbers.I was saying to a friend the other day, “They have a thousand servers in this company, and all thousand servers are running sp_whoisactive.” I have to assume that obviously, there’s a lot of people who downloaded it multiple times, but a lot of those numbers, a lot of those 73,000 downloads must be some DBA is that taking it and distributing it across their environment.

I would love to be able to count the total number of instances the sp_whoisactive was on. It’s got to be in the high hundreds of thousands, I would guess.

Carlos: No, I would agree. I think we have community tools that are widely available. It’s got to be probably the number one tool for lot of DBAs.

Adam: Yeah. Some are pretty cool. I’m glad to been able to provide that.

Carlos: Well very good. Adam, thanks for chatting with us about sp_whoisactive. Of course, you can find that on his blog. It’s sqlblog.com will have that with probably bit of information on our show notes which you can find at sqldatapartner.com/podcast.Adam, before I let you go, we want to switch gears here just a little, get some insights into, additionally, some of things that you’ve done. I know, and one of the things we always like to talk with our guest about is their favorite tool. Now, we’ve been talking about sp_whoisactive tonight, and that can be your favorite tool. But do you have a favorite free or paid tool?

Adam: Obviously, I use sp_whoisactive all the time. I don’t use a lot of other tools, actually, in my work. I use Management Studio. I live in Management Studio. I use Visual Studio. I live in Visual Studio.

Carlos: You’ve done a quite a few things over your career in the performance space, if you will, but I know you did both the finance, the organizations. But what’s the best piece of career advice that you received, and how did it helped you?

Adam: Early on in my career, one of my managers recognized that I had a personality that would lead me to be in leadership positions over the course of my career. I was really fighting it pretty hard, the idea of being in leadership position of any kind.Basically, I actually was purposefully forced myself out of those positions by putting myself, by doing things that weren’t very good for that kind of career. Let’s just put it that way. I’ll keep it vague .

Anyway, this manager sent me and said, “Listen. You are going to be promoted into leadership positions, either in this environment or others. You really need to learn to adopt it and embrace it and learn to really trust other people and learn to delegate to other people and really work in a team environments.”

It took me a long time to absorb that. When I received that feedback, it’s tough for people with my personality to really delegate work to others. It took me a really [inaudible 28:04] to realize the importance of it.

After several years, I finally was able to understand value of that. Now my approach is more I want to delegate work, even though it’s frustrating to me.

My personality, I’m perfectionist. Delegating work and then having someone not do it quite as well as I think I can do it myself is very frustrating to me. That’s one of those things that I have to really…thanks to that feedback I received, that I’ve really, really worked on.

Now, I realize it’s not about the fact that the work isn’t necessarily being done badly, it’s about in fact the other person is learning, and I’m helping them get to the point where they can do the work extremely well. That’s very satisfying, actually, helping other people grow and learn and get better with their career.

It helps me get better as well, because someone can only really be as good as the team that’s around him.

Carlos: Sure. It has a dual purpose there.

Adam: Yes, it did.

Carlos: It helps everything move forward or get better.

Adam: I think so.

Carlos: Very nice. Well ,Adam, our last question for you tonight. If you could have one superhero power, what would it be, and why would you want it?

Adam: That’s interesting one, too. I guess if you think about superheroes, it doesn’t get much more bad ass than Wolverine.

Carlos: Oh.

Adam: I’m going to have to go with his healing ability. I guess Adamantium Skull would be nice as well, but that’s not really a power. That’s mean he is able to have.

Carlos: I guess because of his healing. He was already pumped with that stuff.

Adam: He is always been my favorite superhero, by far. I’m going to have to go with his healing abilities. I guess it makes him immortal as well, right?

Carlos: Yeah, that’s a great question. I’m not exactly sure.

Adam: I think he can be killed technically, but I think he will live forever as long as someone doesn’t cut his head off or something. Pretty close to immortal. Heals up, and he’s a bad ass. That’s sound cool to me.

Carlos: Very good. Adam Machanic, thank you for being with us tonight.

Adam: Thank you. I appreciate that.

Carlos: Again, you can check out the sp_whoisactive at sqlblog.com. You can get show notes from tonight’s episode at sqldatapartners.com/podcast. We’ll see you on SQL trail.

Children: SQL Data Partners.

Episode 21: Azure Data Factory

1400Azure Data Factory

As one of the newest Azure offerings, Azure Data Factory can at first thought be compared to SSIS.  I don’t think we are quite there yet with the comparisons; however, I talk with MVP Reza Rad and discuss some of the similarities and differences between the two.

A consultant now living in New Zealand, Reza has lots of experience with the data management and gives us the scoop on the initial offering of this new Azure tool.  I hope you enjoy the episode.  If you have some feedback or comments, feel free to reach out on Twitter–I am @CarlosLChacon

Transcription: Azure Data Factory

Carlos L. Chacon: This is a SQL Data Partners podcast. My name is Carlos L. Chacon, your host, and this is Episode 21. Today we’re talking about Azure Data Factory. As you may know, I am spending a lot more time in the Azure environment and wanted to follow up with some of the Microsoft teams and other people who are doing things in Azure.

I happened to bump into Reza Rad– no, I didn’t happen to– I sought out Reza Rad at the summit. Reza is from New Zealand and a very, very nice and generous guy. He had a session on doing differences between Azure Data Factory and SSIS.

As you might guess, the Data Factory is a transformation tool that you can use in the Azure environment. We sit down and we discuss some of the differences.

As always, comments are welcome, [email protected]. If there are topics you want to hear about, please let me know. It’s always good to have you, compañeros. I hope you’re learning something new every day, and welcome to the show.

Children: SQL Data Partners.

Carlos: Reza, welcome to the show.

Reza: Thank you. Hi, I’m Reza Rad, and it’s an honor to be here and to be in this podcast.

Carlos: Ultimately today, you gave a presentation at the summit, and it piqued my interest, since I personally am trying to do a lot more with Azure. You gave a presentation on comparing SSIS to the Azure Data Factory. That’s the conversation that we wanted to have today. Get us started, first maybe we should talk about what is the Azure Data Factory.

Reza: Yes. Azure Data Factory is a data-ingestion tool. It’s not like SSIS that need an ETL tool. It’s more of like an EL tool. We do extraction and load with Azure Data Factory, and that works quite well with large amount of data, with big data, with HD inside, with all those things.It can transfer massive amount of data from source into destination, but it is not actually a data transformation tool, so it shouldn’t be compared with data transformation tool. It has its own strength. When you compare that with something like SSIS, SSIS has its own strengths, Azure Data Factory has its own. It is a Cloud-based service, so you pay as you use.

Carlos: That’s an interesting concept. I hadn’t thought about taking the “T” out of that equation.I think a lot of people, particularly, will look at the Azure SQL database. You don’t have the SSIS or the jobs component to run some of those things, so I think there might be an inclination to think “Oh, the Azure Data Factory can do that for me.”

Reza: Yes.

Carlos: But it’s not quite there?

Reza: Exactly that. Yes, I have also quite a lot of clients that says they want a Cloud version of SSIS. They have Azure SQL database, they have Azure storage, and they want to do things on Azure. But they don’t want, actually, to have an Azure VM and then set up SSIS on that, because that would be quite a different story.

Carlos: Much more expensive.

Reza: Yes, more expensive as well. Azure Data Factory can do that, but it’s not that much, let’s say, powerful in data transformation. It is actually powerful. It’s not, let’s say, good rich development experience of data transformation. Because when you do it in SSIS, you just drag and drop transformation, this is my look-up, this is my join, all those things. It’s just drag and drop.It’s really rare that you write scripts in SSIS. But in Azure Data Factory, if you want to do data transformation, you should write the scripts. These scripts might be PIG, HIVE, C# or even SQL stored procedure, and not all stored procedure, a special type of stored procedure.

This is really helpful, especially in solutions that are based on Azure, because a specific customer or client doesn’t actually want to pay for our premises SQL server when everything is on Azure. Why should they pay for a SQL service enterprise on bugs and the environment for that when they don’t do data movement on Azure?

So Azure Data Factory is actually helping on that site. So this is why Azure Data Factory invented Cortana Analytics, to actually work with Azure machine learning, Azure event hub, all those Azure components to actually provide that.

Carlos. Yes. I guess you bring up that, the use case, for Azure Data Factory, or the reason it was developed was to help with the integrations of some of the Azure offerings that are in the Cloud. You mentioned machine learning and what-not, right?

Reza: Yes, yes.

Carlos: So while it can do some integrations between my source systems A and B, it’s really more for, correct me if I’m wrong, when I have some source data, data warehouse, that I then want to integrate with Azure machine learning.

Reza: Yes.

Carlos: And those are the components that’s going to wear its function and use?

Reza: Yeah, that’s right. Yeah. Because in most of the cases, if you are doing only things on, let’s say on premises, or even you just do things on premises, and then at the end, you deploy things or transfer those things on Cloud to just assure that I’m part of the [inaudible 06:21] or something like that, then you can do most of those things with SSIS.Then we talk about Azure Data Factory. We talk about something like Azure SQL data warehouse, big data warehouse that actually meet processing or compute engine that transfer these things into engine that can analyze that later on or it can be used for Azure Machine Learning batch scoring, those kind of thing, or Azure Data Lake, and all of these things.

This is actually the purpose of building Azure Data Factory, to do these functions rather than doing more data transformation as its size does [inaudible 07:12]

Carlos: Well, integrating from on premise to Azure, that ultimately, the recommendation, or your thoughts are, “I could continue to use SSIS for that.”

Reza: You can use either SSIS or Azure Data Factory. Actually, each of these has its own [inaudible 07:29] . If you SSIS, you will get rich list of data transformation. You can do whatever you want. In Azure Data Factory, you can do whatever you want as well, but you have to SQL thing.But pricing-wise, environmental-wise, those kind of things, Azure Data Factory, because you don’t pay for SQL Server enterprise version for doing such things, or you don’t need big server, real-time administrator who actually install SQL Server and take care of all those kinds of things. It’s just the server that you use.

Also, if you have massive amount of data, Azure Data Factory usually works better in this size.

Carlos: Sure. Because it’s the ability to scale, right? So that’s one of the things that I had for over and over and over, talking with the Microsoft folks, or folks like yourself that are dealing with that. Well, truly, we’re dealing with hybrid approaches, so you’re individual scenarios are going to determine tools you should use to go forward.You bring up a good point based with the licensing. For SSIS, if I’m moving things into Azure, I may not have the cost of SQL Server license to keep that. I do have Azure Data Factory to help me, but I may not have all the bells and whistles that I’m used to.

Reza: Yeah, then I have to write more SKUs to actually do the same BI.

Carlos: You mentioned, let’s get a little bit with the scripting. You said that there was a stored procedure option, that you’ll let me use that special stored procedure.

Reza: Yes.

Carlos: So tell me what that means, special.

Reza: When I say special, because when you write stored procedure in SQL Server, you can do everything. That’s the ultimate power in the stored procedure. It functions if you’re doing things or something. But in this type of stored procedure, you should return something.

Carlos: So that’s a little more like a function.

Reza: Yeah.

Carlos: Like a raw result or something should get returned?

Reza: I’m not exactly sure about that, but it should be actually, I think, bullion type of result that should be returned as well as some other results that it can return. This is actually other’s case, so I’m not up to that point to say, but it is right now at this point of time.That’s one part of it. There are some, let’s say, developmental issues around that as well. So then you have your function, and that returns something like that in the stored procedure. Then you return your stored procedure later on, a couple of months later. The return function of that changed, return data part of something else.

Then you have to do some changes in your script in your activity pipeline in Azure Data Factory as well. It doesn’t actually pick that change automatically. It’s not that good development experience yet. It needs more development effort.

At the end, it [inaudible 10:46] the stored procedure as long as you provide the article three prior, then you can write whatever you want and do the transformation, whatever you want.

Carlos: I know that some of the Azure technologies are maturing, and they’re starting to be very comparable, particularly the Azure database, SQL Azure database version 2016. They’re baking things into the part of Azure product that will then be released for the on-premise product.You mentioned in your presentation yesterday, it seemed like Azure Data Factory we’re still in V1 a little bit.

Reza: Yeah. This is going to be much better from now, I believe. This has been released in August 2015, just three months, something like that.Matt Masson, who was the SSIS program lead before and then moved to Master Data Services, assigned back to Azure Data Factory now, so we should expect to have a lot of big changes in Azure Data Factory, and big improvements, because those guys know what they do, and they have big pipeline of things to be done.

Carlos: We could go into, maybe let’s do that, go into some of the nuances between SSIS and Azure Data Warehouse. I think we’ve picked up on a couple of them. First of all, obviously, it’s the GUI, or the GUI experience.

Reza: By the GUI experience, you mean?

Carlos: The development options in SSIS, you can drag and drop, I don’t know. I am not a developer, but I can go in there and create an ETL package.

Reza: That’s right. That’s the main thing. In both administration side or the development side, SSIS is quite powerful, because it used SQL Server data tools.

Carlos: That’s right. You have that suite of…

Reza: You can even create import-export package from management studio. That’s easy, right-click, import, export from this to that. But it’s not that easy in Azure Data Factory, because you have to write JSON. JSON is not. It is easy, but you have to be familiar with it, and you have to write a SKU, so you cannot expect a DBA to come up here and say, “OK, let’s write this JSON.”

Carlos: So the DBAs may have to get cozy with their JSON developers, right?

Reza: Yes, that’s right. But I believe there will be a very fast enhancement on JSON writing of that. Probably, there will be something that will generate that JSON from some result, something like that.There are visual studio templates for Azure Data Factory, but that template doesn’t do that much. It just create the project with folders, link services. And when you right-click on folder, and you say, “OK, I want to create a link service,” this will put the JSON escape template for you, and you have to fill that in.

Carlos: It’s almost like using the templates in SQL Server Management Studio, the parameters, so it gives you a basic script, and then you have to fill it in, fill in the blanks.

Reza: That’s right. In terms of development tools or administration tools, Azure Data Factory is still far behind to SSIS.

Carlos: There is one nifty feature that I’m going to group into the GUI category and Azure Data Warehouse, and that is the hierarchy or dependency function. Why don’t you tell us about that?

Reza: Data lineage, you mean.

Carlos: Data lineage.

Reza: Data lineage. In terms of data lineage, that’s the strong, the main power of Azure Data Factory against SSIS. In SSIS, we can have many data flows, many packages all around the world, and these can talk with each other, datas that can be filled in one data flow and can be empty in other data flow and join with something else in third data flow.It’s really hard in SSIS that you track down a specific table and say, “I want to know where this table is coming from.” You can’t track if you are double to track that package. You could, “Oh, I’m filling that in that data flow because I’m in this.” You don’t have to track down of that, or you don’t have the data lineage that you want to see where this is coming from.

In Azure Data Factory, on the other hand, we have the data lineage, so we can have a massive data pipeline, which is actually equivalent of data flow in SSIS. In that massive pipeline, if we click on the specific data set, this will only highlight the whole path down to only those data sets and pipeline that actually produced this specific data set, and it’s much easier to track.

For example, if there’s a troubleshooting activity, you want to know why this is wrong, you can just go to those pipelines and data sets. That’s the strong point. Microsoft team wants to implement something like that in SSIS, but that’s still in their pipeline of things to do yet.

Carlos: The visual experience, what you demonstrated was very cool. It would highlight that for you, so you have several different steps in your package. You click on that visualization, and then the other ones will go grey, and so you can more visually see that. That’s going to be nice once it’s a bit more usable. That’s going to be handy.

Reza: I’ve heard some rumors about, I’m not sure if this is going to be available. But about hat Azure Data Factory will support data lineage for SSIS as well, I don’t know how. Maybe importing those into Cloud or something like that, but this is just a rumor that I heard.

Carlos: Sure, we can only hope. What are the other comparisons we want to talk about?

Reza: In terms of security agenda, it all look like each other in terms of security. They both support role-based security. So as you do in SSIS, you can say in SSIS, I actually define a role that can deploy packages and other role that reap the packages or execute packages or whatever. Same in Azure Data Factory. In Azure Data Factory, you can do the same role-based security.One of the other differences, I have to say, is compatibility with HDInsight and Azure Data Lake. HDInsight is something that SSIS supports as well in their latest extension, Azure Pack, SSIS Azure Pack. So they support Azure HDInsight tasks, that you can do some part of the work, but it’s more built in and embedded in Azure Data Factory.

You can just define a compute. In this compute, I want to have a cluster HDInsight engine. In this engine, I want 16 clusters. You can define all of these in your JSON SKU, and this will build that HDInsight cluster for your compute, for your data factory, only for that, and this will close that at the end of that as well.

It’s inverted engine to work with HDInsight with PIG and HIVE, and all those other thing. You can do all of these things in SSIS, but you need to do some manual things.

Carlos: How do they determine the costs of Azure Data Factory in Azure?

Reza: The cost of Azure Data Factory, let’s say, compared to its SSIS, the cost is actually based on features that you use. If you use just import/export, we can express those things. If you use Lookup, those kinds of default transformation joins or all those transformation, then actually, you should pay for BI edition of that SQL server.If you use a special type of transformation, like CDC, for doing incremental load or those kind of things, then you have to pay enterprise, which is quite expensive, depends on the core and CPU and all those things.

Carlos: When we say CDC, we’re talking about change in data capture?

Reza: Change in data capture, yes.

Carlos: Those features are only available in the enterprise version?

Reza: Yeah. There are different ways of, let’s say, implementing incremental load in data warehousing. One of the good payoffs doing that is doing that through the CDC. If you want to use that feature, usually, you need enterprise edition SQL server.

Carlos: Personally, correct me if I’m wrong, I always thought of change data capture as being something on my database. I have enterprise version of SQL server. I have changed data capture. Now I have SSIS, and I guess a lot of time, those packages are going to run on a second server.You’re saying that a SQL instance that runs my SSIS would have pulled in and change into capture or do something with it or also need to be enterprise edition?

Reza: The database in the server has to be enterprise, because that’s about CDC or change into capture.

Carlos: My source data? My own TP?

Reza: Yes. For the SSIS side of that, yes. It has to be that. You can develop something with CDC source or also in SQL source in SSIS, or you can run that perfectly. But if you deploy that to production, it has to be enterprise edition that works with that.

Carlos: I did not know that.

Reza: There are also some other special component like Fuzzy, Lookup, those things that do. Sometime Lookup, text mining, those things that they are only available in enterprise.

Carlos: I’m not an SSIS guy, but I have done just enough packages. I’m sure there are certain features, but I guess I feel like Fuzzy Lookup was available to me. But I’m in so many different environments. I can never remember if I’m on enterprise or standard.

Reza: It might be. I’ll have to check. But yeah, there are some features that are only available in enterprise. So if you go with that fully feature version of that, usually, you need the enterprise.

Carlos: To extend that, a lot of the third party add-ons, for example, there are several out there that will help you with SSIS deployment in building things like so FTP-ing and special email function and what-not, because the SSIS is not simple grade at. Do you know of any? I guess there’s no third party integration tools to help you in Azure Data Factory?

Reza: No, it’s not.

Carlos: It’s super new?

Reza: That’s another strong point of SSIS, because there are good luck components that cause luck or fragmented works. They have their own SSIS component that you can plug into your package and use them and good luck. But the main reason for that is that SSIS has been there from 2005.

Carlos: That’s right. We have 10 Years development experience versus three months.

Reza: Yeah. And I believe there are quite a lot of things for Azure Data Factory in the future aspect.

Carlos: Sure. That’s always the downside for folks, for compañeros that are listening, that are tuning in. We’re here in November of 2015. If you’re listening to this in the middle June of 2016, you are going to want to revisit this, because things will have changed between them. I don’t want to report downside of working with Azure. Things are still in constant flux.

Reza: Continue with data pricing, data pricing of SQL server you pay for feature. I hear data factory, you don’t pay for feature. You actually pay for usage. You have all features available. It depends on how many activities you use and frequency of activities. Because in Azure Data Factory, we have pipeline, which is data flow. In each pipeline, we have activity.We can say activities are like transformation. Not exactly transformation, an activity can be just a copy. But some things are transformation. So it depends on how many activities you use in your Azure Data Factory and how frequent you run this.

Let’s say, for example, you run 10 activity in your data flow, and that 10 activity you run 10 times a day or something like that, so that’s 100 times a day. And then you run that for a month, so that makes 3,000 activities per month.

Then you fill into one of the tires of Azure Data Factory pricing. Let’s say for example, if you run this amount of activities per month, this is the price that you pay. It does not depend on the actual size of the data as well, so you can transfer massive amount of the data.

But usually for massive amount of the data, you need HDInsight cluster. HDInsight cluster take space, so you pay that size.

Carlos: Storage cost.

Reza: Yes, but it’s not that much, again.Comprising of these two. Usually, you get quite a lot of more pricing and costing bit SSIS or services like that against Azure Data Factory, which is really low cost based on the activities that you use. You can actually, the activities, some of those that you don’t use and say, “OK, I have this pipeline, but I don’t use that’s.”

The activity that you pay, I think, 80 cents per month for pipelines that are not in use, which is fine. It’s quite lower cost, much, much lower cost against SSIS.

Carlos: The upfront costs aren’t there. We use its cost, and that’s more than nice things. You can let it go dormant, and you’re not continuing to use it.

Reza: That’s right.

Carlos: Well, great. I think we’re all through. There are few more items so we can go in here. If folks want to know a little bit more, we’ll point them to your presentation to that information. On the website, of course, you go to sqldatapartners.com/podcast and look at today’s episode on Azure Data Factory, and we’ll have those notes available with additional information should you want to take a pick at that.Before we let you go, we have a couple of questions we want to ask you. We’ll listen to one additional way that folks can learn about SQL Server.

Host: Have you considered mixing SQL Server training with great friends on a boat in beautiful destinations. If you have, I invite you to check out sqlcruise.com for more information about the great training, the team, and the aim for putting together. There will be a standing sale for the Caribbean in January of 2015, and they want you on board.They’ve offered $100 discount to listen with all these podcasts, so you can check out sqldatapartners.com/sqlcruise for more information there, my experience on SQL Cruise and who knows, maybe we’ll see you on board.

Carlos: Reza, thank you for being here. Before we let you go, we have a couple of questions for you. We’ve talked about Azure Data Factory and SSIS. If that’s your tool, that’s OK. But we want to know, what’s your favorite SQL tool?

Reza: I like SSIS. Everyone knows that I like SSIS. I’ve been working with that since 2005, and before that with DTS. Why I like that, I think it’s more of my passion about data movement and data transformation. I like, actually, to do these things.

Carlos: I know you’ve been working with this for a long time. You have lots of experience in the industry. Take us back, if you will, what’s the best piece of clear advice that you’ve received and would want to share with others?

Reza: I have been in IT for I think 15 years, something like that. One of the very first advices that I have received and was useful to me was from very first manager of mine in a company. It wasn’t a consulting company, it was in health company that I was a developer, and I had a manager.At that point of time, I didn’t know about many of the technologies, about SQL server, about lots of things, and I hadn’t learn.

He said to me that you’re learning these things, and this is great. But after you learn these things, you might think that you know these things, and you don’t need to learn anymore. But this is not how IT works. In IT, you should always learn. If you don’t learn anything, you’re losing anything.

That’s really good piece of advice. Because IT technology, everything SQL Server, SSIS, Azure, all of these things, they are new things coming up.

Carlos: Constant change.

Reza: Yeah. If you don’t learn, you’re losing the new features that are available, and you’re still sticking to traditional way. I remember this past key note, Joseph Sirosh also said something like that. He said, I don’t think I can actually remember the sentence, but he said industry doesn’t respect traditions, our industry respects innovation.

Carlos: Compañeros, that’s a great piece of advice. Our last question, if you could have one superhero power, what would it be, and why do you want it?

Reza: I can tell it in this way rather than thinking and changing the question. One of my favorite superheroes is Batman. Batman is an ordinary person who doesn’t have any special superhero power like Superman or those things.He uses some normal things like special, I don’t know, custom, and he can fly and those things. But these are all based on normal things. So what I do usually like in my career’s role is doing that. I like to show people how they can do really good and powerful things with some tools which are available, and I try to do that in my presentations, those things as well.

Carlos: Very cool. That’s what I like about you, Reza Rad, always trying to reach out, super friendly. You befriend me and my wife. We met last year, and so I do appreciate your time in being here with us.Compañeros, again as well, take a peak, sqldatapartners.com/podcast for today’s show notes and additional information. Thanks for tuning in. If you like what you heard, let us know. Leave us feedback. Of course, we’re on Twitter. I’m CarlosLChacon, and we’ll see you on the SQL training.

Children: SQL Data Partners.

Episode 20: The Role of Mentors


We all start somewhere and if you can find a mentor who will help you on your way by pointing you in a direction you had not considered, you are a lucky person indeed.  This episode I chat some fellow podcasters about their experience with mentors and the characteristics we think good mentors have.

Experience is simply the name we give our mistakes. -Oscar Wilde

Be brave. Take risks. Nothing can substitute experience.  -Paulo Coelho

Experience is what you got by not having it when you need it. -Author Unknown

[A mentor and mentee] is the third most powerful relationship for influencing human behavior (after the family and couple relationships ) if it is working. -Richard E. Caruso, PhD

Part of the reason I have the podcast is to help bring to light some thoughts and ideas you may not have considered.  I hope this continues with today’s episode.  I hope you check it out.

Transcription: mentor

Due to the number we had in our group, the transcription may not attribute the dialog to the correct person 100% of the time.

Carlos L. Chacon: This is a SQL Data Partners podcast. My name is Carlos L. Chacon, your host, and this is episode 20. Today we are going to be talking about mentoring.

At the past seminar, I had the opportunity to rub shoulders with some other podcasters. I wanted to do a group session together. I thought it would be interesting to get some different perspectives.

We got together and started talking about mentoring, some of the people that we relied on as we started in our careers, how they affected us, what we might be looking for in mentors, how that search continues, and how mentors change and develop over time.

Due to the length of the initial recording, we actually recorded for about an hour. I trimmed that down a little bit. There is some slight jumping, but I hope that’s not too annoying. It shouldn’t be a problem, but I apologize if I didn’t get it quite right. With that let’s get into it. Compañeros, welcome to the show.

Children: SQL Data Partners.[music]

Guy Glantser: My name is Guy Glantser, I’m from Israel. I’m the founder and CEO of Madeira. We provide data solutions around Microsoft Technologies and others as well. I also host the SQL Server radio show together with Matan Yungman. Matan Yungman.[laughter]

Matan Yungman: I host the SQL Server Radio show with Guy Glantser. I am the CTO of Madeira. Just like Guy said, working around the data on SQL Server and other platforms also. Having tons of fun on SQL Server radio. We sometimes talk to each other, we sometimes talk to other guests, and generally have lots of fun.

Carlos: Very good, and you’re on sqlserverradio.com.

Matan: Yeah, that’s right.

Richie Rump: My name is Richie Rump. On Twitter I am @Jorriss, J-O-R-R-I-S-S because nobody knows how to spell it.[laughter]

Matan: I still don’t know how to spell.

Richie: I know right.[laughter]

Matan: Even you said it now, I still don’t know how to spell it.

Carlos: Slow that section down.[laughter]

Richie: When we get the recording, I will give it to you and then you can have it.[laughter]

Richie: I am the co-host from Away From The Keyboard. We are a podcast that talks to technologists, not about technology. We get the insides behind their story of how they learned how they grew how they became famous or not so famous.

Chris Bell: I am Chris Bell, I’m wondering why I am here. I started a podcast a little while ago on my own site just to get ideas out and thoughts. Some interviews to get to know people better. Other times just to talk about weird stuff, like my lawn mower blowing up and how to think like a manager to deal with it. Which was kind of fun.

Matan: It sounds more interesting then.

Chris: It sounds more interesting than it is. We’ll put it that way, right? How you get to be a speaker, you write a really good abstract and you get in and you’re just like, “Oh.”

Carlos: Chris, your podcast is called?

Chris: WOxPod.

Carlos: It is at?

Chris: It is at wateroxconsulting.com/podcasts/ all the different series in there.

Carlos: I’m Carlos Chacon. I’m the host of the SQL Data Partner’s podcast. We’re a little bit more boring and we just focus mostly on SQL Server. One of the things we’re doing here, we’re at Summit this year, the PASS Summit, trying to learn, connect, and share.

Chris: Have fun as well.

Carlos: Is PASS answering this that you have to use their tagline?

Chris: I would say, “Yes,” but unless they give you any money then it’s “No.”

Speaker: [crosstalk] .

Carlos: Then cut it out.

Chris: Or you say, “Yes”, and if you don’t get the money then you cut that section out.

Carlos: It’s a great tagline though; I’m not going to lie about that.

Richie: Fantastic.

Guy: Carlos is going to go with, “No comment.”

Chris: Suddenly, dead air.

Carlos: Since we’re here, we’ve talked about getting together, right? Talking a little about some of the things we’re doing in the community, our podcasts are one. Then the role of mentors helping us along the way. We’ve all started from kind of staring and screaming at the screen because we can’t quite get things to work.

Matan: I still do that. Is it just me?

Chris: Job requirement. It is actually just there, other duties as assigned.

Guy: Now through mentor powers the screen changes magically in the words.

Matan: Of course.

Carlos: Normally if I spend a little bit of time and I can’t figure it out, I’m calling somebody. Who can I call? I have a few more tools in my tool kit now.

Guy: Absolutely.

Carlos: We wanted to talk a little bit about mentorship, mentoring and who’s helped us along the way and the ways in which they have done that. I want to get your thoughts on mentorship.

Guy: Well, I can think of several mentors but there’s one that’s really is the best for me. It’s Itzik Ben-Gan. Itzik Ben-Gan is the one I learned a lot from. I started my career I think ’97. I went to a course at one of local colleges in Israel, Itzik was the teacher.It was a wonderful experience for me. I was just learning SQL, so I was looking how Itzik teaches, how he transfers the knowledge and shares everything. Not only the way he did it but also during the breaks or during the exercises, he was going to his email and it’s all connected to the big screen so we can see everything.

Go to email and he goes through the questions that he gets from people. He goes to forums and he answers those people. He’s always working with other people, connecting, sharing, learning. I think he invented this back then and I say, “Wow, I want to be like Itzik, I want to be like that guy.”

This is where we started. I went to another class and I started to go to forums.

Carlos: How did you develop your relationship with Itzik?

Guy: After the course?

Carlos: Through the course or through that mentoring process, did you go up to him and say, “Hey. I want you to be my mentor.” Or, “Can we…”

Guy: Initially, I went to another course, and I made sure that itzik is the teacher.[laughter]

Chris: He’s saying he bought his way into it.

Guy: Yeah.[laughter]

Guy: Sort of. We started to be in touch. Itzik started the Israeli chapter, the Israeli SQL Server User Group, more or less same time, ’99, I think.I started to go to meetings and meet him there. We started to collaborate. One thing led to another. I was in touch with him all the time. I learn a lot to this day.

Today, when Itzik goes and delivers a presentation anywhere, I’ll go and see it. It’s the best. That’s my answer.

Carlos: Very good. Very good.How did you connect with your mentor, Richie?

Richie: Oh, wow. There’s been a few. Right now, I don’t have that one person that is…I’m looking, right? I’ve been looking for years. The one person I would consider a mentor that really shaped the last five years, I would say, of my career. That was a CIO that I had. His name was Chris Daily.I worked for him for about six years, maybe. Probably longer than that. He left the company, came back, and became CIO. At that time, I was promoted to project manager/architect. We all know what “slash” means. You do everything.

We implemented a supply chain management system for a complete quick-serve restaurant. Very large, one of the top five quick-serve restaurants in the world

I’m going to say, “me”. I’m going to take credit for this one. Rewrote that whole thing from top to bottom. It was in PowerBuilder 5. We rewrote that all in .NET and did all that other stuff.

Carlos: How did he provide mentorship in that experience?

Richie: He fired me.

Carlos: Ah. OK.[laughter]

Matan: Right.

Richie: That makes sense.

Guy: That’s mentorship.[crosstalk]

Richie: He fired me. After we did that three years of the whole thing. His door was always open. I’d go late at night. He would be there. We would talk over things. It came to the time where I was not happy there. I had spent three years of my life building this project, this product, and I was done.Sixteen hour days, the whole thing. He saw it and he let me go. He gave me the package. Essentially, he told me, “You cannot be successful here. You’ve done all you can. Now go be successful somewhere else.”

Carlos: You’re not going to be happy here.

Richie: He didn’t say it that way, but that’s how I read it. That’s how I will continue to read it. Someday, he’ll tell me the full story what was going on there. That was a big learning for me.

Carlos: Matan, what about you?

Matan: Regarding to what Richie said, your mentors actually change over time.

Carlos: That’s a good point.

Matan: You’ll have a mentor for a few years and maybe later on you’ll have another mentor. You can have a few mentors…

Guy: I hold to Itzak.[laughter]

Matan: There can be a few of them at the same time. They may not know that they’re your mentor. They may not know you at all. My examples for that is Brent Ozar, which I’ve been following for the past few years.I learned a lot from him, both from the technical side and about how to speak, how to write, how to think about your job and your world. Now, Brent talks a lot about your brand, how to brand yourself. That is for independent consultants, but also for DBAs who just have a day job. How to talk, how to write your resume, and so on and so on.

The other guy is someone called Pat Flynn. I think you know him also, Carlos?

Carlos: Yes.

Matan: I got to know him. He has a blog called Smart Passive Income. It’s not really only for passive income. He also talks a lot about productivity, how to manage your emails, and so on. Lots of other stuff.I got to know him when I searched about how to do a podcast. He has a podcast called Free Podcast Course on YouTube. Six lessons. I learned everything from him. Those are people that take you higher.

Carlos: That’s interesting. Have you actually met Pat in person?

Matan: No.

Richie: Yes. I have.

Carlos: Oh, you have?[laughter][crosstalk]

Matan: Did you talk to him?

Richie: Yeah.

Carlos: Interestingly enough, a mentor could be someone that you don’t necessarily interact with, but if they’re willing to share with the community so you can still benefit from the experience.

Matan: From here I’m going to San Diego and he lives in San Diego so, theoretically, I might bump into him.[laughter]

Matan: Most likely not.

Guy: What about you, Carlos?

Richie: Once he walks out his house and you’re stalking him, then yeah…[laughter][crosstalk]

Chris: Stalking online versus in-person, a little bit of a boundary thing there, but…

Carlos: Interestingly enough, my experience was similar in that through a job opportunity there was someone who was willing to share their time and energy with me.I think my mentorship experience was really about going as far as I could, trying to understand some of the concepts. When I needed some additional help, going and saying, “Here’s what I understand. Here are the dots that aren’t connecting.”

Then that person would help me connect those dots or point me to say, “Well, I don’t know anything about that dot, but you know who does? This person. Go talk to them.” Providing some of that guidance.

I think it was interesting that Matan mentioned, your mentor may not know. It may not be an official, “I am the mentor, you are the mentee,” type relationship.

I think particularly in the SQL Server community, we have lots of people who are willing to give their time and talents to do that and, ultimately, all you have to do is ask. Be approachable, and be willing to take some of that feedback.

Other attributes that would describe a good mentor?

Guy: Makes you have ah-ha moments.

Carlos: Ah-ha moments, OK.[laughter]

Richie: I’d say want a challenge, right?

Carlos: They challenge you, right.

Richie: In conjunction with that is also respect. If I have a mentor I don’t respect, I’m not going to listen to what that person has to say.

Carlos: Fair but tough.

Chris: If you actually have a mentor that’s acknowledged your existence, so not a stalking one, if you actually have… [laughs] Poor Brent.[laughter][cross talk]

Chris: If you’ve got that one, and they can start to pick up on, through your email or your talk on the phone or in person, they pick up on those subtle cues to realize there’s something that’s not coming through just yet. They can pick up on that and guide you to what that is.I’ll pull an even example from last night. I’m at the bar, at karaoke, where you can’t hear a thing. Guy’s coming up here and just starts talking. He’s like, “Yeah, I really want to give back to the community.” I’m like, “What are you doing?” “I go to user group meeting.” OK but, that’s not…

Richie: So you’re taking from the community. Congratulations![laughter]

Chris: Taking! You want to give back. “OK, how are you doing it?” He’s like, “Well, I don’t know.” Then, “Do you know the people that run your user group?” “Yes.” “Go tell them you want to do a Lightning Talk.””Well, we’re just starting, we don’t…” “Great! They need speakers. Say, ‘I want ten minutes to talk about Bom.'” He’s like, “Well, I don’t know what to talk about.” I’m like, “What do you know?” He’s like, “Well, RGRPL.” “There you go. Done! Ten minutes.” That’s all you need.

Richie: Go tell him you want to run communications. [laughs] Nobody wants to do that!

Chris: There you go! I’m just picking up on this as he’s saying “I want to do this but I’m not quite sure da-da-da-da-da-da.” He starts talking about these other things that he’s interested in, like “I thought speaking would be fun.” Well, do it.I’m picking up on that, but you didn’t. I wasn’t even realizing I’m being…What is it, mentoring? “Mentol?” Whatever we want to call it. [laughs] Just to go and say, “Just do it.”

The term I’ve heard a lot this week at PASS is the “voluntold.” You’ve run a user group, you’ve got people you kind of know…People who have run user groups, there’s a chapter of use in that here. You see those people that sit there every week, every month, they’re there but they never get up and do the thing.

They have this passion, it’s there, but even as a chapter leader you’ve got to realize you are a mentor to these people. You’re the one up in front and they’re going, “I could never do that.” You can just go up and say, “You know what? We don’t have anyone for January. Could you just do ten minutes?”

Guy: We were there in the beginning. I was there. Sitting in the audience and…[crosstalk]

Matan: I started talking because of this Guy.[laughter]

Richie: He was mute until then!

Matan: I sent him a LinkedIn. We didn’t really know each other very, very well, me and Guy, because there is no Skype, no [inaudible 15:27] . I send him a message on LinkedIn that I had an idea. Guy had just started to be the chapter leader in Israel.I sent him that I have an idea. He said, “Yeah, I’d be happy to hear your idea. Let’s sit for coffee.” We sat for coffee. I told him about my idea. He didn’t really know what I want from him…

Guy: I told him, “Are you stupid?” Something like that, right?[laughter]

Matan: And I told him, “Yes.”

Guy: Actually, I go, “You’re stupid.” [laughs]

Matan: But other than that, he asked me would I want to start speaking? I said, “Yeah.” The rest is history, I guess.

Carlos: We’ve talked about mentors can change. Obviously the person that helps us at the beginning of our career, very fundamental to our projection to where we are today. We also talked about the need for ongoing or continuing support and maintenance. We’re needy like databases. [laughs]Thoughts on, not that you necessarily are going out for an official, I’m not picking on Paul Randal here, but “Hey, who needs a mentor!”

[laughter]You want to reach out to people that are people that in similar circumstances. Thoughts on how you go and find people that you want to glean information from.

Chris: Who we want to learn from?

Carlos: Who do you want to learn from? How do you go about choosing a mentor now?

Richie: The people you look up to. You know who they are. You read their blog posts and you listen to their podcasts. Maybe these are the people you talk to in your office when you have a problem. Those are the people you need to approach.

Matan: Or you come to PASS Summit and you go to some presentation and you find someone that really looks like he can help you.

Richie: That’s right. Like this guy on Entity Framework at 4:45 today in Room 611.[laughter]

Matan: It’s too late.

Richie: Oh, dang it![laughter][crosstalk]

Carlos: Ultimately, the people that are putting out information, they’re willing to share. Going out, finding a little bit more about them. Reaching out to the people who are putting out content. Obviously you want to talk with them because you know they are knowledgeable. They’re trusted.I was listening with, was it Cathy, who’s mother, mom, mother SQL? I can’t remember what her name was. Each individual person will have their own voice. You want to find your people. [laughs] Your voice.

I think even here, our mentor experiences were very different. We looked up to that person or we established that relationship for very, very different reasons. It’s ultimately finding someone who matches that in the community.

Richie: Remember that they’re a human, too. They’re just like you. You can look up to them all you want. All of us are in the same spot. You can go [sings] “ah!” and we’re all of a sudden going [hollers] “augh!” [laughs] We’re humans.

Guy: There are some exceptions. I’m not sure Itzik is human.

Richie: Fair point.

Guy: Occasionally I have a really complex statistical problem. I send it to Itzik at 2:00 AM in the morning. I get an answer like up to five minutes, to get the answer. It’s not human.[laughter]

Chris: He’s actually a holographic projection, translator-like super computer.

Richie: He has an optimizer in the back of his head. Like Master Chief from Cortana.

Chris: I know. Exactly, there you go.[laughter]

Richie: You know what’s a good way to connect, so now that we have mentioned that? Go and speak. You need to go where these people and have interactions with them. It’s not just going to be, “Hey, can you be my mentor?” You say the first time I meet you.

Chris: Hit number one. Don’t ever go and say that. “Will you be my mentor?”

Richie: No, It’s multiple connections. When I first met Brent Ozar. Brent and I are….

Chris: Is there any other Brent in our world?[laughter]

Chris: Sorry to all the other Brents but you know what I’m talking about.

Richie: It was at one of those speaker deals, right? I started speaking because I wanted to meet those people. Not anyone in particular but, “Hey, there’s a speaker den?” When I found out there was a speaker den, where they are all in one spot and I can essentially have my own personal session with really bright people, I was all there, full in.I didn’t have any technical sessions to give. I was still a project manager. I gave a project management presentation. I talked about what I knew. Then a year later or a few months later, I was able to put together something technical and do some research and put all that. It’s just a matter of getting out there and being in these places, where these people you look up to are.

Chris: I do that with SQL Cruise, Alaska. My wife and I had that on our bucket list. We needed to do an Alaskan cruise. Then I saw it came up. Actually, Brent Ozar at the time on one of the blogs I’m reading all the time, said, “We are doing our company retreat on the Alaskan cruise.” I was like, “Boom.”I texted my wife and said, “These people are doing this, it’s an Alaskan cruise. You want to go?” I swear by the time I hit send, the OK came through right back from an immediate response. Got hooked on that.

I’m going to shamelessly plug SQL Cruise. I know you’ve been mentioning it on your podcast with Data Partners and that. It’s a life changing experience because you get with these icons, SQL royalty, if you will. You get to be the SQL courtesans.

Richie: No, that’s different. That’s Andrea.

Chris: All those big names, Kevin Lebrun, Kevin Klein, Brent Ozar, those people. You want to get in the room, you better get there half an hour early. People cram into those. Then there’s like us, maybe the courtesans that get stuck outside trying to woo and get into that room.

Richie: I’m the janitor of the SQL Server community, OK?

Chris: You’re the stable boy. Things like SQL Cruise, it’s unique because you’re on a boat. You can only get so far away from somebody. You’re in a boat, in the middle of the ocean. If you want to talk to Brent, he can only run so far. There’s a lot of swimming.

Richie: You can also run off the boat, which I’m surprised he didn’t do that.[laughter]

Chris: You know what’s worse? They bring their families. You wind up with a small group, with less than 30 people. Half of them are these icons or whatever you want to call them, that you look up to. You’re sitting there. That’s when you realize that they are humans. You are hanging out and it gets, this is going to be an inappropriate word almost, intimate.[laughter]

Richie: Lots of things happen.[crosstalk]

Chris: Exactly. What happens on the ship, stays on the ship. You get to know these people and interact. It’s so different. I know that I sat there across from Brent Ozar, which we all mentioned. He’s everywhere. He’s like marketing. You get him on media, boom, he’s there. That’s what it is. He owns Google.

Carlos: You want a case study of content marketing.[laughter]

Chris: It’s brilliant how he does it. It was brilliant what he did. I think we all want a mentor like that. We all want to have that kind of. I remember sitting with Brent and the one real mentor thing I got from him, sitting there. I look at him and said, “Why should I bother blogging or podcast or video where everybody’s going to Google and look it up and the first 10 pages are all Brent Ozar?”[laughter]

Chris: He gave me two things of feedback. One, it works like a resume on your own. You can actually say, “Yeah, I wrote this stuff.” Don’t go copy and paste other people’s stuff, because that can be found out really quickly. Write your own. In writing your own, the second point was you have your own voice of it.The way you say the exact same thing, someone else will interpret it better from you, then from me. That helps kick-start the things but also build those relationships with these people. Get the mentor-ships, even though you don’t realize you’re getting it.

Osmosis works fantastic in those situations. It doesn’t mean that you have to press against them. It’s just in the general atmosphere that it just evolves and happens.

Carlos: I would like to come back to the mentorship. You don’t know what will happen when you start to reach out, or you find your people, or you find your voice. I’m going to pick on Matan here for a second, only because that initial, “Let’s meet for coffee,” turned into a business opportunity. Correct?

Matan: Yeah.

Carlos: Do you want to tell us a little bit about that and that how evolved to what it is?

Matan: Yeah, we sat for coffee and we talked about starting to speak. We talked about podcasting and took some time. We needed to figure out how it gets done and then we bought the microphone. We did a few pilots. We started in Hebrew. It was tons of fun.

Chris: I actually heard yours is in Hebrew, because you had mentioned Statistics Parser. I was like, “Let me hear it.” I didn’t understand anything of it. I heard Statistics Parser. I was good with it.[crosstalk][laughter]

Richie: Any Google alerts? [laughs] Do you recognize those words?

Carlos: True story. Absolutely true.

Chris: I think any language is determined just like that.

Matan: At some point I went to Amsterdam for SQL Rally and I interviewed Brent, Denny Cherry, and Adam Machanic. It was an English chapter obviously. One of the listeners told me, “Why don’t you do an English podcast also?”We said, “Yeah, let’s do that.” It went very, very well. At some point you got business opportunity when we got to know each other. At some point it was pretty much clear that I would come and work with you guys.

Carlos: Interesting, that progression. I didn’t realize that that progression was actually you reached out for the chapter meeting. You guys talked. Then you started saying, “Hey, let’s do a podcast together.”

Matan: No, we sat for a meeting because we wanted to talk about the podcast. Guy also offered me to start speaking.

Carlos: OK.

Matan: Eventually the business opportunity came.

Guy: Actually, in the beginning, I thought it was a really bad idea. I didn’t think that anyone would like to listen to people talking about SQL Server. Initially it sounded like a very bad Idea. I told Matan, “Let’s forget about it. Let’s just do something else. Let’s just speak in the after meetings.”[laughter]

Guy: I need speakers. He insisted. He didn’t give up. Then we said,” Let’s give it a try.” Initially, we didn’t even think to do it ourselves. We looked at it as an initiative and let’s find people that can talk on the radio. We just manage it.

Matan: I thought about it. I actually thought about doing it myself. You are less skeptical.[laughter]

Guy: Then we didn’t find and we didn’t know who to go to. We said, “Let’s try ourselves. Let’s have one show and see how it goes.” It went really fine. We enjoyed it very much. Then we got some feedback from people. We did another one and another one. One thing led to another. Now I can say it’s a brilliant idea.

Carlos: So glad you thought of it.[laughter]

Guy: I’m so glad he insisted on it. Really today we have thousands of listeners every show. It’s crazy and people see us in conferences like this one.” Hey, you’re the guy from SQL Server Radio.” It’s crazy.

Matan: What I learned from this occasion with Guy and from other incidents is that a little bit of guts generally helps. Generally speaking, things are less scary than you think. If you think of something else or something that Finn says. If you feel uncomfortable with something most likely if you do it, it will be good for you.If you get over yourself and reach out and do the thing that scares you, in most cases, it will be very good.

Richie: You also ask yourself, what’s the worst that can happen? What’s the worst that can happen if I put a podcast out there? Nobody listens to it. OK, and I was able to interview someone that I probably wouldn’t have a conversation with otherwise. Well, so that’s a benefit and no one heard it. That’s OK, right? I can move on from there.

Chris: You have a much brighter outlook on life than I do.

Carlos: Well, then to close the circle, I don’t know what’s your business history there. You started this relationship. You guys started working together. Ultimately, you saw some value in Matan’s work ethic or even knowledge or whatever.

Matan: I’m a good liar.[laughter]

Richie: That’s a good trait to have.

Carlos: Maybe I’m making an assumption here. You were kind of serving as the mentor a little bit.

Guy: I don’t know.

Carlos: What made the decision?

Matan: In many cases, yeah.

Carlos: What made the decision to say let’s go into business together basically?

Guy: Well, I got to know him very well from the radio. We have been working together and speaking together. You get to know a person very good, because we have those. It’s that intimate. Some intimate conversations between us. I get to know him very well.When you get to know someone like that, you can appreciate and you know you wanted to work with this guy. I knew I want to work with him. It took maybe three, four or five meetings in order to get to know someone. Let’s move on. Let’s take it higher.

Carlos: That’s interesting. I think it’s part of when you reach out to your mentor, these people you want to connect with, you also have to give of your time. Yes, you’re retrieving information as you start to engage. If you’re not willing to take what it is they are giving you, then why?

Matan: I’m speaking here at Summit in three hours and talk about not being scared and taking risks, what’s the worst thing. I’m going to do my presentation dressed up as superman. I can say now.[laughter]

Matan: It’s going to be a surprise.

Richie: Can I throw kryptonite on the stage?[laughter]

Matan: Yeah, please do. My wife told me that this is a really bad idea. Don’t do that.[laughter]

Chris: So is Bradley Ball your mentor then?[laughter]

Matan: I’m taking the risk and what’s the worst thing that can happen? I don’t know.

Richie: You never speak at PASS again.[laughter]

Richie: Maybe that’s not such a bad thing.

Carlos: Last year, they had sharks and alligators or something.

Richie: I think the thing about Summit for me is that it’s not about the technical teaching. I think that would get people into the door. It’s really connecting with the community.I don’t think there’s many communities, technical communities that have the kind of interactions that the PASS community has. I think that’s what keeps bringing them more and more.

Matan: Then very many interesting things happen outside of the lectures, of the sessions. In the hallways and during such gatherings.

Chris: I’m notoriously bad for coming here and I cheat. I’ll go to two sessions and then I’ll go to all the Lightning Talks. Tonight I went to a dozen sessions. There were five in one hour.[laughter]

Chris: Yeah, I went to a dozen sessions, and that was great. I really went to four or three or whatever. Get the video. Buy the session recording. There’s no way you’re going to make to 230-whatever sessions every year. You’re going to miss the ones you want to do. I’m going to manage it even though no one’s going to listen.[laughter]

Chris: You used the technical aspect to sell it, to get your business to pay for you to go. You can learn the tech online. You can read a book, do whatever, but to meet the people and do the networking. Build that soft set skill that you get and all that experience.Don’t worry about losing your job because you didn’t go to every single session. You’ll have built a network strong enough to help you out to get something else quicker. I’m not saying you go and say I quit, that’s it.

No. Just know that you built that. It actually makes it more comfortable and easy to build life in those aspects versus just going and coming, I went to this session. I went. I left. They gave me a couple free drinks at the end of the night. Then I went back to my hotel room and watched whatever Superman movie, we’ll say.

Please don’t do that. If anything, sleep during the actual sessions and do everything else and be out until in the morning because that where you meet those connections.

Richie: Unless you’re speaking, then don’t do that.

Carlos: Oh yeah.

Chris: Unless you’re speaking the day after, other than that.

Carlos: Yeah, so.

Matan: In my first time two years ago I went to a lot of sessions. Just wanted to learn and learn and learn. I learned a lot, but I missed all the networking stuff around all the collaborations, and I wasn’t there. Last year it was different for me, it was like 50/50. This time I been to one session so far.

Richie: I think my experience has been similar to everyone else’s in that there are other people out there in your same boat. Whatever that boat is, you are not alone. It may feel very tough sometimes. Right?Maybe you think you got the crappiest company to work for, or your boss is horrible, or you don’t feel you have the career advancements that are available to you. Right? I can assure you there are people out there that have had those experiences, been able to overcome them.

Being able to connect, talking with them because you never know what other opportunities or what other things will come of that. I can’t say that you know we’ve invented Wonder Bread because of these experiences, but to have those relationships, even though I have run into problems.

I know that I do have a network of people that I can at least reach out to. “Hey what do you think about this, this is the experience I am having. What are your thoughts.”

Carlos: Yeah, I know.

Richie: They will give me some feedback.

Carlos: I think how you connect isn’t going to be through the technology. That’s how we’re all here, that brought us here. How you connect is what you like outside that technology.Last year they had a board game night. There was a group of us that had never met each other and we started playing a game of Pandemic. We still chat all the time on Twitter for the rest of the year because we played one game of Pandemic, which we lost horribly right.

This year we all saw each other and it’s like, “Hey remember that game?” and all this other stuff and we had an interest outside of the technology of which we connected with.

Chris: Does this mean we all have to keep in touch now?

[background music]