Episode 22: Eight Years of sp_whoisactive

Episode 22: Eight Years of sp_whoisactive

Episode 22: Eight Years of sp_whoisactive 150 150 Carlos L Chacon

1400In what has to be one of the most downloaded SQL Server scripts on the planet with over 73,000 downloads, sp_whoisactive has been providing active session information to data professionals for eight years.  I catch up with Adam Machanic to talk a bit about how the procedure came to be, how he uses it and what is in store down the road.

If you are using SQL Server and haven’t used sp_whoisactive before, I invite you go and download it right now, then come back and listen to the program.  We will wait for you.

Have fun on the SQL trail compañeros.

Show Notes

Adam on Twitter
sp_whoisactive download
Example of @get_task_info
Example of @get_transaction_info
Can Wolverine Die?

Carlos L. Chacon: This is SQL Data Partners Podcast. My name is Carlos L. Chacon, your host, and this is episode 22. Today we’re talking about sp_whoisactive with Adam Machanic.

Adam was one of the first people I reached out to when I started recording my podcast in the summer of 2015. Because of his schedule, we weren’t able to meet immediately. He asked me to reach out to him a couple of months, which I did, and he was gracious to sit down and chat with me.

There aren’t too many bigger names in the community than Adam. And in listening to the recordings, you can tell I was a little starstruck, so I apologize for that. Some of my questions, I stammer a little bit too much, and I try and prepare and have those ready ahead of time. This time just didn’t go quite as well as I had planned.

It is interesting to sit down and chat with Adam, what his thoughts are as far as how we should be using sp_whoisactive, how it was designed and some of its future plans.

As always, we’d love to get your thoughts and feedback. If there’s a question that you think we should arrest, hit me up on Twitter at @CarlosLChacon. It’s always good to have you compañeros, and welcome to the show.

Children: SQL Data Partners.[music]

Carlos: Adam, welcome to the program.

Adam Machanic: Hi.

Carlos: We were discussing, before we got started here, about when this project started. You actually created a series of posts about sp_whoisactive of which you can see out on sqlblog.com. The very first post, you have, “Hey DBA, why is my application so slow? Hey DBA, why is my query taking forever to return the results? Hey DBA, something is broken. Fix it quick.”The worst thing a DBA can do is to base a decision on an uninformed guest, and hence this creation of this tool, sp_whoisactive, was created. Take us back. When did this project officially begin for you?

Adam: I started working on it back in 2006-2007, something like that. Just like everyone else, I had been using SQL Server since 1999 or so. I started working on version 7.0, SQL 7.0, whatever it was called, whatever version that was, then SQL Server 2000.Obviously, throughout that time, I was using sp_who, sp_who2. They basically gave nothing in terms of…No insight in terms of what’s going on. It will tell you that a SELECT is running or that an INSERT is running or something like that. It gives really confusing output basically. Basically, it doesn’t tell you anything to debug the system.

When SQL Server 2005 came out, we had this DMVs, Dynamic Management Views, and I started playing with them right away, and it took me quite a while. It took me two years to understand them and see the power and figure out where to get things.

I was working on some project with a bunch of consultants and basically working for an insurance company, and they were eight of us all working on this little tiny box that we all had to share the same server.

For some reason, the company would not give us…They were paying us probably a lot more in aggregate per hour than the server was worth, which is interesting to think about. Anyway, we kept killing each other. One person would run a query, and it would take over the whole server, because it was tiny. No one really knew what was going on.

Because I would have to run something and wait for it to finish, I had a lot of downtime in the project. I started writing this script to figure out who was running what and try to work out some way that we can all survive and co-exist on the same server.

So over about six months, I played with this quite a bit and built up the very first version that I released on my blog. It was just a script at that time, just a big, huge chunk of script that I released on my blog on December 31, 2007. From there, I started getting a bunch of feedback. The response was really positive, and I started bolting more and more stuff on it.

At some point, I turned it into a start procedure, and I had people asking for all kinds of different options and switches and things like that. It just organically grew into what it is today.

Carlos: Wow. That’s something to think about, the staying power of it. Looking on your site, I can see comments from October 2015. You are now on version 11.11?

Adam: Yeah.

Carlos: Obviously, it has some staying power.

Adam: Yeah, a little bit. Interestingly, that version 11.11 has been out for quite a while now. It’s actually been the most recent version since 2012. There’s a couple of small bug coming. They are not even really bugs. There are misbehaviors that I’m aware of.There are a couple outstanding requests. Actually, several outstanding requests, but not too many bugs that exist in that version, so I haven’t really felt a need to write a new one. It’s just been sitting out there since 2012, very stable at this point, but very minor, a couple of issues. It’s been pretty cool, actually, because I use it every single day.

You talk about in the software industry, eating you own dog food, and I do that every single day with SQL as I do. Every single piece of work I do, I’m constantly using it in modeling my own processes and my own work. I think one of the reasons it’s worked out really well.

Carlos: Is the go-to, when people are asking those questions, “Hey, this screen has been running, and it’s been running for a while. What’s going on?” That ability to get in-site into the server, as you mentioned, to see what’s happening, to see if there’s blocking, to see if it’s just [inaudible 06:56] taking a long time has been super, super helpful.Now, as you mentioned, there are stored procedures, so people can download it. They install that, and they can run it, and of course there are some default parameters, but you don’t have to supply anything with it. It will give you an output of what’s currently running on your system.

Adam: Yes, that’s correct. That’s the whole idea is. I tried to make the default parameters as generically applicable as possible. No one really stopped to think about what they put into it. Unfortunately, I don’t get as many anymore, but I used to get a lot of questions like, “I installed it, and it didn’t return anything. I keep running it, but it just doesn’t give me any output.”The reason is it’s not supposed to give you any output unless there’s something interesting forward to show you. People won’t stop on the dev server, which you absolutely should do. Test it, don’t trust me. Don’t trust anyone with something you’re downloading on the Internet.

You are going to put on your test server. You are going to try it there first. Make sure it doesn’t crush anything. But of course, there’s nothing running on your test server, and it’s design, it’s called ‘Who Is Active.’ It’s only supposed to show you something that’s actually active. It’s not supposed to show you all the background stuff that’s constantly running on your server. Because honestly, who cares?

What you want to see is who’s active and who’s making the server slow right now. When you run it on your test server, and nothing is running, it doesn’t show you anything, because it says, “I don’t have anything interesting to show you.” I used to get a lot of emails from people who would say, “I think it’s broken. It doesn’t give me anything. What’s up?” That’s a bit of a learning curve for some people.

Carlos: There you go.

Adam: Yes. I think particularly very useful in a dog fight, if you will, when people are starting to ask questions about the database, then it becomes your friend. I know that we have options to add some additional details that get returned. For example, stored procedure names or objects that are running in addition to just the query that’s running at the moment.We can see execution plans in addition to that, things of that nature. But as I mentioned, I admit to normally running it with no parameters. Are there other functions that you’ve been using? You mentioned using it every day basis that you think have been the most helpful for you?

Adam: Yeah, definitely. The first one is @get_task_info. Basically, there are three different modes that Who is Active can use to collect and weigh information. First of all, there’s a very like weight mode. The second mode is the mode where it collects the top weight per request, and then the third mode is a mode where it collects all the weights per request.The default is the middle mode where it just collects a single weight per request, and it basically ignores certain types of weights, for example. It ignores parallelism weights and prioritizes blocking weights and lock weights and things of that nature, and then it surcharges you what it believes is the most important weight for you to focus on. That’s the default.

If your server is under heavy, heavy duress, and sometimes your server will be so loaded down that sp that was active won’t run very quickly. If that happens, you can put it into like weight mode. In which case, it won’t capture any weights, as weight collection is a little bit expensive.

Normally when I’m running it, I actually run it with four weights collection where it collects all the weights, and you control these three modes with the same switch. It’s called @get_task_info. @get_task_info=1 is the default, and that’s the one where it just collects a single weight per request.

@get_task_info = 0 puts it into lightweight mode. @get_task_info = 2 puts it into heavy, collect-everything mode. And that’s the one where you’ll see all of the parallelism waits, if there are any, all of the waits across all of the threads that are running on your request.

You’ll see how many threads are running on behalf of your request. You’ll also get a couple of additional columns added to the output. You’ll get a physical IOs column that I captured from one of the thread’s DMVs and a contact switches column that I captured from the same one.

You’ll get a lot more information at the expense of, obviously, a little bit heavier run time. And that’s usually the mode I actually use when I’m debugging things, because I want to get as much information as I possibly can. So that’s a really, really useful one, especially if you understand how to read waits well and how to interpret them. It gets very, very heavy.

A second mode that I’d like to turn on is called @get_transaction_info. By default, sp_whoisactive only tells you whether your SPID has, or how many transactions your request has opened at the moment. It doesn’t give you any more information aside from that.

If you turn on @get_transaction_info = 1, a new column is added to the output that actually shows you which databases your session has open transactions in, how many log records have been written in each database, and how many bytes have been written to the log on behalf of your transaction in that database.

Carlos: Oh wow, OK.

Adam: And that’s super, super useful. I do a lot of ETL work, a lot of batch processing. Sometimes the transaction logs will start growing, and we want to know why is that happening.

Carlos: Sure

Adam: Sp_whoisactive @get_transaction_info = 1, and I have the answer in a second. Another thing I’ve noticed is that if you are involved in a large, fully logged transaction, you can actually figure out how far into the transaction you are based on reading the log records.So for example if you have insert running, and it’s fully logged, it’s actually going to log one log record per row that’s been inserted, on average. Sometimes there will be page splits and other things like that, that will mess with the numbers. But basically, it’s going to be one log record per row per index that’s being inserted into.

If you note, for example, I’m going to insert 10 million rows, I know that my table that I’m inserting it to has three indexes, then I know that there’s going to be approximately 30 million log records generated on behalf of my transaction.

So if I’m running my insert and someone comes and says, “Is that insert almost done?” I can run Who is Active and say, “Looks like we’ve done 20 million log records, therefore we’re about two thirds of the way done. So it’s been two hours now. I’ll talk to you in an hour.”

Carlos: OK, very interesting. Yeah, very helpful. Anytime you can provide that information right to your users to give them that status, it’s super helpful, and it makes you look super informed.

Adam: Right, exactly. That’s the point.

Carlos: Yeah. Now, you had mentioned an important concept, and that is using it in your testing.

Adam: Right

Carlos: I have to admit that I usually do a little more firefighting with it. And you mentioned a specific scenario, this log instance, for example, looking at the log transactions to see where you are and how much volume that you have. When do you normally start bringing that in into your testing as you’re developing, for example you mentioned your ETL packages or ETL processes?

Adam: I’m running this thing all the time, so basically if I write a batch process, and I’m running it, I’m testing it, even from the very first test, I’ll be running sp_whoisactive in another window, capturing off query plans, waits, so on and so forth.Basically, I’m a very performance-minded developer. When I write code, I don’t want to have to go back and play with it later. I want it to perform from day one and keep performing.

Carlos: There you go. That’s good advice right there.

Adam: It’s good and bad. It depends on what you’re doing to be perfectly honest. Most of the projects I work on, I’m brought in because I am a performance-minded person, so it makes sense there. For some other projects or some people, it might not make as much sense to focus quite that much.I would say exercise caution in that area. Sometimes you need to get the job done. And if the query is a little bit slow, as long as you’ve gotten the job done, you don’t need to spend three days carefully tweaking every single aspect of it to make it as fast as humanly possible. That’s more often than not actually what I do, because that’s my world.

Carlos: Oh, OK.

Adam: That’s not everyone’s world. That is my world.

Carlos: Sure. So maybe like an 80:20 rule can apply there as well.

Adam: Yeah. Don’t always do that, because it’s sometimes a waste of time. But by and large, that’s the way I operate. I’ll have that thing running all the time while I’m testing processes that I’m writing, while I’m testing anything I’m writing. I find it’s just much easier for me to use that tool since I’m very familiar with it.I know how to read the output really well than to turn on a trace or do something else that other people would generally do. I see a lot of DBAs turning on a profiler still or extended events or whatever to capture information. I don’t find myself doing that anymore, because I just use Who Is Active to capture all that stuff, or occasionally some other DMV queries.

Carlos: As you begin to become more familiar with the wait types, you can troubleshoot that a little bit easier. I guess thoughts around once you have that list come back to you, if you’re not as familiar with all of those, I guess thoughts about maybe next steps. Where would you go next? Now, I have this list of top three waits, if you will. Where would I take that information?

Adam: There are lots of great articles online about different waits, just Google, effectively. Plug any wait into Google, and you’re going to get some information most likely. You’re going to get someone’s blog. MSDN has some stuff. Some of the MSDN blogs have some stuff.You’re probably going to get something. If you don’t get something, then you go to Twitter to #SQLhelp, and you ask there. And then if you still don’t get something, then you have to call Microsoft.

Carlos: Hopefully it won’t come to that. But yes, the community is very anxious and willing to share that information.

Adam: Yeah, and I have a couple of things on my blog as well. There’s so much information out there. It’s actually gotten to the point where it’s more important to look at all the information and figure out what’s real and what’s not.There’s so much information now that there’s almost misinformation in certain cases. It gets difficult. Some of the waits are much more complex than people realize. Things like IO waits, for example, are actually very complex. They’re not necessarily always about IO. They can be about other things depending on what else is going on in your server.

It’s important to really be careful with what you’re looking at and make sure that you use an evidence-based model and not guess.

Carlos: Sure. And that brings up one of those tools that we can look at some of those as the execution plans. And I know you can get the execution plan back as a result, as a parameter, if you enable the parameter to get the execution plan back. But one of the things that might be a little surprising is that you’re looking at the plan and the cache, right?It’s not exactly the same thing as running include the execution plan in your SQLs or management studio, and then seeing that result.

Adam: Well, yeah. There are two kinds of plans in SQL server. There is the estimated plan and the actual plan. I think these are really badly named. They’re both miss numbers in their own way.The estimated plan, if you look at it, will be the plan shape. It’s going to run. It’s going to be cashed unless there is recompile that occurs just before the plan runs. Most of the time, the estimated plan is in fact the actual plan shape it’s going to run.

Then we have this thing called the actual plan. The actual plan is these generally, almost always the same exact plan shape as the estimated plan, but it includes some additional information: number of rows per iterator, number of executions per iterator, and some other information. It’s the same plan, it just has a little more information.

Who Is Active, since it pulls the data out of the cache while the plan is running does in fact show you the actual plan shape that’s running, even though it is called an estimated plan.

That’s why I say that. I think those are really badly named concepts in the SQL Server world.

Carlos: Sure, sure. Fair enough. I was referring. I guess my thinking was that I wouldn’t necessarily see the actual rows and estimated rows, for example, from an operator pulling it from the plan cache, right?

Adam: That’s correct. You will not see any in that data. The data is actually very, very expensive for SQL Server to collect. That’s the reason it’s not available by default.In SQL Server 2014 and above, there’s a DMV that will actually tell you all of those numbers on a per iterator, per thread basis, but the big caveat is your session actually has to have plan collection enabled, so return to actual plan, so set showplan XML on, and whatever that is has to be enabled for your session, or you have to have either trace or extended event session collecting the actual created plan.

Either way, you’re adding a lot of overhead to your plan and the SQL Server to collect that information. Unfortunately, we can’t get that too easily.

Carlos: Another reason to begin using that in your development process is where you’re not having the load of the system there. You can spare that overhead, and it collected information for you if you put into production.

Adam: Correct, correct. Of course, I don’t know if it’s Murphy’s Law or some other law, but once you go to production, everything is different.

Carlos: That’s right. Nothing like doing in a production system. That’s for sure. We talked about version 11.11. There is a version 11.111, so 11.1, if you will, that was created for Azure. Now that lot of DMVs are available in Azure, they’re working all those things out. It will work. But interestingly enough, you mentioned a general lack of Azure. So is that changed at all for you personally?

Adam: First of all, just to clarify, I did release version of 11.111 and did work in Azure, and now it doesn’t anymore. So what they did was they released all the DMVs finally in Azure, then they ripped one of them out that sp_whoisactive does use.First there was, it had a reference of MSDB, because it pull some information about SQL agent jobs. I had to pull that out, because there’s no MSDB in Azure. Then it did work, and then they pulled another DMV out, which is sys.dm_os_sys_info, which is a DMV that has information about.

It only has one row and has a bunch of columns in software link system. It provide information like how much memories on the system and how many threads there are and how many physical cores there are and bunch of other information, how long the server is going up, things like that.

Anyway, they ripped that one out. Unfortunately, that’s pretty deeply embedded in some of the logic, and it was active, so I’m going to have to, and there’s no replacement that I can find. So I’m going to have to, firstly for rewrite part of it, if I do want it to run on Azure.

Then to get back to your question, yes, I do want to run on Azure. I’m getting lot more request these days, actually, for an Azure version. It seems like that’s finally starting to be some attraction around Azure, especially with some of the newer SQL Server features they’ve announce.

I can really take advantage of it — backup Azure, Stretch Databases, things like that. Is it called Stretch Tables? I don’t remember exactly.

Carlos: They are at the table level. But yeah, stretch…

Adam: Stretch Tables. Some of these features, I think, are really going to start driving adoption. It’s getting easier and easier to set up an Azure database. They’ve done a really good job in terms of giving people free or nearly free access.I think we’re going to see quite a paradigm shift in that direction in the next few years, so I definitely will be helping to support that with Who Is Active.

Carlos: Do you have any download numbers for us, version 11.11? How many times they have been downloaded?

Adam: This one has been downloaded about 73,000 times. I think, probably, in total all the versions into there, probably half a million or something like that times, because there’s been ton of different betas and other versions. 73,000 times. It’s interesting to think about those numbers.I was saying to a friend the other day, “They have a thousand servers in this company, and all thousand servers are running sp_whoisactive.” I have to assume that obviously, there’s a lot of people who downloaded it multiple times, but a lot of those numbers, a lot of those 73,000 downloads must be some DBA is that taking it and distributing it across their environment.

I would love to be able to count the total number of instances the sp_whoisactive was on. It’s got to be in the high hundreds of thousands, I would guess.

Carlos: No, I would agree. I think we have community tools that are widely available. It’s got to be probably the number one tool for lot of DBAs.

Adam: Yeah. Some are pretty cool. I’m glad to been able to provide that.

Carlos: Well very good. Adam, thanks for chatting with us about sp_whoisactive. Of course, you can find that on his blog. It’s sqlblog.com will have that with probably bit of information on our show notes which you can find at sqldatapartner.com/podcast.Adam, before I let you go, we want to switch gears here just a little, get some insights into, additionally, some of things that you’ve done. I know, and one of the things we always like to talk with our guest about is their favorite tool. Now, we’ve been talking about sp_whoisactive tonight, and that can be your favorite tool. But do you have a favorite free or paid tool?

Adam: Obviously, I use sp_whoisactive all the time. I don’t use a lot of other tools, actually, in my work. I use Management Studio. I live in Management Studio. I use Visual Studio. I live in Visual Studio.

Carlos: You’ve done a quite a few things over your career in the performance space, if you will, but I know you did both the finance, the organizations. But what’s the best piece of career advice that you received, and how did it helped you?

Adam: Early on in my career, one of my managers recognized that I had a personality that would lead me to be in leadership positions over the course of my career. I was really fighting it pretty hard, the idea of being in leadership position of any kind.Basically, I actually was purposefully forced myself out of those positions by putting myself, by doing things that weren’t very good for that kind of career. Let’s just put it that way. I’ll keep it vague .

Anyway, this manager sent me and said, “Listen. You are going to be promoted into leadership positions, either in this environment or others. You really need to learn to adopt it and embrace it and learn to really trust other people and learn to delegate to other people and really work in a team environments.”

It took me a long time to absorb that. When I received that feedback, it’s tough for people with my personality to really delegate work to others. It took me a really [inaudible 28:04] to realize the importance of it.

After several years, I finally was able to understand value of that. Now my approach is more I want to delegate work, even though it’s frustrating to me.

My personality, I’m perfectionist. Delegating work and then having someone not do it quite as well as I think I can do it myself is very frustrating to me. That’s one of those things that I have to really…thanks to that feedback I received, that I’ve really, really worked on.

Now, I realize it’s not about the fact that the work isn’t necessarily being done badly, it’s about in fact the other person is learning, and I’m helping them get to the point where they can do the work extremely well. That’s very satisfying, actually, helping other people grow and learn and get better with their career.

It helps me get better as well, because someone can only really be as good as the team that’s around him.

Carlos: Sure. It has a dual purpose there.

Adam: Yes, it did.

Carlos: It helps everything move forward or get better.

Adam: I think so.

Carlos: Very nice. Well ,Adam, our last question for you tonight. If you could have one superhero power, what would it be, and why would you want it?

Adam: That’s interesting one, too. I guess if you think about superheroes, it doesn’t get much more bad ass than Wolverine.

Carlos: Oh.

Adam: I’m going to have to go with his healing ability. I guess Adamantium Skull would be nice as well, but that’s not really a power. That’s mean he is able to have.

Carlos: I guess because of his healing. He was already pumped with that stuff.

Adam: He is always been my favorite superhero, by far. I’m going to have to go with his healing abilities. I guess it makes him immortal as well, right?

Carlos: Yeah, that’s a great question. I’m not exactly sure.

Adam: I think he can be killed technically, but I think he will live forever as long as someone doesn’t cut his head off or something. Pretty close to immortal. Heals up, and he’s a bad ass. That’s sound cool to me.

Carlos: Very good. Adam Machanic, thank you for being with us tonight.

Adam: Thank you. I appreciate that.

Carlos: Again, you can check out the sp_whoisactive at sqlblog.com. You can get show notes from tonight’s episode at sqldatapartners.com/podcast. We’ll see you on SQL trail.

Children: SQL Data Partners.

Back to top