One of the new features being added to SQL Server 2016 is the Query Store which will allow for greater access to the performance history of your databases. This is going to be extremely helpful when you are approached about ‘What happened this morning?’ With the Query Store feature, you will now be able to provide some insight into that question.
I chat with Borko Novakovic, the program manager for Query Store feature, and he shares with you, Compañeros, details about the new feature and how those interested in performance monitoring will be able to use this for their benefit.
Even if you aren’t going to run SQL Server 2016 as soon as it comes out, you will find this information interesting and I hope it helps you along the SQL trail. Have fun out there.
Transcription: Query Store
Carlos L Chacon: This is the “SQL Data Partners Podcast,” my name is Carlos L Chacon, your host and in episode 24 we’re talking about the Query Store, the new 2016 feature that will enable you to get better history of the performance of your database.
I’m joined today by Borko Novakovic, from Serbia. He is the program manager for the Query Store feature. In other sessions that he has given he compares the Query Store to the flight data recorder for your database. I know this is going to be very useful for a lot of administrators out there that want to get access into that and see why queries may have changed in their performance over time.
After 20 somewhat podcast interviews you’d think that I’d have down all the sound quality issues and I am embarrassed to say that something happened on my side. Borko fortunately sounds very good but my side…
Anyway, something happened. I made some edits so that Borko is a little bit more featured there. I do apologize for that, and I hope that you’ll bear with me and continue to tune in.
If we miss something, if you want some other feedback to send my way, of course you can hit me up on Twitter @CarlosLChacon, or you can shoot me an email at [email protected].
Compañeros, it’s always good to have you. I hope that you have enjoyed the holiday season and are looking forward to a great 2016. I know I am. It’s good to have you, and as always, compañeros, welcome to the show.
Audience: SQL data partners.[music]
Carlos: Borko, thanks for being here. Welcome to the program.
Borko Novakovic: Thank you, Carlos.
Carlos: Let’s go ahead and jump in. Tell us why the query store will be such a cool feature.
Borko: Query store is a new feature or a tool in SQL Server 2016 and Azure SQL Database that dramatically lowers the bar for the query performance troubleshooting. In a sense, you can think of it as a fly data recorder for your workload. It captures and collects all the relevant data that is important for query performance troubleshooting, so that the data is always available for you when you need it.
Borko: Basically with the query store, it does two fundamental things for our users. It collects all relevant compile and run-time data related to query and execution plans. Based on that data, you can do any kind of analysis and take corrective steps if you have performance problems. That’s one thing.Another thing, it really simplifies the process of plan forcing in the SQL Server. Prior to Query Store you would need to either use plan hints or plan guides. Both of these features are not that intuitive for average users. Now with the Query Store, you are just one click away from forcing the plan that you believe that is the best one for your query.
Carlos: Yes, that’s going to be interesting. I think one of the big differences between the Query Store feature and the current, for example, DMVs we have now, is that we can get some insight into the last query that was run, and then the cumulative nature of that query, where that the Query Store’s going to give me point in time over, this query’s run ten times today, I want to see stats about each of those ten query attempts.
Borko: One of the problems with the DMVs is actually in the fact that when you look for the specific plan in the plan cache DMV, in some situation, you might not be able to find it, because for many reasons, SQL Server might decide to evict the plan from the cache. It’s not longer in the memory. In many cases people do troubleshooting basically. Customers are complaining about the performance, DBA is doing performance analysis, and he cannot conclude what went wrong,With the Query Store, that cannot happen, because the full history of query execution is persistently stored inside the user database. Even if you had the performance problem a day ago, or a couple of days ago, you could go back in the past and see how the query was executed, how many times it executed, what were all different plans that were used for the query execution, and compare them side by side.
Carlos: I think that’s a common scenario where a user will come up and maybe nonchalantly say, “Hey, this morning I was having trouble. It seems fine now after lunch. But what was going on this morning?” [laughs]
Borko: Exactly. That is one of the main use cases for the Query Store. And of course, having the data for the analysis will give you opportunity to open new different scenarios. You can analyze, “What are the top resource-consuming queries today, during the last week, during the last month, ever?” which was completely impossible to do with the DMV.
Carlos: It is only cumulative. You can only do it since last restart, or since the last reset. Say, only give me last week.
Borko: Under the condition that the plan is currently in the cache, which might not be the case. Now you have statistics both for the queries that are currently in the cache as well as for those that are not in the cache, for ad hoc queries. You can analyze top resource-consuming queries, but also you can understand better your workload. You can understand the nature of your workload. You can conclude whether your workload is parameterized.What is the number of different queries? What is the number of different plans? How many plans you usually have per query in your workload? Stuff like that. Then you can distinguish between queries that execute very frequently in your system versus those that execute once a day, once a week. Then you can apply different techniques to optimize all of these.
Carlos: Now we’re familiar with including the actual plan results when we execute a query, but that process of collecting that information takes effort for SQL Server to capture that information. How are we getting around some of the performance impacts of now collecting this information for all of that queries over the course of, say, even a week?
Borko: I could translate your question if I understood it correctly. “What is the average performance impact of running Query Store with the workload?” Right?
Carlos: Fair enough.
Borko: This is the bottom line question. Basically, of course, the usual answer for this question is it depends on your workload, right? But I will now try to explain how it depends. The main fact that is critical for the performance overhead is how many different queries and how many different plans you have in your system.What happens is, if your workload is predominantly ad hoc workload, you have, let’s say, thousands or hundreds of thousands of different queries. Then, actually, your workload is difficult both for the SQL Server and the Query Store. It actually puts a lot of load both for the Query Store and the SQL Server, because the SQL Server will be constantly compiling new queries and new plans, which probably means you are not using parameterization enough as you could.
In that case, you might observe some performance overhead. But if you have really well-packed, really well-designed workload — let’s say you have a couple of thousands of different queries, and then you have plans that are being reused — then the overhead is really, really minimal.
When I say “minimal,” what we observe in Azure SQL Database — and we are running Query Store on thousands of databases for months now because Query Store is really a cloud-first feature, and we really learn a lot and improve a lot based on Azure feedback — what we see is really negligible performance overhead. On average, it is a few percent — one to two percent, even less.
Now, it doesn’t mean that every workload in the world will have that performance overhead. As I mentioned, you can construct synthetic workloads that can have five, maybe even more, overhead. But what I’m trying to say, what we observe on a thousand of different databases is really a minimum overhead. The design we implemented is such that we are minimally blocking current workload.
What happens is, actually, there are two integration points between every workload and the Query Store. Every time when the query is compiled or recompiled, then the new plan and compile time statistics are being sent to the Query Store, using the message that exists between the query optimizer and the Query Store.
Every time a query gets executed, then the query processing pipeline sends the runtime statistics and Query Store then calculates statistic information for the interval over which query was executed.
Now, if you go to my previous point. If you compile rarely and execute frequently, then the only point where overhead exists is actually during the runtime. This really is a minimal overhead.
Borko: If you’re executing different queries all the time, then what you’ll be doing — what SQL Server will actually be doing — is SQL Server will be compiling plans all of the time, and then send plans and compile time statistics to the Query Store, and that will basically incur significant performance overhead. The most expensive overhead comes from plan compilation and sending the compile time statistics.
Carlos: Sure. For those, maybe using some Entity Framework and other things, you might be a little bit more in trouble. If you’re using store procedures and whatnot, you’re…
Carlos: …A little bit safer. I think you could use the DMV because the DMV will give you the object information if the plans are coming from an object that’s per procedure, or views. If it’s ad hoc, that information will be there. You can get a sense, overall, for what’s the percentage of ad hoc versus object plans, if you will — plans that come from storing procedures, then.That would give you a sense for what my performance in fact might be.
Borko: You can use the Query Store, of course, to easily identify whether your workload is ad hoc or not.
Carlos: Once you’ve set that up, then obviously you’re collecting that information, and I’d expect the Query Store to tell you that.
Borko: Yes. Then, if you have that information in front of you, then you can decide what to do because you have some techniques that you can apply even without changing your workload for parameterization, and then reduce the number of different plans that will be used during the query execution.
Carlos: Sure. The benefit there of having that information is being able to force the parameterization, as you mentioned, to reduce then the number of plans.
Borko: Yes, and just to mention one thing. Overhead of running ad hoc workload is not only in the fact that there are too many recompilation that happens, or compilation that happens, but also in the fact that Query Store is being flooded with a large number of actually similar queries and plans. And then what can happen in that case is that very easily you can exceed the size quota Query Store has inside of every user database.Maybe now I didn’t explain that part, but basically Query Store stores the data inside the user database. There is a pretty fine size quota that you can set and figure to any value that you think is reasonable. We start 100 megabytes, but basically if you’re running ad hoc workloads, what we observe is that those kind of workloads very quickly exceed the quota. They are generating new and new queries, new plans, which are all similar, but not the same.
In that case, what happens, Query Store goes to the read-only mode, because it exceeds the quota, and then you are not operating the latest data. If you go to Query Store you won’t see the latest data, because it will stop collecting the latest data.
Carlos: Once it gets to about 100 megs, it stops.
Borko: Yeah. Which is a second bad thing which happens, and yet another reason why you need to do something about ad hoc workloads.
Carlos: Let’s just close that loop for those who are experience the Query Store for the first time. You mentioned how they get stored. So previously, if you’re comparing that to Dynamic Management Views, the information gets stored in the resource database, it’s not seen to the end user.Now you’re saying I enable Query Store on a database, the query is formed — or what goes into the Query Store for that database, actually gets stored inside that database — and then that information is available to me via SQL Management Studio.
Borko: Yes, and what happens is the data is persisted across backups and restores, so even if you backup your database and you store it in some other place, you get the Query Store data inside. I can just quickly and briefly explain how Query Store works and what is the data that Query Store collects and captures inside the user database.Let’s start from queries. Every unique query text is being stored and captured inside internal structure of the Query store. First of all, Query Store uses internal tables to persist the data, and all the data is being exposed to set of views — system views — that you get as part of the SQL server.
If you go and search for sys.querystore, you will find a couple of views that expose query text, queries, plans, and run-time statistics. We started with text. Every unique text is being stored in the Query Store.
Now, it’s very important thing. One unique query text can produce multiple records in Query Store. This is because, as we know, if you execute the same query text using the different context setting — we call it “plan affecting settings” — you can actually produce different results. If you change the language, if you set the different set different ANSI_NULLS. If you change some of these set options, the same query text can produce different results.
Therefore, we store the same query text under the different set options. We store it as different query entries in the Query Store. That means you can analyze and tune those queries independently. For us, it’s also important whether some query text is used inside or outside of some containing object, like store procedure, function, trigger.
Say that you have a query text. The same query text that is used as a ad hoc query and the same query text that is used as part of the store proc will generate two query entries. You will be able to actually analyze them independently, and if you need to tune that query text, you have actually option to independently tune your ad hoc query versus store procedure query.
Then, for every query entry in the Query Store, we do collect every execution plan, every plan that we get from the compilation phase, inside a separate table and expose as part of a separate view. A set of compile time statistics are collected, like the CPU, time view for the compilation, and some other metrics.
Then, the most important part. This is the compile time part. Every time when the query gets executed, some of its plans is being executed. What query store is doing, it divides execution in time windows. We call it “run-time intervals.” This is something that you can control and configure. You can say, “I want to track execution every minute,” or every five minute, or every hour, or every day. It defines the granularity that you’ll have inside the Query Store.
For every query that was executed on that interval, you will have exactly one row in the Query Store with aggregated run-time statistics for that plan.
Carlos: If I understand what you’re saying right, is that I have the option of saying I’m going to keep this information for a month, but I want you to aggregate it in chunks of one hour, so if the query runs ten times an hour…Borko. Yes, in chunks, right.
Carlos: …I’m going to see the average…am I going to see average results, or cumulative intervals?
Borko: You’ll see a couple of metrics. You’ll see the number of execution times. How many times query was executed in that interval, then you’ll have average, and then you have maximum and minimum, and standard deviation for the duration. Using the number of executions and average you can easily calculate the total duration of that query.Those are the statistics methods that you have for your analysis. It wouldn’t be feasible to store each and every execution of the query, because there are queries that are executed thousands of times in a minute.
Carlos: Let’s get into some of that. You mentioned that your experience with being in SQL Azure Database. You’ve seen the usage and some of the scenarios. What is a general guideline?Let’s say I’m just starting out. I want to get some better insight into my environment. What are some of the scenarios or, “best practices” may be a strong word, just yet, but the setup components that I’d want to look at, that I’d want to implement?
Borko: Usually what we do explain to our customers is how they get started with the Query Store. The number one thing that we advise to our customers is, “You need to install the latest version of SQL Server Management Studio,” which means, 2016 Management Studio.You probably know that as of CTP2, you have a standalone installer of the Management Studio, which means you can install the Management Studio independently from the SQL Server.
Which is especially interesting and useful if you don’t have SQL Server running [inaudible 19:53] in your environment, but you want to use Query Store in Azure. Why we do suggest people to install the latest version of Management Studio, because Management Studio comes with a set of great UIs that you can use both to configure and analyze the data that Query Store actually collects.
As a starting point, if you go and click the properties of your database, then on the left-hand side you’ll notice a new, basically a tab– a Query Store tab — for the database. It opens a form that you can use to activate the Query Store. To set the operations mode to retry it, which means I want Query Store to start collecting the data.
There are a set of very important parameters that you need to configure for the Query Store. For many people it’s actually good enough to take defaults because SQL Server Management Studio comes with defaults, but some settings are important and I want to emphasize those.
I already mentioned the maximum disk size. So, we start with 100 megabytes. Another settings that people need to consider is for how long you want to keep the data inside the Query Store. I think we came out with a pretty conservative default, which is one year and two days.
Borko: Which I think is really long period of time for many users.My advice for users is if you don’t plan to analyze data for such a long period of time you don’t need to keep the data because it will be overhead to the system.
So, if you want to analyze data only in the last month, then just set that value to 30 days and that’s completely fine.
Carlos: Then we download it. So, what about the event that, so, I’m a good DBA, right, I want to review my systems, let’s just say on a quarterly basis. I keep it for three months.Is there a way to persist that data or even export some of that data to say, “OK, I’ve made these changes. I’ve affected parameterization on some of these queries. It’s ad hoc queries that all look the same. I don’t necessarily want to keep that data but I do want to get some thoughts on if the changes that I’ve made have been good for my system.” So, would the advice be to just keep that for the two months, or the two quarters, rather, so that you can compare those or is there a way to export some of that and say, “Here’s where I was. Here’s where I am now.”
Borko: You can use different techniques to the data from the Query Store to any external structure because we do expose views so that you can use SSAS. You can use any of other export procedures to export the data to your custom structures. You can use backup restore or backup mechanism to export the data.Now, we don’t have any specific mechanism today to export the data. Only the data that is related to the Query Store to some format and then import into the same format to another database. We don’t have that mechanism today. I must admit that we are thinking about having such a mechanism, but it doesn’t exist.
So, if you want to compare. You always want to keep last month of the data, but you had some plan forcing decision in the past and the you want to analyze how your database today compares to one year ago. Then you need to do analysis on your own.
If you export the data previously in some structures, in some tables and then you export the contents data to the Query Store in the same database, you can do analysis.
I must admit that today you are pretty much dependent on your own skills for exporting and analyzing the data.
Carlos: The MSDN site has lots of great information of how to use and set up that and we’ll that information all with the show notes today. It’s sqldatapartners.com/podcast. You can go in, get some of this information. Some of the step-by-step details you can get on how to set those things up, will be there, available for you as well.
Borko: Yeah, thank you. That’s awesome.
Carlos: We had mentioned that this has been Azure SQL Database, with cloud-first feature. Are there going to be any differences between the Azure feature and the on-premise feature when it gets released? Which I know is a little bit hard to tell because it’s just CTP3. Current thoughts, right, in November 2015, of what differences might there be.
Borko: If you compare today query storage at Azure SQL Database and in SQL server [inaudible 24:44] CTP3, you won’t find any difference. The feature set is completely the same.As of beginning of November, we are actually at the GUI level of quality in Azure SQL Database. We call it “globally available,” which means we are really at the production level of quality in the Azure SQL Database which automatically means we have the same level of quality on prime because we are using the same code base. Everything is the same.
Carlos: Now, one thing that we maybe had danced around just a little and that is, once I have collected all of this information — and you mentioned SQL Server Management Studio adding the newest version — and that is the analytics that I’ll have inside of SQL Management Studio. There’s actually reporting…
Carlos: …and some charting available for you. So, it’s not just looking at raw data, if you will. It will guide you on that path to decision making.
Borko: The visualization that we provided in SSMS actually are reports that help you go through the main scenarios. Although you could analyze Query Store views on your own — there are six different views that you can join and combine the data — it’s of course much easier to go and look at predefined reports because they visualize the data in a very intuitive and simple way.I will quickly cover what are the three most important views and reports that you provide as part of SSMS. First one, we call it “regress queries.” So, that is the view that you can use to easily and very quickly analyze what are the queries that today behave worse than compared to the, let’s say, average on a period of last seven days.
If you will, the default is “Give me the queries which total duration is worse than comparing to the last seven days.” Of course you can change the metric that you analyze. By default it’s duration, but you can use CPU time, you can use memory, you can use IO metrics.
Instead of writing complex queries against the Query Store, you can go and configure the reports and it will give you top-end queries that actually needs your attention.
Then, you can go and analyze a different perspective and we call it “top resource consuming queries.” Those are the really the most expensive queries that you have in your system. Again, you are a free to specify what is the metric that you are interested in. Whether it’s total duration, or CPU, or memory, IO, and what is the period of interest. Again, even if those queries are not regressed, maybe they behave completely in a predictable way, they are still the most expensive, which means those are the guys that you need to pay attention to, right?
Because you might have thousands of queries, but usually you have handful of these that are very expensive and then you need to actually pay more attention. And in some cases you can easily see, “Hey, this query was executed with three different plans,” and, “In last month, this one was the best one, but for some reason SQL server was switching between those three and I need to analyze why.”
Of course, when you click on the plan, it’s very handy on the bottom of the visualization, you have plan visualization. Very complex data is visualized in a very intuitive way and you have everything that you need. Query and plan execution, and a single plan drill-down.
Just to conclude, the third scenario, we call it “overall resource consumption,” which means, “I don’t want to go into a” — and some people start even from that one, right? — which is, “this is my database, this is the time axis, I want to see what is the total CPU for my database in last day.”
I have intervals and I can see the chart and how CPU duration or memory consumption look like in last day and then for interval I can double click. From that point, I’ll actually go to the top resource consuming queries for that interval.
So, using SSMS will actually save you a huge amount of time.
Carlos: It’s a complete feature, right? Not only the collection of that data, but then giving you the tools to drill down and make sense of what information is being collected.
Carlos: You presented at PASS this year. It was the last session of the conference Friday afternoon at 3:30 pm. The room was packed. It was probably, gosh, a thousand people in there. More. So, I know that Query Store is going to be a very interesting feature. You had mentioned some metrics actually in looking at the 2016 features and kind of where Query Store fit in there. You want to talk about that for a second?
Borko: Based on the data that we got from people that wanted to share the information with us, Query Store is really among top three features currently. Among top features from SQL Server 2016 and we are very happy about that because we know that is something that we can really help our customers to optimize and understand their workload much better than before. Yeah.
Carlos: So, it will be exciting. We were chatting, I’ve seen the demo, but Borko smacked my hand and said I need to download CTP3 and get in there and play around with that a little bit more. So, I have my homework assignment for me. Compañeros, I hope that you’ll take advantage to do that.Of course you can use the Azure SQL database to do that if you have access there, or you can the download the CTP version.
Borko, thanks for chatting with us. We want to transition just slightly into what I’m now calling the “SQL Family” section of the podcast and it’s an opportunity just to chat a little bit more about you. How you do your work, and things. We were talking about Query Store and it’s OK. I know that you’re the program team for that. So, Query Store is your favorite tool, that’s OK. But what’s your favorite SQL tool?
Borko: I think that answer is easy for me. The favorite tool for me is really SQL Server Management Studio.
Borko: Not because I work for Microsoft, but because…I mean, I loved SSMS even much more before I joined Microsoft. I joined Microsoft four years ago, but before that I’m in the industry for maybe 14 years. I was first developer and then I was product manager and also software architect in ISV company that work and built different solutions on top of Microsoft technology stack using the SQL Server, and I found SSMS extremely useful.Why? It’s because I had everything that I need in a single place. Now, this Query Store visualization, which is really extraordinary, makes me love it even more.
Borko: And I think to be completely honest I think we are doing a great job to improve SSMS in the future.
Carlos: Yeah. That was funny. I hear that the SSMS that SQL server Azure 2016 is got a little bit of a face lift in its next version, right? [laughs]
Carlos: Very good. Very cool. You’ve mentioned, you’ve been with Microsoft for a while. Ultimately you are over in Serbia, you going to have a slightly different experience. For example for myself, being in the United States we all take different paths on the trail of the SQL trail. What is the best piece of heard advice you’ve ever received?
Borko: Basically it’s interesting. I graduated from Belgrade University. I actually took a course in electronics. I was supposed to work in hardware industry. [laughs]
Borko: The best advice that I got from my very good friend and my colleague was, “You should be called developer. [laughs] It will give you more freedom, more flexibility. You’ll be more creative.” That was the best piece of advice.I never worked as an engineer in electronics. I started immediately after the faculty. I started as a simple SQL developer. At some point I worked for a company that built a solution on top of SQL Server, that’s like 10 years ago started my SQL journey.
Carlos: Our last question for today. If you could have one superhero power, what would it be, and why would you want it?
Borko: That’s the difficult question for me. Probably going back in the past, and see how history looked like is something that I really like. I’m a really big fan of history. I’m really into history. I like history. On my free time, I actually explore and read a lot of about history.I think I would go to the old Rome and see how actually things looked like because I think there are a lot of needs, a lot of beliefs about the history. The reason not because of me being interested only, but also maybe to explain to people how things look like. I do believe that learning from history is the great way to pave the road for the future.
Carlos: Very interesting. I enjoyed travel myself. I think you’re on a great location in Europe, to get access to a few of those spots at least of course not as they were.
Borko: Greek is near Rome is near so history is near. [laughs]
Carlos: [laughs] Very cool, Borko, thanks again for chatting with us today. You’ve been great.
Borko: Thank you, Carlos.[background music]
Carlos: We’ll have again…short notes are available at sqldatapartners.com/podcast if you want to check that out. Any additional questions we’ll probably be following up just because this is such an area that has a lot of people interested.As we get closer to RTM we may reach out to Borko and have a chat with him. Thanks again. Compañeros and we’ll see you on the SQL Trail.
[background music only]
Audience: SQL Data Partners.