Episode 122: Isolation Levels

Does not play well with others. Is this description appropriate for some of the queries in your database?  In this episode, we explore the role behind isolation levels–why they are important, what they help prevent, and why so many people try to get around the rules they try to enforce.

Isolation Levels

We want our databases to be responsive and provide us with the data as quickly as possible; however, the database has to juggle everyone trying to make changes (updates, inserts, and deletes) with all the reporting options (selects).  To do this, SQL Server has a mechanism to retain order when there are multiple requests at the same time.  In some cases this requires locking records to ensure only one request can modify the record at a time.  These locks can lead to blocking and this is when people start trying to do crazy stuff.

Dirty Reads

When you are reading data that is involved with another transaction.  You ignore any potential affect it may have on your results.

Phantom Records

Read

Episode Quotes

“Isolation levels are about the degree of isolation you want an individual transaction to have.”

“If everybody is only reading and there is no write to the database then isolation levels don’t matter.”

“An optimistic model does work well if you don’t have to read conflicts and the pessimistic model works well if you have conflicts.”

“It push load on TempDB and if TempDB is already a bottleneck then, yes, you could potentially have issues there.”

“Memory optimized tables supports snapshot isolation, repeatable reads and serializable.”

Listen to Learn

00:05 Introduction of the guest speaker (Jos de Bruijn)
00:24 Episode Topic: Isolation Levels
00:52 What are isolation levels and why it is important?
04:18 Dirty reads and Phantom reads
09:30 Isolation levels when building a database application
11:51 Optimistic Model and Pessimistic Model implementation
18:15 TempDB bottlenecks, locking and blockin user databases
21:09 Difference between the utilization of Read Committed Snapshots and General Snapshot Isolation
26:41 Implementation of repeatable read and serializable isolation on memory optimized tables
34:43 Locking hints and when to use it
36:31 SQL Family questions

About Jos de Bruijn

Jos de Bruijn is a Senior Program Manager in the Database Systems team. He works on the SQL Server Database Engine and the Azure SQL Database service, focusing on query processing and data storage. Jos has led the query and transaction processing for In-Memory OLTP in SQL Server 2014, and delivered In-Memory OLTP enhancements in SQL Server 2016/2017 and Azure SQL Database. His current focus is performance and scalability in Azure SQL Database. In a previous life, he obtained a Ph.D. in knowledge representation and semantic web technology, and worked in academia for several years as an assistant professor.

Transcription: Isolation Levels

*Untranscribed introductory portion*

Carlos: Well Jos, welcome back to the program.

Jos: Well, thanks Carlos! I’m happy to be back.

Carlos: Yeah, it’s great for you. It’s always great to talk with a Microsoft folks and we enjoy you taking some time out to talk with us. It’s been almost a hundred episodes, and so it’s nice to have you back on.

So today, we’re going to be talking about isolation levels which might seem like a slightly daunting task. But I think it’s one of those very important features, and I guess let’s go ahead maybe set the stage for why isolation levels are important, and what are they help us to do. Right, so I guess how do we want to frame that in this conversation?

Jos: Yeah. So isolation levels, so we are talking about transaction isolation levels and isolation levels are, as the name suggests are about the degree of isolation you want an individual transaction to have. Because in a database system, database systems, SQL Servers, support a higher degree of concurrency that you can have. Lots and lots of different transactions, user operation happening at the same time. So the question is if you have a transaction that is operating on some data, reading some data, making modifications to what degree it isolates from another transaction that is touching the same data or reading the data and modifying the data at the same time. So that is why isolation levels are important because what you don’t want is that you are making an update on some data. Let’s say you’ve banking software, is a very classical example, you are doing an account transfer between one account at the other. You want to make sure that the overall balance is between the two is zero, right? Subtract 10 from one account then add 10 to another account. At the end of the day, you want to make sure that you have to balance, that’s only $10 a shift, is you don’t want another transaction to come in halfway and look at the data as if there was higher total amount. You want the other transaction to only see the result of your transfer and not…

Carlos: In between state. Ultimately, if you’re writing queries to the database, it’s how those queries behave with each other. Right, that’s kind of the idea of our conversation today.

Steve: And maybe just to phrase that a little bit different from the developer’s perspective, when the developers has an application that’s talking to the database. It is how those queries that are coming from that application are going to behave. And the different transaction isolation levels may change behavior of those queries slightly. Is that fair way of saying it?

Jos: Yeah. It is actually about the interaction between the queries and the write operations. If everybody is only reading and there is no write to the database then isolation levels don’t matter, and there is no difference whatsoever in the behavior between different isolation levels. It’s only in the interaction between the reads and the writes. So if you have a query and some other transaction is doing writes to the database, your isolation level essentially governs what you can see, what’s states of the database you are allowed to see.

Carlos: And what we are trying to solve here is I think help that basic like basic transactional database logic from the very beginning, right from the very beginning, almost the reasons that these transactional systems were put together. These are kind of the rules that it agree that it would follow in correlation from those updates and reads.

Jos: Yeah, that’s right.

Steve: So two terms that seem to always come up when we start talking about isolation levels are the term dirty reads, and the term phantom reads. Can you touch on what those exactly mean from the SQL perspective?

Jos: Yes. So dirty reads means that is reading data that has not actually been committed to the database. So usually when you have a transaction that does writes, you do the write operations. You make the changes in the tables and then at the end of your transaction you’re doing a commit. And then when you receive an acknowledgement from the server that that transaction has been committed, that means only at that point in time is the data really part of the database. At any point before that it, it is actually dirty data and if anybody reads that data it will be a dirty read because before the transaction is committed, it can roll back at anytime. And if the server crashes or failsover before a transaction is committed it has always roll back. So a dirty read means you are reading data that has not yet been committed to the database and you don’t know if it will ever be committed to the database. So that is a dirty read. That is kind of you’re getting in very flimsy territory and usually dirty reads is what you really want to avoid. Phantom reads, that is something very different. Phantom reads, that is data that has been committed to the database, but it has been committed essentially while your transaction has been running. It’s a subtle kind of interaction between transactions.

Steve: When you’re selecting from a table and you’re reading through page by page, as a query does, and something happens that causes pages to shift around in that table. For instance, an index is being reorganized, or somebody has inserted into a page that’s full that causes a page split. What can happen when that occurs, you get records that are either duplicated or missing. So, phantom records are basically when rows are either being counted twice, or they’re not there at all. And what that can do is if you’re doing reporting, and it has a table with this going on, is you can get very skewed numbers on your results. So, if you’re doing, like here’s 10 rows and then at the bottom there’s a summary that says here’s our total line, if your total line is being calculated and a row moves, some rows can be counted twice and some rows can be missed. If you’ve got a clustered index on a GUID or a last updated date or something that can change, and that changes, that causes the placement of that row in the table to move. And if you’re doing something like select count where from that table, with locking, make sure that you get the right number or rows. But if you use select count with no lock, you can then end up with those phantom records, where you have either rows missing or rows duplicated because using no lock says “I don’t care about other people’s locking, I’m just gonna scan through the pages and grab whatever’s there that’s not fully committed.”

Carlos: Exactly, so although that work is happening, and it’s still gonna give you the results, regardless of what other people are doing.

Steve: Yep, and what that means is you can grab the same row twice, or you can miss a row completely.

Carlos: This sometimes can be difficult to pin point when you talk about coming from a developer’s perspective. You know, when you think about testing as to writing your queries against your own database. You know, they happen very, very fast even when you try to “load test” and then maybe you have some different windows open or connections and you’re trying to replicate some of the stuff. But really it gets important when that concurrency, when the number of transactions increases at the same time on the database. And that’s when you start to see all these rules to start to come in force. I know sometimes when I was first starting with this it seemed a little abstract but again that idea of who gets to go first and what’s going to happen plays an important role.

Jos: Yeah, absolutely. But it is important when you are working with a database application especially if you have sensitive data that you are working on especially with things like financial data. It is important to really think through like when you are reading the data. Like how important it is to isolate the reads from concurrent applications. How important it is that when you do for example a range query, when you query all these account data for a single use, how important is it that nobody adds any additional account data to it, right? So that is what you need to think about when you are building a database application.

Carlos: Yeah, and again I think this could be a little bit blurry in the sense particularly when you have all of this, I guess you use the streaming data, right? For all of these, the no SQL Solutions that are coming on the scene. Again it’s that balancing act of how do you want these interactions to play into. And these are again core fundamental issues. Now, while we want our transaction to play nicely together like you mentioned with the account, my bank information, I don’t want somebody to see that I’ve taken 10 out and have it deposited into the other account while that’s going on. But in order for the database to be able to force that one of the things that I’m going to have to deal with then is blocking. That’s saying like, “No, I’m sorry you can’t you until I’m finished.” And that’s kind of the mechanism that the database is going to use to help enforce that traffic.

Jos: Yeah, that’s right. I mean, there are essentially two ways that this can be enforced. So first off is you need to make the choice what isolation level is right for your transactions and for your read operations. Usually you pick an isolation level either for you whole application or for individual transactions. But it is even possible to pick for individual read operations. So that is first part, choice. And then in terms of how these isolation levels are implemented. There are generally two ways to implement. One is optimistic way, the other is pessimistic way. So pessimistic is the default or has been the default for the long time. Optimistic way, that has been introduced first with snapshot isolation for kind of general tables. And then later with memory optimized tables also if you pick the repeatable read or serializable you’ll also get optimistic. But essentially the pessimistic model means that you are going to assume that there are going to be conflicts between read operations or different read operations that will get into each other’s ways. We are always going to assume some other transactions go to write something to the database that’s going to violate your transaction isolation which means that you need to take a lock on a row or range of rows. So basically if you pick for example, repeatable read. A repeatable read scenario if you read a row you will always want to be able to read that same row in the same state or later point of your transaction. Or serializable meaning that you essentially have serialization path between different transactions, right, so you don’t see any interference between transactions, meaning in practice if you read a range of rows or you read an entire table, no rows will have been inserted into or deleted from that range. So in the pessimistic model you actually take a look to make sure that nobody can change those rows or these range of rows. Whereas in the optimistic model, in the optimistic model, what you say is, “Well. I’m going to anticipate that there are not going to be any conflicts between these different operations”, and if there is a conflict so for example two concurrent transactions want to update the same row at the same time, you get a failure right away. If there is some concurrent transaction that commits but violates that serializable isolation then your commit fails. An optimistic model does work well if you don’t have to read conflicts and the pessimistic model works well if you have conflicts. But then of course you have these locks, and especially if you use serializable you can have lots of low ranges of data and then you can have transaction that start to wait for each other.

Carlos: Right, and this is again some interesting ideas that have been floating out there. Steve and I were chatting about this before we get started here and so you mentioned SQL Server is using a pessimistic by default, right? So that isolation level read committed and we know that it wants to protect the reads that are happening. So we are going to start putting locks there then we start getting blocking. Well, we have this, I don’t know, if it’s secret sauce. But there is this idea out there that people can use a function called no lock and that’s somehow is going to get by. It’s going to ease their burden and they are not going to worry about all these things.

Jos: I think when we start talking about this level then you are already kind of assuming that you are already using a pretty low level of transaction isolation. So earlier while we were talking about transaction isolation we were talking about higher isolation levels like serializable or repeatable read. Those are pretty high isolation levels. When you get actually to the lower ends, when you get to read committed which is the standard isolation level in SQL Server. It’s default. That is the isolation level where you still have some integrity in the database where you avoid dirty reads. So every wrote that you read has actually been committed to the database that is why we recommit it so you have a minimal level of consistency. You can still have locking and blocking because when you’re updating a row, you need to take a look on that row so that nobody reads it. But at the same time I will throw it here as well that we do have read committed snapshot isolation which is not used by enough customers of ours. So locking and blocking, read committed snapshots. Unfortunately, there are some horror cases where the application can’t observe differences of behavior between read committed snapshot and traditional read committed. But in most cases you’ll be fine so that is what I would point at first.

Carlos: So I guess what we are suggesting here is that if your organization as it’s rule is using no lock everywhere by default, you probably want to take a look at your isolation levels before you keep slapping that on all of your transactions.

Jos: Yeah, so first I would say, look at the read committed snapshots. What happens is, so in SQL Server, read committed snapshots came to SQL Server in 2005. Unfortunately, because we have legacy it is still not the default. But if you create a new database in SQL Server, read committed snapshots is still of the database. Now, what does it mean to have a read committed snapshot on? Read committed snapshot means that every individual query that you run within a transaction, each individual query operates all the snapshot of the database, the snapshot as of the start of the execution of that query. This means that if there is some rows that are being updated by a concurrent transaction you are actually reading older version of that row. Which means that you can just continue reading your blocks by updates that are happening at the same time.

Steve: Now, that row versioning of keeping track of those older versions of the rows that’s done in TempDB, is that correct?

Jos: That’s correct, yes.

Steve: Ok. I mean, one of the thing that I’ve ran into as I’ve come across database is where I think this will be great to run in read committed snapshot isolation level because people are already doing with no lock everywhere and it would be great to make it so that they didn’t have to do that. But then on those the same systems because they are already overloaded a little bit to begin with and that’s why they started using the no lock, they find out then that I think look at it in the TempDB is their biggest performance bottleneck. And I guess how big of an impact will turning RCSI on. I assume that going to put quite a bit more load on TempDB and if TempDB is already a bottleneck that would not be a good idea to turn on.

Jos: Well, indeed. It push load on TempDB and if TempDB is already a bottleneck then, yes, you could potentially have issues there. However, usually locking and blocking in a user database is often a different performance or a different performance issue from TempDB bottlenecks. So indeed, if you have lots of TempDB bottlenecks and on top of that you have these blocking bottlenecks in the user database. Then yes, maybe it will not help you. Maybe TempDB will become an even bigger performance bottleneck.

Steve: But if you have locking and blocking in your user databases and TempDB is not overloaded and that’s a good scenario to turn on the read committed snapshot isolation level.

Jos: Exactly.

Steve: Ok, I just want to make sure I got that right.

Carlos: That’s where we’re looking at things. I think, Steve, to your point the idea is like you’re going to trade one problem for another, right? So I’m going to trade my locking and blocking for potential TempDB issue contentions. As long as I knew that’s that taken care of. Good current feeding there. Maybe you’ve move that over to your SSDs or whatever is going to flip your boat there. There is no magic bullet. I guess the idea is that you want to make sure that you understand the trade-offs and that you’re prepared then to deal with them.

Jos: I do want to put another point to this and there is a difference here over the utilization between read committed snapshots and general snapshot isolation. Because read committed snapshots, you use that snapshot only for the running of single query not for your entire transaction. If you have traditional snapshot isolation you use for the entire transaction. So if you have traditional snapshot isolation, you need to keep the row versions in your TempDB for the entire duration of the transaction, and if you have long range transactions in the system which lot of strain on TempDB. But if you use read committed snapshots you only need to keep the version for the duration of the query which means that there is a lot less pressure on TempDB compared with whole snapshot isolation.

Steve: That’s a great point.

Carlos: I feel like the read committed snapshots however is an introduction to the in-memory isolation levels.

Jos: To some extent, so if you look at isolation levels which are supported by memory optimized tables. So memory optimized tables supports snapshot isolation, repeatable reads and serializable. Read committed is actually not supported, so snapshot isolation. But with memory optimized tables the row versioning is not done in TempDB instead it is done in the memory space of the user database. It is build into the table data structures so the row versioning is a lot more efficient than for traditional disk-based tables.

Steve: So then with the read committed snapshot or the snapshot isolation levels, those could be turn on in existing databases without changing the structure of the tables. But for the memory optimized tables that’s involved in changing over to the Hekaton or the memory optimized style table.

Jos: That’s right, yeah. And when talking about the way this isolation levels work for memory optimized tables, they are optimistic model. So first is you have the basic snapshot isolation on memory optimized tables that works functionally very similar to disk-based tables. So you see a snapshot of the data as of the start of the transaction. If there are concurrent rise let’s say row, you have conflicts that are being detected, throughout your transaction you always read from the same snapshot level of the database. Now, where it gets interesting with memory optimized tables is for repeatable read that are serializable isolation levels, how they are implemented, because they are implemented using the optimistic model which means that you have the same guarantees. So repeatable reads nobody has change, nobody has change this row between the time you read it and you commit the transaction. For serializable you essentially don’t see any other transaction committing data in any of the ranges that you read from. Those are the basic guarantees.

Carlos: Now, you’ve actually been bringing up the serializable and repeatable read quite a bit, but you also mentioned the read committed snapshot something that probably more customers should take a look at. Now admittedly, I think may have had one database that I have to put in repeatable read but I’ve never work on a database that has been in serializable. How common is that. I mean, I feel like I only know it because it will only be used for exams and stuff.

Jos: Yeah, that is true. So if it’s serializable is the only true transaction isolation if you want to look at from a purist kind of point of view. That is where the research comes from. But then when you hit reality then you have performance, right, and concurrency. With serializable, with the pessimistic approach where you take the locks, then you have this range of rows that you’re reading and if your query does a full table scan you take a look on the entire table. And between that time and when you commit a transaction nobody else can make any change in that table. So that will be very restrictive so there is a lot of locking and blocking. And if you don’t absolutely need it for this particular read operation then you don’t do it. That is why the lower isolation levels are more popular, that’s why read committed is the default isolation level in SQL Server. And actually one fact that not everybody realizes is that you can actually pick and choose isolation levels on a query level and even on the table level within. So within your query you can add a hint to the table and say, “The read on this table need to be under serializable for example but the other tables can be read under lower isolation level.” So you could say for example, my accounts table I always want to read that under serializable but other tables I can read them on the lower levels.

Steve: Well, so then with that, I mean independent of what your overall database isolation level is if you that control on a table by table basis then you can, as long as you understand what all those mean. You can get the behavior you need between different tables where you have one table that no lock is ok. You might have another table where you need serializable and make it sound how often things are changing, right? Is that correct?

Jos: No lock is ok. I think there are very few scenarios where no lock is ok. There are some cases that, I’ve heard of some cases where you have an application that for example has two connections of the database and one connection needs to see operations that the other connection did in the database without it actually having commit the transaction. And in that case, no lock or rather I would prefer people use read uncommitted in that case because then you’ll actually say what’s your intention is even though implementation-wise it’s the same. But you will actually say what’s your intention. Then that kind of thing might be ok but otherwise it is usually use as workaround around locking and blocking. So semantically people don’t like to read dirty data usually, right? It is more of a performance workaround. I would say rather read uncommitted, consider read uncommitted only if read committed snapshots puts too much pressure on your TempDB or your application somehow cannot deal with the snapshots semantics of the individual queries.

Steve: Ok, thank you. And perhaps I was baiting you a little bit there with no lock comment because I am a big opponent of no lock. It causes so many issues from what I see in day to day organizations. There is so much misunderstanding from people out there around you just turn it on as to go faster switch. And that is really not the right behavior of it.

Jos: Yeah, read committed snapshot does the go faster switch.
Steve: Absolutely, alright. Love it.

Jos: It’s always good to please your host.

Steve: Well, no. I guess what I said I love it, I love it because now I can refer back to this section of the podcast and help solve debates that I’ll have in the future.

Jos: Let’s talk about, besides no lock which I guess you should never use. There are a number of locking hints that people would be familiar with that are useful in some cases where you can say, “I want an update lock on this row” or “an x lock on this row”. I would suggest before going dead route because these locking hints are typically used as to pick and choose on the query or statement level or table level what you actually want. So I would suggest before going dead route think about whether you can solve the same problem with isolation levels because isolation levels actually, you actually declare your intent what you want to do rather than you specify what is the mechanism that the server should use. That’s often easier to understand by the user.

Carlos: Sure, from a troubleshooting perspective when you somebody come in and take a look at it. They know kind of from a system setting like you mentioned here is my intention rather than having to then find a query, oh in this scenario it’s different.

Jos: Yeah, exactly. I do want to acknowledge that there are always scenarios, there are always cases where people do have to resort to a locking hints but I would say do that as a last resort.

Carlos: Sure. I mean you have that in your bag of options but don’t make that the first one probably.

Jos: Yeah, exactly.

Carlos: Ok, so shall we go ahead and do the SQL Family?

Steve: Yup. How did you first get started with SQL Server?

Jos: I got started with SQL Server, when I started to work for the SQL Server team. That was in October 2010. I moved to Seattle and started working in Redmond on SQL Server query processing.
Steve: And you had not worked with SQL Server before that?

Jos: Nope.

Steve: Wow! Ok.

Carlos: So what made you make that leap, I guess you know when you’re applying for this job and it would be like, “Hey, you want to come work on our database?” “Sure.”

Jos: Yeah. I was working in Academia before. I was working at a university doing research. Doing some teaching, but I’ve kind of decided I wanted to move to the industry, product development to kind of have a more immediate impact on the world and also work in a more closely and collaborative teams. And I want to work in interesting technology, so impact, interesting technology, as SQL Server needs to build there. I knew someone from university who has moved to the SQL team earlier. He told me about an opening. I applied and that’s what happened.

Carlos: There you go. Now that’s interesting. Ok, now I understand why where all the serializable comments came from. Obviously, you are working on a team, right? Maybe we won’t ask what you would change about the SQL Server but maybe we’ll ask you, what’s the next thing you’re working on?

Jos: I actually find the first question, I find it interesting. I do want to give an answer for what I would change in SQL Server. Something I would change if there were no legacy. I guess I know it’s making changes while there are legacy applications very hard but I really wish we could change the default on a lot of things. Like for example the ANSI session settings, like ANSI_NULLS, ANSI_PADDING, all that stuff. I would like to put read, the snapshot that we talked about as the default and a number of other kind of best practices that people have developed over the years. I would like to be able to make those the default to SQL Server. But it’s very difficult because we have a legacy and we cannot break legacy applications of course because it would ruin our reputation.

Carlos: Right, sure.

Steve: Very interesting. Ok. So was there something you want to talk about what you’re working on next or do you want to jump into the.

Jos: No, let’s jump to the next one.
Steve: So what’s the best piece of career advice that you have ever received?

Jos: The best is, work on something you’re interested in. That is the best career advice I’ve received and it’s also the career advice I would give, always do something you find interesting.

Carlos: Our last question for you today, Jos. If you could have one superhero power what would it be and why do you want it?

Jos: What would it be? I think that’s pretty for me. I would like to be able to slow down time or stop time for everybody besides myself so that I could either do some extra work or more likely take a nap while nobody notices.

Carlos: True superhero power, right? You want to take a nap without anybody noticing.

Steve: I would like to take advantage of that one myself.

Carlos: Well, Jos, thanks so much. We do appreciate the conversation and thanks for giving us some of your time.

Jos: Yeah, thanks for having me on, always a pleasure.

Episode 121: What makes a good manager?

I have heard it said “People don’t quit jobs–they quit managers”.  At a recent event, I decided to ask people what they thought the qualities of a good manager are.  Perhaps you aren’t thinking of getting into management; however, at some point in your career you will have the opportunity to lead.  We share your thoughts and comment on the challenges of having the qualities of a good manager.

Episode Quotes

“The best trait in a manager is someone who can give you enough strength to do your job.” – Doug Parnell

“The best attributes for good manager is trusting their good people to do the things that they hired them to do.” – Matt Gordon

“A good manager sets clear expectations. Understand how to hold things accountable and doesn’t micro manage.” – Royce Cavitt

“A good manager is someone that can lead and direct the team to success… a critical management skill is communication.” – Jim McCullen

“You also have to build that trust as the manager or between the manager and the employee.”

“There is an old expression it’s called ‘complain up’, if you got something to complain about, you should bring that to your manager.”

Listen to Learn

00:26 Brief introduction about the episode topic – What makes a good manager?
00:58 Companero shout outs
01:59 Reminder on “Tips and Tricks”
03:56 Upcoming Database Health Monitor webcast
04:50 Show notes links
05:24 Interview recordings about attributes of a good manager
15:10 Priorities perspective of a manager and the manager’s boss
23:02 Empathy and Accountability
26:25 Importance of communication
29:14 Relationship and Trust

Transcription: What makes a good manager?

*Untranscribed introductory portion*

Carlos: Companeros, welcome to another episode. This is Episode 121 and it’s good to have you on the SQL Trail again.

Steve: Today’s topic is on “What makes a good manager?” That’s an interesting one.

Carlos: That’s right. So this was when I go to SQL Saturdays and I like to try and get people’s thoughts about different things and I was in Charlotte. I guess, it’s been a month or two now. I thought, hey let’s talk to people about what makes a good manager. And then this topic evolved a little bit for me and I will get into that as we begin our discussion.

Steve: Alright, but before we do that do we have any companero shout outs this week?

Carlos: Yes, so we’re going to mention a few people obviously the folks that we’ve talked to and we are appreciative to those who have lend their voices to this episode. It’s interesting however, I have been thinking about it but it was vocalize actually by another blogger. And they were specifically talking about Twitter, but it seems like social media in general, the idea of the “like” button and how engagements tends to be a little bit lower but likes continue to go a way up. How do you kind of judge that when you put consent at things? Obviously companeros we are very interested in hearing from you and your thoughts are not stupid and so obviously we’re appreciative to those who are willing to engage with us. You know, drop us a line, whatever that might be.

Steve: Oh yup, definitely. We always like to hear from people.

Carlos: That’s right. So a reminder, we’re still collecting tips and tricks. We want to start this new segment in January and we invite you to leave those comments either via social media or on the website of the podcast page. You could leave that and we’ll put them on the mix and we’ll start that up in January of next year.
Steve: Yup, so what we’re looking for there is really, what are your ideas on things that you know that other people don’t always know when you’re getting around in SQL Server or Management Studio?

Carlos: Just let us know. You will always be surprised what people don’t know.

Steve: I see this like if you’re sitting next to someone at a keyboard and you’re helping them work through something, and they are doing something the hard way and you try and explain how to do it in an easier way. Sometimes it’s not that easy to explain and we’re going to try and take some of those on.

Carlos: That’s right, exactly. I even think, and so it was interesting, we were actually talking about a function today, right? So functions might sometimes play into this as well. If you have a favorite function that people ask you about. Like, “Hey, why are you using that?” We would love to know that too.

Steve: Yeah. I definitely agree on that. There is this stuff function in SQL Server and I was explaining it to somebody years ago and she said, “Well, stuff, what’s the real function you’re using?” She thought I was just using the word stuff as a placeholder for a real function name. I came back to point out that well there is indeed a SQL server function called stuff.

Carlos: There you go, perfect example.

Steve: Yup. So we also an upcoming Database Health Monitor webcast on the 14th of December.

Carlos: That’s right, so for those of you who are catching this early there is still time to register. Obviously if you are listening to this after the 14th you can watch the video recording when it comes out.

Steve: Yup, and that will be at 10AM Pacific or 1PM Eastern on the 14th. And we’ll be going through and talking about the Quick Scan Report in Database Health Monitor and things you can do there. And you can find the link at sqldatapartners.com/webcast. You can go there and sign up now.

Carlos: Yeah, so we are looking forward to putting on more of those and hopefully doing those this second Thursday of every month. Let’s see, so for our show notes today, you can reach us at sqldatapartners.com/manager.

Steve: Or sqldatapartners.com/121 for the episode number.

Carlos: Yeah, so as I mentioned, we started by getting some people, we’re kind of man on the street, took my recording equipment and went to Charlotte for the SQL Saturday down there in October. And so collecting some thoughts, let’s go and listen to those and then we can go from there.

[START OF INTERVIEW RECORDINGS]

Gita: My name is Gita and I work for a financial institution. And as the question which you are asking the best attribute. I think the manager can understand your strength and he gives you the opportunity which can explore you and make you shine where your strengths are. Also, he develops like a very good trust relationship where it’s not like a micro management. He gives you more responsibility to do it effectively by yourself without managing if the things are happening. And if you’re working without your manager managing you, like micro manage you, and still you’re working then that’s a best manager.

Doug: Hello! My name is Doug Parnell. I work for Elon University. And the best trait in a manager is someone who can give you enough strength to do your job, do some research, but then pull you back in when there is a high priority item or a task need to get completed. I might have that now and it’s great. Give you some freedom but also, I know when there is an email sent, or that passing hallway conversation. Like, “Oh ok that deserves my attention.” Rally back, get it done and then continue on.

Sandra: Hello! My name is Sandra Peel. I am a software developer at CMI Solutions. I think the best qualities in a boss are integrity and being ethical with at everything they do with employees, customers and any business dealings. Thank you.

Rick: This is Rick Hieges. I think one of the best attributes they can have is for them to listen to you. Although they may not be able to do anything about it at least they can let you know that you’re being heard and showing that you’re important to them.

Matt: Hi! I’m Matt Gordon. I am a data platform solution architect for DMI. So the best attributes for good manager is trusting their good people to do the things that they hired them to do.

Carlos: So what does that trust look like to you?

Matt: Lack of micro management. If you assign a task, you hire them to do it. Let them do it within parameters but let them do it. That’s what you’re there for, that’s what they are there for.

Carlos: I’m going to extend this question a little bit because we have this idea of micro management come up a bit. What then maybe is the responsibility of the employee to help the manager not micro manage? Does that make sense?

Matt: Yeah. Well, you have to earn that trust. I mean, you know, if you’re new to a company generally they’ll let the leash out in sections. And you need to accomplish the task as you go. Even if you’re walking off the street with all the certifications, credentials, experience, you’re still getting use to a new firm, new manager, new structure so you have to earn that trust as you move forward. And if you don’t, then they are not going to let the leash out in that section.

Guest: The best qualities of a manager I would say is somebody who listens and somebody who is attentive to his direct reports. A good manager I would say is somebody who you can sit and have a conversation with. Somebody who is also a risk taker as you and somebody who helps his direct reports, you know, people who reports to him. And somebody who is understanding because sometimes you go to stuffs, and there’s a lot of events happen in life. You know, family, or whatever the case maybe; somebody who is understanding towards that. Work is work, definitely you are here to work, and you have a manager who is really understanding the outside push besides work.

Jeff: Jeff Garbus, CEO, Soaring Eagle Database Consulting. I’ll go for two. The first is that a good manager empowers his employees or her employees to be able to do the task and perform the task he needs them to perform. That means not only do you give a task, but you give the person the ability to complete the task without having to jump to additional routes without curtailing the persons creativity, without putting limits on his or her ability to perform.

Royce: I’ll add. A good manager sets clear expectations. Understand how to hold things accountable and doesn’t micro manage. Allows them to do their job and encourages them to make the right decisions.

Joseph: Joseph Miller, Soaring Eagle Account Executive. A good manage is transparent. A good manager is somebody that cares not only about the task but making their, the person that’s under them, successful. As well as a good manager is candid. Somebody that’s candid with the person that they are supervising or working with.

Jason: My name is Jason. I work for Idera. When I think about managers, I’ve had good managers and I’ve had bad managers. That qualities that I like the most about good managers is far that they empower you. They listen to what you have to say about the things that are your challenges to do your job directly, and they empower you. They listen and they empower.

Vera: I think there are a lot of qualities but I think the main one would have to be understanding because you need to understand what your people look right through. You have to be able to relate to them. So understand and be relatable, so understand what they are dealing with. Trustworthy. A leader, got to be able to lead your people there and relate to them.

[END OF INTERVIEW RECORDINGS]

Carlos: So I also opened this up to Twitter, not too long ago. I mean we had a few people chimed in. Mike Armentrout, a former guest. He was with us on the Quilk episode. So Mike says, the fortitude to run interference, so prioritize work and be focus on without constant firefighting.

Steve: And you use #enforceprriotities on that. So that’s a really good point there. I mean, I’ve seen managers who they claim they have a priority list for you to work on. But you number one priority changes six times a day. Sure, there are some days where that’s just the way it’s going to be. But if that’s your everyday occurrence then somebody is doing something wrong probably. I think that to be able to run interference and keep people focus so they can actually get something done rather than just the amount of trash that you spent shifting gears between projects.

Carlos: Right, and I think again, we’re going to come at this from a manager’s perspective here a little bit but I can see how this can sometimes be tough because we talked a little bit about this because some people just liked to be firefighters. And so this also plays in and can make it difficult from a priorities perspective.

Steve: Right, but I think that in dealing with that you will always need to consider the amount of trash that’s involve where you are switching from one project to another. You want to make sure that whatever you’re working on if you have a new #1 priority, you don’t lose what you’re doing. You save it off so you can come back to it later.

Carlos: And the ability to communicate that, right? You get an email, you see the email modification pop up, and then they just show up in your cube, “Hey did you get my email?” “Ahh, yes, modification just popped up. I haven’t read it, consumed it and have a response for you.”

Steve: You know that’s interesting. It sort of comes down to as well how often do you read email? I know one of the things that I do when I’m heads down actively working on something that I need to get done, well everything needs to get done obviously, but to get done quickly sometimes I’ll ignore my email for an hour or two. People, text and calls, say, “Well, did you get my email yet.” “Well, no, I’m working; in a minute.” I think that the expectation that email is an immediate response mechanism is not always a valid assumption.

Carlos: Exactly. Yeah, you’re right, so culture is kind of plays a role in there, right? So I think are you helping culture when it comes to helping with priorities. So John and Sean McAllen from Minionware, they chimed in as well. They talked about, the ability to listen and to hear. It seem like that was a fairly common response when I was talking with people. So common answers, they don’t want to be micro managed, they want their manager to listen and to help them succeed. And so I thought it was interesting that we tend to think of managers, those in authority as people that are put there in a position to help us. I’ll say a bit of narcissistic view, almost innocence. Maybe not quite but that idea is what I think you should be doing is helping me, and it’s easy to think that way because obviously you report to them. You have responsibility to them and things like that. They have a responsibility over you. There are things that they can do for you that you may not be able to do for yourself.

Steve: But then also, if you look at the perspective of your manager’s boss. Your manager’s boss as view of your manager is that the manager is there to help them.

Carlos: Yeah, exactly. We actually then, sort of another level. So I started asking some of the other managers. I said, “Ok, managers what do you think your role is?” And so we have one voice and then we get into some of the other emails that I got.

Karla: This is Karla Shields, the Executive Director of Computer Technology Institute of Central Piedmont. What makes a good manager is involving your team, engaging them, and being their biggest advocate so that they can succeed in this world of technology.

Carlos: So then I went ahead reached out to a couple of my previous managers actually and just asked them. I said, “What do you think it is to be a good manager?” And one of them said that management is about watching the right things and supporting at the right time which I thought was kind of interesting. That idea made me think a little bit about almost like an alert or like a dashboard. You kind of get a pulse, or a beat on what’s going on and then respond accordingly. Just kind of take that scenario even further is, are you a good alert? Or are you the kind of alert that’s eventually is going to get because you’re popping up all the time.

Steve: Right. Are you giving your manager the right information so that they can support you at the right time?

Carlos: Exactly. And I know a lot of times, and that’s very difficult. There is a balance here, right, because I know for a lot of folks. We’ve talked about this kind of subject before. A lot of folks are only kind of feel like they’re having that discussion at the annual review. And that’s just not enough, right? That’s like, I am not watching the dashboard. But then if you’re constantly popping in or sending emails everyday and say, “Hey, what’s my status? What’s my status?” You know, that’s probably not helpful either. I guess, how much information do you give so that they can feel like the know what’s going on without also then giving them and say, “Hey, by the way I like micro manage me.” I mean more hands on than maybe you’re doing.

Steve: That’s right and I think that’s something that where if a manager is having a regular meeting with each team member individually, like more often than their annual review; and whether that’s once every other week, or once a month, or once every six week, who knows. I mean, I’ve seen it and any combination of those. It’s a good way to be able to talk about the manager to be able to talk and say, “Well, what do you need?” or “What can I do to help you?” But it’s also from the employees perspective, it’s a great time to be able to push the manager and say, “What can I do to better support you?”

Carlos: Exactly. There is that two way street.

Steve: I used to be a big, when I was doing a lot of management, I was a big believer that when you do an annual review there should never be any surprises. Everything on there should be things that have been talked about throughout the year. That’s not always the case. There is a lot of surprises when it comes to annual reviews.

Carlos: Yeah, unfortunately. Ok, so another unto my managers. Ok, so a few things that he has learned, empathy, compassion go a long way. I thought was kind of interesting. And then the second one, accountability, holding people accountable helps keep everyone honest including yourself. And those two things almost are a little bit combatant in a sense, empathy and accountability. I’m not saying they are opposites but they can seem like opposites when you’re one spectrum or the other.

Steve: Right. Oh yeah, and I think that there is the right place for both of those. I mean, with the empathy and compassion there are certain things that are appropriate as a manger and certain things that are probably very inappropriate as well. I can think that, I don’t know, it’s the difference between like something tragic has happened in your life and you’ve come in late today because of that or you’re in a car accident, that’s something tragic your late versus someone who is continuously late for meetings or habitually I guess. Looking at that, I mean, that’s one of those that I’ve seen and I’ve probably done it myself the job and a bad job probably on both sides of those.

Carlos: Yes, that’s hard, right, and they are learning too. I guess that is one of the interesting things. I mean, just like we, not all of us, myself included. I’m definitely in this camp, I didn’t become a DBA because I had deserved it. It kind of fell to me a little bit in a sense, and managers are sometimes the same way, right? There is a need and they get caught up in it.

Steve: And you know on that I think there is a lot of people who end up as managers because that’s sort of the career path that’s set. And that you’re a stellar technical person.

Carlos: And you’ve been around for a while.

Steve: And you’ve been around for a while and a manager leaves, and the team grows they need another manager and they say, “Oh, grab him or her.” They’ve been here … And sometimes you’re taking your absolute best technical person and putting them into a role that they may not want to do but then oftentimes they associate pay raise with that, so suddenly they want to do it even though it may not be something that they really want to do.

Carlos: Or really great at because that’s different. I mean the things that we’ve been talking about soft skills, building culture, all these types of things have nothing to do with keeping databases up or maintaining things like that.

Steve: Yup, absolutely.

Carlos: So Jim McCullen who is a former guest when we did our CIO Panel – Episode 54 came and he said, “A good manager is someone that can lead and direct the team to success.” And then he gave a couple of follow up or ideas around how that could happen. So in order to do that they need to clearly understand the goal to be achieved and they need to understand how to motivate and communicate with their team and with the stakeholders. Like, ok, communication, big deal. He goes on to say, “I think a critical management skill is communication. When a project fails or a team gets demotivated it is usually due to lack of communication.” Now this was interesting because nobody really brought this up. They all talk about listening and listening is part of communicating but I think kind of the outbound stuff can also be important.

Steve: Right, and I think with that too a manager perhaps with too many direct reports that many meetings combined with that may not always be able to put out the communication that’s needed there which leads to that demotivation that you’re saying or that was mentioned.

Carlos: Right. I think it can also be tough for me, because like that’s just saying for example, let’s say you have ten reports which is probably pushing it. But some teams are that large and they have conversations with two or three people because they get along with them or whatever. They’ve communicated some idea to them and then the team kinds of disseminates that is not the same as me the manager communicate with all ten however that might be whether that’s email, team meeting, or whatever it might be.

Steve: An interesting quote that I heard many years ago when I was actually, I think it was in a management training class. The quote said, “People don’t quit jobs, they quit managers.”

Carlos: I’ve heard that.

Steve: I think that, I mean, it’s so true because oftentimes you might love your job or hate your manager and you don’t want to be there. Or you might hate your job but you love your manager so you’re willing to put up with the job you hate because of that.

Carlos: Right, because you know they’ll go to bat for you or whatever.

Steve: Yeah, and I think that could make all the difference in the world. And I think that, it’s just one of those that I think about is people quit don’t jobs, they quit managers, and just from the employee perspective or manager perspective that keep that in mind.

Carlos: Yeah, that’s right. I think an interesting idea here is what makes a good manager is that relationship and then of course it takes two to tango on that. So you’re going have to put in some work into the relationship. It can’t be just one sided, and the same obviously as for the manager. Don’t be afraid to give that input. Talking about culture, things that have worked for you in the past if that’s open and they are willing to receive that. But at the same time we have to recognize that they have a lot on their plate as well.

Steve: But you also have to build that trust as the manager or between the manager and the employee to a point that you can be open. You can say, “Yeah boss. I got a problem with this.” And no, it is not going to turn around and twist it on you for not getting your job done or whatever it may be. I’ve seen some managers that when you bring up something like that they do everything they can to try and help you whatever it is you’re having an issue with, whether it’s technical or soft skills or personal or whatever. But I’ve seen others that you say, “Oh, I’m having problems with this”, so they write it down as goals – must fix this problem. And then you’re expected, ok, if you don’t fix that problem you’re in trouble. And that’s just sort of burns the bridge of trust there I think.

Carlos: Yeah, so that idea takes me back to Episode 110 talking with Richard Campbell – Building Trust with My Team. Yeah, that does take time and you have to make some investments there.

Steve: Yup. I can remember a manager, just a short story on that, who really burns the trust bridge really quickly, and this was probably 15 years ago. We had a web conference or go to meeting type call going on. And we get into the call and we get talking about ok who’s here. I was in the role of a team manager at that point and then the VP type manager was above me at that point. And I have four or five people on the call. And I was on the call, the team was on the call, the VP was supposed to be on the call or my boss but he wasn’t there. So we did the whole call because we figured, ok he’s not there we’re going to proceed without him. And we get to the end of the call and as soon as the call ends, he phones me directly and just rims me out for how I behaved on the call, whatever, for how I’ve worked with the team on the call.

Carlos: So he was listening in but did not state that he was there.

Steve: Yes. There were several times that during the call it came up with things that he should have chimed in with had he been honest about being there but instead he was just there to sort of entrap people, to get people. So rather than just jumping in and saying, “Well, what about this, what about that?”, he just kept notes and afterwards he came back and just sort of made a list of here’s all the things that he didn’t like that I did on the call.

Carlos: Obviously, we would probably not include that in the list of good manager.

Steve: And that was interesting because I quit that manager. And in fact, the company I was working for resigned and when I resigned I said, “I’m happy to work here but I will not work for this person anymore.” And they reassigned me to a different place a. And I ended working there for couple more years and I enjoyed it.

Carlos: Interesting, yeah.

Steve: You got to be able to sort of be understanding but also to realize when the person you’re working with cannot be trusted.

Carlos: Yeah, you cannot be there. And that’s the topic that we, I guess again, we kind of get into later is that your manager is not showing the option. Obviously you need to built trust but going to an HR if your organization is large enough to have one, and most at least have someone responsible for payroll, you know, nothing else, a shoulder to cry on. But having some of those avenues to talk about I think one thing that can be damaging to culture into that trust is to start talking about your managers kind of behind their back, that water cooler talk because you’ll never know if that’s going to get back to them and it can be really damaging. You may not like it, voice it to the manager but don’t voice it to everybody else and tell them how crappy they are, won’t do you any favor.

Steve: Yeah, there is an old expression it’s called “complain up”.

Carlos: Oh, there you go.

Steve: Where if you got something to complain or gripe about, you should bring that to your manager. Not take it to your peers or co-workers.

Carlos: Right. And I remember, now this was for a startup and they were going through some growing pains. I think they were, I guess I can’t remember exactly how many people. But I want to say there are about 40 people at that point. But they were growing, they were moving to a new office space, things like that. And I remember one of the things they always keep asking for was for patience. You give us a suggestion, you give us a feedback, just because we don’t turn around on it in 10 days doesn’t mean that we forgot about it. And that could be one of those things too, very similar to that, again back to that dashboard alert. You want to give that feedback again if you let some time pass, if enough time is passed and you feel like you haven’t got a response. Feel free to try that again and say, “Oh yeah, I’m sorry.” Give us status update, right? Get that pulse and let your manager know that you’re still thinking about it.

Steve: Yup.

Carlos: Well, interesting. Thanks everybody for chiming in, kind of giving in some of your thoughts. We do appreciate it. We like to have this kind of collaborative episodes and just another good example.

Steve: Yeah, definitely. How do we call it? The non-technical, sort of the more soft skill side of things topics have been interesting lately. I think we’ve done a few of those over the last several months. It’s just a different take on things. I like it a lot.

Carlos: As always you can give us your thoughts and feedback on social media. You can hit me up via email, [email protected] You can reach out to us on LinkedIn. I am at Carlos L Chacon.

Steve: And I’m on LinkedIn at Steve Stedman. And we’ll see you on the SQL Trail.

Episode 120: SQL Injection

Shouldn’t they have fixed that instead of putting out all these new features?  That might be what you all thought when you saw the title for today’s episode.  SQL Injection is still a big deal in today’s databases and we are pleased to have Bert Wagner on the program to talk with us about how it can affect you and the applications you protect.

One of the most difficult aspects to deal with SQL Injection is to decide who is responsible for dealing with it?  Bert does a great job giving us some insights on what he has seen work.  We invite you to give us your comments about how you have gone about trying to evade a SQL Injection attack.

Little Bobby Tables

Episode Quotes

“SQL Injection is essentially when you have a dynamic string that you create in SQL that’s getting executed and it ends up doing something that you didn’t intend to do.”

“When it comes to security it never solely depends on one person.”

“It doesn’t even matter if your database is kind of public knowledge or not, someone is going to be able to guess it.”

“The best thing you can do to protect yourself against dynamic SQL Injection attacks is just get rid of dynamic SQL.”

“Once again injection attacks only can happen with dynamic string execution.”

Listen to Learn

00:04 Introduction of the guest speaker (Bert )

00:38 The famous SQL Injection meme

01:19 What is SQL Injection and possible SQL Injection attacks

02:45 How to know if there is SQL Injection attack in your system?

07:43 Thoughts about dynamic strings, sp_executesql, dynamic SQL

10:38 Dynamic SQL and parameter sniffing issue

16:37 Misconceptions about SQL Injection

23:20 SQLMap

23:58 Tips on how to prevent SQL Injection

34:21 SQL Family Questions

SQLMap

About Bert Wagner

Bert WagnerEver since watching hackers first try to break into his website in the late 90s, Bert has been fascinated by the world of security. When not building secure web applications and working with SQL Server by day in Cleveland, OH, he enjoys blogging and vlogging about SQL Server at bertwagner.com. Away from a computer screen, Bert is an avid outdoorsman and all around do-it-yourself-er.

Transcription: SQL Injection

*Untranscribed introductory portion*

Carlos: Bert, welcome to the program.

Bert: Hey guys, thanks for having me.

Steve: Yeah, it’s great to have you on the show. I know we chatted at PASS Summit at our SQL Trail event. I think that was a lot of fun, good to meet you.

Bert: Yeah, it was a lot of fun.

Carlos: Yes, and this is one of the very cool things about having a podcast like this is that we get to have a little bit of a pleasure, right, it’s the Trail Mix and then do a little business as well and talk about our favorite subject which is SQL Server, so we are glad to have you on. Now ultimately, our topic today is SQL Injection and I’m reminded of the meme out there, and it’s a stick figures, but you get a parent or something on a phone and then you see the caption, it’s like, “Why did you name your table or drop table students;”, right?

Steve: Yeah, why did you name your son that?

Bert: It was a famous little Bobby Tables, right?

Carlos: There we go, little Bobby Tables, that’s right. And so that might set the stage a little bit for that idea of SQL injection. Talk to us about SQL injection. What it is and some of the problems and then we’re going to start from there.

Bert: Sure, so SQL Injection is essentially when you have a dynamic string that you create in SQL that’s getting executed and it ends up doing something that you didn’t intend to do, right? A user is passing in some parameter value that is then changing the content of that dynamic string that you built and is causing the query to perform an action that you weren’t originally intending. In essence, that’s what is a SQL Injection attack is, to give an idea from the minor things that can be done obviously. It could be something not very malicious at all. You could inject just random code that won’t really do anything just to maybe test out if the server is vulnerable to SQL Injection or not. Then on the opposite end of the spectrum, you can go all the way to querying system tables to learn more about the data, or querying receiving the full content of other tables. You could modify and manipulate data so it’s not just read-only. It’s really any command that you can think of you can potentially execute through a SQL Injection vulnerability.

Steve: Interesting. Now, with that, I mean if somebody, seems like there are two categories there, one category that could do damage and there is the other category that they are just browsing, and they are borrowing, they are stealing, they are taking some inner data. And I think with that, is there necessarily any way to even know if someone has done that to your system? If there was a vulnerability there to know if anybody ever hit it?

Bert: Right. I mean, so the only way you’d be able to tell is through logging. If you are having users input, free form data into a website or your application which is then maybe kicking off a store procedure, some adhoc query. If you’re logging that information you’d be able to tell. But if you’re not doing that kind of logging then you might not know. It really depends. It’s not just knowing whether you have an injection attack or not becomes a big issue because if you don’t know then you don’t really know the validity of the data that is in your server, right? A lot of times people think SQL Injection, why does it matter and the first big thing that comes to mind is someone is going to steal all our database data. They are going to steal our usernames, our passwords, out highly sensitive data about our customers, and obviously that’s a really big problem, but that’s not the only problem that you get. Like what you’re alluding to, data validity might be a problem if you don’t know that someone is maybe manipulating data on your server. You’re running a shopping cart and they want to give themselves a really good discount so they are updating the prices of your products table

Carlos: Our airline ticket is a little too expensive, right?

Bert: Yeah. I mean, that’s a major problem there if you don’t know what’s happening. There is no guaranteed way of finding out. And then to just round it off, another major issue with SQL injection is just that availability of your server or your application. That’s another thing most people don’t think of is if you’re able to write any kind of SQL code you want and inject it, you could potentially write code that will tie up your server or potentially disrupt access for other users if you just lock everything in the database and no one else can access your app. That’s downtime for your application and that causes another big problem too.

Steve: Yeah, that’s interesting because when people start chasing a performance issue or blocking issue. I mean, very rarely you ever think, “Oh, could it be a SQL Injection attack”, that somebody is messing with you.

Carlos: Right, and to that point along with that so, who ultimately then is the owner of this. Now from our listeners, most of our companeros are data folks in general. We do have some developers out there that listen but I can see this very quickly pointing to into a, “Well, that’s not my problem.” That type of issue. It’s almost like a security issue. No, it’s whoever is writing the start prox issue. And so I guess, maybe we should start digging into, from a DBA perspective, how can I know or which I’ll be looking for to see if SQL Injection is even a problem for me. Is there a good test there?

Bert: Yeah, I guess to answer your first question with finger pointing. I feel like when it comes to security it never solely depends on one person.

Carlos: Sure. It takes a village.

Bert: It takes a village with security. The more layers you have, better off you are typically. And so whether you’re a DBA and you have injectable in a store procedure that’s on one of your boxes, obviously your are responsible for that even though it could be a developer who wrote that could. But it should also be there responsibility to not write this type of code. And then there you could have people if you’re in a large company who has whole groups devoted to security, it should be on their radars too if they are running different pieces of software that profile maybe, types of data that’s going into your servers. It’s on them too. It’s really I don’t think any one person is responsible. I think we are all responsible. And in terms of a few, maybe if you are a DBA and you’re getting a new server. You don’t know what’s on that server. You want to see if you’re vulnerable, right? Because last thing you want is getting an email saying, “Hey, why did you cause our data to get lost?” Yeah, I mean, there are a few different things you could do. None of them are 100% foolproof. You know, I’ve written some scripts on my blog that basically look at various system use that search for queries inside your procedures and functions that may have dynamic string execution occurring. You could pretty easily search the definition of prox and use everything else to see are dynamic strings being executed and then that will help you narrow down where you could start looking to see if you have injectable code.

Carlos: Ok, so I’m looking for “executesql” in my store procedures and then I could start testing there. Is that basically the …?

Bert: Yeah. The reason that’s not foolproof is even though you can say, find me where my definition text is like “execute” or like “sp_executesql”. One, if you find things it’s not necessarily mean that they are injectable. But two, it doesn’t count for all the adhoc queries that might be coming to your server. I mean, that’s only searching your procedures but if a developer has hard coded a SQL query into their app, you’re not going to catch that in the system definitions there.

Carlos: Which I think is not trivial because I think about all the ORMs, right? That’s 90% of what they are doing is creating that code for the developer into the database.

Steve: So just show me back there to the whole sp_executesql place for instance. I mean, one of the things that I ran into a problem a couple of years ago. I was dealing with parameter sniffing issues. And I’ve been to PASS Summit and I thing that was in Charlotte that year and I’ve seen Kimberly Tripp talk about dealing dynamic SQL as a way to work around some of the store procedure parameter sniffing issues you had. I came back and learn from that and adapted some store procedures to it to work that way using dynamic SQL safely with parameters of course. But then immediately everyone jumped on it saying, you can’t use sp_executesql ever because you are going to allow parameter sniffing. Sorry I said that wrong, you are not allowed SQL Injection. And I think that’s one of the sort of misconceptions is that simply using dynamic SQL that’s being executed doesn’t necessarily equate to, yes you are allowing SQL Injection in. Would you agree with that or do you have any thoughts on that.

Bert: Right, so I mean, dynamic SQL exist for a reason. I know, like you’re eluding too, there is a lot of negative I guess association with it because of the injection problem. But there are really good things that you can do with the dynamic SQL like you’re saying, right, parameter sniffing. That’s one way to potentially solve the parameter sniffing problem. There are other things where if you have an application and you need like the ultimate performance to be extracted from it. Sometimes the only way you can get that performance is by writing a dynamic SQL query. Or maybe you need to vary the output of your results set, right? And dynamic SQL is the only way to do it. There is a lot very valid scenarios to use dynamic SQL. It’s just that SQL Injection could be a side effect that you need to be careful of.

Steve: Right, and I think that’s the key there is making sure that where you are using a dynamic SQL that it is safe from the SQL injection perspective.

Carlos: Right, some of the red flags there that you are looking for. Once you’ve indentified a store procedure as being a dynamic SQL like what’s the next step?

Bert: Yeah, so it’s an interesting problem. I personally think that the best way to understand how to protect against it is to fire up your own test database. I mean, don’t do this at work or anything where you might flagged by a security duties at home. Unless you are an info security and that’s your job to test this kind of stuff out. Try it out and that’s the best way you’re going to be able to learn really how it works and how to protect against it. But things you can look for is if you are concatenating parameters into your strings. That’s probably the biggest thing to watch out for because if that’s happening basically you are allowing input data coming from a user, malicious or otherwise, and they are able to append to the SQL string that you’re building dynamically.

Carlos: Ok, so knuckle-dragging Neanderthal that I am. I feel like I have to ask this question just from our previous conversation. The whole reason I’m using dynamic SQL, and very simple example, SELECT * FROM table WHERE parameter = mystore procedure parameter. Right? It will allow people to best set it, so if that’s just one, if it’s equals, it’s ok. But if I’m adding like parameter1 + parameter2, is that where I get into trouble?

Bert: Yeah, so if you think of that exact example you gave where you’re building a string SELECT * FROM table WHERE parameter =, and then that’s all a string and then you’re concatenating in a parameter. That is potentially vulnerable to SQL Injection. And I would argue in that specific example. You shouldn’t even be concatenating a parameter to a simple string like that to begin with. That is a query where you’re parameterizing the value of a WHERE predicate for example. That’s something you can parameterize and use for example sp_executesql t osafely execute. You shouldn’t be using necessarily dynamic SQL to execute that kind of statement to begin with. And that’s a problem that I’ve seen with just people are coming from maybe developer backgrounds not just single amount full disclosure I’m a developer.

Carlos: Oh men! We’re going to have to vet our guest a little bit better, Steve, not just.

Bert: I mean, depending on your background building a dynamic string where you’re concatenating user input values to a query might be acceptable in whatever language you’re coming from. Actually, if your knowledge in that subject is maybe a little older, so it’s not necessarily that people are doing this knowingly or they want to write injectable code. It could just be that their background is that’s the correct way to do it or that’s an okay way to do it, that’s how I’ve always done it. But it’s not necessarily safe secure code.

Steve: Yup, and I think from the perspective of application code that’s making a call into SQL Server. Usually when somebody get started and learns a new programming language or new interface to talk to the database. Usually the examples are there without parameterization, they just show you concatenating something. So when somebody jumps in and just learning it and they haven’t learned the value behind parameters and how to use them. It’s just somebody just doesn’t know any better.

Bert: That’s so true, Steve. I mean, so SQL Injection just as a quick background has been around forever, right? This is not something that’s new. This is not something that’s even in the past decade. This has been around since the 90’s. It’s been around with SQL Server from SQL 6 and 7. It’s been a problem for that long and it continues to be a problem for that long. And I think you’re exactly right. A lot of those beginner tutorials that you follow. They are just trying to teach concepts of here is how to do something or here is how to write a query. And they’re kind of foregoing the whole security aspect of it and that’s unfortunate.

Steve: And whenever I see one of those I always try and go to that next step to understand how to use parameters. Not just from the SQL Injection perspective but also from the performance and reusability perspective.

Carlos: Are there any kind of misconceptions out there that you see around SQL Injection that people are commonly confused with or get wrong?

Bert: Yeah. I definitely interact with people where maybe they are aware of SQL Injection and kind of what it is but they think, “Ok, this doesn’t apply to me for a bunch of reasons.” One of the things I hear is that, “It’s ok. I don’t need to really protect against SQL Injection because the structure of my database isn’t public so my attacker isn’t going to be able to know what to query.” That’s a huge misconception for multiple reasons. One, is that a lot of our databases that we probably use. They have really easy to guess table names and column names, right? A lot of databases probably have a products table or a users table. It doesn’t even matter if your database is kind of public knowledge or not, someone is going to be able to guess it. But then take them step further there is great ways in SQL to find out the structure of your database. Like sys.objects and things like that will just actually tell you that all the table columns in your database. Malicious users know about that and so even if they don’t know the structure of your database they can very easily find it out. Another common misconception I hear them, you know, if I follow up on that is, “Ok, well, I escape my table names”, which I hope you don’t do that. But once again, using something like sys.objects, sys.columns is going to reveal that information. So it doesn’t matter if you columns are called A1, A2, A3. To a hacker or someone trying to get your data it’s not going to stop them all.

Carlos: Now having said that, if, that’s a big if, you’re using an application user that just has read and write to that database. Don’t those objects then no longer available?

Bert: Yes, that’s a great point, Carlos. That’s one of when I’m trying to write secure code and trying to protect an application from SQL Injection. You know, that’s one of the number one things that you want to do for all your code that you’re writing, that’s accepting user input parameters is lockdown that user that’s executing the code to kind of minimize damage. It still might not fully protect you from SQL Injection but it’s going to limit what that malicious user is able to find out or do in your database.

Carlos: Right, so they are going to work a little bit harder which may or may not be their prerogative.

Bert: Right. I mean if you take that user and you only give it read access, there is no way they are going to be able to modify data, delete data, anything like that on your database. They might still be able to read the contents of a table but it will be limited to that table or that schema or that database depending how well you protect that log in there.

Steve: So one of the misconceptions that I came across. I’m just curious what are your thoughts on it might be. But around, let’s say it’s a web system and there’s thousands of webpages that are accessing the database and you got to go through obviously make sure that everyone of those is SQL Injection safe. But one of the misconceptions that I experienced in a management situation was that we found there were SQL Injection problems in a system. We presented it to the management team and their response was, because there were two pages in the site. There are pages that you can see before you actually log-in to the system and there are the pages you can see after you log-in to the system, so we know who you are. And the response was just, “Well, let’s just make sure all the pages that you don’t have to be logged in to see.” And this is a public site used by thousands of people across the world. But let’s just make sure the pages that are public that don’t require a log-in are SQL Injection safe. We’re not going to worry about the other ones because those are log-in users and they would never do anything like that.

Bert: Yeah. I mean, there are a bunch of red flags there, right? But yeah, I mean, you need to protect against SQL Injection everywhere. It depends on your application but I’m sure many people have created multiple Twitter accounts or multiple Facebook profiles. So what’s going to stop someone from creating a fake account into that system, right? And even though they are logged in it doesn’t mean anything. People who want to get into systems are really good at kind of covering their tracks. It doesn’t matter if they are authenticated into your app there unless you’re tying your users to their passports or something like that. You got some very high secure verified application where you’re not just letting anybody register. But even then you still want to protect against injection.

Steve: Right. And this is one where anyone could just sign-up for a demo account or trial account in the system they were in.

Bert: Yes, that’s a big red flag especially because nowadays, it used to be that, another misconception I’ve heard a lot is, “My website, my application is so small. No one would ever try to attack me.” Like I’m selling boutique garden gnomes online and I have 50 costumers a year and they are all really into garden gnomes. I know none of them are malicious, right. The fact of the matter is that it’s not like someone needs to be actively searching for injection vulnerabilities on your site by hand. Like going in to the log in form and trying different things. There are plenty of tools that hackers have to automate that kind of attack and they basically just go scan the internet and having these tools automatically test for injection vulnerabilities just to find which sites out there have them so they can potentially get the data and do various things with it.

Carlos: Yes, scary stuff.

Bert: Yeah, but it’s also cool. So like one of the tools I want to mention is called SQLMap. It’s an open source tool that’s used for automating SQL Injection testing. If you have an application and you really want to test it out, you don’t have to use these kinds of apps. These apps work kind of both ways. They help out the attackers but they also going to help you out on the defensive side to actually test your own applications to see very quickly and easily is my application vulnerable to SQL Injection attacks, so works both ways.

Carlos: Right, very cool.

Steve: Well, I guess if you’ve got someone who‘s listening and maybe this is their sort of first exposure to SQL Injection, the topic of SQL Injection. Is there anything that you might recommend or any tips you may have that may help with preventing it. Like where you would start first if this brand new to you?

Bert: Sure, so usually this is kind of how I evaluate and try to protect against SQL Injection. First thing is, do I need to be using dynamic SQL? Because like we kind of talked about earlier, a lot of times it could just be someone wrote a query that’s dynamic and is concatenating parameters because that’s just the only way they know how to do it. But if you’re just concatenating a WHERE predicate value, you don’t even need to be doing. Then just get rid of your dynamic SQL. You could just pass in a parameter to your query and it will evaluate perfectly fine without needing to build a dynamic query string. So honestly, that is by far the best thing you can do to protect yourself against dynamic SQL Injection attacks is just get rid of dynamic SQL. Always the first thing to check is do you actually need to be using dynamic SQL?

Carlos: Yeah, why was this put in place?

Bert: Just kind of a common sense sanity check. Can I write this a different way and still get the same result without making myself vulnerable because once again injection attacks only can happen with dynamic string execution. If you don’t have that dynamic string execution, you’re good to go.

Carlos: Sure. I guess I do want to make one point where we’re kind of talking about, you know, everybody needs to pay attention. One additional that I had and Troy Hunt was talking about this. This is more of the SSL certificates on small sites. But the idea was that, yes they may not, going back to the garden gnomes. They might not be trying to attack your site but they may be trying to get into your site to then send malicious stuff to somebody else. And that’s even bigger problem because now you get blacklisted and so your 50 customers go to 0. Who’s going to blacklist you and all other stuff.

Bert: Right, that’s a huge, that’s a great point. That was a great blogpost from Troy Hunt there about that. And just SQL Injection in general if you ever want to know numbers, that’s hard to get numbers, but Troy Hunt runs this website “haveibeenpwned” Which if you’re not sign up for, you should. It’s basically a notification and service for your username and data breach that gets exposed. But if you go to their data breach page there you can just do like a CTRL + F find on the webpage and search for SQL Injection. And you will see all of these companies, huge companies. I’m talking like Yahoo! and Sony who specifically have data leak because of SQL Injection attacks. This is like a really serious deal that affects everybody. Sometimes it’s nice, I mean not nice, it stinks for those companies, right, for their users data get released. It’s not just you by yourself. I mean this is a major problem that affects everybody.

Carlos: I agree. And then of course if you are subject to it, that’s not fun. That’s not fun for the managers because you don’t know the extent of it. You may not even know where it is. So then all of a sudden they are kind of throwing money at a problem to try and stop it and they don’t know it’s not worth going.

Bert: Yeah. I mean, so you want to do your best bet. And so option one is just get rid of that dynamic SQL if you don’t need it. After that, let’s say you evaluate your app and you do need dynamic SQL. Like you’re doing one of those things that is valid to use dynamic SQL in your database. The thing you would want to try to do is do something like sp_executesql which will parameterize your dynamically built queries. And so that is the safe way to allow input parameters that you pass in and executed as part of a dynamic SQL query string without falling vulnerable to that injection attack. Now, sp_executesql has its downsides though. Although you can pass in a dynamically generated SQL query into it to execute, you still can parameterize everything with it. So things like table names, right? You wouldn’t be able to pass in as a parameter even if using sp_executesql, it won’t work. A lot of times a table name might be something that you do want to parameterize.

Steve: Oh, and that’s an interesting one. Do you have a good option for how to do that?

Bert: Yeah. The best option would be to use the QUOTENAME function in SQL Server. And that just basically escapes characters by default. If you don’t pass in any parameters besides just the string that you’re escaping at as brackets around it kind of make it a system object name, and that will protect you for sure. The downside to using QUOTENAME. So QUOTENAME is the best solution if you can’t use sp_executesql. The downside is that it is limited to outputting only 128 characters. So if your input for some reason is longer than a 128 character, you need to start getting a little creative with what you do and that opens you up to potential problems.

Carlos: Right, then you start concatenating and putting all these things together. Yeah, you’re kind of back to square one.

Steve: So there is one example I saw where they are passing in a table name that needed to be concatenated in to a string. And there are only 5 or 6 options, 5 or 6 possible table names they could pass in. So the solution they came up with was instead of just concatenated them, it put inside of an if statement that is said, if it’s table name1 or table name2 or table name3 it injects not parameter but the actual text of what that table name was. If it doesn’t match one of those known table names it falls through and aborts of the store procedure.

Bert: And that’s, I mean if you’re able to do that, that is great. And that’s not just on the DBA side but if you’re working with your developers the first line of defense for these types of attacks is the developers. It’s the app code. They should be doing all these things too. They should be sanitizing their inputs. They should be checking that the input data is a valid entry. Like if there is only 6 table names, it should be one of those 6 names and then if you’re able to do that similar kind of check with an if statement in T-SQL then all the better. The problem starts to crop up when you have a more complex input, quantity of values that you’re inputting. If it’s 6 tables or 10 tables it’s pretty to handle. But once you get to the realm of many more than that and you start wanting to write maybe what at that time seems smarter validation functions or sanitizing functions that’s where you get yourself into trouble because it’s really hard to write a function that’s 100% secure that kinds of validates data like that.

Steve: Very good point.

Bert: And so what I’ve seen a lot is people will use like the replace function for example. Once common technique to prevent SQL Injection is to sanitize your input from single quotes, right? Because if you are trying to inject some code, usually that injected code is going to use a quote in it somewhere to end one statement and help start another statement. So what people will do is they will try to write a REPLACE function that replaces single quotes with a set of two single quotes to kind of escape that quote and prevent the attacker from succeeding what they wanted to do. While that works great for some scenarios, it doesn’t work in all scenarios. That’s the big caution with trying to kind of write your own sanitation functions in SQL Server using something with REPLACE because it’s not always going to work. And it’s not always right to think of every scenario that an attacker might try is impossible. Even if you’re somehow able to do it that doesn’t mean that some new feature of SQL in the future is going to stay on top of that forever whoever ends up maintaining your code. That’s just really a big problem there. And like we mentioned, locking down the user account that is executing your SQL queries makes a big difference. That’s something I would implement in all scenarios for sure.

Steve: Oh yeah. I don’t know how many times I’ve seen the web system at places that runs as the SA user. That’s one of the first things I always want to get changed because it’s just so dangerous.

Carlos: Oh yeah.

Bert: Yeah. I mean, that opens you up to everything.

Carlos: Should we go ahead and do SQL Family then?

Steve: Let’s do it.

Carlos: So Bert how did you first get started with SQL Server?

Bert: Well, I started my database expeditions in MySQL probably when I was 11 or 12 years old.

Carlos: Oh, starting young.

Bert: Yeah, just running a PHP website, coding my own log, having tons of SQL Injection vulnerabilities there and that’s actually where I learned. That’s why I learned about SQL Injection was actually with MySQL. I would look at the logs and say, “Ok, what’s all this weird 1=1 input that people are submitting.” That’s my start with databases. I obviously didn’t know much back then. I still don’t think I know much now but that kind of open the doors to get hired to a Microsoft shop where they have SQL Server. Yeah, sure, I know all about relational databases. I’ve built websites using MySQL and so that’s kind of how I got started there.

Steve: Pretty cool.

Carlos: If you could change one thing about SQL Server, what would it be?

Bert: Well, if you had asked me this question a year ago I think I would have different answers. But I’ve been really impressed with how kind of the speed of development has become with SQL Server in the past year. Like that would be my big wishlist item would have been just get more features out faster. It seems like they are doing that. I’m really satisfied with that. I guess the one thing I still like to change is for them to take like a release. Maybe not a major release but just take the time and really polish the existing things that are in there. I’m talking about things like maybe making error messages more user friendly instead of just telling me some data got truncated. Point me to that data so that I know so I don’t have to figure it out on my own. Or if I run out of space.

Carlos: Which line? Dang it!

Bert: Yeah. You know, for me if I’m using a tool everyday those little kinds of things make a big difference into how happy I am and how happy I am to use a tool. So that would be huge for me because I mean all the features are great. I’m happy with them. Mitch is polishing all the rough edges would be great.

Steve: Ok. I like that.

Carlos: What’s the best piece of career advice you’ve received.

Bert: My favorite career advice that someone told me once is I guess the popular one is, “Fake it till you make it.” But someone kind of has their own modified version. There’s this photographer chase Jarvis and he always talks about “Make it till you make it”, which basically just keep doing what you’re doing and eventually you will get to where you want to be just by kind of continuously improving and getting better at whatever your craft is, right? In his case it was photography. But for me it’s like I want to become better at SQL Server. The only way to do that is just to keep doing things with SQL Server, pushing myself to learn new things and blogging.

Carlos: Making mistakes and then learning from those mistakes. That’s the big thing, right? The fear of failure can hold us back sometimes.

Bert: Right. I mean, hopefully don’t do any injection mistakes in production but always get to learn.

Steve: Ok, so our final question. If you could have one superhero power, what would it be and why would you want it?

Bert: So thinking about this, I think I would want to be able like to control time. Not like be able to go back time to 10 years or 20 years or something like that. But if there is like an undo button for where I could just kind of go back in time some limited duration like maybe three minutes or five minutes. Not only will that prevent lots of “Opps” scenarios where I delete something that I don’t want but I guess I could always put in those last minute bets to in crazy sport events outcome that no one expects. I think that would be pretty cool. Then you don’t have to deal with all the ramifications of changing history and the whole future outcome is different. So I think three minutes back wouldn’t be too bad.

Steve: Ok, so the time control undo stack. I like that.

Carlos: Well, awesome. Well Bert, thanks so much for being on the program today. We do appreciate it.

Bert: Yeah, thank you guys. It’s a pleasure.

Steve: It’s great to have you Bert. We learn some things along the way, too.

Episode 119: Should I get involved with GitHub?

At first glance you might think our podcast topic is from way out in left field; however, as you think about it there are a number of reasons we should be talking about GitHub.  Even if you don’t need a public source control repository for creating code, GitHub is becoming the preferred place for creators to publish their content and the ability to interact with other is extremely value.  In this episode we give some of our thoughts around getting started with GitHub and how it might differ from some of the other options out there.

 

Don’t forget to leave your ‘Tips and Tricks’ on the podcast page.  If we get enough submissions, we will start that new segment in January.

Episode Quotes

“Although it’s not directly a SQL Server thing, it’s one of those things that is really important to be able to understand.”

“That’s an interesting way to use GitHub because you’re almost using it as a distribution mechanism.”

“It’s probably the most popular repository now.”

“When people see public repo on GitHub it’s really just assumed that it’s there for community distribution.”

“As long as you’re using source control to keep track of what you’re doing and you’ve got the appropriate ways to distribute it, GitHub is great for that, but it’s not the only tool out there.”

Listen to Learn

00:12 Introduction about the episode topic (GitHub)
02:34 Companero shoutouts
03:53 What’s coming up: Tip and Tricks and Database Health Monitor Webcast
06:53 SQL Server in the News
09:55 Show notes links
10:42 GitHub as a repository for DBAs and IT professionals and other usage of GitHub
16:40 Private or public repository – pros and cons
24:41 GitHub licenses
26:38 The “Read Me” file
28:30 GitHub interaction (Browser, Desktop App, Bash Shell commands)
31:45 RedGate SQL Source Control, Visual Studio
35:15 Defining differing terminologies
35:55 Thoughts about using GitHub if you’re the only one working on a project?

Lock down your Azure resources

Transcription: Should I get involved with GitHub?

*Untranscribed introductory portion*

Carlos: Companeros, welcome aboard the SQL Trail, this is Episode 119.

Steve: Yes 119, and today’s topic is getting started with GitHub. That’s quite a few things lined up to cover here today.

Carlos: That’s right, and so obviously GitHub being a source control product and there is a lot of flavors here, and when you think about GitHub you can probably put yourself in one of a couple of buckets here. One, you’ve never heard of GitHub.

Steve: Right, and at this point you are wondering what the heck we are talking about.

Carlos: So we hope to provide a little bit of information there and maybe some at the end of the program, if you haven’t got started, now you’ll have at least some direction to go further. Maybe the other bucket is you’ve heard about GitHub but you’re scared.

Steve: Maybe scared is not the right word but you might just not know where to go.

Carlos: Sure, yeah, intimidated is the word I would have used in describing myself, and I still kind of feel that way to a certain extent. So you’ve said source control, but you don’t know Git per say, and those folks who should be doing this episode with us.

Steve: Right, the real pros who want to share the information.

Carlos: Yeah, exactly, who are using it. I’m not sure how valuable this episode will be. I mean obviously we are going to share some opinions or thoughts on some things. But we are not going to be uncovering particularly an audio format, anything deep dark GitHub secrets. And then the last group, so those who have jumped to Kool-Aid and basically it’s GitHub or bust.

Steve: Right, yeah, and those are the ones if you’re using anything else you are unworthy basically.

Carlos: Our conversation is kind of, with an emphasis on GitHub but we admittedly are going to break away a little bit from that. We will be referencing things like TFS as well. And so if GitHub is your thing and you’re hearing other repositories is going to make your ears bleed. This episode is not for you either.

Steve: But before we really get into the details there, do we have any companero shoutouts this week?

Carlos: We do have a shoutout. I want to give a shoutout to Nikki V from the Netherlands. Nikki V left a comment on the website talking about Episode 115. He mentioned, “I love your fun and honest review of the SQL Trail.” And so we appreciate that and I hope to see you Nikki on the SQL Trail someday.

Steve: Nice, and then we have another one from Andy Levy. He liked Episode 117 which is our wrap up of Summit.

Carlos: That’s right and enjoyed kind of hearing some of the comments there and we enjoyed meeting him at Summit obviously. Another fellow that I met, I’m not sure if you’re connected with him Steve, but Justin. Justin who I believe is in Alabama now. We were talking and he mentioned a script that he had to help find potential foreign keys and he shared that with me. I haven’t been able to go through it just yet but I’m grateful to him for sharing it with me and may actually come in handy on a project that we are working on, so interesting idea. We mentioned it last week, we will mention it again, so now through the end of the year we want to start collecting your tips and tricks. We’re going to start adding tips and tricks section, very similar to SQL Server in the News. However, we want this section to come predominantly from you the listener.

Steve: And that could be really in two flavors there. I mean, one could be, here’s your tip or trick and how exactly how to do it or the other could be you saw somebody do this and it was really awesome but you didn’t quite figure out how to do it, and maybe we can help with those too.

Carlos: Yeah, there you go, any of those combinations. You can go to sqldatapartners.com/podcast. On that page you can either leave in the comment for submission of ideas or tips and tricks or you can use the online recording and record a segment of what’s your tip or trick is.

Steve: Yup. Then coming up on December 14th we have a special event plan.

Carlos: That’s right, and I should say probably a recurring event. Something else we’re trying or adding into the mix here.

Steve: Yup, and this is the Database Health Monitor Webcast. And when we say it’s recurring, we’ll probably do it every month or maybe every couple of months but it’s not going to be the same every time. We will cover different topics or different parts of the program.

Carlos: That’s right, so we’ve tested out our video equipment and it hasn’t broken just yet and so we’re thinking we will add a little video to the mix here. So there are going to be short 30-minutes segments, you know, very similar to the podcast but it’s going to be focused on Database Health Monitor and kind of monitoring those interactions with your system.

Steve: Yup, and with that it will be a live broadcast.

Carlos: That’s right. We will record it and make it available through YouTube or whatnot afterwards.

Steve: But people can attend live through the go to meeting connection, correct?

Carlos: Correct. So at sqldatapartners.com/webcast, you can go there and sign up for the next episode, and we’re going to those the second Thursday of every month is the plan.

Steve: So then if somebody has questions, we’ll do our part of it and then at the end we will open it up to Q&A. That’s going to be fun.

Carlos: That’s right, so we’re hoping everyone finds that valuable and just another way to connect with you all, something in a live setting and again not hours and hours. But very similar, you’ve seen other folks out there in the community doing something similar and we’re looking to put our spin on it. And with that, I think it’s time for a little SQL Server in the News. And it’s in this segment only because one, we were just talking about it with a client and then I happen to be looking at the Microsoft blog and there was actually an article about this last month. That’s why I said, “Oh well, if they are talking about it, we should be talking about it.” Ultimately, it’s this idea of securing your Azure VMs or Azure SQL Database and restricting it by IP Address. And so you and I are talking, and ultimately what we had is we had a scenario where a client wants to give us access to their VMs and we got a username and password. And I’m like, “Well, I haven’t given you my IP Address, how am I going to connect to this thing?” I’m like, well let me just try really quick and I could connect. And I was like, yeessh. So we talked a little bit about this and I’m not sure. I guess I was thinking maybe with Azure SQL Database but it sounds like they even changed that. Maybe, I don’t if it’s process now, right, we’re I’m locking that down automatically so that it’s just something that I do. I’m adding those IP Addresses, but basically the idea is that you can by default those are exposed on the internet. So if you know the IP Address you can basically start to try to get in there, and not a good idea.

Steve: Yup. But I think part of what they do is they have to do it that way so that you can get connected there to begin with before you’ve actually done anything to lock it down. So I think that just a lot of time people overlooked that and I think that’s Azure databases and with Azure VMs.

Carlos: Right, well I guess I feel like I can do that in the dashboard. I don’t actually need to connect in order to make those firewall rules enabled so I guess that’s a puzzling thing.

Steve: But I would bet that if they made it so that you had to set that up before you could actually connect so limit where you can connect from that a lot of people would not get pass that because it’s not entirely intuitive how to do it.

Carlos: You know I agree and there are a million ways to go ahead and do that, right, so like if you have a point-to-point VPN then that’s different than me being from my home and I just want to give my IP Address because I’m going to ask who’s going to be connecting to it. There are some complexities in there and I kind of see that but I think if that’s something that you’re working with, you are doing something in Azure, that’s a conversation you want to start having. Don’t start spinning up those databases and just leave them there, right? You need to make sure you’re talking to your network folks and making sure they get locked down.

Steve: Yup, absolutely. Alright.

Carlos: So our show notes for today’s episode is going to be at sqldatapartners.com/github.

Steve: Or at sqldatapartners.com/119 for our episode number.

Carlos: Ok, and so as we are putting this together and obviously we are recording just before Thanksgiving, so happy Thanksgiving. And hopefully everyone had a happy thanksgiving because this will come out after thanksgiving. And we were thinking about what we will be talking about and so this idea of GitHub at least in my mind. You think about the changing nature of the DBA position and I feel like I am hearing just a lot more about GitHub. Yup and so I thought we talk a little bit about why that is and then kind of get into it there.

Steve: Well, what’s interesting here Carlos is when you first brought up the topic of GitHub I kind of thought to myself, “Well we’re doing the SQL podcast. Why are we talking about GitHub?” And then after looking a little bit and thinking about it a bit more I realized how much we actually use it as DBAs. Although it’s not directly a SQL Server thing, it’s one of those things that’s really important to be able to understand what it is and how that works.

Carlos: That’s right. And I’m reminded, I’m not sure if the code is attributed to him but I heard it first from Andy Leonard and this quote that he had he’s like, “There are really two types of DBAs or IT professionals. There are those who are using source control and there are those who will be using source control after some disaster.” And again ultimately that’s what we are talking about is a source control technology but it almost seems like in a sense that that is almost like a defacto way that people are starting to push, and when I say people, particularly Microsoft, right. The documentation is now on GitHub or you can see it on the website but as far as being able to interact with it having people contribute to it like that’s now an option. That’s how the people who are writing documentation for Microsoft are actually contributing or pushing to this GitHub repository.

Steve: Right, and I think that’s an interesting way to use GitHub because you’re almost using it as a distribution mechanism. And some extent with the ability to get feedback and I guess if you’re accepting code changes from the community then that’s different because you’re actually taking changes in.

Carlos: Right, and I think we can get into that and maybe pros and cons or good and bad of what that means. But yeah, I think you’re right. We are kind of using at as a way just to distribute that information. It seems like anymore the code samples. If you want a code sample so they have come up with so there was worldwide importers or wide world importers. Whatever it was it was kind of available to the next version of Adventure Works, another sample database. But they’ve kind of actually come with another one since that and basically they’re like, “Oh, go and get it in GitHub.” I haven’t seen links for like, “Hey, download it here.” It’s like go to GitHub. There are all the samples, people are changing and they are kind of massaging it there. I think about Cosmos DB samples and things like that for example. It’s all there and you’re like, “Oh, gosh! I don’t know how to interact with that.”

Steve: And you know what’s interesting is it might be maybe a little bit more trusted there because you can go and see the history and go and get more information on it and it’s not just. I mean of course coming from Microsoft is different but if it’s coming from some third party tool vendor out there that you may have never heard of, do you trust what’s on GitHub more what’s on their website? I don’t know, maybe.

Carlos: Yeah, that’s interesting. And I think again kind of the influence of the open source community in that we should have visibility into how the soup is being made if you will. Yeah, I think definitely contributes to that, so that is interesting. So a couple of other reasons quickly, so if you’ve been using CodePlex for example. I know CodePlex, you could put together a tool, throw it up there, kind of share it with the community. As of December 1, that’s going to be read-only. So if you want to be doing things, like for example Pal Tools, one of the tools that I have used in the past. The creators or maintainers of those applications won’t be able to use CodePlex anymore and they are moving everything to GitHub. And the last one is, you see a lot of employers starting to ask for that in the interview process.

Steve: Yeah, this is one I saw a couple of years ago where they actually said, “I don’t care about your resume just give us your GitHub username and we will look at what you’ve done.” Which I thought maybe that was a little bit, you miss a lot of stuff because there are a lot of people who have done amazing things in their career that didn’t necessarily happen on GitHub publicly.

Carlos: Yeah, exactly. That stuff is probably an emphasis of maybe focusing too much on almost like a particular technology stack potentially is that a lot of people would go to the wayside. But I think you have to agree that it’s probably the most popular repository now, right?

Steve: Oh yeah, absolutely.

Carlos: And so because of that, it’s almost like if you’re not on Facebook, right? It’s like, “Come on!” Everybody is there. Even if you’re not posting there all the time just that ability to log in and kind of connect with the rest of the world. I kind of feel like that’s what GitHub has become in a sense.

Steve: I think that, I mean, there is a lot of different uses there for what you can do on GitHub. I mean one, is something big like Microsoft distributing sample databases. Another option is you’ve got something like Brent Ozar does with his sp_blitz and other scripts that he does where he distributes those but he also get a lot of feedback and changes from people. And I think that’s a great way to use it, to be able to take those changes back and be able to integrate them in your base code there.

Carlos: Yeah, I mean, DBA Tools is another example of that.

Steve: Yup. But then there is like let’s say you’re a company and you’ve got your private intellectual property that you want to keep private for whatever reason, you’re making a product. There are GitHub options that you can pay for that will give you private repos, private on-site repos with GitHub. That’s very different than sort of the free open source collaboration method there too.

Carlos: Yeah, exactly. And I guess that’s an interesting question. Let’s go ahead and dive into that for a bit. Ultimately, with GitHub you can have a repository, that’s kind of your folder, your project, that’s kind of where all the files are going to go for that for that project. And you can have private versus public. I think it makes a lot of sense. I am a private company, I’m selling a piece of software. I want to keep that private it’s the IP that I have. Now the question is the way we kind of frame it or talk about it is that it’s the open source. You want people to see what you’ve done. Even if maybe you’re not having people collaborate with you, you know that book of work may still be important to you. And so why would you then like does everything have to be public or do you still have things that are private versus public? Admittedly, everything that I have out there is public which is not very much. Let’s just say I have two repositories, so please don’t think that I am slaying all these code or anything. The two repositories are for the Zero to SQL book. I put up the scripts to create database and made it from a distribution perspective. Very similar to what we’ve talked about. And then the other one was our database setup and we are hoping people would contribute to that and maybe get that started.

Carlos: But there are other uses too for private repos. I mean, I have quite a few more than you do there. But some of the examples of what I do is I have one that’s called presentations and it’s private but it has every presentation that I have done at a conference, at a SQL Saturday, PASS Summit, whatever for the last 7 or 8 years I’ve got in there. So if I remember that, “Oh that was in a presentation I did 5 years ago I want to go find that code.” It’s all there on that private repo. Now, some people might argue that it would be better to put that out on a public repo but there is a lot of stuff that I’m working on there that may not be ready for public consumption yet.

Carlos: Right, and that’s kind of another interesting idea. It’s almost like the policy of your repository and the purpose because I think you almost like have little islands or little reasons/different reasons why people are putting out those repositories and they are not all the same. Like you mentioned, “Everything might not be ready for primetime”, and so that’s an interesting take. And so how you decide to set that up, I think you need to be clear about that if you’re going to make a public repository. You need to be clear like, “Hey, this is just for me”, “This is part of my body of work” or however you want to say that maybe a bit more eloquently. But make that known upfront. And another vain could be even as simple as like, “Hey, here is how we do certain or here is our policy on like case or formatting”, things like that, coding standards, there you go.

Steve: But I think when people see public repo on GitHub it’s really just assumed that it’s there for community distribution and that you’re hoping that people will fork it and then use it on, and then contribute and push back into what you’ve done. I think that’s kind of the assumption but that’s not always the case there with public repos.
Carlos: Sure, yeah. Maybe that’s a philosophical question at this point. But why would you put out a public repo if you didn’t want anybody to contribute to it?

Steve: That’s a great question.

Carlos: But we do know that there are people that are doing that.

Steve: If you have a public repo and you didn’t want anyone to contribute or add to it in any way. Would you be bothered if they did? Would you, like if somebody came along and came up with some super cool enhancement for whatever it was you had there and they did a pull request so that you could review it and maybe integrate back into your code base. I mean there are some out there I’ve seen where they’re just going to say, “No. I don’t want it. I don’t want anything, anybody contributes because that’s not mine.” That’s not my mentality but there are some of that out there.

Carlos: Well, and in their defense we’ve talked about this on other episodes like go back to community contribution episode that we did. We interviewed Bren and Crisy. It takes a lot of time. I mean, it’s one thing to write the code, so anybody who’s done that branching and merging all the stuff back in and then casting. You are now taking responsibility for all of that as the owner of that repository. There are pluses and minuses there as well too.

Steve: And some of that may depend on how complex it is. I mean there’s a lot of code out there on that DBA Tools project. It’s many orders of magnitude different from our setup scripts that we put in our public repo. I mean one single page and the other is gigantic.

Carlos: Right, lots and lots of functions and some of those. Ok, so I guess ultimately dealing with that if you do go private. You have that ability to say, “Yes, I want to share with the world” or “No, I don’t”. Like you’ve talked about, most of the people are thinking or the expectation is you go public because you want to share it, you want people to contribute to it but then there potentially some consequences with that. Having said that, I guess I would say, typically if you’re new. And I still consider myself to be new to that community. You know, we’ve put those things out there. We have like a couple of watchers but we haven’t actually had anybody contribute and so I don’t think you should be so afraid. Like I would let that stop you I guess is what I’m saying because it’s going to take a lot of time and branding and all of those things until it gets noticed or picked up. Almost like your blog in that sense.

Steve: Right. But the other thing to be aware of too is if you’re looking at someone’s repo there, public repo on GitHub, look like how often it’s been maintained. When was it updated? When was it was last changed? And I know for instance I had some project that probably are 6 years ago that was like a weekend thing that I thought, “Oh, this is going to be cool. I want to make it available for everybody and put it on GitHub and I haven’t touched it since.” I hope that people who are looking at that are not judging me on what I know based off of what I did on that project because that was just sort of like we weekend thing and never quite finish it you know. I probably should have gone back and deleted it but I never did.

Carlos: Yeah, that’s right, so there are all kinds of projects out there. Now, another thought or question I had was around licenses. At least in GitHub when you go up and create that repository it’s going to ask you about some licenses which admittedly I don’t know very much. I usually just take the default and put it up there. If you’re kind of just getting started and you want to start to put something up there, should they be concerned about the type of license we choose?

Steve: Well, I think only if you care about any kind of liability around your code or where it can go or who it can be used by whether people can make money out of your work. I mean if any of those things matter to you then yes you should be concerned about the licensing.

Carlos: Yeah, got you.

Steve: There are some that say, “Use this in whatever way you want but not just for commercial. You cannot make money off of it.” And then there’s others that say, “Use it in whatever way you want. But you can’t modify it.” There are appropriate licenses for almost every scenario you can imagine out there.

Carlos: Yeah, but again so I guess I wouldn’t get two bugs down into that. I think there are defaults which I can’t remember off top of my head anymore.

Steve: Yeah. Like there are some like new licensing, there are some of the MIT licensing. There are a handful of other licensing options in there. I don’t remember them off top of my head either. Just make sure you know. I mean you take a look at it at least to see what it is but don’t let that stop you. It might be that you create a repo and you realized, “Oops, that’s the wrong licensing. Maybe I have to go back and change that later.”

Carlos: Yeah, there you go and asking around never heard particularly if you know somebody else in the community. Your developer friend, they will be able to help you with it. The other thing that I think, I don’t know if it’s specific to GitHub per say. I’m trying to think about it. I’m not sure there is an equivalent in TFS but one of the things that’s going to allow you to do or that it can do for you when you create that repository just a little box just to create that “Read Me” file.
Steve: Oh yeah, and that’s super helpful.

Carlos: I think that “Read Me” ultimately is again because thinking under the premise of it being a public project is share the rules, right?

Steve: Right, but even more than that, that “Read Me” is really the main visual representation of all info around your project when somebody first visits your project page.

Carlos: The kind of why.

Steve: Yeah. And then that may link off to many other pages but it sort of like what is your project, why are you doing it, like what are the rules around it.

Carlos: Right, exactly. And you know put in some time and thought into that. Again, not to it to be fancy but at least letting people know, “Ok, the only reason this is here is because I want to add this to my body of work” or “Hey, this what I’m looking to build, I would love you would join me”, you know, that kind of thing.

Steve: Yup, definitely worth looking up for any GitHub project that someone else created that you’re viewing.

Carlos: Yeah, and that would probably another great way to get some ideas there, you know, peek around at some of the other ones. Like we mentioned, so Brent’s stuff is out there. The DBA Tools stuff is out there. I know that there are some others like all the CodePlex stuff that’s moving over. If you can find your project now in GitHub you can take a peek there because they are all going to have “Read Me” and things like that. Now, once we’ve kind of put all these together obviously at least my first interaction with GitHub and I would say 90% of my interaction with GitHub has been through the browser but there are plenty of tools that are available that you could then use to download and not to interact with the browser. I don’t know if it’s the most popular, the most common, I don’t know, this is according to Carlos. I don’t have any numbers to back that statement up, but the GitHub Desktop App. You can download that, get started and that will allow you to kind of easily upload and modify your files, interact with your repository.

Steve: Yup. Other things available there too are the Git bash shell which is what I personally use and it’s basically a bash type command shell if you’re familiar with UNIX or LINUX bash. And it has all the GitHub commands in there and you can do everything. Kind of like a DOS prompt but in a shell window that lets you not have to use your mouse and click through the GUI. I think it comes down to what you like more yourself. I mean, are you more of a mouse type person, or you are more of a keyboard person and for me the bash shell is great because I like the keyboard more than the mouse.

Carlos: There you go. And I think once you get the commands down then it becomes a little easier. Now, that is the one knock against the desktop app is that you’ll use it and you’ll never learn those commands. There is kind of a jump that you may have to take but a lot of tutorial like on YouTube and whatnot can go over the common ones. I mean, so bringing the repository down locally so you can work on it. Of course you’re saving the file and then getting it back up there. I think that’s half the battle right there.

Steve: And then sort of halfway in between those is Visual Studio. If you are working in Visual Studio you have an integrated Git project there, you can do a lot of the Git commands through just the normal Visual Studio code checking section.

Carlos: Yeah, those are nice things at least at least on the Microsoft perspective. If you’re looking to get started and you have those tools is that I think the barrier to entry is a little bit easier because it’s going to be in an environment that you’re more familiar with. Yeah, and we talk a little bit about those repositories already so now I’m curious, you came from a developer background. Did you even mess around with the desktop or did you go straight into the bash?

Steve: No I didn’t even try the desktop. I just went straight to bash. I mean, I’ve been using source control for 27 or 28 years now. I mean, most of it has been command line so that’s kind of what I learned on as a kid and grew up with.

Carlos: Right, ok. From a slightly different perspective so I did knock her up writing bash. If you’re familiar even with the RedGate stuff, their source code tool.

Steve: The SQL Source Control.

Carlos: Yeah, SQL Source Control, there you go that’s the name of it. That might be your first step. It’s a very SSMS driven interface that allows you to easily get those objects and kind of put them up, then you get into the and that innovation with GFS if you’re using that, and then kind of going into the Visual Studio stuff. I think more and more people are going to start going that route. If you haven’t already your teams are kind of expecting that, that’s their interface. I know we have problems for example and I’m sure there is a way around it, but it was difficult for us working with developers that were also writing the scripts putting in the RedGate tool. They wanted to use a single tool and at least at that time, this is a couple of years ago. At that time Visual Studio and RedGate are not quite compatible and so we ended up just having to use Visual Studio and I think that’s probably common for a lot of people.

Steve: Oh, interesting because I know I’ve used the RedGate SQL Source Control project in a software development environment where developers were making code changes and making it all in through Management Studio using RedGate’s product.

Carlos: Yeah, exactly. But then their code on the frontend side they’re going to check in through Visual Studio, right?

Steve: Oh yeah, absolutely. And some of that it was done through other tools besides Visual Studio. But that’s true, the beauty of it is that when you use like RedGate’s product it’s just part of a source control project. It doesn’t have to be that way for everything. You can use Visual Studio, you can use other stuff. Git is very open to a lot of different tools like that.

Carlos: Ok, yeah. There you go. I think where I was going is that they wanted, and for whatever reason, the developers didn’t want to have two tools to be able to check in code and so they just wanted to do everything through a single repository.

Steve: Got you, so that makes sense. Right.

Carlos: I guess we kind of talked about this but I’ll bring it back so if you haven’t gotten into that. Again, so I only have two repositories so it’s not like I’m a GitHub pro or anything. But I feel like get out there, at least create an account. Now, this brings up the question. Maybe you’re not quite ready. Maybe you have some things that you’re working on. You have some scripts, so again, I’m coming this out from a body of evidence perspective, I want my future employer to know who I am and be able to share some of the things that I’ve done, right? However, I am happy to share but I’m not interested in really the contributions perspective. Should we still encourage people to use GitHub if they are the only ones working on the project?

Steve: Let me back a little bit on a couple of terminology things before we answer that one.

Carlos: Ok, here we go.

Steve: Ok, so we’ve got Git, and Git is basically a source control product. And then we have GitHub who happens to be a hosting vendor that has free repos or paid repos or private ones. There are also other vendors out there like BitBucket which is also another Git hosted cloud based solution. Different UI tools when you’re working through the browser than you have with GitHub but still command line it’s exactly the same thing whether you’re working from the GItHub desktop. So when we say if you’re working on a project on your own and you don’t really want contribution on it. I mean, I would recommend yes you should be in source control. Does that have to be public on GiHub? No. Will it help you next time you’re in a job interview and someone wants to look at code examples you’ve worked on? Yeah, perhaps it will. But is there intellectual property that you’ve created in that source control that you don’t want to give away to the world then you should have it in source control but it probably shouldn’t be public in that case. Does that answer what you’re getting out there?

Carlos: Yeah. I mean, I think so. I think that question would be particularly if there is a scripts that you’ve written, and I think even mentioned in Justine’s script that he gave to me. I’m not sure that he is ready to show that with world just yet but it’s one of those things that you could put out there. You know, he didn’t put that together with the intention of making money on it. If you have those types of things that you could put out there, I think it makes a lot of sense to try it out. Be involved in the community and just help your name get out there.

Steve: Yup. You know I could think of one. At PASS Summit I presented, one of my presentations was on backup and recovery and I’ve shared with people a script that would allow you to build your backup script through your log chain. That was one that I have it in source control in my private repo and occasionally I just put it in a .zip file and put it out on my website. However, thinking through that that’s the wrong way to do that. I should just take that and put that one in a public repo because all the source code is available anyone who wants to use it. Just by downloading the script today and it would be a lot easier to maintain and if somebody has feedback that they want us change. That one would be much easier to do have in a public repo.

Carlos: Right, and so that’s interesting. We kind of talk about this, you know, actually before we started recording we talked about this idea of in certain cases I wouldn’t want to go to GitHub, I just download it from the person’s website. I think some of that was maybe pre GitHub, or pre open-source source control if that’s the right terminology because that was the only way to distribute some of that stuff. Now, we kind of have this defacto way of instead of me having to then take the code put it all together, compile it if that’s the stuff I have to take, and then zip it up and put it up in my website. Once I have uploaded it, now everybody else can just kind of take it from there. You saved yourself a little bit of hassle there. Yup and I think the difference there is like that restore script that I was talking about from my demo. Man, that’s a single file. If somebody wants to go and get it all you have to do is click one link to download it on GitHub or click one link to download it on my website. Either way you’re just getting a single file. But if it’s something more complex like take Brent’s blitz scripts where is that quite a few of them that different things, it might be easier to just grab it off of a website zip up versus grabbing a handful of different files to download. And I haven’t really looked at his stuff there too much but maybe he’s got a bundle or an install package now.

Carlos: Right, well I think even in that case so probably different scripts. Are you going to use all of them? It’s almost like I only care about this specific code right now instead of downloading four files I’m only going to use one of them.

Steve: Yup, and I guess I probably should have gone and looked at Brent’s GitHub setup before we talk about it. I would have been much more informed on that one. But I know I looked at it when he put it out there several months ago. I just haven’t been back since then.

Carlos: Sure. Ok, again, ultimately I thought kind of interesting conversation around how we can start interacting better with GitHub. Again, I have an account. It’s not like I have tons of repositories out there. But I feel like I’m connected. At least try to go through the motions there. I think more and more it’s going to become, I won’t say required. It won’t be required but I think we’ll just to continue to see it more and more. However, they are not the only game in town. There are lots of different, we talked about a couple, and I think ultimately our message is, “Yes, use source control and what makes most sense for you.”

Steve: Yup, absolutely. As long as you’re using source control to keep track of what you’re doing and you’ve got the appropriate ways to distribute it, GitHub is great for that, but it’s not the only tool out there. There are lots of them.

Carlos: Yeah. Ok, I think that’s going to do it for today’s episode. Our music for SQL Server in the News is by Mansardian used under Creative Comments. As always you can connect with us on social media. We would love to hear your thoughts and feedback. You can leave comments in our website and we do still allow reviews on iTunes. Those of you iTunes users we’d love for you to leave a comment there. You can connect with us on LinkedIn. I’m at Carlos L. Chacon.

Steve: And you can get me on LinkedIn at Steve Stedman and we’ll see you on the SQL Trail.

Episode 118: How has index maintenance changed?

After having attended PASS Summit and seeing all the new features coming it out, it can be very easy to overlook the basics–they are boring.  These features have been around forever.  This may be true, but they still play an important role and ignoring them won’t help get you those fancy new features any time soon.

In this episode, our topic is index maintenance, with an emphasis on what has changed or what might change.  Is the query store feature going to impact the way we maintain our indexes?  We invite one our favorite guests, Sean McCown back to talk with us and give some of his thoughts.

Episode Quotes

“You can’t really change the math of re-indexing and maintenance.”

“And part of the problem is not with that, but with not having time to do maintenance is allowing business people to run the show.”

“Rebuilding an index is absolutely rebuilding the index pages and putting them in the order that they are supposed to be.”

“It’s not the stuff that they add that I think is ridiculous. It’s the stuff they don’t put in.”

Listen to Learn

00:06 Introduction about the speaker and brief outline of the episode topic
01:12 Updates on minion ware and what’s changing on re-indexing
05:00 Problems that will arise of having no time to do maintenance
10:18 Fragmentation of indexes
15:04 Does SSD and other premium storages change the data access?
19:17 How to determine how the fill factor should be different?
22:51 SQL Server: Analytics on database
27:49 The difference between re-indexing, meaning rebuilding, reorganizing an index, or statistics
33:02 What stands out with minion re-index?
43:27 Re-index 2.0 features
49:00 SQL Family

About Sean McCown

Sean McCownSean McCown is a Certified Master in SQL Server 2008 and a SQL Server MVP with 20 years of experience in databases. He is also founder and co-owner of the MidnightDBA.com website, where he records free SQL Server training videos and co-hosts the popular web show [email protected]

 

Transcription: How has index maintenance changed?

*Untranscribed introductory portion*

Carlos: So Sean, it has been a little while, welcome back to the program.

Sean: Thanks guys! It has been a while but not that long. I’m still getting people talking to me about the last episode I was on.

Carlos: Is that right?

Sean: Oh yeah, I get a lot of people at SQL Sats and whatnot telling me how much they listen to you guys and they heard about me and the stuff I’m doing through you guys, so yeah, good for you all. I didn’t realize you are so popular.

Steve: Well, it’s always good to have you on the show, and I think that helps with the popularity.

Carlos: That’s right, our MVP guest. Yes, so I guess 20 episodes. It was episode 90 was the last time we had you on.

Sean: Oh wow!

Carlos: Talking about Check DBCC. Yes, good to have you back on and always trying to keep up with what you’re doing. So today, ultimately our topic is going to be a bit about re-indexing. What’s in store for that? It’s one of those, I think everybody has a story about re-indexing or an opinion about re-indexing. I think even the future of where that’s going might change a little bit. And so I guess let’s just first jump off, why don’t you tell us, give us a little update on the minion ware stuff. What’s changing on the re-indexing?

Sean: Wow, so first I want to say we’ve been watching a lot of Voyager recently so I want to say your opinions about re-indexing are irrelevant.

Steve: What? The resistance is futile?

Sean: That’s right. Well, because to me it’s like I see this a lot. Companies seem to take pride. I’m going off on a tangent a little bit but it may have even been the one I won on last time. But I see companies do this a lot because you’re talking about opinions. And you get these companies that brag about how busy they are. “Oh well, our business is so slammed, we don’t have time to do re-indexing. We don’t have time for maintenance. We are so slammed.” You know, while that’s a cute sound bite, you can’t really change the math of re-indexing and maintenance, right? I mean, it’s nice to think that you don’t have time for it but that’s like saying you don’t have time to take care of yourself but you’re going to have time for the heart attack.

Carlos: So in that event I guess is do you believe that that is one of the reason, an additional reason perhaps that people are looking to the no SQL world. Again, not to say there aren’t indexes but they definitely seem to be same class citizens.

Sean: Yeah, people were looking to the no SQL world for because of the hype. They are advertising a gazillion transactions per hour and it’s so much easier. It takes a lot less maintenance because it got so much less overhead an all of that. But now they are getting bit in the ass by what, the fact that there are no keys on the table so your isn’t enforce and whatnot. So they are like, “Hey but I got corrupted data now.” “Oh, duh, you think so. Maybe you should add some of that overhead backend.” The stuff is here for a reason. I mean, no SQL has its place. There are sometimes when it just doesn’t matter, when that integrity is a lot less important or when you have a threshold for that sort of thing, right? For having what we call logical corruption. And you need the speed more than you need to have absolutely everything lined up. But most of the world need some sort of integrity.

Carlos: Right, you want some protection, you want some help in assisting what’s keeping your data. If you want to keep it around, right?

Sean: Yeah, so no SQL is really more hype than anything if you ask me. Yeah, that’s all I have on that.

Steve: Ok, so jumping back then to sort of our businesses is so busy that we don’t have time to do that. I think oftentimes I hear that and I see that, and quite often it comes down to misunderstanding of some of the fundamental maintenance items you need to do there. I see oftentimes like somebody will go through and do a complete rebuilt of all their indexes and then right after that they will rebuild all their statistics.

Sean: I see, yes.

Steve: And they will rebuild the statistics with like a 10% sampling or something like that which when you rebuild your indexes, of course your statistics get updated too. So you’re actually being wasteful rebuilding this statistics right afterwards.

Sean: And part of the problem is not with that, but with not having time to do maintenance is allowing business people to run the show. I mean, you get PMs and you get Sales Managers, and you get VPs, you got C-Level people and they are saying they don’t understand the database and they say, “We just don’t have time for that. We just can’t do it. I don’t have time, I can’t do that.” “We’ve got customers. We need to keep them online.”

Carlos: Well, in their defense it’s because they’ve tried probably to do, they haven’t done it in 10 years then they try to rebuild it and it hits them because the business is impacted in some way because it runs over. It’s still running at noon and nobody can get into the database, and so that’s where they’re like, “We’ll never do that again.”

Sean: Or they have a DBA like and this kind of merges into what you’re saying, they have a DBA that’s just doesn’t know how to do maintenance well.

Carlos: Right, and we’ve all been in that camp. I think I may have actually told this the last time we were on. I had been hired by a law firm to help with a conversion of the financial application and that was like the flagship project that I had come on for. There were two DBAs. He was actually in London at that time, this was in Richmond, Virginia where we’re located, so he was in London helping with some office and they were doing some conversion over there. And then we had the document management have been having problems, so the decision had been, somehow somebody else. So the DBA in London had talked about reorganizing the index that’s what has been approved. And for some reason we’re talking and we decided we’re going to rebuild the thing instead of reorganizing it.

Steve: Let me guess on Standard Edition as well.

Carlos: Standard Edition as well. So the issue was not what you may think it was. The issue wasn’t locking everything else out. The issue was we’ve granted disk space. So it started to rebuild and it creates that second copy and there wasn’t enough disk space on the drive to facilitate both copies of the index and that had already taken like 2 hours and some such thing. An attorney had needed to get into the system to get this document and they couldn’t. And then we were waiting for it to rollback and so that caused a big hoopla because we were not paying attention to what we were doing. We were all tired. It was like midnight when this was happening. We were both involved in own projects and we made kind of in the moment decision when that probably wasn’t the best choice.

Sean: Right, I had a situation years ago with a big client who hadn’t done maintenance ever, and they were being charged something like $40,000 a month on missed SLA fees from their customers. And they were about to both close the site and then the other sites were about to go to Oracle because SQL just couldn’t keep up with their level of business. Ok, sure if you say so. And so I got involved and they were 99% fragmented on almost every table in the database. So I got their maintenance taken care of and even did a little bit of scale up for them. Put some indexes on different lungs and whatnot. And told them that, and they were a true 24/7 shop, and I said, “Now we have to go down 4 hours every Sunday to do maintenance. We have to do this.” And he was like, “I can’t do that. I can’t afford that time.” So okay fine, right?

Carlos: What about the $40,000 a month?

Sean: So like a week later, Sunday, they decided to skip their maintenance. That following Tuesday, they had to pull on an emergency downtime to do their maintenance because the database has gotten so slow. Now they have gotten the taste of the good life. And the database has gotten so slow they could barely work. And this is the way they’d worked for years. So they did their maintenance that Tuesday as an emergency. The following Sunday they got ready to do their maintenance and the Plant Manager said, “We can’t bring everything down. We can’t do this. We’ve got to skip it.” And the General Manager said, “Stop. No. Take it down to the maintenance every Sunday. Don’t ever miss another one.” So we made a believer out of him for sure.

Carlos: Yeah, that’s funny. I think maybe we should pause just for a second because I’m thinking all of a sudden about Eduardo Cervantes and some of our other developer friends who are listening. I guess I feel like maybe we should just revisit or rehash some of these terms that we’re talking about. And so I guess even the idea of index fragmentation, right, you mentioned 99%. Maybe let’s just take a couple of minutes and talk about the very beginning, so we have a table, if there’s a primary key on it, you have the clustered index and obviously you have other non-clustered indexes. How do we go about getting fragmentation on our indexes?

Sean: Right, so my favorite analogy to this is with the CD case and it seems to be the one that everybody can associate with.

Carlos: You’re showing your age there a little bit, Sean.

Sean: I’ve got 4 CD books sitting over there.

Carlos: Ok, for you millenials, right? We used to carry our music around.

Sean: Yeah, exactly.

Steve: Yeah, and it wasn’t on our phone.

Sean: That’s right. So let’s say you got a nice big CD case with all your movies in it back before Amazon had it all digital. And this is going to dovetail into fill factor as well. So you alphabetize all of your movies in there, so you’ve got them on a certain order. That by the way is your clustered index. All of the movies are physically in order in this book. And so let’s say they come up with another Batman movie, Batman forgot his comb on Penguin’s Lair, and that page is full. What do you do? So you go to one of the pages at the end of the book, you put the Batman in there, and then you put a little post it note on the current B page that says, “Bs continue on page 75.” And so you get enough of those where you’re having to put them in the back of the book and make different markers to different things, and that’s fragmentation, where you don’t have everything in order. You got to go here for part of it, then over here for part of it, then over there for part of it because everything starts getting out of order because you haven’t left room to put anything else in there. So that’s what we mean by a fragmented index.

Steve: And the effect of that being out of order is if you’re trying to find that Batman movie or something else that starts with B, it’s going to take you a lot longer to find it because you’re starting in your B section and you have to flip to the back and jump around quite a bit more than if they were all in the expected location.

Sean: Absolutely, and if you put different letter on that same Batman page and then you got another B, well then that would be way out of order too. And you’re going to leave a post it note on that one to say where the next B section starts. I mean it can get way out of hand. However, if you were to say, if your CD book held four CDs a page and you were to only put in two CDs on each page because you know you might get some more CDs. They might come out with another Batman movie of nowhere. So the first one, you will only have the first two Batman movies on that page and you’ve got two free when they come out with a new Batman movie then you can just slide it in there right past the last one you have and you still got room for another Batman movie. Anything after that and you’re going to have to put it on another page, right? So that’s your fill factor as well. So you’re leaving space for inserts and once it fills up then you have to either start creating new pages or you have to re-index and recreate it with that so. A good example of that would be, they come out with two more Batman movies and your page is full, then they come out with another one. Well, now you have to tear your CD case apart and redo everything leaving two free per page so now you’ve got your expansion again, right?

Steve: While you’ve got that all ripped apart, if someone wants to find a movie, it’s going to be very challenging for them to find that.

Sean: Yes. That’s why classically they lock it. Right, nobody is allowed. That’s why classically they lock the table because we don’t know where this is right now.

Carlos: Under construction, right?

Sean: Yes, we don’t know where it is.

Carlos: We got the yellow tape.

Sean: Exactly.

Carlos: I think another reason why there’s maybe some confusion or misinterpretation. I think what you’re saying is correct. I think on the flip side however, people are going to say it really affects you, so that lookup affects you when it takes a long time to do those lookups. So with the advent of SSD, all these premium storage does that change the game at all?

Sean: Well, it helps because what you’re doing now is you’re taking what should have been a sequential lookup and turning it into a random lookup. So here you’re looking everything that is The Present or The Batman Present then you’ve got a range there and you should have a sequential lookup which is going to be very fast especially by the index. But now it’s random because they are scattered across all these different pages and maybe even all of these different extents and you’ve got these pointers that’s going to have to go everywhere for it which SSD does really well. That’s a lot less important if you don’t have an actuator arm moving back and forth trying to access spinning data. But you’re still pulling extra I/Os. I mean, that’s still I/O that isn’t as efficient as it would be if it weren’t arranged. But it has made it a lot better.

Carlos: So ultimately, the idea of the overall speed of getting your data as the cracks of the problem, and fragmentation and fill factor goes into that. I guess there is one more thing we should say about fill factor and that is the idea of, so we talked about pages, right? So you mentioned leaving a little bit of space there. Now, SQL Server does read everything from memory so you have to be careful. I don’t want to leave either you want to leave, and again, I guess I’m thinking about my CD case that has those four slots on a page. I don’t want to just put in one CD or DVD, whatever and leave the other three blank because those are a lot of pages that I’m not using, right? When those pages isn’t going to memory that basically means I have a lot of blank stuff in memory, so I guess there’s a tradeoff.

Sean: So yeah, it’s trading one set of complications for another as I like to say about almost everything. You’re giving up space to gain the speed, and to gain concurrency and to gain efficiency in I/O but you’re giving up other resources to make that happen, so yeah.
Carlos: And I guess on that what’s right really comes down to understanding what are the constraints on your system. And if you’re already constrained on memory you’re going to make it even worst if you need more pages to be loaded to get to the data. But if it’s one of those things where you’ve got way more memory on your server than you ever need, the unicorn world I guess. Well, maybe it’s not so much of an issue and I think that more memory can mitigate the constraints of your disk if your disk is already …

Sean: You know, maybe you’ll be lucky and you’re on a system that can use BPE or something like that.

Carlos: I’m not familiar with that acronym. BPE?

Sean: Buffer Pool Extension.

Carlos: Oh, got you. So we’re using SSDs, you can extend that beyond your memory capacity.

Sean: Exactly. But I found it also, I’m compelled to say that one mistake that a lot of people make is thinking that every table needs to have the exact same fill factor as well. They tend to set that and that is propagated by some re-indexing solutions out there but certainly not minion which is where all of these started to begin with. Which is what we’re doing new in minion, right? But a lot of the re-indexing solutions out there you’ll find both free and paid is that they don’t make it very easy at all for different objects to have different settings.

Carlos: Well, give us the short version of how you make that determination as to how the fill factor should be different.

Sean: Right, so that is unfortunately that’s an actual discussion. The short version is, and also takes some work, right? The short version is you have to monitor the fragmentation on the table, so you have to have some sort of monger out there, pulling frag stats from the DMV, say on an hourly, or bidaily, or daily basis, whatever you think your threshold needs to be. And then see the rate of fragmentation per day or per hour that these tables are going through and when they start misbehaving. When they start misbehaving as in when performance starts to tank. Then you can note the fragmentation level and if you know that this thing has a fragmentation level of 5% a day I guess then you can start with the 90% fill factor and see if that can get you through to your next re-indexing. That’s what this is all about, it’s getting you through to the next big re-indexing and doing a little bit every day even. I’m not a very big fan of the big re-indexing every week. I’m a fan of taking them as they are needed.

Carlos: A more incremental approach.
Sean: Right, but setting the settings to what will I need. If a table needs to be defrag at 20%, I mean re-indexed at 20%, then why make it wait three more days for the Saturday night maintenance. It’s only going to get worst from here.

Steve: I think the key there is knowing when they need to be re-indexed. I think so often people take the approach of we have to re-indexed everything on a weekly basis and that’s one of those that, I don’t know who disagrees with that.

Sean: Right, and quite often, and of course the frag monitor that I just talked about is if you’ve got a big shot that can be overkill or if you don’t have the skill to write something like that that can be overkill. If you want to take just like the poor man’s swag approach you can set your big table to 90% and then if it misbehaves before the next re-indexing then drop it down to 85%. Then if it misbehaves before the next re-indexing then drop it down to 80%. Keep dropping it by 5% until it makes it through your next re-indexing. And the extra page loads in the memory don’t have adverse effects either. There’s going to be that give and take. I would never recommend to going something like a fill factor of like 60% or something, or 50%. I think that’s a little much for me.

Carlos: Yeah, that will be a little too much. So now it’s interesting because you mentioned that idea of kind of getting the insights to fragmentation and to making adjustments, right? One of the things that I asked when we’re going to put this together was the future of re-indexing and obviously you have the new feature or the new version of the minion ware out. So one of the things that’s interesting that I see, and we’ve talked about on the show before from SQL Server in the News perspective is that SQL Server starting to put the analytics into the database. Looking at your query plan it’s saying, “Hey, we haven’t talk about this with like your table value functions.” So like the first time it runs, it doesn’t know how many rows it’s going to be affected. It’s going to guess one. It runs and it says, “Oh, I actually had a thousand.” The second time it runs, it’s going to reevaluate that plan and say, “Oh, I’m not expecting one. I’m expecting a thousand doing the changes at all.” And I have a feeling that that may impact this idea of how it goes back and gets those records and things. Do you think that that feature, that idea of basically the database deciding what to do and how those plans are going to get generated is going to affect this decision at all?

Sean: The decision of what exactly?

Carlos: Of whether to re-index.

Sean: Oh, I see. You know, I think in a perfect world what’s going to happen ultimately. And they are nowhere near that right now. Because right now they can do, they can look at some plan regressions. We’re simple plan regressions right now based off of CPU. If it’s going to cause a regression of greater than 10 seconds I think, then you can kick it into force the last known good plan. And that’s really the only thing you can go by automatically based off of that or based off of auto tuning.

Carlos: That’s the next thing they have added, so there’s the auto tuning. Basically that says, hey my stats have changed and I’m getting some differences then use the old plan.

Sean: Now, you’re putting something into that I don’t think exist. You’re making an assumption. It says if the plan is going to regress not if the stats have changed. It’s looking solely at the plan. So it says if the plan is going to take longer than 10 seconds, I think it’s going by CPU.

Carlos: The execution of the actual…

Sean: It‘s based of the execution of the actual plan. If it’s going to change significantly I’ll say right now because I don’t remember exactly what it is but it’s 10 seconds or 10 milliseconds, something like that. But if it’s going to take like 10 seconds longer for the plan to run then it’s going to look for the last known good plan. That’s completely different. You’re inserting causality.

Carlos: What I think what I meant to say was it’s looking at the stats of the execution, not the statistics. I apologize, that was unclear.

Sean: Yeah, it’s looking at a stat of the execution.

Carlos: A stat of the execution. Yeah, that’s right.

Sean: Well, I take that back. There are two conditions that can kick it in. It can be kicked in by the plan taking longer and by error count. If the last error count of the good plan is won then it will say, then it has to be greater than one that the current error count has to be greater than one, so we’re looking at timeouts and stuff like that. So it can be based off of error count as well and they call that error prompt. And it’s just a bit. Is this plan error? Is this query error prompt? Yes. Then use the last known good plan in going about your business. Yeah, I think that in a perfect world they would have to be able to tell the difference. They would have to know why a query is going bad. Why has the plan regressed? And right now in SQL as an engine, at least not exposed to us, there’s no way to tell the difference between a bad query that’s bad because the stats are out of whack, or because there is fragmentation, or because there’s memory pressure or whatnot. There’s no place to look to automatically say, “Oh, well, instead of rebuilding stats.” I mean, instead of rebuilding the index I need to reduce stats. Right, there is nothing in there that tells you that. It’s just kind of jack of all trades sort of thing. It’s a shotgun at this point. That’s why so many people quote re-index needlessly.

Steve: On that point, Sean. One of the things that I think comes up in conversation a lot and people get a little bit confused on when they haven’t experienced it is the difference between re-indexing, meaning rebuilding, reorganizing an index, or what statistics are? And can we take a minute to talk about that because I think that place into the whole conversation here.

Sean: Absolutely. Rebuilding an index is absolutely rebuilding the index pages and putting them in the order that they are supposed to be. It’s like that whole taking your CD case apart and leaving the two slots free on every page and making sure that they are all in order. So it’s literally completely rewriting those and rearranging all the data so it’s completely alphabetize again if you will whatever alphabet you’re using. Whether it’s GUIDs or first names, or social security, whatever, right? So it’s literally just putting the data back in order and making sure that everything is contiguous again. But your stats is data about that data so let’s say that you’ve got an alphabetic thing or let’s say you have a numeric thing. And so that stats would be something like, let’s say the data in your page is the first row is a 1, and the second row is a 5. So the stats would say something like this value is one and the steps between, and there are four steps between this value and the next value. And if the next one is a 6 then it would say, there is one step between this value and the next value. And if the next one is a 6, then it would say there is one step between this value and the next value. And so if the next value is 10 or 9, it would say there are three steps between this value and the next value. So it’s data about that data to let the engine estimate where data is going to be located so it can know whether it has to do a scan or a seek. Because there are times, I mean, I see this all the time where people say, “You want to get rid of all scans.” You don’t want to do scans, you want to do seeks. No, not necessarily. There are some times when it’s actually cheaper to do a scan than just to do a seek. So that’s what stats are, is it’s data about the data. And where that starts going wrong is if the stats haven’t been updated in a while and say we’ll go back to this first one, let’s say you’ve got 1, 5 and 6 as a data set. And then you’ve got between this value and the next value is four steps. So you’ve got one 1, and one 5. And it says, well there’s four steps in between but 5 is the very next value. And then you insert twelve more 1s. Well, now your stats are throwing completely off because from that first one to that second one is only one step. Your metadata about that data is thrown completely off. And then let’s say that somebody comes in and deletes that file altogether. Well, now you’ve got 5 listed in the stats but it doesn’t even exist. So it’s going to get there and it’s probably going to have to do a scan in order to find it because the stats are completely gone, right? So when you delete those stats you are basically clipping SQL’s legs out from under it and saying that, “I want you to find this data and by the way it’s none of your business what the data looks like. Just go find it and make it as efficient as possible.” And they are like, “But I need information.” So it’s kind of like, the stats are kind of like a trimmed down index that tells the engine what the data looks like so it can access it more frequently, so it can access it more efficiently. The more the data changes, the more those stats are going to get out of whack because you’re going to have updates. I mean, if somebody comes in and updates that 1, 2 or 12. And stats are only updated after 500 rows and 20% by default. So imagine having a table with a million rows in it or 200,000 rows in it and trying to join that with another table that’s got say 40 million rows in it and your stats are extremely out of date and SQL doesn’t know anything about it, that’s a pretty long lookup time.

Steve: Yup, ok. So then let’s say, I mean I know enough to be dangerous and I’m going to go and use some built-in maintenance plans, do my re-indexing or a I know a little bit more and I’m going to build my own solution to do it or I go and grab someone else’s maintenance tool out there to go do it. I mean, there’s a lot of different things that can happened there on how those get rebuilt. But what is it because as you talk about your minion re-index, what it is that’s really different or stands out with what you’re doing.

Sean: So much and even in version 2. Dude, let’s not get of here without giving you a couple nuggets from version 2, ok? But right now, currently the whole thing is table based, so all of the settings are table based, so that makes it really easy to be able to have different settings both at the database level and at the table level without having to have any extra jobs or job steps. So it’s just an insert into a table. So let’s say I want the entire database to have a fill factor of 90. Well, I can set that in the database table and then let’s say I want certain tables in there to have different fill factors, so one at 85, one at 80, one at 75. Well, I can set those. I can just put in rows in the table table. And now they’ve got all their settings and those settings also extend to every single one of the re-index settings Not just a handful of it but every single one of them. Some of them can rebuild stats after a reorg. Some of them can be sorted in TempDB. Some of them can use drop existing. Some of them can have online re-indexing. Some of them cannot have online re-indexing and so on and so on. One of the biggest things that we do in that sense is to give you a high level of configurability without any extra jobs or job steps. I think that’s the main thing that differentiates us and we’ve done what we call prep only. I think we talked about this a long time ago but the big problem with re-indexes on large databases is, let’s say you’ve got a 4-hour window. Take you an hour and a half to two hours, and sometimes longer just to pull your frag stats. In a big database it can take a long time, right? So in minion, you can pull those frag stats earlier in the day and use them in your maintenance window. So you could start two or three hours before maintenance window starts and pull the frag stats. And during maintenance window you can say, “Use those stats that I’ve already collected”, and it won’t try to pull them again or use them. And now you’re using your entire window for actual maintenance instead of using half of it to find out the maintenance you were supposed to do.

Carlos: Right, so almost like that rules based chart or list gauge the ability to finally tune all the individual knobs and settings.

Sean: Right. And you know, something I wasn’t very proud of for a while in Re-index 1 in order to do that. That’s was one of the things that you did to have a separate job for. So you had to have a job to run the prep only and then another job to run the actual re-indexing itself. But now we’ve brought that into scheduler table. We just released 1.3 of six weeks or 2 months ago something like that. So we just released 1.3 and we’ve got that scheduler table now. So now you could say something like at noon I want you to run this is a prep only and then at midnight I want you to run the re-index as a run prep job to use the stats that I collected at noon. And now they are just entries in a scheduler table. So now we’re back down to having no extra jobs at all. You could run prep only for 40 different databases at 12 different times throughout the week and it will all be just be table entries. It’s all still just one job who’ll do the running.

Carlos: Got you.

Steve: Yeah, that’s really interesting because I know if you have a different job for each of those or even a different job schedule for those different events it can get really messy in the SQL jobs for the SQL agent.

Sean: We had a client for 75 jobs just to manage their re-indexes. On a SQL Server they had a ton of databases like 3,000 or 5,000 databases on their and that’s 75 something jobs just to manage the re-indexes because they had special things that they had to do in there. That was on somebody else’s solution.

Steve: Wow.

Sean: Also in 1.3 we gave you the ability to, we could always do this but we made it a setting to re-index heaps. So now it is a setting how you want to handle heaps both of the database and the table level so you can handle like it always has been and just re-index the indexes themselves. Or you can re-index the heaps themselves.

Steve: Yeah, and you know that interesting because that’s something a lot of people don’t think about when they are re-indexing as well. What’s going to happen with the heap?

Sean: And like I said, you can always do it but it followed the, I don’t know if I want to call it the hack attack method. We’ve always had this way of being able to alter the run. Now, you asked me before, not on film but the other day when we were talking on Skype. You asked me if we were going to be able to do something. If we are going to be able to do, if we had it coming where we’re going to be able to limit the re-indexes to like to page counter.

Carlos: That’s right, so basically my small index is I don’t care about.

Sean: Right, exactly. Now, I can actually do that today, and the way we do that is actually pretty clever. But it’s the way that you had to do the heap re-index before, so you will get the idea. The method to do that now is to do a prep only and that takes all of your information and puts it in a table. And then insert a step that deletes rows out of that table based of any criteria you want.

Carlos: Oh, see, do the little prep and purge anything you don’t want and then let it run in the maintenance window and you got it covered.

Sean: Exactly. Now, that gives you so many things, right? Yes, you can do it off of index page count, you can do it off of row count. It allows you to reorder the indexes because that’s another thing that we really do well, is that we allow you to do your tables in any order you want, so you can order the tables.

Carlos: But I guess, I still don’t like that idea that I have to be the one to go back and go through there and do it. I just fell like that there should be some minimum thresholds.

Sean: Hold on, I’m getting there.

Carlos: Ok, ok. Here we go.

Sean: Now, so there’s a couple of things going here, right? I was talking about ordering because we talked about this a long time ago. I remember where I was telling you guys that that’s the problem I have with re-index jobs that stop after 2 hours is very seldomly is that actually necessary. What you mean is I am not able to control the orders of the tables go in, and so in order to keep my important tables from running after that 2-hour time window I have to stop my own team. Whereas if I make sure that my 7 or 15 most important tables get done first then I don’t care if those other little tables run over two hours. Who cares, right? They don’t take long enough and they are not used enough, so being able to order your tables is not a big deal. So within this no man’s land, I guess we’ll call it purgatory between the prep only and the actual run. You can not only delete stuff out of there but you can also reorder the tables based off of anything you want and we’ve got some SPs that we’ve written that I don’t know where they are if they made it to the community zone or if they made it to the queries folder or what. But we’ve got some SPs that we’ve written to help you with this so that you can reorder tables based off of, or you can reorder the tables based off of usage, so you can make sure that the most used tables get done first, or the least used you can reorder them by schema, you can reorder them by anything that comes up in the stats DMV, or any other criteria that you would like. And this is kind of where I’m getting at Carlos is right now with other solutions. I hate to keep saying that but that what keeps coming back to. Currently with other solutions you get a hard coded, you can do this with this page count, above this page count, or below this page count. And that’s it, that’s the only decision you get to make. But what if I want to do it based off of like I said usage? What if I want to order them based off of usage or what if I want to delete the ones based off of usage or whatnot. Anything that gets below a certain usage don’t bother even re-indexing and I just wanted to just kill that altogether, right? And I don’t want to put exclusions in the table because that’s an exclusion list that I have to manage. I wanted it to be more dynamic so we have a big dynamic component to the way we’re doing this. So right now you can go code your own solution or you can alter one of the solutions that we’ve already done. We give it to you in SPs so you can just pass in a couple of parameters and you’re done. But in Re-index 2 that I’m working on now, I am following that exact same method. So what I’ve been telling my users to do all this time I’m building them into the product. So if you look at it right now, the way it works currently is it pulls all the frag stats and then in the exact same SP it runs the re-index. So frag stats gets pulled then it runs the re-index. Well, that’s no difference from if I were to do a prep only earlier in the day and then during the maintenance window I run the job, I run the re-index. Now we are only arguing about the amount of time that happens between the prep and the re-index, right? So in Re-index 2 I’m going to have two steps in the job. Well, maybe three but we’ll talk about that in a second. So the first step is going to do a prep only. Every single time without fail even if you’re just running the re-index it’s going to do a prep only. The next step runs that prep only. So that’s the exact same as pulling the frag stats and running the re-indexes, only now I have split it up in two different job steps. But one runs after the other so you’re talking about a difference of milliseconds. And so what that’s going to allow me to do now is build into the system these inserts, these purgatory inserts. So now, let’s say that you want to limit the run by anything that’s only over 500 pages. Ok, great. It will be a checkbox for you and behind the scenes after the prep only it will run that SP that deletes anything out of there that’s under 500 pages. And then let’s say that you also want to order it by the most used index, well after that, it will run the one that orders them by most used index. And let’s say you only want to do the ones that are in this schema. Well, hopefully you would have passed that into the table as a parameter but you see what I’m doing. So I can put in ghosted records, I can do anything I want, in that no man’s land because the prep has already. I’ve got a list of tables and indexes that I’m going to run and then you can delete, you can add, you can subtract, you can rearrange. You can do anything you want to before that run actually happens.

Carlos: These features you mentioned, is this 1.3 or is this talking about 2 now?

Sean: Well, ok, so it’s available on 1.3 but you got to code it yourself or get the code that we’ve written. In 2.0 that I’m writing right now is where I’m going to build that stuff in and I’ll build in a handful of those purgatory SPs. I’m going to stick with that for now. I’m going to build that handful of purgatory SPs and then I’m going to try to build in a mechanism for you guys to write your own and even upload them into the community zones so that everybody can share. So you can see the method, it’s going to be much more flexible than anything else you’ve gotten now because you can do more than index pages or row count. You can do anything that crosses your mind. I mean even going down to saying, “This index took this long last time and that’s too long this time so I want to delete it from the thing.” And the last thing I want to say about Index 2.0 as a little bit of a teaser, look for some good multithreading.

Carlos: Ok, there we go. Do we have an estimated date yet? Or you’re still on the works?

Sean: Oh, I’m still on the works. I’m wouldn’t even come close to having a date but I’m working on it diligently. But I will say that I just got, because one of the problems, remember we were just talking about frag stats and how long it can take to pull. I just got the multithreaded frag stats engine worked out. So you’re going to be able to say you’ve got a database that’s like 4TB, 6TB. You’ll be able to pull frag stats in as many threads as you want so I want to split it up, and I want to split the database up into 10 pieces and pull frag stats for 10 tables at a time. And you can do that in tens and just work through that database and cut your time drastically. I’m saying beautiful times where I’ve got a table with something like 53,000 tables in it on one of my test boxes and it takes, well on my laptop, it takes like 2½ hours to pull all the stats for it, pull all the frag stats for it. It gets done in like 12½ minutes when I stripped it up to like 10 frags. That’s a beautiful thing. So I’m looking forwards to it and more.

Carlos: Shall we go ahead and do SQL Family?

Sean: Let’s do it.

Steve: Yes, since you’ve been on the show a few times before we sort of shake up the SQL Family questions a little bit this time.

Sean: Alright, and I haven’t heard any of these, so this is going to be great.

Carlos: Yes, that’s true.

Sean: You we’re supposed to type them on the chat and I haven’t seen anything so this is. I like to spot an idea, let’s do this.

Carlos: Do you want to go first, Steve?

Steve: Yeah, so I guess you’ve been around SQL Server quite a while and you’ve done a lot with it and the first one question I have is not what would be the one feature you would change but what do you think is the most absurd or ridiculous feature that was ever added to SQL Server?

Sean: You know, it’s not the stuff that they add that I think is ridiculous. It’s the stuff they don’t put in. Things I’m going to call the half ass features. Things like not forcing us into dynamic SQL by not allowing us to use variable names as objects. You know, SELECT * FROM at table, being able to do stuff like that. Or forcing us into XML for everything and then not standardizing that XML. Like if you’ve ever tried to parse programmatically a deadlock XML graph and have 14 nodes in one, or 14 attributes in one and then some of those attributes don’t exist and then the other ones attributes don’t even exist, so not standardizing that XML. I mean put them there, have them know but put them there so at least they are programmatically for me to parse. Stuff like that where they just don’t go all the way and let us do things we need to be able to do. They force us into ridiculous decisions. We don’t even have thread safety in stored procedures.

Carlos: So this is going to take you back a little bit, right? But the first time that you ever failed to close the transaction.

Sean: I can tell you the last time. Just the other day I was like, “Why the hell is this taking so long?” And I went into the DMV and I was like, “Now, what is blocking this? What is blocked?” Yeah, that was just last week. So the first time? I really don’t know, man. You know, I’m going to tell you honestly. I have always been in the class of what I called deep shit coders. I make so many ridiculous mistakes. I troubleshoot stuffs for half a day before I realize I’m troubleshooting on one server and writing on another server. I do that kind of stuff all the time. When was the first time? Dude, I wouldn’t even begin to know, so many times to count.

Carlos: Oh, ok. I guess it was very public for me so that’s why I guess I things that’s on the dev server, that’s something different, right? You know, that first time that you did it and the email from the manager comes.

Sean: I can definitely tell you the first time I brought a cluster down.

Carlos: Ok, let’s hear it.

Sean: Ok. I was working for Pilgrim’s Pride and we had some like 80 something locations all feeding into a central location and we only had 2 DBAs at the corporate office, me and the DBA manager. I was fairly new to clusters and I didn’t really understand the concept, and I didn’t really know what I was doing was clusters and I went in there and I did something that I shouldn’t have done and brought the cluster down for everybody, because every site in the entire network feed into ours, right? So transactions across 80 something sites were failing in the busiest part of the workday. And I went to tell the boss what I had just done because I didn’t know how to get the cluster back up. I just walked in there and I said, “Hey, dude I got to tell you something.” He goes, “Ok, what?” And his phone rang and he goes, “Can you wait just a minute?” That’s actually for me. And he looked at me and said, “What did you do?” And I was like, “Oh god, uhmm, I brought the cluster down. It’s down right now.” And yeah, that was back when you had to take 25-45 minutes to spin up all the disks in the array. It wasn’t a pretty picture. At least I didn’t corrupt anything but it was not great.

Steve: Ok.

Carlos: So we’ve had your wife Jen on the program here before as well, so a family friendly version of how you two met?

Sean: Wow! Ok, you know, that is actually a bitter sweet story.

Carlos: Oh boy, do we not want to include that one, Sean, we don’t have to.

Sean: No, no. There’s nothing wrong with it. We met when I was in my late teens. I was 18 or 19 and she was 13. We were both going to the same youth group. And we became friends because she was fairly introverted and I was, I don’t know if I was introverted or not, but I wasn’t a cool kid, and there were some definitely cool kids in our youth group and I wasn’t one of them. So we just kind of started hanging out together and we became friends and we were friends for, I don’t know, 20 years I think before we actually got together as a couple.

Carlos: Wow, ok, so late blossomed.

Sean: Yeah.

Carlos: All is well that ends well.

Steve: And in that which of you is the first one to dive into SQL Server?

Sean: Me.

Carlos: Even with the chef thing.

Sean: Oh yeah, yeah. I quit cooking. Well, I had my own catering company even after I left my chef who I apprentice under for so many years. I had my own catering company for a few years even after I was in IT. So I got her involved in SQL. I don’t know how long I was in it before I got her involved but she was an intern at FedEx at that time and I started talking to her about SQL because I guess she was dissatisfied with the way her current position was going. We have to ask her specifically what the catalyst was but knowing her, probably just to please her man.

Steve: Well, that wraps up my SQL Family questions for today.

Carlos: That’s right. Thanks Sean for being on the program. We do appreciate it.

Sean: Anytime because I love talking.

Steve: Yeah, and as usual it’s been a pleasure. Thank you!

Episode 117: PASS Summit Retrospective

Have you ever had a week without a Wednesday?  Me neither, so for this episode, we recap our experience at PASS Summit and talk about all the amazing compañeros we met and give some thoughts around our very first meetup.  It was so much fun to meet everyone and both Steve and I gave sessions, which was a great experience–and a first for me.

One, we both realized after the fact is, we are not good at social media or taking pictures!  We apologize we don’t have a bit more to share on that front, but hey–we do have faces for radio.

Episode Quotes

“I find when I’m at a conference I get so busy and I enjoy talking with people. I just don’t think to take out the cameras down for shot.”

“It’s nice to have live music but at the same time it made a little difficult to hear sometimes.”

“Watching these keynotes from the hotel is super convenient.”

“Good questions that’s always make you feel good as a speaker.”

“Be prepared, bring two laptops and a backup plan, and a backup to your backup plan.”

“The last 5 minutes before a session starts, even if everything is ready to go and going smoothly, the last 5 minutes is the stress factor.”

 

Listen to Learn

02:32 Companero Shoutouts

07:47 SQL Server in the News

10:50 PASS Summit Retrospective, being on social media

12:47 Tuesday night at Seattle – Carlos’ Mardi Gras suit, the live music

15:56 Volunteer Party18:46 Thoughts about the livestreaming of keynotes

20:13 Wednesday sessions, SQL Trail Mix event

25:10 Thursday – Steve’s session on Database Corruption

31:15 Thoughts about the Game Night

32:29 The Sponsor Area

34:07 Friday – Steve’s session (Backup and Corruption) and Carlos’ session (Establishing Baselines)

40:23 End of conference and unfortunate story at dinner

Transcription: PASS Summit Retrospective

*Untranscribed introductory portion*

Carlos: Companeros! It’s nice to have you on the SQL Trail again.

Steve: Yes, and welcome to Episode 117. This is our PASS Summit retrospective episode, which I’ll tell you we got to meet a lot of companeros last week.

Carlos: Yes, and it was great meeting a lot of you and getting to chat with you. I feel like I’ve been doing a little bit of retrospective, a couple of retrospectives here. It is the conference season so we’re kind of giving our thoughts around the conference, what went well and what we enjoyed being out there in Seattle.

Steve: And I know I say last week, but yeah, it was last week when we’re recording this but it will be two weeks back when it actually airs.

Carlos: That’s right. Man, how time flies.

Steve: Yeah, and I guess with you being on the East Coast and me being on the West Coast we don’t have that many opportunities where we’re actually physically in the same location.

Carlos: That’s right. And we did, again you’ll see as we kind of get into this just some of the activities that we had. But our schedule was jam-packed and some client issues kind of rose and we didn’t have as much time as I was hoping. So for example we didn’t get to do an episode, I don’t want to say live, but face to face, which has been something that we thought about but didn’t plan well. But this works and hopefully the companeros will enjoy it.

Steve: Maybe next time we’ll have to have you come a day earlier or something just to do that. Or I’ll go a day early depending on where we’re going to.

Carlos: Right, there you go. There is always next year, right, sound like a bad sports team. Ok, so getting into some companero shoutouts, and because of Summit and running into so many of you. We’re actually going to break this up a little bit, so if we don’t mention your name now don’t freak out just yet because we may be working some of that into the actual story or our retrospective. So first we have Justin Wailey, who I think I incorrectly identified as being from Georgia. His Twitter account says he is in Alabama. It was good to connect with Justin. I had met him two years ago. I think it was his first Summit back then and it was good to reconnect with him.

Steve: Yup, and we’ve got Tom Norman on the list too. I don’t remember meeting Tom, do you?

Carlos: Yes. He was on our Trail Mix, but I did run into him at the opening social. He was a buddy and so we chatted. Tom is here from the East Coast and it was nice to connect with him again. Now another one that we both talked to kind of in sessions so actually in the convention center, and it’s Brent Unruh. I’m pretty sure I’m pronouncing that right. He told me how to pronounce it. Hopefully I’m getting that right Brent. But BrentU on Twitter and he is the one who brought us the question because many people talked about flying and his question was, “Would you still choose flying if you can only go 3 miles an hour?”

Steve: Oh yeah, that was long time back on the podcast, SQL Family session wasn’t it?

Carlos: That’s right. So you’re long time listener Brent. Thanks for reaching out and connecting with us.

Steve: Yup, and then we also talked with Andy Levy from Rochester.

Carlos: Yes, and we’ll probably bring him back up maybe when we get to the sessions but nice to meet him. Contributor on the DBA Tools team and it was nice to connect with him.

Steve: Yup, and I know he and I have been back and forth on LinkedIn a little bit since the conference talking about Database Health Monitor.

Carlos: Oh, got you. Ok. Another DBA Tools regular is Constantine Kokkinos. It’ was good to meet him. And he is the one who got me the, well I guess so Andy had them. He was the one who brought them but I love the DBA Tools.

Steve: You know, there were quite a few members of the DBA Tools group there and it was good to meet Constantine.

Carlos: Yes. I don’t think Claudio was there but I did talked with Rob. Unfortunately I did not meet Chrissy, wanted to do that but for a reason didn’t happen.

Steve: Oh! Yeah, I got to meet Chrissy in the speaker room. Sort of they are practicing. I heard her talking and I did not know what she looked like but I just totally recognized her voice from the podcast.

Carlos: Oh, got you. Ok.

Steve: So I got to say hi and met her there as well.

Carlos: Oh very nice. And then a couple of our MVPs want to give shoutouts to Argenis Fernandez, Brent Ozar, Cathy Kellenberger, so thanks for coming out to the Trail Mix event and talking with us, giving some of your thoughts. I know everyone appreciate getting your thoughts on some things and mingling with us so we do appreciate that.

Steve: Yeah, it was an interesting group to get to know there.

Carlos: Sean Barry is another one. Good to meet him. Kevin Wilkes, long time supporter and previous guest on the show. Adam Jacobson who is actually someone who have introduced us to a client and happen to be there, so it was nice to be able to catch up with him.

Steve: Yup, I did not get to talk to him as much as I would like to but we did talk briefly.

Carlos: Yes, and that was a reoccurring theme I think for Summit is, and that was a surprise. I mean, particularly with the podcast and that’s really where and at the SQL Saturdays I feel I want to be connected. That’s one of the things that I want people to know me for if you will is being connected. And I was surprised to just how many people I wasn’t able to talked to, so I got like empathic, Oh yeah, I got to follow up with them and ended up, I’m at home.

Steve: Well, you know, on that point. I mean, you talk to some people and they think, “Oh, you’re going to a conference for the week. It’s like a vacation. Time to relax.” But, holy cow, it was busy. I mean, one, it never stopped it seem like which is awesome. I mean, I really enjoyed it but it’s certainly no vacation. It’s busy.

Carlos: That’s right, that’s right, and then from our perspective as well we are trying to, like everyone. Not saying that it’s unique to us but trying to understand the ramifications of all the new features that Microsoft is dumping out there on us, the keynotes. We had that keynote on the Cosmos DB. That’s something that I think all of us are trying to understand, ok, what’s our future is going to be like. What’s all looking like and understanding some of those concepts. That takes brain cycles as well.

Steve: And speaking of that future, I mean does that bring us right into SQL Server News?

Carlos: It does. Yes, let’s do the SQL Server in the News. So we ran into Travis, and Travis he’s the PM for the Linux migration and I’m not exactly sure how his role has evolve. I think it’s still something that he is still doing. But we were chatting and of course they had mentioned many of the new features. We won’t cover all of them today. We’ll save some of those and kind of spread them out for future episodes. What he mentioned to me and he said that they may be announcing everything that they have just put out and SQL Server obviously became live at the beginning of the month. They are going to announce VNext as potentially as early as next week.

Steve: Wow! So if you’re not on SQL Server 2017 yet you will soon be behind.

Carlos: Yeah, that’s right. You just made to 2018.

Steve: VNext will be taking over in the near future perhaps 2018.

Carlos: Yeah, so again, just kind of continuing that cycle. And you used to think about the platform going on to Linux and all the additional features that they are trying to integrate there and what the monumental challenge that is.

Steve: Oh yeah. It’s really interesting because part of it is there are so many new things going into SQL Server at the rate that it’s going out right now and other Azure related databases and things. I mean, it’s really tough to figure out, ok what are the things that are really going to stick and are going to be the fundamental pieces that we’re going to need to know going forward, and what are the things that are maybe a little bit of hype and maybe don’t end up being that big of a deal.

Carlos: You know that’s right. I think the comment is we don’t need to feel. And even us as consultants I feel like we’re in a pretty good position because we get to be lots of different environments and touch of this new technology sometimes. But a lot of times I can feel like I’m in the Stone Age because all the new things that people are trying to do. You know, cloud this and cloud that. I think companeros you don’t need to be afraid if you’re not on VNext. Don’t despair.

Steve: Right. And I think, but that’s part of the benefit of the conference too is you go and you’re exposed to a lot of this new information. And it sort of pushes you out of that comfort zone that you’ve been in for the last couple of years perhaps.

Carlos: Sure, exactly. And if nothing more like to be educated how things are changing and the types of problems that people are trying to solve. Your current organization may be not trying to solve that problem but if you find a problem that you’re then interested in again you’re creating value and then trying to go out and solve that for a different place or a different organization. Who knows what will happen, but being in a position to be able to do that is valuable.

Steve: Yeah, absolutely.

Carlos: Ok, so ultimately today it is Steve and I are doing the retrospective, we will have some shownotes. Now, I would say pictures and I guess we can get into this a little bit but we as busy as we were you could tell that we’re not great social media people, and do not take nearly enough pictures.

Steve: Yup, and you know, that’s the thing I find when I’m at a conference I get so busy and I enjoy talking with people. I just don’t think to take out the cameras down for shot where like there are others who say, “Oh look, I got a picture with this person with that person.” Wow, I just didn’t have time to do that. I think I need to make time for that next time.

Carlos: I think we are showing our age there a little bit, right? We’re not the selfie first kind of people.

Steve: Yeah.

Carlos: And that’s like for Twitter. I mean, I like to be on Twitter and put some things out there but I’m definitely not one of those people that has to put my every waking moment detail on there. And then you get busy if that’s not the habit, right? You haven’t trained yourself to do that then the day goes by and you’re like, “Opps.” In fact today I was actually replying to some tweets that happened during the Summit, so that was kind of embarrassing.

Steve: Yeah, it’s like you need an extra 4 hours of each night to go back and catch up on everything.

Carlos: That’s right. My old man disease is already kicking in pretty good. Particularly in the West Coast time, right, I’m there in Seattle or being that my body is on East Coast time rather. I’m in Seattle, I’m ready to go to bed. But the shownotes episodes are going to be at sqldatapartners.com/summit.

Steve: Or at sqldatapartners.com/117 for the episode number.

Carlos: Yeah, so as far as some of the other things that we did. So Tuesday was kind low key, actually we’re in Seattle. I know that there was like the user group meeting or the chatgroup leader meetings or whatnot. We actually had some client engagement meetings so we took advantage some of that time particularly with me being on the East Coast to do a little bit of that. But we didn’t get to check in and I started meeting people.

Steve: Yup, and on Tuesday night there was the Welcome Reception for everyone, and we went to that. That was a lot of fun and got to see a number of people that I hadn’t seen in a year or so.

Carlos: Yes, and if you missed me, I’m sorry I was trying to stand out as much as possible in the Mardi Gras suit.

Steve: Right, some people might have thought you were an entertainer of sometime with that suit Carlos rather than actually an attendee.

Carlos: Yes. I was accused more than ones, “Are you hired to…” “Nope. Just me.”

Steve: Well, that was a pretty awesome suit and if you haven’t seen Carlos in his I guess Mardi Gras suit we could call it?

Carlos: Yes, that’s right.

Steve: Yeah, take a look online and there are some pictures of him on that.

Carlos: Yes, we actually get those up on social media. Now, one thing I thought was interesting and I don’t remember. Now, I missed last year’s Summit. I don’t remember them doing live music at the welcome.

Steve: No, that was different because last year I think there was not a live band last year. It was recorded music and they were doing like slideshow or something like that I think last year.

Carlos: So I thought it’s nice to have live music but at the same time it made a little difficult to hear sometimes.

Steve: Right, and we are standing chatting with someone and you realized that it’s so loud you’ve got to be six inches from their face just to hear them. It makes it tough to communicate. If I had any say in it, I would love to say, maybe we’re getting old, but turn down the music. That sounds like an old man story there.

Carlos: Yeah, I really have to rename this “The Old Man Review”.

Steve: I guess we’re showing through.

Carlos: I agree, it’s one of those things. I don’t know if we could have a little bit more time. Of course people come in and out of that as well, but maybe pausing that music or giving us an hour before it plays. I don’t think they did take a rest, but yeah, I don’t know.

Steve: And of course, Tuesday night was the Volunteer Party.

Carlos: Yes, and so again more old man. I stand at the welcome reception a little longer than you did. I know you and Derek went over to that. I did not want to walk over there. I guess I was too cheap to take an Uber. Oh and that was when Andy Leonard actually gave me a book. So I had this book that I’m like, “What the heck I’m going to do with that?” So I ended up taking it back to the hotel room to drop it off and then deciding, “No, I’m not going to make it out there.”

Steve: Yup. Well, so I ended up going to the volunteer reception or volunteer party I guess. What I like about it is a couple of things. One, free food. So getting to meet some of the other speakers, some of the other PASS chapter leaders, some of the other volunteers, I mean, it’s a great group of people. What I would say is sign up and volunteer in some way if you’re not part of that because then you could be part of it.

Carlos: Yeah, that’s right. And you meet the Costa Rican guys, right? The Hispanic guys?

Steve: Yeah. I guess there were 4 or 5 of them and I think I ran into them that same party last year. Yeah, good group of guys. Bright guys from Costa Rica.

Carlos: Yeah, good times. And I did the mistake of then trying to go out afterwards because the DBA Tools Team had like a reception I saw on Twitter. I’m like, “Oh, that’s just right here. I’ll go the block.” When I got there they were actually all headed to SQL Karaoke which I then felt stupid because I just got in there and it was ending. Here I didn’t want to walk to the volunteer thing so then I ended up walking all the way to SQL Karaoke and if you know where that is. It’s like on the borders of Seattle, so I walked a lot more than that what I wanted to that evening. But it was all good, so one of the guys I got to meet in line actually. Again, feeling stupid again because I didn’t hit to that ticket to get into SQL Karaoke so there was that whole thing. But the SQLBits, no not SQLBits. SQLBits is Rodney from Florida. They guy who wrote the SQL Server 2016 song.

Steve: Oh yeah. I forgot his name but he was at SQL Saturday at Redmond last year.

Carlos: Right, so he was in line there behind me and so we started chatting and that was a lot of fun.

Steve: And he was going to Karaoke even though he is a real musician.

Carlos: That’s right. But we ended up, “Ok let me see. I’ll show you how this is done.”
I also ran into, so in the opening reception another name, Bert Wagner. It was funny because I walked up to Bert and like. I know you, I can’t remember where I know you but I feel like we’ve met. And then it wasn’t until Wednesday he actually came to our SQL Trail event and like, “Ah! Now I remember from the sign up list.”

Steve: Oh yeah. And I remember meeting Bert as well and talking a little bit about his job and what he does. Yeah, definitely cool.

Carlos: Right. So one of the takeaways and one of the things now, I’m not sure that we’re maybe promoting this idea but we found that watching these keynotes from the hotel is super convenient. Now, that does mean that you’re kind of miss out on breakfast where you have to kind of go on your own way for breakfast. But that’s something that we did and just worked out really well.

Steve: Oh yeah. And what I liked about that was because normally you go to the keynote and it’s jam-packed full, and it’s hard to see. I mean, depending on where you’re sitting in it could be a challenge. So just watching from the live broadcast and watching from the hotel room then you’ve got the chance while you’re doing that when they mention something then go look it up. Go online and take a look at it while you’re watching it. And I think that I like doing it that way rather than being there where you’re just sort of stuck in a seat and glued to the room.

Carlos: Right. I’m not sure if it was my tablet or what. mood of your laptop. It seem to be fine but you do sometimes hit that streaming issues. But once they got it going, PASS did a good job on their side to at least make sure that plenty of streams were available.

Steve: Yup. So then Wednesday there were some great sessions and I know I got to see a couple of those. Then later in the day on Wednesday we had the SQL Trail Mix event.

Carlos: Yes, we weren’t quite sure what to make of this, right? How this is going to turn out and special thanks here to Sean and Jen McAllen. We actually reached out to them to get their thoughts they’ve been doing several events there at Summit. And we thought, ok, because you do hear about all these different events would there be room enough for ours.

Steve: Would people want to go to ours?

Carlos: Would people come? That’s right. So it’s Steve and I sitting there eating peanuts, so these are all kind of thoughts kind of growing through our head. So we said, “Ok, let’s do it. And we’ll reach out obviously to the companeros and to our podcast guests and we’ll kind of go from there. I think we were pleased with the turnout. Who was there, thank you, so if you were there companeros it was great to meet you. I guess there are a couple that we want to specifically point out here but we had a great time chatting with you.

Steve: And I think what really surprised me was the small amount of promotion that we did, the small amount of time we had to promote it, and how many people actually showed up in that small amount of promotion time.

Carlos: Right, so that is true. If you didn’t get an invite for whatever reason, so we apologize because we did fill up way faster than we thought it would. So again, we were kind of thinking 25 people would be super successful. That’s was kind of our number. We knew some people wouldn’t come. We ended cutting off registration at 45 and that happened in a matter of about three hours, so it was super quick.

Steve: Yup, and I guess just a little bit of background for those who don’t know that the SQL Trail Mix was. This was an event that we hosted where it was a restaurant/bar where we reserved some space. We got a number of tables and we bought appetizers for everyone although we didn’t buy drinks for everyone. Everyone was on the hook for their own drinks. And it was just show up and chat. Show up and meet other podcast listeners. Meet other podcast guests and give us feedback on the podcast. I had a great time. It was awesome.

Carlos: I agree. It was good. For example, rubbing shoulders with James, so James Youkhanis He has been on Twitter. We’ve exchanged emails. He has suggested several podcasts topics and so to finally been able to meet him, rub shoulders with him was actually pretty cool. The guys from Connecticut, Will Conklin and Chris Albert, long time listeners of the podcast, so we shoutout to you guys. Thanks for coming up and chatting with us. And then Josh Simar. I can’t remember all of a sudden where Josh is from, but he had mentioned that he’s been listening to the podcast and it was great to chat with him as well.

Steve: And then also, Andre Ranieri is a friend of mine that I’ve known through PASS events for quite a while and he showed up at the event and have a chance to catch him up a little bit.

Carlos: Right, and that was an interesting thing. While we had our registration, I don’t know, we probably ended up with 30-35 people, because people did come in and out. Registration kind of help but it didn’t indicate everyone so I guess there is some thought we need to put into how to do that. We just want to make sure that we could provide for everyone who did show up.

Steve: Yup, and I think if we’re going to do it again or I hope we do it again. I think maybe going a little bit bigger but not massive would be good.

Carlos: What was interesting because I think kind of worked to our favor potentially is that we didn’t realized that we’d also be going up against seven world series game. I think that impacted just the bar itself because when we talk, “Yeah, out Wednesdays are dead. Why don’t you come it would be great to have a group.” And then we get in there and I’m like this is dead.

Steve: Oh yeah, it was packed. It really was.

Carlos: Ultimately we would very much like to continue to meet up with you guys and I’m not sure how well this would work at other events but if we have opportunities to something like this of course we’ll try to do it again.

Steve: So then that brings us to Thursday.

Carlos: Yes, that’s when Summit got real in the sense because you had your session. You had the first of two sessions.

Steve: First of two sessions. This was the first time at PASS Summit that I actually had two sessions to present. The first one was on Database Corruption, and I was on early afternoon I think. It was one of those things where as a speaker you always worry about getting connected, at least I do. Is the presentation going to work? Is the laptop going to work and all that? The laptop I’ve had for about 2½ I’ve never had a problem. Everywhere I’ve gone, you plug it in, it work just fine. It’s even work fine at two former PASS Summits on similar equipment. But still every time we go I tried out. There is this speaker ready room that they have to try it out on their projector and make sure it works. Try that all out ahead of time and get there and it will not for some reason the multi-video output will not connect to the projector. And of course, I’m the kind of guy when I present, that as soon as the previous speaker is off the stage I am there taking every minute I can to make sure that it’s going to work. And it didn’t work, and it didn’t work.

Carlos: And the AV people came up and they were trying to help you.

Steve: Yup and some of our friends Derek Bovenkamp and Randolph West. I mean they were in the audience jumping in trying to help with, “Try this, try that. Here is the USB stick, things like that.” And finally we are about 7 or 8 minutes late when I started my session which I apologize for that and I felt so bad for doing that. But I finally just put it on a USB stick, put it on a different laptop and did the presentation from PowerPoint on the other laptop. Not optimal but what I learned on that I guess is be prepared. Back to the Boy Scout motto, right?

Carlos: We’re going to the Boy Scout motto, there you go.

Steve: Be prepared, which I thought I was but be prepared with two laptops, which isn’t a financial thing because I had an extra laptop sitting at home. I could have prepared and brought up with me. But be prepared, bring two laptops and a backup plan, and a backup to your backup plan because whatever you think might go wrong, whatever could go wrong would go wrong eventually and it did in that session. I think I got a lot of feedback at the end that people were happy with it but I just felt bad getting started 7 or 8 minutes late because of that technical difficulty.

Carlos: And thanks to PASS for providing another laptop, right, PowerPoint loaded that you can move over pretty quickly. I think it worked well. The way that you gave that was kind of a story based idea, concepts. It was an introductory session, and so as you told the stories that kind of walkthrough some of those things. I thought your audience was super interactive. They were engaged, they were, “Urgghh”. When you tell a struggle they made it their own struggle or you could sense that they have been there, those kinds of things. And so I thought that it went really well.

Steve: Yeah, and what was different with that session was an introduction to database corruption. I hadn’t plan to do that sort of deep dive type bit level corruption fixing that I’ve done in the past. If I had planned that it wouldn’t have gone so well with the technical difficulties we had. But because it was more sort of the high level what is corruption and how do you deal with it. I think it worked out there really well and it was different than how I would have presented corruption in the past as well. I had a good time with it and I think people enjoyed it. At the end of course, though everyone comes up to ask questions to the speaker and you’re trying to get off the stage and all that because there was another speaker coming in we took the conversation outside to the outside of the room and continued for about 20 minutes with different people that had questions about corruption. It was really cool. I had a good time with that.

Carlos: I think that’s when we ran into Brent as well, Brent Unruh. Yeah, so good questions that’s always make you feel good as a speaker. People would come up afterwards and want to continue the conversation. It’s a mix bag, right? As an attendee you want to get that conversation in, then there’s other people in line and you want to go off to your next session as well. And then as a speaker you want to try to accommodate as many people as you can but we know that some of you probably slip out without having to have that conversation, so we apologize. I think another reason why we want to have an event like that is so that we can have that connection time in a less time constrained environment.

Steve: Right. And also I’d like to give a special thanks on that to Randolph West because last year at my corruption presentation and this year with the corruption presentation, at the end when you’re done speaking and everyone comes up to ask questions he helps filled some of those questions. And that was really awesome because he gave me a chance to get a drink of water and listen to what are some of the other questions were while I was packing up my laptop at the end. Thanks Randolph, I appreciate that.

Carlos: Yes, very cool. So Thursday night, a couple of shoutouts to the guys that helped organize the game night. Now ultimately it’s done through PASS but Kevin Hill was instrumental. I guess that’s how I found out about it. No, that’s not true. I found out about it through you, and then after investigating it Kevin Hill was kind of helping promote it and organize it a bit. And then he and Matt Cushing, one of those I have interacted with on Twitter. It was nice to meet him and shake hands with him.

Steve: Yeah, the game night was interesting because you come in and it’s just a bunch of tables and board games. And you first walk in the room and it’s a bit intimidating to look at all the people playing board games and wonder, “Ok, where I’m going to sit down?” And you just kind of wander just around a little bit, “Hey, there is a game that’s starting right now. I’ll jump in and play.” And it was awesome. I had a good time.

Carlos: Right. I think that worked really well. I think there was some confusion because people didn’t realize they had to pay to go. And I think that $12 was to buy a drink. Again, as somebody who doesn’t drink alcohol maybe they could have done without that, but I think it’s one of those things people liked it. But it did cause some confusion for folks.

Steve: Right, and I think also some of the $12 of that Game Night goes to the cost of the staff coming in and taking down regular seating and putting up tables because that was in the presentation rooms, so they had to come in and change out the room for that event.

Carlos: Right, and that makes sense, so the venue probably have cost associated with that as well. You’re exactly right. That makes a lot of sense. And then all of a sudden we’re at Friday. Well, the other thing I guess I do want to say about Thursday before we get to Friday is that I didn’t spend a ton of time in the Sponsor Area but Thursday was the day that I was able to do a bit of that. I guess special shoutout to the Windox guys, so Paul and his team. Got to meet with his partner and talk with him a little bit. It seemed a little bit smaller this year that in years prior, the Sponsor Area.

Steve: Yeah. That was interesting and I wonder why. I think the big players were there that you normally see and they had big booths. And then some of the smaller players that you see like Sean and Jen from MinionWare. I mean, they had their booth there. But it sort of the overall it just didn’t seem quite as large as what I remember in the past. Maybe it was just a different layout that made it appear that way too.

Carlos: Yeah, perhaps. On the other hot side is that we’ve also gone to some very large conferences like the Microsoft Partner Conference or Ignite. These are like 25,000 people conferences, and so all of a sudden you get that magnitude, right? And not to say that, I can’t remember what the attendance was but between 4,000 and 5,000. Not to say that that’s small but you’re talking about a quarter of a size and so it can pack things pretty quickly.

Steve: And then Friday, so we both had sessions. I had a session that was right around like 11:00 – 12:30 so right before lunch, and that was on Backup and Corruption. And although my laptop didn’t plugged in quite as easily as I would hoped because I spend hours working on it the night before to make sure it was going to work. It worked, it was just something to do with the video switching, Switching from full screen PowerPoint over to Management Studio. I had a little difficulty with there but it was better than the day before, and the topic was on Backup and Recovery. The thing that I had a lot of fun with was automatic restore script where when you write your backups, it would write the restore script at the same time. So that if you ever have to recover from backup you just find the restore script associated with the backup or transaction log file, open it and then uncomment a couple of lines and you’re running your restore. That was sort of the crowd pleaser I think in that presentation and it was a lot of fun.

Carlos: Oh very nice. Yes, so I was running a bit behind and I know that I had offered my laptop as a substitute if you needed it and by the time I got in there, you’re on the agenda, the presentation was fixed, and your corruption session was full. But the backup session was even fuller. I mean, there were lots of people standing and I thought, “Hmm. I’m not going to be able to get in this one.”

Steve: No, no. So the laptop problem was resolved and the presentation went smoothly from that perspective so that was a lot of fun. So then your session, Carlos.

Carlos: Yeah, so I was at the end and.

Steve: Establishing Baselines, right?

Carlos: Yup, through monitoring or managing performance through baselines was that kind of idea but taking baselines and thoughts around that idea, and so the last session of the day. I mean, obviously at that point and people are Summit is now wearing on them and everybody is a bit tired. So first I will say thanks if you were there, so again, to Will and Chris. I know who were there. Of course you were there, Andy was there, Mariano came, so thanks for you for coming out. That was my first time speaking at Summit and I feel like I do pretty well in group presentations. I’ve given lots of them and feel at home. I wasn’t so nervous and so I actually got up on the platform. And for whatever reason, that visual like being elevated. It was like, “Whoah.”

Steve: I’ll tell you. I personally think the last 5 minutes before a session starts, even if everything is ready to go and going smoothly, the last 5 minutes is the stress factor, right there. And then once you start talking. I mean I could see you getting nervous ahead of time, once you start talking then it really smooth over at that point and everything was cool I think.

Carlos: Right, so to help us some of that. Like I tried, which didn’t seem to go pretty well was creating a PowerPoint or some way to potentially engage the audience a little bit beforehand. Now I started just asking some questions so I had a little PowerPoint. I call it “Where in the world?” just like a photo and then the idea was you guess where it was. That didn’t seem to engage to many people which I thought was curious.

Steve: But there were a lot of interesting pictures that kept people occupied for that 5-10 minutes before the session started.

Carlos: Right, I guess there is that. But then once we get in there I felt a bit more comfortable. You know, you’ve given that session several times and that’s kind of when the habit start to kick in. I was happy. One thing I was happy about with the session is that we were able to still had some conversation. One of the things that I don’t particularly, I should say care for. My style I guess that’s a better word. I like my style to be a more interactive. I feel like if I’m just sitting there the only one talking in a presentation then I’m doing something wrong. I was glad that we were able to get some feedbacks, really some comments, right and kind of start a little bit of dialogue there.

Steve: Yeah, with that it almost seems like the larger the room is, the more difficult it is to get people to interact because they are afraid to speak out in the bigger room.

Carlos: Exactly right. That’s right and I guess, to that point we had. The room is, it wasn’t that humongous room because some of those rooms were big and I don’t remember exactly how many people would fit in there but we probably had, I don’t know, I’m going to say 50 people in there but I guess that was just on the cusp of still people being able to interact with each other. Once you get more than that it is like that Backup and Recovery session for example would have been a lot more difficult to do, I think.

Steve: It was a good session and I think it was the end of the conference session which is a tough timeslot but it went really well and I enjoyed it.

Carlos: Yeah, well thanks.

Steve: I thought one of the things I caught that’s really interesting is you’re talking about different products and tools and you bring up DBA Tools. I forget exactly what feature you mentioned but something with DBA Tools that you demoed and then Andy Levy, afterwards comes up or he said something about it because he works on it, and afterwards he came up and he said he actually worked on that thing, that feature that you were showing. That’s right so he had check in code. To him it was kind of a cool thing that the code that he has worked on has been shown at Summit. Only the best for companeros, right? So kind of cool to be able to show work and then that was it. We kind of went home. We went out to dinner with Derek and I headed to the airport.

Steve: Yes, so unfortunate story at dinner. We made a new rule that anyone who chokes and needs the Heimlich Maneuver has to buy dinner.

Carlos: Oh man, that was a bit scary.

Steve: We sat down for burgers and onion rings and I took the first bite of an onion ring. It wasn’t that big of a bite and somehow it got lunged in my throat. I wasn’t breathing. I did not have a working airway at that point.

Carlos: It was blocked. I was talking with Derek and you started kind of making a noise. I looked at you and your face was read and I was like, “Hey, are you ok?” I kind of pat you on the back and looked at your throat and I was like, “Oh man, we have a problem.” And we were in a booth which made it worse, we’re sitting next to each other and I was actually surprised that how quickly you were able to get out of that booth.

Steve: I wasn’t breathing. I had an extreme motivation at that point because I had an onion ring lunged in my throat.

Carlos: Yes, so that boy scout training coming back. That was a bit of a scary moment.

Steve: Yes, so very special thanks on that one, Carlos. I really appreciated that and I ended up buying dinner because of it.

Carlos: We’re glad that you’re still here and we hope that we never have to invoke that rule at dinner again.

Steve: Yes, I hope that it never happens again either, to me or anyone I’m dining with.

Carlos: Yes, that’s right.

Steve: But what a way to finish off a week.

Carlos: That’s right with a bang, that’s for sure. Ultimately that was our conference. We appreciate you companeros for sharing those of you who were there. For those of who weren’t, we’ll hope to connect with you at some other time.

Steve: Yes, and maybe next year at our SQL Trail Mix.

Carlos: That’s right, yeah. I think that’s going to do it for today’s episode. Our music for SQL Server in the News is by Mansardian used under Creative Comments. You can reach the shownotes for today’s episode at sqldatapartners.com/summit.

Steve: Or at sqldatapartners.com/117.

Carlos: And as always if you have thoughts about what you want us to be talking about on the podcast, you can reach out to us on various social media methods. One of them being LinkedIn. You can reach out to me I’m Carlos L Chacon.

Steve: You can get me on LinkedIn as Steve Stedman and we’ll see you on the SQL Trail.