I chat will Louis Davidson about database design and get his thoughts on some strategies about how to go about making your design and what tools are available to help you. It may surprise you that, according to Louis, the data type is not the most important component of database design. I hope you enjoy the show.
Transcription: Database Design
As always, we are interested in your feedback and you can leave that on iTunes or on Stitcher. If you would like to reach out via twitter, I am @CarlosLChacon. Please be sure to check out all the show notes at sqldatapartners.com/podcast and as always, Compañeros, welcome to the show.
Carlos: Today our guest is Louis Davidson. Today we are talking about logical and physical database design. Louis has been in the IT industry for about 18 years, as a corporate database developer and architect. The majority of his experience has been with the Microsoft SQL Server in every version that has been released since 4.21a.Now I didn’t take the time to look that up, but I have no idea when version 4.21a came out. I’m going to take a guess that it’s been a while ago.
Louis Davidson: It was a while ago.[laughter]
Carlos: Currently he is the senior data architect for the Christian Broadcasting Network. Louis hails from Nashville Tennessee and he’s claimed fame as he can put his pants on two legs at a time. Louis, welcome to the program.
Louis: How’s it going?[laughter]
Carlos: If you want to connect with Louis, he’s on Twitter @drsql and he blogs at SQLblog.com. Today Louis like we mentioned we want to talk about database design and this can be a big topic. One that I’m not sure that a lot of our data professionals are paying as much attention to as maybe they should.One of the reasons I wanted to have you on the program today was you’ve written a couple of books on this subject, the most recent being, “Pro SQL Server 2012 Relational Database Design and Implementation.” You even invited Jessica Moss who hails from Richmond Virginia.
She’s near and dear to our community as our heart to write a chapter in there about reporting design. I can only assume that bringing those books together is quite a bit of work?
Louis: It’s not easy. It’s really hard. I had lots of different co-writers. I really liked Jessica.
Carlos: Very good.
Louis: Some others and I actually invited her to write more, but I couldn’t get her to do it. You really need more than one chapter on reporting design if that’s going to really inform you.
Carlos: Very good. That book is available on Amazon.com. It’s also available from Apress if you’d like to go check that out. Let’s start off with talking about those two items, right? Logical versus physical design, what those differences might be, and why folks might want to use both.Now my experience is; done some development work for a dot com, right, for a start-up, the business folks seem to understand better the logical model, and by the logical model I think we mean all the words are spelled out, trying to save syntax or space, kind of talking about big picture stuff, and really those containers.
Louis: Yeah, and let’s even start before that.
Louis: There’s a conceptual model, is probably the best place to start, and there’s a lot of discussion about what that means, and whether it’s just a napkin design or if it’s something more formalized. I like the definition that it’s basically tables and relationships. You sit down with your user, or your documentation, where ever you’re getting your design; your requirements from, magical requirements that everybody has, beautifully documented.
Carlos: Right, that’s right. That’s exactly what you want.
Louis: [sarcastically] Yeah, it’s in like two or three pages of prose someone’s put together for you nicely, and then the Easter Bunny brings you lunch, because all sorts of magical things are occurring here.You sit down with somebody, and you say, “What do you have?” “We have customers, and they order products.” You write a little box, and you write “Customer,” and you connect it to “order” and you go OK. What does that mean? How do you do that? How do they pay for it?
Can they pay for it with multiple ways? Can they do this? How do you invoice people? How do you ship people? What do you make, or what if it’s partial shipment? You start drawing that out, until you have this picture of what the database is going to look like, you know 80% right?
Louis: You test it, you have your requirements, you have your scenarios that customers want to do, and you say “Customers needs to order three products, this way, and they need to pay for it with two credit cards and a cheque”, and you just say “Can I do that?” OK I would put this in this table, this in this table, this in this table. So once you run through all these scenarios, you have that. You said wireframe earlier, you have that frame right?
Louis: For how the database is going to look, so once you get to that, you then go to the logical model, and if you’ve done a really, really great job at doing the conceptual model, and working with the customer, the logical model’s easy, it’s just decorating it with attributes about things you want to store.You have a customer, how do you know their name? They have first name, last name, middle name? Oh well, only Americans have that one, Only Americans have that one, but that’s our standard and there’s other standard around the world.
There’s different ways you would do that. Some people might have a table for each name, or you might have 15, what the customer wants to get you start modeling that out and start dealing with the details, and that’s where you get a lot of annoyances with the people who want to build and the people who want to see something is…life is messy.
And all those messiness are going to come out sooner or later.
Carlos: I think one important concept that you mentioned there was the scenarios.Too many times we do maybe try to get in there and start throwing some structure around some things, but being able to understand how a user is going to interact with this application, all of the…maybe not all but at least 80% was what you…the 80/20 rule that you mentioned.
Here are the major ways in which this person is going to interact with them and the things that the business is going to want either capture or be able to report on, describe, offer to that end user and that will help dictate how that goes…
Louis: …because guys who are building this are starting to think about an architecture and if you’re used to our procedures one of the things I might do is go to those scenarios and say user clicks web page, web page causes this store procedure.I don’t necessarily have all the tables designed or ready to go or anything, but I know we need an interface that looks like this and it needs to have these answers. When you have all the scenarios that we’re going to implement, you have all the interfaces you need to build, everything else is just, details.
Carlos: We have a little bit of a chicken and egg scenario here a lot of times and I know a lot from the data professional side or those who mostly deal with data, and then we have those the writers, the developers and then the business, they kind of all each have their unique mindset of how they want to start putting all this together.Again as being a data person I thought or had the opinion at one time that, yes we should get all of the data structure complete before we then pass on or engage the business or the developers because once it kind of gets out, once the cow gets out of the barn, you can’t get it back in that thing.
Louis: Wouldn’t that be nice? OK, here’s done. Then we move on the next project, you guys just implement my design and you can’t do that. The argument between the people that have the waterfall myth, I find horrifying, in some respects because other steps of the waterfall method are necessary.You can’t build something if you don’t know what the requirements are and do a little design. Obviously we can’t spend too years designing something either, so we do little bits and pieces along, but you still need to think about where you’re going or you do end up…because nobody destroys, it’s as bad as building a house.
You want to build a house, you don’t start with one bedroom and then add a bedroom and then add another bedroom, and then now we need a bathroom, so you add a bathroom. I’ve seen houses that are built like this and they’re a nightmare.
The walls are all a different shade because there’s something going on, and so what ideally you do is just tear down the house and build another one. You can’t do that because of the state of the data and the code.
We already had that code written, why would we replace that? Just add another table, add another or that, and so you end up with this mess because we didn’t realize we’re going to have more than 100 customers. We didn’t realize that they needed to order multiple products.
When we built this thing we only had one product so we didn’t have to worry about other multiple product line orders or something. You can go too far with that, because you might also want to say, “Maybe one day we’ll have thousands of products and that we’ll need to be able to order thousands of products on a single order.” Nobody does that.
There’s a reasonable real limit to what people do. You have to balance the future versus getting something done, but there’s the level in there that’s the right level, that feels right, you don’t have to completely rebuild everything every spread if you are using agile or whatever methodology you’re using.
Carlos: I think it’s a little bit more of an art than a science. I think the important thing is to get some agreement right from the different groups that you’re willing to at least move forward with these things, but I think as someone who’s responsible for design, you always kind of want to be looking for the growth or for the new opportunities or new requirements that might come your way and don’t paint yourself into a corner so to speak.
Louis: It’s a lot like sports. Let’s take basketball. If you have five great players and none of them are working together, you’re going to lose. If you have five mediocre players playing those five great players, but they’re all working together on the same page. And they’re thinking the same way, they’re going to do a lot better.The real problem I find is every DBA makes fun of developers. Every database developer makes fun of the developers and back and forth, “The DBAs are evil and we need to go around them and get things done.” If you have people who work together, “I know what to do with the database. I know how to make it work for you so that you have to do less work.”
While they know how to make the UI do things. I don’t want a bad UI. I don’t want to go back to the clunky days where we had these green-blue screens that had a bunch of…And if you had five tables, you had to go to five different screens to get to the data.
Nobody wants that. Just because you can have two addresses, doesn’t mean you have to shove that to everybody. 99 percent of your customers have one address. Build a UI for that, you can’t not build through the second address if you allow it as a business.
You can’t have to have some really horrible method of going to the back end and entering it, and then every report’s messed up. I’m bringing up reports because that’s the number one reason why we store data. It’s to report it.
Carlos: Yeah, get it out and report on it. It’s like we were saying…
Louis: We just take cash and…Give us your product, give us some cash and move on. [laughs]
Carlos: It’s interesting that I get that idea, which I agree with, is the siloed nature of our development teams. They can be very territorial. Instead of sharing and passing the ball, to use your basketball example, they want to hold it. They’re like, “No, this is mine and you can’t have it.”There’s probably a little bit more soft skills that are needed in the database design than, “Here’s the ABCs of how to put that together.”
One of the components that I’ve seen, again working at a dot com was, before I wanted to have all the data, have all that completed. We used the word “wire-framing.” I think of wire-framing as drawing up your application on a napkin type. You get the business has a little bit of input to say, “OK, I think it should flow like this,” or, “This is what would happen when they click on this screen.”
A lot of that comes because we don’t necessarily have our scenarios well-defined. Doing a little bit of that wire-framing with the screening and what it would look like. Putting the data together at the same time helps things move forward and I think it also helps force the teams to talk to each other, to work together.
If being a data modeler is not my full-time position, I’m a DBA, or, I’m even a developer. What are some things that you as a database developer or database design modeler might recommend that these folks know or become a little bit familiar with? So they might better contribute to the application for being better?
Louis: I always suggest people understand normalization. I know it’s a boring topic and it sounds arcane. It’s been around for longer than I’ve been doing databases. If you understand that and you get a taste of why databases are built the way they are, you’ll probably do OK.You’ll know that storing five bits of data in the same row, that should be in five different rows. Is going to mess you up, is going to be a problem. To the way you work.
Carlos: When you want to get it back out, like you mentioned with a Georgia reporting.
Louis: Yes. I always like cards’ 12 rules, the card rules for the database, to just talk about. That’s one of the things I do when I do a pre-count on this, on the database design subject. You talk about what the idea is behind building databases.There’s two fundamental properties that I hope I’ll get right here because I’m pulling this out of my head. One is logical data independence, and that is, I shouldn’t have to know anything about the data, other than what SQL can tell me. In physical data independence, I shouldn’t know the physical attributes of the data.
I should just be able to ask it. At one level, you should be able to write an application that runs on the Express edition or the Enterprise edition of SQL server and they both just run, depending on how much data you have. There are some features in enterprise edition that it doesn’t quite fit, maybe partitioning.
Most of the things are physical attributes that ideally you should be able to change it run-time. If you were to not have any hands, or any of that stuff and that’s your primary goal. Then you’re in much better shape than if you have…If you thought like a program. I was writing a cursor today to run through a procedure, run this procedure that’s built.
I was thinking, “This is so much easier in some respects than writing SQL because it’s iterative, I get to see each row, every row I was putting out, I was sticking it out to a bunch of variables and I’m thinking, “This is why people do this because it’s easy, and it’s natural as a programmer.”
It was something I couldn’t write in regular SQL because I had to call an interface server procedure. If you think like that, you sort of write like that. Really understanding that is the power of SQL. To be able to put things in one statement and have it do millions of things per your one statement is a fantastic thing.
Understanding how SQL works and why SQL is written the way it is. Understanding that while SQL server lets you do things in many different ways, they keep adding features to the product that say, “Let’s make it easier for the cursor. Let’s make it easier for people to do this, let’s put documents in there in a certain way.”
That’s not the fundamental choice to go for a really normalized database first. There’s things you can’t do in that paradigm. Sometimes, it would be better to just use Excel, sometimes it’s probably better to use no SQL.
Don’t try to use SQL Server for everything. Understand what it is your target is and how it works and the best way to use it. Just learn everything and do it good.
Carlos: [laughs] No problem, I know that’s been a bit of a challenge. As an SQL Server guy, if you will, I think SQL Server should be able to do it all. I have learned, particularly if you mention no SQL. That big data approach…That there are instances where SQL Server’s not the right place.I think this kind of goes back to our team approach is that if you can get a few more heads into the mix and those with some different experiences then you’ll be able to make the best of that judgment there. Ultimately I think we’re doing all these database design, we’re trying to hold hands with everybody, sing kumbaya.
Ultimately, a good database design will lead to good performance, a good application. Or at least, an application that we can tune a little bit easier. Is that…?
Louis: Yes. Once you have a properly normalized database, if you have one table for every concept, generally adding information about a concept is simple and normal. Or we discover that customers have such and such an attribute.If you have to go in and say, “Oh, we have a new concept,” if it’s really a new concept. Then adding a new table makes sense, adding new screens and changing your business makes sense because it’s a new concept.
If it’s not a new concept, and it’s just something you didn’t think of like, “Oh, customers have more than one phone number? Oh, that’s interesting, we didn’t even think about that.”
Maybe you may have said, “The rules said, ‘No, don’t do it,’ this is exactly how we want it, it would be more problem.” If it fits the requirement, and it’s too much to implement, then OK, fine. That’s a change in business. You shouldn’t be smacked up with just little changes to your process. This is true for any design.
Carlos: What happens when you, and let’s use that telephone example just because that’s fairly straight forward, what happens when you come across that instance where, “Oops, I didn’t know about cell phones.” [laughs] “I didn’t know I could get a number through Skype.” And these changes come…
Louis: Then you’re stuck; you want to say, “OK, now we have a customer, we should have a customer phone number table.” A lot of times you end up in a compromise, and they go, “No, we can’t build another table, just add another phone number column.” OK great. Now we want to want them to have three phone numbers. OK, great.We’re allowing them to have one home phone, one cell phone, one fax number, OK great. We can put those. Then they go, “No, you know, sometimes people have two phone numbers.” Now you need to make some data about that.
You end up with phone number one, phone number one status, phone number on type, phone number one this. You end up with fifty, sixty columns out there, we actually had this in a database we work on. It was a third-party thing, with email addresses. This is this email address, and it’s this status and this status. It does this and you can use it for this purpose but can’t use it for that purpose.
It just becomes a mess instead of having just one table that it kind of flows into. Again, you should be ready to…If you’re going to change the business, be ready to change the database, change the data structures.
Carlos: I’m curious to get your take. Fortunately, I think the business…A lot of times we get into a jam because they see the level of effort as the percentage change of a screen, “Well, it’s just another button,” right?
Carlos: How hard can that be? Any ideas or any thoughts on how you can approach the business and say, “You know, what you’ve requested here is a fairly large change. We’re happy to do it, but we’re going to have to create the phone numbers table and makes some changes there.”
Louis: That’s tough. Usually, I’m thinking…The business users I can build that table, I can have that thing changed. I can have those structures set up in about a couple of hours. Or it may take a couple of days, but it won’t take weeks.
Carlos: There’s two components there. One is the data component which…Sure, we can add the table, move it around. But now we’re putting the developers in a pinch because they may have to change one of their processes or store procedures or things like that, reporting.
Louis: Yeah, I’m always wondering why it takes long truthfully. A lot of times just because you have…It’s just like a bad database. If you have a poorly-built infrastructure for your application, you’re going to have the exact same problems as a poorly-built database. It’s unchangeable. It’s too rigid.
Carlos: My thought there is kind of going back to the team approach. You’ve got to make sure that you’re not trying to throw anybody under the bus. [laughs] That you state, that “This is what I’m seeing, I can make these changes, but I also think it will affect this other group.”Instead of saying, “I can have my changes done by the close of business. I’m not sure why they can’t have it done by close of business, but I can do that.” I think, again you don’t want to silo yourself to saying that, “Yes, I can do this”, not including the whole team. Is that fair?
Louis: If I’m in a happy place, of never having to actually be in those conversations.
Carlos: You must just be in a very happy place Louis. [laughs]
Louis: Those are tough conversations, the ones we have to deal with either the politics of it all…I really hate politics, you have to deal with the timing and the…
Carlos: When everything’s going to happen.
Louis: The money and all. I’m a developer, an architect/developer. I’m not a politician.
Carlos: Getting back into some of that design, let’s talk a little bit about how users of SQL Server can start looking at their logical designs, their data models. In SQL Server, we have database diagrams. My question is, are you using database diagrams or you’re using a third-party tool?
Louis: [laughs] I would never use database diagrams.
Carlos: [laughs] Never use database diagrams, Oh boy.
Louis: I’m transitioning from one database diagram to another. We’ve had a bit of time getting it set up and using the high-Faluting…Where you could save it in a back-end repository.We’ve had a little issue in getting that setup. I’m actually building a database right now using what I call the Paul Neilson method. I don’t know if you’ve ever met Paul.
Carlos: No, I’m not aware…
Louis: He’s used to be big in the community. He wrote the “SQL Server Bible Series” up to 2008. He was the lead on those up until 2008, I think it was. That is just, build one script, have a bunch of drops at the top, a bunch of creates come next, load some data, have some tests go, all in one script.The database diagrams don’t show you enough and you don’t feel…I’m not saying that people can’t make use of them. You can’t connect to the database you’re using because they’re it’s a live editor. The worst part about them is, everything you want to show, you have to put in there.
When you use a proper data modeling tool, you can put all sorts of stuff in the model that’s never going to be implemented. Documentation for starters, if you’re stuck in a place where they think foreign keys are bad, and they do exist.
Or maybe you’re using the newer memory stuff, and you can’t actually implement them. You can still put them in the model and see that they need to exist, and know you have to write code so you can make that change.
You can put all that stuff in the model and see it, and edit it, and refine it over and over and over until you get to the place where you want.
Carlos: If I’m hearing you correctly, if you’re serious about modeling or that’s something that you’re doing more and more of, you will have to use a third party tool. You mentioned you were changing…I use the Embarcadero Architect’s Tool. That’s my tool of choice at the moment.
Louis: I was trying no to be.
Carlos: [laughs] No, we can throw it out there, right?
Louis: That’s where we’re actually going to because they have a, SQL Server centric product. That is…
Carlos: Maybe that’s why I’ve been so happy with it. It’s worked out really nice for me. Not cheap but it does the job. My impression of database diagrams, and you actually went a little bit further that I’ve thought, was more of, If you don’t know what you’re data model looks like, you can use database diagrams to at least put some pieces together.Become more familiar. Either you’re a new developer on the team, or heck you don’t have any documentation to this point so you might as well throw something together, you can use this diagrams for that.
Louis: I mean it’s a solid viewer of the structure, but it’s not what I would use as a design tool.[crosstalk]
Carlos: …The viewer is probably the only way that I use Database Diagrams now. If I can’t reverse-engineer, into tool like Embarcadero.
Louis: You can use it on a clean database. Just generate the structures there, see it. You could use it that way if you wanted to. Not everybody has access to those Embarcadero tools because some of them are not exactly free or anything.
Carlos: They’re not cheap, no. I think your point was valid, we should reiterate that again. That the database Diagram tool is a live editor. You make a change to that. You’re changing your database.You can’t do that in your production environment unless you…That’s just the way you roll. [laughs] You have instant access to your developers like, “Hey, by the way, just made this change, FYI.”
Louis: That’s the DBA thing. Every DBA has done it at least once I think.
Carlos: That’s true. In a previous episode, I did admit to obfuscating changing some code around in production that I’d been working on at a test. I actually never made changes to SQL Server using database diagrams at a broad environment. Luckily, I guess I managed to figure that out in test soon enough, luckily for me.
Louis: That change manager would yell it’s all anyways because they add some objects into the database, [laughs] “What are these new objects in the database? Who did this?”
Carlos: “It was me probably. I hope you can’t audit that back to me.” [laughs] Thanks, I do appreciate the conversation Louis, again. For those who want to take a deeper dive into the “Pro SQL Server 2012, Relational Database design, and Implementation” book available from Apress or Amazon.Before we let you go, I wanted to ask, what’s your favorite SQL tool? This can be paid or free. Why do you like it and how do you use it?
Louis: The one that’s changed our lives the most has been SQL Source Control from Redgate. I always had a problem trying to figure out how to check things in. How to do a really good source control?You try to make scripts and then you check in the files. It does that for you in a way that works with the person whose used management studio forever.
You just go into management studio, you check it out, you make the changes, it sees the problems. Everything’s right there in the tool the way you like it.
Carlos: You can use it right there in SQL Server Management Studios without having to change anything or go to a different application to get your stuff. Your favorite experience as a database administrator, I guess, a story that would help illustrate why it is that you like to do what you do?
Louis: That one’s tough. I’m really that bad at telling stories or remembering what happened yesterday.
Louis: I like being in DBA because I like the ability to shape the data, and get things right. The best experiences have been working with another designer who’s building the front end, and we built these stored procedures. We built this UI, this whole system where the screens…The thing just came right off the stored procedures. We generated those out of a data modeling tool. It just, “Boom!” Everything worked with very little override.
Carlos: Very nice, very nice. Getting to work with people that know what they’re doing, and can help move things forward, always good experiences.
Carlos: To close out, if you had one superhero power, what would it be, and why would you want it?
Louis: [laughs] So many terrible answers to this question.[laughter]
Louis: I don’t know. For me, if I wanted to be realistic about it, it would be the ability to go to these conferences that I go to and actually enjoy the entire time. [laughs] But, I guess it’s not superhero because I see more than my superheroes, Caitlin, Brent, and Itzhak, all out doing these things. I don’t guess that’s a superhero. That’s just me not being old.[laughter]
Louis: Old and crotchety.
Carlos: OK. So, unlimited energy, or something like that.
Louis: Unlimited energy would be it, right?
Carlos: There you go.
Louis: Super speed, unlimited energy.
Carlos: Louis, thanks so much for being on the show. We do appreciate it. It’s been great having you. Compañeros, we’ll see you on The SQL Trail.