Episode 72: Testing Automation for Business Intelligence

Episode 72: Testing Automation for Business Intelligence

Episode 72: Testing Automation for Business Intelligence 560 420 Carlos L Chacon

In the data space we’re hearing a lot about dev ops, continuous integration and the programmers are getting a lot of love there with lots of tools to help them; however, if you’re on the data side–not so much.  If you’re running Agile in your environment and then you’re trying to do data warehousing or other data development we’re not quite getting the same love. Sure, there are a couple of unit testing software offerings out there, but it’s a little bit cumbersome.  This episode our conversation focuses on testing automation in a business intelligence or data warehouse environment–even in an Agile system.  Our guest is Lynn Winterboer and she has some interesting thoughts at the intersection of business users, data professionals and tools.

 Episode Quote

“I would say what we need the business users to do, if we’re on the IT side, what we need from our business users is their specific expertise and we need to be using them for things that require their specific expertise.” Lynn Winterboer

Listen to Learn

  • Where the burden of final testing should be
  • Roles for the business and the IT sides for testing automation
  • Ways to get started with testing automation
  • How to find the product owner for the data warehouse
  • How your test environments might look different than they do today
  • Testing automation is the goal, but there are several baby steps to get there
  • Why automation is not the most complicated piece of testing automation

Lynn on Twitter
Lynn on LinkedIn
Winterboer Agile Analytics

About Lynn Winterboer

Testing Automation Business Intelligence“I’m primarily a business person who has also spent over 20 years working in the data warehousing world as a business analyst, project manager or being the business lead on a data warehousing project.”

Lynn is a an Agile Analytics Coach and trainer with Winterboer Agile Analytics based in Denver Colorado.  She has two daughters and works with her husband on the business.

Untranscribed introductory portion*

Carlos L Chacon:               Lynn, welcome to the program.

Lynn Winterboer:            Hi there. Nice to be here.

Carlos L Chacon:               Yes, thanks for coming on with us today. Interestingly enough, we met in Kansas City at a SQL Saturday out there and your session topic intrigued me. In the data space or I guess in technology space we’re hearing a lot about dev ops, continuous integration and the programmers are getting a lot of love there, lots of tools to help them with that however, particularly if you’re an … you’re in the Agile space or you’re running Agile in your environment and then you’re trying to do data warehousing or other data development we’re not quite getting the same love. Sure, there are a couple of other unit tests, testing software out there.

It’s a little bit cumbersome. It’s kind of hard to wrap your mind around so ultimately our conversation today is on that thought of testing automation in a data warehouse environment and then even a step further, those who are running an Agile. I thought you had some really interesting questions about it or thoughts around it and so we wanted to have you on talking a little bit about this. I guess, first just let’s kind of set the stage here and talk about why testing automation is so important.

Lynn Winterboer:            Okay. Thank you, first of all, for inviting me to be on the podcast. It’s great to be reaching your listeners. I do feel really passionate about this topic because … I’ll give you a little background about myself. I’m primarily a business person who has also spent over 20 years working in the data warehousing world as a business analyst, project manager or being the business lead on a data warehousing project. What I’ve found over the years is that, in general, the data warehousing industry doesn’t have a whole lot of focus on testing. We’re very good at things like data quality and data modeling and managing large volumes of data and managing complex transformations so we’ve got some really good strengths that lead to great business results but we also haven’t had as much support in the industry for testing as a discipline and as a skillset.

The reason that I really care about this is because it is not uncommon for the final testing to be on the shoulders of the business stakeholders and they are the ones who know the data the bet and they are the ones who will do the best job of validating that what the data warehouse team has built is correct. However, it is not the best use of their time, nor the best way to build a strong relationship with between the business and the development team if what they’re catching are errors that we should have caught early on. You combine that with … We put a lot more burden on our business stakeholders than a lot of development teams do, I think, in my experience. You combine that with the … A lot of the Agile concepts that are really, really … they’re exciting and they are effective and they’re guiding projects really well. Of pulling your testing earlier and making testing something that the entire delivery cycle is involved in, or quality I should say, more quality than testing.

Some of the Agile concepts like having acceptance criteria to each user story or each small requirement and the acceptance criteria’s expressed from the point of view of the business of, “Here’s how I’ll know that you’ve met my need, when you can show me this.” If we can take that acceptance criteria, turn it into an actual test that everybody, the business, the developers, the testers can agree is a good test to prove the development is complete and then automate that, you have a really, really nice set of not only automated tests that prove to the business that you’ve done what they’ve asked but also regression tests that can be used down the road, which is … I would say really critical to being able to move quick and to be agile, regardless of what Agile framework you’re using.

Carlos L Chacon:               No, that’s right and it’s interesting that you mentioned kind of lessening the work of the business holders because, interestingly enough, in our last episode, episode 70, we had Kevin Wilke on. We were talking a little bit about Excel and kind of using it for the analytics piece and we actually, from the tech side, we actually came to it like, “Hey, the business users need to do more,” and so not it’s interesting to kind of throw that back on us and say, “No, no, no. We have enough to do as well.” We need to figure out some way that we can move forward. I think it is interesting that we all do need to kind of get together on the same page and put some of these things in place so that we can move faster altogether.

Lynn Winterboer:            I would say what we need the business users to do, if we’re on the IT side, what we need from our business users is their specific expertise and we need to be using them for things that require their specific expertise. If we can pull that expertise earlier through asking for acceptance criteria, that’s a … and clarifying that acceptance criteria. I did a webinar a couple of years ago where I said the most common acceptance criteria in the data warehouse team is the least useful and that is our acceptance criteria is that the data is correct. No duh. That’s our job, right? That is what we do. That’s what we focus on. That is what we’re dedicated to. How do we know what correct means from your point of view, Mr. and Mrs. Business person? I do think pulling the users in early to have concrete discussions about what do … I hear that you want this. What do you mean by that?

For example, if somebody says … I’m going to give a very, very simple example here just so that nobody gets lost in the domain. Let’s say they want a summary of product revenue by quarter, so revenue by product by quarter. We say, “Okay.” We have our common conversations where we say, “What do you mean by product? When you say ‘revenue by product’ is that the top three product lines the company has or is that the 30,000 skews that we have.?” We have that typical conversation but it’s also helpful if what we can come out of that conversation is them giving us a set of orders, for example, that if you can calculate product revenue by quarter correctly on this specific set of records, I will know that you’ve calculated it correctly because these records have the … They represent each of our major three product lines. They go down as far in terms of they’ve got all the right skews in them that I want to look at and, actually, in that conversation we might come out that it’s not just product revenue by quarter, which might by, say, three numbers if you have three product lines, three summaries. It may be that they’re looking for 30 numbers because when they say “product” they mean throughout the whole product tree.

Coming up with concrete actual examples is one of the most important things. Honestly, as a business analyst working in data warehousing, that’s been one of my go-to tools since the ’90s and lots of good BAs who worked with data teams. That is their go-to tool is an actual example of this. What do you want? What that does, if they can give us a handful of records, meaning a defined set of records … Even if it’s all the orders in Q-4 of last year, that’s okay. It’s a defined set and they can pre-calculate themselves, so this gets back to maybe the Excel comment you made earlier. If they can pre-calculate themselves what they expect that number to be for product revenue by quarter or that set of numbers, then we have something we can test against. We can automate it. You have to know your expected result to be able to automate something. If it takes a human to look at it and go, “Yeah, that feels right,” it’s hard to automate. You can automate the running of the test but you can’t automate the assessment of whether they’ve passed or failed.

Steve Stedman:                Lynn, what you’re describing there sounds a lot like what I’ve observed in development organizations where you have backlog grooming and planning meetings and the product owners are there and the developers are there discussing the user stories and acceptance criteria. Do you see that the business people are getting in the same room with the analysts that are in the same room with the developers that are building out the data warehouses or the BI structure and working in the same way development teams would there?

Lynn Winterboer:            Yes. That is what I’m talking about and you mentioned the role of product owner, which is an important role and typically that person is from the business and is very knowledgeable about the business. I shouldn’t say “typically”, by requirement they are. A product owner is supposed to have three criteria. One is they have the knowledge … I’m going to start with one is the authority to make prioritization decisions and typically that’s somebody who’s pretty high up in the organization in the business organization who has that authority or that authority’s been bestowed on them by somebody high up. They have the knowledge to do the deep digging to come up with, for example, the subset of records and what the expected results might be in this case and they have the bandwidth to do all that work.

It’s pretty hard to find a person in a large data warehousing team supporting a large company. It’s going to be hard to find somebody who has the authority, the knowledge and the bandwidth to represent all the various business entities that that warehousing team supports, even if it’s a specific project, let’s say, for a finance team. You probably would have a good product owner might be a financial analyst who’s worked at the company for a long time and who has the trust and support of the CFO to being making some of these prioritization decisions, or at least to be going out and getting the information on prioritization from the executives and bringing it back.

We do expect the product owner to be deeply knowledgeable in the business and the business needs, however, that doesn’t mean the product owner is hung out to dry all by him or herself. It means they are expected to bring in the people they need to do to get the … to meet the need, the need of product ownership. I’m starting to talk about product ownership instead of product owners because I think it’s a need that needs to be met. Then, you have a product owner who’s that one point of contact but they would bring into this up front discussion the right people they need to have in it from within the business. Even if we-

Carlos L Chacon:               I think even another critical point that you made there is that I think a lot of times, depending on the organization and Steven mentioned some of the steps that the Agile takes to try to convey an idea, but it still seems like a lot of times in the BI space they’re kind of like, “I need this report,” right?

Lynn Winterboer:            Right.

Carlos L Chacon:               They rip off the Excel spreadsheet, “I need this.” The IT folks start thinking about how, “Okay, how’s this going to look? Drop downs, drill downs,” all these kinds of things where they probably used to say, “Stop,” like you mentioned, “Let’s go back. I’d like to see you calculate this.”

Lynn Winterboer:            Yeah. You know, I actually think Excel is an incredibly useful tool for data warehousing. I know I have there are lots of people out there who would slap me for saying that but if I can get somebody in the business to do a manual export from their sources system and show me, in Excel, without re-keying anything, using pure formulas and referencing of cells, what they need and that meets their need that is a great prototype and that goes a long way toward helping us understand what’s needed. What I would suggest is that we would … limit the scope of the input data for acceptance criteria and acceptance tests to say … And again, it’s something you want to be able to pre-calculate.

I had a great customer I worked with. Our product owner at that point was a senior director in finance and she finally came down to, just to make the story short, after lots and lots of testing and lots of going down rat holes and discovering that the rat hole was because of a data anomaly that’s not likely to occur in the future, she finally said, “You know what? We have grown by, our company’s grown by acquisition for twenty years. We have a lot of weird date from the past. We might get weird data in the future but in doing what we need to do to meet the company’s needs right now, I don’t want us spending time on stuff that’s not going to happen in the future because we’re going to have a cutoff date and then we’re only going to be looking at future stuff.”

She came up with 16 orders that if we ran these 16 orders through and applied all the revenue recognition rules, of which they were many and they tripped all over each other a lot, she said, “If you can get these 16 orders to come up with the revenue recognition that I have allocated in this spreadsheet purely by formula, I will know that you’ve got 80 to 90% of the need met and then we can go down the rat holes of the corner cases and decide which ones are important enough to tackle before we go live.” She said, “I don’t want anybody looking at any other date but these 16 orders until we’ve nailed them.” A big part of that, which is not uncommon in data warehousing, is the business was … They had a revenue recognition system that was 10 years old and had been built and maintained by a single individual who had been just taking orders from the business and doing the best she could and what it turned out is there were very complicated rules and there were some conflicts.

There were lots of business decisions that needed to be made so this senior director of finance was able to say, “Once you get these 16 orders to flow through and give us this, then I’ll be happy and I’ll be confident I can tell the CFO we’re on track.” I think that was a great example and that was something we could automate. We were able to automate those tests so every time we put a new rule in, we could run the test suite and in a second know whether we had broken a prior rule or not.

Carlos L Chacon:               I was going to say that that’s where the next kind of transition or difficult part is, at least when I working with data warehouse teams, is then to limit them to a certain amount of data. It seems like they’re always, “We want a refresh of this data so we can do our testing because if we can’t have all the data then we can’t do our tests.” It sounds like if you can identify what that good data looks like, then you can go off of that and then again, like you said, looking at other scenarios further down the line.

Lynn Winterboer:            We do … we still need that good set of data further down the line and so another challenge that I think data warehousing teams face is that our testing environments, typically there’s a couple of problems with them. The first one being they’re typically shared testing environments so if you’re doing an ERP implementation and you’re building reports for that ERP or off of the data in that ERP system, your testing environment from a data warehouse perspective will typically be the ERP’s testing environment.

If you think about people implementing an ERP, they’re testing out workflow processes. They want to know that it can handle all these different workflow scenarios and so they’re basically creating a bunch of junk data to test to prove or disprove that the workflow works. That’s not useful to use in data warehousing. We need what we’ll call quote-unquote real data and junk data doesn’t do us any good. It’s not junk to them. It serves a purpose to them but we’re at cross purposes so that’s the biggest problem I see. It’s not uncommon for large, I’ll say ERP systems or CRM systems again, to not want to refresh the data in their testing environment but once every six months or once ever year because they’ve got all this testing in progress and it’s going to interrupt their testing.

First of all, I think we need separate testing environments. We get one that has a really good set of data that doesn’t change and that is reliable to be realistic as to how the business data looks and they need a different environment. They need a much smaller set of data. They just need to play around with creating records and pushing them through. The other thing is our testing environments tend to be much smaller and not configured like our production environments and that causes us a big problem because if they’re much smaller, we can put a whole lot of data in them. If they aren’t configured like our production environment, then we have all these potential huge problems when we try to deploy to production of things we couldn’t catch in test because it’s a different configuration.

Then I think the third big problem for us is actually what kind of data can we pull into our test environment? With data breaches and data security being so critical these days that if your test environment is not in the same security configuration as your production environment, which it typically isn’t, you’re often not allowed to pull quote-unquote real data in to test because of BI data or financial data or HIPPA regulations. That makes it really hard for the data warehousing team to test on quote-unquote real data because we were not allowed move it.

Some teams are deciding to … The teams who get the magic wand and they can get whatever they really want to do this are getting basically a copy of their production data warehouse behind the same firewalls and in the same security standards as their production systems that they’re drawing from and … are able to have, whenever they want, a refresh of either a subset or a full refresh of data from production. That’s pretty hard to get, financially, but I’m hoping that the benefits of test automation, which is one of the key components of continuous integration, will start to resonate within the executive world in terms of if we’re going to support our software development teams in continuous integration and configuration management and code management and test automation, that we want to support our data teams as well. It might cost something we weren’t expecting but we’ll see where that goes.

Steve Stedman:                Okay. Then at that point you’ve got your test data, whether it’s a complete replica of production or some subset of close to production or whatever it may be there, and you’ve got your acceptance criteria that you talked about and then you have the 16 orders that you mentioned that if these orders go through, that’s representative of everything. Then what is the path from there to really automate that and get it into a process where it doesn’t take someone to manually go through and … I don’t know, manually test that?

Lynn Winterboer:            Manually test it … Actually, that part is the easy part, frankly. Right now you have a business person saying that but the demonstration that you saw, Carlos, at SQL Saturday, came from a group of my data warehousing friends. We were sitting around having coffee one day, talking about life and stuff and I started asking them like, “Come on, guys. Why is it so hard to do test automation?” They all looked at each other and said, “It’s not that hard. We just don’t do it.” I said, “Really? Okay.” One guy had done some test automation in the past and he’s like, “The mechanism’s pretty simple. It’s just … “I can honestly say I think the reason a lot of data warehousing teams don’t do it is we’re a very vendor-driven community, and have been for years, in the sense that …

The example I give to demonstrate this is, if we’re looking to hire an ETL developer and the language, the software language that the ETL developers typically use is SQL, we don’t put an ad up for a sequel developer like other teams put an ad out for a Java developer. We don’t do that. We don’t even put an add out for an ETL developer, typically. We put an ad out for an Informatica developer or a DataStage developer or an Ab Initio developer, which are products that these developers use to do ETL, which is extract, transform and load. If a company’s looking for an Informatica developer and you’re a DataStage developer, they don’t want to talk to you. You won’t even make the first cut to be able to talk.

I have several friends who do development in ETL tools across a variety of tools and they say, “It’s not that hard to cross tools. They’re not that different,” however the industry still isn’t there. I think our vendors have done a good job of becoming very sticky. They have differentiated themselves from each other. They each have their own semantics. It was by design and I’m not upset with the vendors. I think they’ve done their job well. I think we need, as an industry, to start challenging that and stepping back and abstracting it a bit and saying, “In the end it doesn’t matter what tool you use. It’s the skill to know how to use that tool to do something as efficiently and effectively as possible.”

That is why, since there are very few tools out there that anybody has delivered as vendors for testing the data warehouse or for doing test automation … There are tools out there that’ll talk about testing the data warehouse but they’re really where you store your tests so you can manually run them and record the results. For test automation there are really only a handful of tools created by vendors that are available to data warehousing teams. They’re really pretty new to the market. It’s really a blossoming market.

Carlos L Chacon:               Right.

Steve Stedman:                Then those tools you’re talking about there, those would be created by specific vendors like Informatica for their specific product. Is that what you’re saying?

Lynn Winterboer:            Yes, Informatica does have some testing capabilities built into its tool, which is great and if, for teams who are suing Informatica and buy that additional module, I think that’s certainly worth looking into. I don’t know a whole lot of teams that are actually using it. Maybe just lack of exposure on my part to the right teams. Frankly, I don’t know a whole lot of teams that are doing test automation, to be honest. The other tools that I think are worth looking into, one is called Ice DQ, I-C-E like ice cube, D-Q for “data quality”. Another one is called Tricentis. It’s by a company called Tosca. Maybe the tool is called Tosca and the tool is Tricentis. Anyway, they’re out of Europe and I have European data warehousing friends who are using it to there’s their data warehouse so I know for sure that it’s being used and applied that way. Another company is called QuerySurge and then another one is called Zuzena, Z-U-Z-E-N-A. Those are a couple tools that I’ve been learning about lately that teams might want to look into.

Back to your question, Steven, “What does it take to automate it?” The automation piece is not the complicated piece. You really just need to define the tests, store that definition in a database, store the test records, which specific test records do you want to run through for that test, what is your expect result for that test and then have something do a comparison between what actually happened when you ran the test and what the expected result was and to then put the … Basically … Excuse me. Basically record the … Sorry, that totally stumped me.

You want to have … Then you have a tool that’s something that does a comparison. What actually happened when you ran the test versus what were expected results and typically we record all of the results, passes or fails. Typically it only notifies a human if it’s a fail and there may be certain tests you say, “I don’t really want to know right away if this one fails but I want to see it in the log,” or something like that. Other ones you say, “Yes,” you want the developer who last checked in the piece of development to get a notification immediately when something fails. It’s really not that complicated, the mechanism. The hard part in test automation, frankly, is deciding what’s that acceptance criteria that makes sense to the business, what are the examples of records that prove or disprove? Not only do you want to say, “If you can run these five records through and they all calculate correctly, I’m happy, but here are five others that shouldn’t be included in the query and we want to make sure they’re left out,” or something like that. You have to do your positive and negative testing.

That testing mindset and the skillset that testers bring into a team is something that we really need to embrace in the data warehousing world and get them to come into our team. We have to have patience with them because they don’t understand data versus GUI and they have to have patience with us because we don’t understand testing discipline and strategies. If you work together, you can really come up with some powerful things.

Carlos L Chacon:               I think that’s the key point. You said, “Oh, it’s not that difficult.” I think it’s that coming together because the IT teams, they can definitely build the tests but they need to know what the rules are, right? They’re going to get the results from the business and they can’t get that rule from a, “Here’s my report, I want you to build it.” They’ve got to understand what that … It’s almost like back to the beginning what we talked about. It’s, again, to use I guess another Agile word but that cross-platform teams to kind of come together and share information. I think that is what will kind of get us headed in the right direction and, of course, the tools and how you go about that, you can fight that another day.

Lynn Winterboer:            Yeah.

Steve Stedman:                It’s almost like, at that point, it’s more of a culture shift for an organization to break down walls and have that cross-functional capability there.

Lynn Winterboer:            It is.

Steve Stedman:                That’s almost more difficult than the technical piece sometimes.

Lynn Winterboer:            Oh, I think it is. I really do and when we talk about Agile, we talk about even a single data warehousing team or a BI time having cross-functional skills within a single team of say seven people and what we mean by that is we’d really love to see these data teams evolve to where, within a single group of, say, seven plus or minus two people, you’ve got the abilities to do the analysis, the work with the business, then do the data analysis, then do the data modeling, do the ETL, do the BI metadata work or the cube work, do the BI front end and do the testing and quality assurance. It doesn’t mean you need a human that does each of those things. It may mean that you grow your team so that a single human has skills across multiple categories there.

Then we could extend it even further and say a really agile organization is going to have … They’re going to look at things from the point of view of a value chain so if you look at a business value chain … Let’s just say “supply chain management”, that is one of the typical value chains. What we would do there, if were going to be a really agile organization, is we would have the data professionals, the people with these skills from the data perspective and the data warehousing perspective … be closely aligned with the people in the ERP system who are also closely aligned with the people in any other system that plays in that value chain.

What I’d like to see is where data warehousing and business intelligence and reporting and analytics are part of the value stream of a business process. They’re not and also … “Oh, yeah, we also need reports.” They’re not a red-headed stepchild of IT. They’re really a valued part so as you’re building out the software to meet a business need you’re also looking at building out the reporting very soon thereafter in small chunks.

Instead of an entire ERP implementation, for example, you might start … I’m just going to pick this one out of the blue. You might start with shipping data and say, “Okay, we’re going to make sure our shipping component’s work correctly in the ERP and meet the business’s need and then we’re going to do our shipping reporting before we move on to order management,” for example. I don’t know. However your order is that you do these things, it doesn’t necessarily have to be the order in which the value chain flows. It might be the order in which the business rules are more clearly nailed down by the business. You tackle that first while the business is trying to figure out the business rules in the more complicated pieces of the value stream. That’s where I’d really like to see it and I’d like to see that sliced very thinly to run from a business need through front end software through data warehousing to the end before they move on to the next little slice. That’s my dream.

Steve Stedman:                Okay.

Carlos L Chacon:               We should all dream big there, right?

Lynn Winterboer:            Yeah, exactly.

Carlos L Chacon:               To kind of put it all together and to kind of see some of these steps, I think, when we look at implementing data warehouse test automation, the first components are code cultural, right? Getting together with your analyst or whoever’s giving you the spreadsheets and giving them, “This is what I want this report of,” then they have to help you identify the rules and the process of how they go to where that report looks like. We can then take that and that set of data. We then will put that into our test environment, if you will, and that’s what I’m going to run my development against. It doesn’t necessarily need to be a whole copy of my data warehouse because I know the rules that are going to be in place and I can test against that core set of data. Do I come up with the same numbers I’m expecting? Yada, yada, yada, why not?

Then when I move and I go from there, I can do the automation, run through the bill. Then I go to the next step and then when I’m ready to actually present that to the user, and let’s just say we’re going to the next quarter, so the data I have is … Because we’re sitting here in the fourth quarter of 2016, I’ve got data for the third quarter of 2016. When I start looking at fourth quarter data, I’m going to apply those rules I show that back to the business and then they say, “Well, this number looks a little bit odd.” Then I can go and say, “Well, I’ve applied these rules. My third quarter numbers still look good so what is it? Do we have a change? Is there a new product? Is there some other component or rule that we didn’t take into account?” Then we can kind of start from there. Is that a fair kind of-

Lynn Winterboer:            That’s exactly it.

Carlos L Chacon:               Yeah.

Lynn Winterboer:            Yeah, I think it is. I think what’s good about the process you just described is that if you’ve already validated you’ve got Q-3 running right and it looks good and then something looks funny in Q-4, you’re right. What you’re doing is you’re narrowing the range of areas you would look for an issue so it’s going to help with troubleshooting to say, “We’re confident the math works correctly. What else could be skewing this? Is it a change in the business process? Is it a change in our source system, how it records things?” Any of those things, it’s going to narrow down the scope of what you have to go troubleshoot and that is really, really useful to a data team and to the business people because we can have some confidence in certain things and then say, “Here are the unknowns that we’re going to go investigate.”

I think … Then, with regression testing as well, if you have regression testing at a very micro level … Not too micro but unit tests and then the next level up might be acceptance test. If your regression test suite is a combination of unit tests on specific transformations, for example, and then acceptance tests that might be a little bit broader and more business-facing but you’re automating those, you’re going to know pretty quickly what broke and why it broke because it’ll be down to … Let’s say even down to a single field level. “This field is no longer flowing through a system the way we thought it should continue to do so. Nobody has intended to change that but now this field isn’t behaving as we expected.” It’s going to save you a ton of time and troubleshooting and it also really …

I think test automation eventually, Steven, you mentioned the cultural changes. Test automation’s going to help reduce the amount of finger-pointing or anxiety might feel of like, “Oh gosh, something’s broken,” and suddenly everybody’s covering their own bottoms over it. If you’ve got test automation and you’re … The test you’ve automated are pretty micro level, you’re going to know exactly what broke and there’s no need to be guessing or speculating or finger-pointing. You’re just going to go fix it. You’re going to go solve the problem. I think it does help take the emotion out of the reaction to testing results.

Carlos L Chacon:               One last thought I wanted to ask here. Ultimately, this testing sounds a little easier when we are looking backwards but I know a lot of our listeners are going to be screaming at me and saying, “Well, what about, how do I do that for a new dimension? I want to create something that I don’t currently have rules for.” Does it still apply or …

Lynn Winterboer:            I think it does. I think you have to speculate on what … You have to create some, say in my 16 orders example, you might create order 17 and 18 that have the characteristics of the new type of order that’s going into production or you might create a whole nother 16 orders that add these characteristics and then see how that plays out. You may have to create some … You have to mock up the data, basically. I think in the software world they use terms like “mocking and scaffolding” and we can do that in data warehousing, as well, and in BI.

I do know some successful Agile teams where the BI team is separate from the data warehousing team but they’ll have a joint design session and the BI team will do some scaffolding. They will create structures that don’t yet exist in the data warehouse and populate those structures with data as they expect it to come from the data warehouse and then they’ll build from there. It takes some tight coordination between the two teams but at least the BI team can be moving ahead so that when the data warehouse does have quote-unquote real structures and real data for the BI team to pull into their tool set and into their environment, they’ve got a head start. It may not be perfect but then they tune it and tweak it to make it fit instead of starting from scratch.

Steve Stedman:                Interesting. Okay.

Carlos L Chacon:               Should we do SQL family?

Lynn Winterboer:            Yeah, let’s do SQL family.

Carlos L Chacon:               How do you keep up with technology changes now? I guess you mentioned you’re kind of a business person but you obviously you’ve doing a little bit of cross over here and we’ve been able to get along so we’re going to-

Lynn Winterboer:            Yeah.

Carlos L Chacon:               How are you kind of keeping up with changes, I guess even in your Agile coaching as business methods and things change, how do you keep up with that and kind of stay on your toes?

Lynn Winterboer:            It actually takes a lot of energy to do it. I am looking basically at two industries and then trying to find their overlap or their synergies. I have my Agile world and I have my data warehousing world and it’s important to me to keep up with both of them. I go to a lot of local meetups because that is where not only do I learn new things and hear how people … Somebody might use a term that I don’t understand and I can just ask them and say, “Oh, what do you mean by that?” Then they define it and I go, “Oh, okay, that’s something new that I haven’t heard about.” I also go to national conferences like the one … I guess SQL Saturday Kansas City, for me it was national because I had to fly there from Denver. I try to go to those types of conferences.

On the data warehousing world I’m an analyst with the Boulder BI Brain Trust, which is a group that meets on Fridays. Not necessarily every Friday but some months it’s every Friday from roughly nine to twelve-thirty and we meet with vendors and they share us their tools. It’s anything related to data warehousing or BI or analytics. We get a portion of time where we’re tweeting like crazy. It’s a public portion. We record those and if you’re a subscriber to that organization you can go in and view the recordings.

We also then go into an NDA section where the vendors will ask us questions about what we’re seeing on the marketplace and what direction they should go or they show us new stuff they’re thinking about rolling out and we can give them feedback. The vendors get some good feedback from this group as well, as us learning about them. I don’t go to all of them because it’s a big chunk of time but I really try to go to as many as I can, especially if I think the tool or the vendor that’s presenting has anything to do with Agile enablement for data warehousing. I think one of the most important things is I keep a list of experts when I meet them so when I went to SQL Saturday in Kansas City, I met several people who i will now have in my own CRM as somebody, these people are really good with this type of thing or that type of thing. If I have a need or a question, I know who to reach out and I maintain those relationships and I hope they will leverage my insight and knowledge, as well, and reach out to me when they have questions.

Steve Stedman:                Okay. Lynn, what is the best piece of career advice that you’ve ever received?

Lynn Winterboer:            I’m actually going to give you two but I think they’re related. The first one that comes to mind is really about trusting your higher power. The lady who shared it with me was a woman who was very successful in her data warehousing career and with her company and I asked her at some point, “How do you get your career to be as good as yours? What is your secret to success?” Frankly, she said, “I pray. I pray a lot.” I would translate that to say, pray to your higher power, trust the universe, listen to the yearnings of your heart because I think those are given from above and at leads to finding the work that makes you want to jump out of bed in the morning and excited to go to work.

I know, for me, that is what led em to bringing Agile and data warehousing together. I love the data world. I am hooked. I’m a true data geek and I love the people who work in that world. People don’t go work in the data world if they’re lazy or dumb. Data warehousing is full of some really smart, intelligent, hard-working people. The Agile piece is what brings the joy and the trust and the cultural changes that make it exciting to go to work there so I really decided about five years ago I just wanted to bring the two of them together and that combination, those synergies and bringing those together is what makes me want to jump out of bed every day, so sort of a two-part answer.

Steve Stedman:                Okay. Great.

Carlos L Chacon:               Very nice. Our last question for you today, Lynn, if you could have one super hero power, what would it be and why do you want it?

Lynn Winterboer:            Let’s think. That’s a great question. Should’ve known after meeting you guys at the SQL Saturday that that would be one of the questions. I would say I would have the ability to gift to another human being the gift of deep empathy, to really be able to understand your … somebody who’s very different from you. Where are they coming from? What are their motivations? What are their fears? I think the world is really a scary and sad place right now and I think empathy … I’ll distinguish empathy from sympathy. Sympathy is saying, “There, there, I feel bad for you.” Empathy is saying, “Wow, I can’t … That must be really hard. I’m imaging if that happened to me.” I really think if I could do that, it would be the ability to give the gift of deep empathy. Knowing myself, I would have to not have a super hero costume or a cape because I get very distracted by that external thing. Be better if I could walk around kind of invisibly gifting empathy to people.

Carlos L Chacon:               I was going to say, if that gets out and too much empathy, people might start running away and be like, “Ah, no!”

Lynn Winterboer:            I know. [inaudible 00:51:40]. Yeah, I’d maybe have to have the ability to be invisible and give empathy.

Steve Stedman:                Okay.

Carlos L Chacon:               Lynn, thanks so much for being on the show today.

Lynn Winterboer:            Thank you, guys. It’s been delightful.

Carlos L Chacon:               Yeah, we do appreciate it.

1 Comment
  • Excellent show! I’m curious if people are using open source tools such as tSQLt and/or Fitnesse for regression testing ETL jobs, and if TDD/BDD methodology is being applied to ETL development practices. Also, any links to best practices for migrating ETL jobs from one tool to another are appreciated.

Leave a Reply

Back to top