Two years ago blogger Jay Killeen observed, “And with modern languages and modern applications such as PowerBI and Power Query you should be safe to assume there would be modern tooling. You would be wrong.” In this episode, Eugene and I explore the challenges with Devops in Power BI and some ways we and others have implemented to help us deal with versions and team development of reports. Compañeros, you can also win a SQL Data Partners T-shirt if you post on social media using the hashtag #sqltrail.
Jay Killeen: Managing Queries in Power BI & Power Query using Git
Eugene Meidinger: Why the Raspberry Pi 2 is my main computer
Reza Rad: Power BI Helper
Tony McGovern: Power Query Version Control
Power BI Sentinel
The DataOps Manifesto
“When we think about devops, we think about trying to share code or merge. Power BI leaves us a little bit wanting, at the moment.”
“I’d say the situation’s a lot worse than it is with SSIS, because with the work that I’ve done with SSIS, you can put those xml files into source control.”
“We’re out in the Wild West, we’re figuring it out because Microsoft’s left us with like a rubber band and toothpick out in the desert and we’re trying to do devops.”
Listen to Learn
00:38 Intro to the topic
01:32 Compañero Shout-outs
03:46 How to win a SQL Data Partners Podcast t-shirt
05:32 SQL Trail announcements and request for opinions
07:10 Power BI leaves us wanting when it comes to devops
09:04 Publishing changes in pbix can be a lot harder than xml in SSIS
12:10 OneDrive is still the recommended source control for Power BI
13:21 There are people trying to fix this problem
15:32 And now there’s Power BI Sentinel from Purple Frog Systems
18:48 There’s no good way to collaborate with pbix files
21:25 Power BI Sentinel is a monthly charge and seems reasonable for what it can do
22:57 Closing Thoughts
Carlos: Compañeros! Welcome another edition of the SQL Data Partners Podcast! I am Carlos L Chacon, your host. This is Episode 170. Welcome, compañeros. I am joined today by Eugene Meidinger, welcome!
Eugene: Hey, how’s it going?
Carlos: And I know, compañeros, you won’t be able to see this, because it’s a podcast, but we’re actually doing some video today, which is not typical for us.
Eugene: It’s probably not good you can see my unshaven face. I’ve got the self-employed beard in full growth.
Carlos: Yeah, I’ll just keep my eyes on the notes for today’s episode.
Eugene: Yes, please.
Carlos: So our topic today is Devops with Power BI and we’d like to excuse Angela and Kevin, both couldn’t make it today, but we’re thinking about them. We’re talking about Devops with Power BI, so we’re talking a little bit about source control, how to share with others, get into a little bit of that and some of the challenges around this. But as always, compañeros, we first have a couple of shout-outs we want to give. I’ve got some, to Michael John, Meks Tadele, Josh Krodel, April Smith, Rafael Luis Lacerda, LeSean Greene, Jonathon Lipke and Katie Novotny. Thank you. So she actually reached out a couple of questions on a Power BI Embedded, for me, so I was appreciative of that.
Carlos: And so compañeros, maybe not a shout-out, but I do have an apology. I didn’t realize a couple of episodes ago, and it doesn’t matter, I’m sure, if you’ve listened, you were probably like, “oh my gosh, what the heck is going on?” We moved into an office, SQL Data Partners did, and in that transition, somehow my mic settings got reset and I didn’t realize until after the episode had been recorded that it was no bueno. So thanks to Julien, who did the best he could, given what I gave him, but I apologize for the bad noise quality, and hopefully this episode is a little bit better.
Eugene: I’ve got a shout-out to Drew Furgiuele for having an epic twitter thread on a cluster he made with a bunch of Raspberry Pi’s and a network switch. I think he’s still figuring out exactly what he’s doing with it. I saw he was playing around with making a VPN and Kubernetes and stuff, but now I want to dig out my Raspberry Pi’s and do something with them.
Carlos: There you go. You know, the hundred and fifty you’ve got in the closet somewhere.
Eugene: Well, I’ll tell you a quick, funny story. When I was dating my now wife–
Carlos: Your former girlfriend, yeah.
Eugene: My former girlfriend. So when she was my girlfriend, we were dating and I was buying Raspberry Pi’s at a rate that I told her, “well, it seems to be about one for every year we’re dating.”
Carlos: Oh wow.
Eugene: And so, we dated for three years and then I proposed and then I’m like, “I’m thinking about getting another one” and she’s like, “well, we’re not dating anymore, we’re engaged.” And I’m like, “ah, your logic’s irrefutable.”
Carlos: Had to give up the Raspberry Pis. That’s funny.
Carlos: Yeah, I still have yet to get on the Raspberry Pi bandwagon, but who knows, maybe this will prompt me. I doubt it.
Eugene: They’re fun. I’ve got an old blogpost we can put in the show notes. There was a brief period of time where I was using it as my primary computer, just to simplify life. It was interesting, yeah.
Carlos: Oh wow, that is interesting. Okay, so a couple of other announcements, so we haven’t talked about it. So, I started giving these out at SQLSaturday RVA, but compañeros, we now have, maybe not quite so hot off the presses, but we’ve got podcast t-shirts. They have the small logo on the front, and then on the back, they’ve got a much larger logo with the “I’ll see you on the #sqltrail” at the bottom. And so, you may be asking yourself, I know Kevin is asking me, “how can I get one of these t-shirts?” So we’re still working out some of the details, compañeros, and you’ll forgive me. I know that most of you live overseas, outside of the United States, rather, and so you’ll forgive me, but for right now, we don’t have a way to get it to you. We’re not quite ready to take on the international shipping costs required to get you these t-shirts and I know that that’s extremely pathetic, and you’ll forgive me, but right now, we’re not going to do that. So, if you’re in the States and you want me to send you a t-shirt, here’s how you can get one. On social media, you can reference us, Eugene, whoever it might be, an episode, something that you learned, whatever, but we’re going to ask you to use the hashtag ‘sqltrail’, #sqltrail. Each week we’ll collect those, maybe we’ll mention a couple and we will pick a winner, randomly select a winner from all of the social media posts that we get from that week.
Carlos: So we will do that, and then of course, when it comes to the point where we’re traveling internationally, I’ll try to let everybody know and we’ll see if we can get some from those areas. But right now, mention us on social media, #sqltrail, and we’ll randomly pick one, we’ll reach out to you and then work out all the details necessary to send you a t-shirt. One other thing, compañeros, we want to talk about SQL Trail, October 16th through the 18th in Richmond, Virginia. Still working out some of the details and I have a question. My question is, if you could choose a keynote speaker for an event, who would it be? I’d like to get some thoughts around this and I will be reaching out. Admittedly, I have been reaching out to some people, but if you have some thoughts about who you’d like to see as a keynote speaker, we’d be very interested to know. I know from a PASS perspective, we’ve gotten spoiled, almost, in some of the keynote speakers that we’ve had, in the past. So, not intending to reach out to Dr. DeWitt, so probably off the list. So you have to remember, compañeros, as a D-lister, you know, there’s only so many people I have access to.
Eugene: I don’t know, we need somebody who forced Oracle to change their licensing terms. Have you heard of the DeWitt clause?
Carlos: Oh, so I have heard of that, and I remember looking it up on Wikipedia.
Eugene: Yeah, same. So I forget some of the exact details, but basically, decades ago, he was doing benchmark testing and his benchmarks didn’t lean too favorably towards Oracle. And so, Oracle basically changed their licensing so that if you use Oracle, you can’t publish benchmarks with it, unless you’re like specifically allowed, like you’re the ACCD, or TC, whatever four letter acronym is for that benchmark.
Carlos: Right, whatever the benchmark standard is.
Eugene: Yeah, I think Microsoft has similar licensing, but it’s called the DeWitt clause, because he made Oracle so mad. So, mad, mad props to that gentleman.
Carlos: Interesting, interesting. Okay, compañeros, our episode notes or show notes for today’s episode can be found at sqldatapartners.com/powerbi or sqldatapartners.com/170. Okay, so we’re getting in here, again Devops for Power BI is our conversation and so in doing a little research for this, I came across a two year’s old post from a guy named Jay Killeen, who we don’t know. I don’t know Jay, but hello, Jay, if you’re listening. Big fan, all of a sudden. Anyway, he talks about Power BI and Power Query and how the M language and how’s it’s so great. And then he says this: “And with modern languages and modern applications such as PowerBI and Power Query you should be safe to assume there would be modern tooling. You would be wrong.” And so when we think about devops, we think about trying to share code or merge. Power BI leaves us a little bit wanting, at the moment, Eugene.
Eugene: Yeah, Power BI and the devops space is a giant mess right now, to my understanding. I’m hoping that that changes eventually, but the situation is not great. I think a recurring theme with Power BI is they made this tool that’s really, really great for business users, and then they kind of worked their way back to things that were discovered in the past decade or two. We had part of that conversation with dataflows way back, where, “okay, we made a tool, we made Power Query so simple that a business user can use it without any particular training. Oh wait, we still need a data warehouse-ish kind of thing. Well, let’s make dataflows.” Yeah, same thing here. It’s very, very difficult to approach that goal of devops when you’re working with Power BI, unfortunately, for a number of reasons.
Carlos: Yeah, so it’s very akin, I feel like, to working with SSIS and source control. So, ultimately, we have json files, so SSIS is xml and so when you think about devops, we’re trying to compare and merge. So you and I are both working on a pbix file, and now we need to publish. Well, if you’ve made let’s just say a report for easy scenario, there’s two tabs. I’m working on tab 1, you’re working on tab 2 and now all of a sudden, we have to publish. Then kind of a ‘who gets there’ and you know, how does that change process? There’s no good way to do that.
Eugene: Yeah, I’d say the situation’s a lot worse than it is with SSIS, because with the work that I’ve done with SSIS, you can put those xml files into source control. So the Development Studio, if I remember correctly, is based off of Visual Studio and it has direct support for Team Foundation Server or Git or that sort of thing. Like, so at my last job, anytime I would do stuff in SSRS or SSIS, I had those files in source control, and because it’s xml, yes it’s ugly. It’s not as clean as say like if you’re working with BIML or whatever, but it’s plain text. The benefit with plain text is that you can do a DIFF on the before and after pretty easily. It may look kind of crazy, but you can see, “okay, here are these five lines changed.” Like I’ve done that with SSRS where I’m like, “what did I change on this?” And then you scroll to the one line and it’s like the font color or something. The immediate problem with Power BI is the pbix format. Pbix is a zip file. Literally, you rename it .zip and you can go in. The problem is compression, the way that compression works is it removes redundancies. When you remove redundancies in your data, what happens is, it looks like random noise, because anything that doesn’t look like random noise could be compressed further. So now, you change the font color on one of your visuals, and because the file’s compressed, you may have changed the entire structure of the file. So now it’s opaque. So every time you make a change, you’re resaving your entire– maybe it’s 5MB, your entire 5MB file, instead of Git, the way that it works is it stores the deltas, it stores the differences.
Eugene: So let’s say that hypothetically, I don’t know how you would get this, but let’s say hypothetically you had a 100MB SSRS project or file or whatever and you go in and you just change the font color and change one line of code in that xml file. Well, Git just stores that as like, you know, 10KB or whatever is necessary, basically, to point to tell it, “okay, where did it change and what changed?” Tiny, tiny change. You do the same exact thing with a pbix file, you’re probably reuploading another 100MB, which just kind of balloons out of control. SSIS and SSRS are manageable because they’re plain text files and they have integrations with Visual Studio, so you can hook it up to Git or Team Foundation Service, or whatever.
Carlos: That’s right, that’s a great point, became at least the tooling lends itself to “hey, within the tool, I can connect to source control versus, my Power BI developer.”
Eugene: Yeah, there’s nothing.
Carlos: There’s nothing, so you have to be able to say, “oh hey, check this in.” So then I basically have to go to another spot, I’m managing files, wherever I’m saving that as, and I’m kind of managing that outside of my traditional development environment.
Eugene: Yeah, I was at a Chalk Talk three years ago, I think, at PASS Summit with Microsoft and at the time, their recommended source control was OneDrive for Power BI, and I don’t think that’s changed as of yet. They know it’s a problem, but I don’t think it’s changed.
Carlos: Yeah, and so you get a bit of that and as you search for source control and Power BI. SharePoint is another one that I’ve seen.
Eugene: Which fits in with the target demographic. So a lot of the people that they’re aiming at are comfortable with SharePoint, they’re comfortable with Office 365, and so that makes a lot of sense, working with a file library.
Carlos: And it gives you the version history, so if something like, oops, what’s happened, at least you can go back. So for comparison purposes, yeah, it makes a lot more challenging, but you do have a way to get back, if you save it up.
Carlos: Now, there are a couple of people trying to do something about this.
Eugene: Right. So a couple of things. I guess we should differentiate between the product that you’re hinting at and then kind of rogue efforts that people are doing. Starting with the rogue efforts, I saw a presentation by–
Carlos: Well, and I guess before we even get there, so we are no longer in Microsoft Tools. This is community or third-party.
Eugene: We’re out in the Wild West, we’re figuring it out because Microsoft’s left us with like a rubber band and toothpick out in the desert and we’re trying to do devops, yeah.
Carlos: With OneDrive, right? We have OneDrive.
Eugene: We have OneDrive, yes. Sorry, we have this OneDrive, a toothpick and a rubber band or a gumband, but no one would know what a gumband is, and we’re trying to find our way back to society. Yes, correct, so we’re outside of Microsoft-land. Alan Gilman presented at the Pittsburgh Power BI group and he talked about doing this stuff in Enterprise, and so it’s interesting, some of the stuff that he does where he just tries to pull a lot of these pieces out kind of manually. Reza Rad has a tool called the Power BI Helper, which allows you to pull out that M code and do some other things. He would take screen shots and put them in the Word documents, pull out the code, put them in the Word documents. I think there’s some tools from the SQL BI crew, so Alberto Ferrari and Marco Russo, that allow you to pull out some of the DAX or some of the dependencies, that kind of stuff. And so now you’re pulling out some of those things, so you can at least apply source control to those artifacts. Another thing that he talked about was making template files, so you have one file that is just the data set and you’re able to clear out the resident data. And then you have another file that reads off of that for the actual report design, so now you’re decoupling those two pieces. So you can make changes to the data set without having to worry about the report design and so you have a little bit more control that way. There’s a blog post we’ll have in the show notes by Tony McGovern, where he’s basically dynamically evaluating Power Query code from GitHub, which from a source control perspective is awesome. From a security perspective, it’s terrifying and the product team was like, “no, please, please don’t do this. This is a bad idea.”
Carlos: Don’t do it. Don’t go there. That’s right. Just because you can, doesn’t mean you should.
Eugene: Yeah. So there’s ways to adhoc it, but then what I’m really excited about is this new product that came out a few months ago called Power BI Sentinel. And it’s a product from the Purple Frog Systems folks and it seems like it’s really trying to fill in a lot of these gaps by giving you change tracking, automatic documentation, data lineage so you can see, “okay, if I change this data set, what’s it going to affect downstream?” Just all this stuff and I think it’s really exciting, because this has just been a gaping hole in the application lifecycle management piece when it comes to Power BI, so it looks really cool.
Carlos: Right, and I have not used it, either, so we’re kind of basing it off conversations and demos. So I think it is still very much, “I’m saving my pbix file to a central location and then I’m going to attach this.” So the Power BI Sentinel is a software-as-a-service, so you’re going to log into it and then connect to your reports and things.
Eugene: Yeah, if I’m reading it correctly, you publish your reports like normal to your workspace and then it talks to your workspace and saves backups to Azure Storage. So every time you push out a new copy, it’s saving a copy and so now you can compare and go back and revert and it’s so nice that it has this change tracking so you can change that one little font color, and in theory it can say, “this is what’s different.” Because, you know, you and I are doing work for a customer and there’s so many times where we’ll say, “okay, we built this thing.” And they say, “well, okay, we need you to change it.” Like, “okay, we’re not supposed to use TY and LY for ‘this year’ and ‘last year’, we need to use CY and PY for ‘current year’ and ‘previous year’ because it’s more multi-lingual” or whatever, and there’s always this risk that you make a bunch of changes. So if you’ve read the Phoenix Project, they talk about how, in a manufacturing plant, you want single-piece flow. You want to make single changes at a time so if something breaks, you can identify and fix it and remediate faster, versus, you make 50 changes, something breaks, and you don’t know what.
Carlos: That’s right, you’ve got to go through 50 iterations or whatever.
Eugene: Yeah, exactly, and the problem is these pbix files are these big blobs, so what happens naturally is you make 100 changes, you break something and now you’re like, “what did I do?” This tooling is really exciting because it allows you to take more of that single piece flow perspective where you make a change, publish, make a change, publish, and then if you break something, you can see what was different, what did you do and kind of do a whodunit. Again, since I haven’t used the tool, I can’t speak too much to it, but from what it promises to do, it’s really exciting for me, as someone who works every day with Power BI.
Carlos: Right, and of course, we will link to it. Again, going through the demo and whatnot, it’s not Visual Studio, so you’re not going to get side-by-side comparisons, things like that, and I’m not trying to knock what they’re doing. It’s very cool, we’ve said that it’s good stuff. It does, however, give you that ability to see at a high level, “hey, this was the setting that was changed.” And that, at least, then there’s all that drill-down because you’re drilling down into json. And they’ve tried to make that as easy as possible. But again, at least it’s a lot more accessible than nothing.
Eugene: Well, and something, too, you talked about how do you deal with collaboration and multiple people working on a pbix file. If you’ve ever used Git before, there’s a command called ‘git blame’ and in theory, we shouldn’t be blaming people and we should have no blame retrospectives or whatever it’s called, but you know, I can imagine people working on a team, multiple working on a report or something like that and you want to go back and say, “alright, who messed this up?” Or “who made this change?” Because maybe you change the name of a report or how it looked and you want to know what the justification was. Or, something else to consider, I used to do some consulting work for a Fortune 1000 company in Pittsburgh. And because they were a Fortune 1000 company, they had to be Sarbanes-Oxley compliant, which means any time you’re dealing with financial reports, there has to be this kind of organizational firewall between the report developers and the sysadmins who deploy it, because the thing is, you don’t want another Enron where people are cooking the books. And people need to be able to validate there’s controls in place that the financial documents are reliable. What it also meant is any time we had to make a change for something, we had to put in a change request. And it was fine, it was like a one-page document, but again, having a way to tie, “okay, here’s what we changed” to that change document really helps when you’re dealing in a heavily regulated kind of environment like financial reporting. So I think something like this where you can say, “okay, not only do we track the changes, but give me a custom field to pop in, here’s our change order number.” That would be fantastic for some places.
Carlos: That’s right. I mean, and it’s interesting, because obviously we’re seeing reports like for database, I think we mentioned Brent’s stuff if using Power BI now.
Eugene: He stopped because of this, I found out, actually.
Carlos: Oh, he did? Oh, okay, well there you go.
Eugene: Yeah, so he made a dashboard for his sp_Blitz stuff or some of those things and I remember he recently like– he stopped because of these issues, because of the fact that there’s no easy way to collaborate. And so what you have to do is to basically, you change the whole file, you hope no one breaks it and like his stuff is open source now. So how do you do pull requests for this big blob of a report file? It’s not ideal.
Carlos: Yes, challenging. My thought was that while yeah, I’m sure there are reports that are not financial in nature, virtually all the reports we have done on a customer-basis tie into financials, I mean, at the very beginning and then you’ve got to branch off from there.
Eugene: Well, you and I are working on one for commissions and you get commissions off by a penny and there’s heck to pay. So yeah, I agree.
Carlos: So the Power BI Sentinel, it’s unclear to me, it appears that this is a monthly charge.
Eugene: Yeah, it looks like it’s a software-as-a-service kind of model, which I think is reasonable. I mean the pricing looks pretty affordable, honestly, and I’m sure there was a lot of work that went into putting this in there, because you have to think, this document format, this pbix format, totally undocumented, totally unsupported. I’ve gone and done this myself where you’re just opening up a zip file and poking around and seeing what you can find and Microsoft’s like, “don’t go in there” and you’re like, “I’m going to do it anyway.” It’s funny, actually, so back– this was way back in I want to say like 2016 or something, when 2017 was still in preview. You could open a pbix file, take the data model file inside of it, rename it .abf and restore it to SQL Server SSAS 2017. So literally, for like the first year or two, it was just like SSAS Tabular backups was the file format, and then they changed it. Now it’s compressed with Express 9 or some Microsoft algorithm, so I haven’t found a way to do that again yet, but maybe I will tomorrow. But my point is to make a tool like this has got to take a lot of work because you’re in the Wild West. You’re out on your own with that toothpick and that rubber band trying to figure out “what does this file represent?”
Carlos: That’s right. So they do have a community version, you can’t use shared workspaces, looks like, is probably the big piece, but at least from a change-tracking perspective, so you get 10 reports in a single workspace and it allows you to kick the tires.
Eugene: It’s a great way to play around with it, yeah, I agree.
Carlos: Okay, well, compañeros, as always, if you have other ways or other thoughts about how you’re doing this, let us know. Do we have any other points we want to make in today’s episode, Eugene?
Eugene: No, we’ve covered that it’s a hot mess right now when it comes to devops in Power BI and I think as Power BI continues to pivot to supporting more of the Enterprise space, I think that it’s just going to be a thing they have to do, just like how they had to make Power BI on-prem an option. So, they’re starting off in business-user-land and they’re making their way towards Enterprise kind of governance, so we’ll get there.
Carlos: That’s right, I’m sure at some point, or maybe next April, April 1st, you heard it here, first, compañeros. April 1st next year, they’re going to add Power BI Development to Visual Studio.
Eugene: We’re doing the over-under on that, okay.
Carlos: Well, I figured April 1st either way. Right, if it doesn’t come out, I can put something out about it. Okay, compañeros, that’s going to do it for today’s episode. Thanks again for tuning in. Eugene, thanks for being here. If you want to reach out to us on social media:
Eugene: You can find me @sqlgene on Twitter or sqlgene.com for my blog.
Carlos: Or wherever fine Pluralsight courses are found.
Eugene: Mostly at Pluralsight, occasionally at Udemy when they get pirated. So if you go to The Pirate Bay, I’m sure you can find some of my courses, as well.
Carlos: Right, author must be Eugene Meidinger. You heard it here first, compañeros. And I can be reach at LinkedIn. Carlos L Chacon is my handle and compañeros, I’ll see you on the SQL Trail.