Moving data is fairly straightforward; however, mapping data from one source to another can be extremely challenging. In an attempt to help self-service BI users, Microsoft has introduced a Common Data Model and in this episode, we discuss the new model. This may not be as approachable for organizations which already have a data warehouse; however, for those who are in the Microsoft suite of tools and looking to report on that data in Power BI, the Common Data Model may be the jump start to get the model you need to get going.
“I think the naming’s a little bit misleading or maybe aspirational, because you look at the name, you assume, “oh, this is just like a traditional data warehouse or maybe like an OLTP, single source of truth or something like that.” And to me, it feels a little bit different.”
“Dynamics may not be as ubiquitous as SQL Server, however, for those organizations that do have Dynamics and then want to start integrating into some of these other pieces, I can see where it gives you that framework.”
“This is kind of a low-code data warehousing solution. And not only that, but one of the things that I’ve been interested in with this is, potentially it seems like a way to have like, platform as a service data storage for something like Power Apps or whatnot.”
Listen to Learn
00:38 Intro to the topic
01:14 Compañero Shout-Outs
01:56 Why do we need to have Common Data Model?
05:48 The Power community is still small, but growing quickly
07:33 This seems good for low-code business users
09:29 Eugene has not found a specific use case for the CDM
11:54 Your data is not living in a traditional relational database
14:03 The Open Data Initiative & competing with Salesforce
15:35 If you compañeros have experience with the CDM, please reach out to us
18:25 Closing Thoughts
Carlos: Compañeros! Welcome to another edition of the SQL Data Partners Podcast. I am Carlos L Chacon, your host. This is Episode 189. I am joined today by Eugene Meidinger.
Carlos: And Kevin is somewhere in the ether.
Eugene: He’s with us in spirit.
Carlos: He’s with us in spirit, that’s right. Let see, so today our topic is the Power BI Common Data Model. So, I threw in there Power BI, but the Common Data Model, I guess, is not necessarily geared just to Power BI, but a very common use would be in Power BI, so that’s why we are including it or lumping it in there.
Carlos: But before we get into that, we do have a couple of shout-outs. So, compañeros, shout-out to Joshua Hitchcock, Leann Harris and Arnoldo Montes de Oca for showing a little love for the podcast. We do appreciate it. For those of you who enjoyed the DBATools in a Month episode: Marty Governor, Alexander Rivera, Malcolm Monahan & Craig Burnham and to all of those lucky winners of the free book. A special shout-out to Joshua Hitchcock. He was one of the first ones to grab that copy. Okay, so today, this episode will be found at sqldatapartners.com/powerbi or sqldatapartners.com/189. Okay.
Eugene: Very good.
Carlos: Common Data Model.
Carlos: So, hm. We’re data people, right?
Carlos: How many times have you had to transform data or to look up “okay, what’s the ERD of this data? How do I get data from A to B?” Lots of people define things differently, and so Microsoft has come out and said, “you know what? We’ll give this a shot. We have applications, we have Dynamics and some of these other CRM sales. Why don’t we kind of put together our common data model for some of these pieces, and then see if we can build applications around those?” And I guess I jumped into that a little too quickly, but the thought of why? Why do we need to have a common data model?
Eugene: Yeah, so, I think the naming’s a little bit misleading or maybe aspirational, because you look at the name, Common Data Model, and you assume, “oh, this is just like, I don’t know, a traditional data warehouse or maybe like an OLTP single source of truth or something like that.” And to me, it feels a little bit different, or probably like you, I’ve been struggling for a long time trying to figure out, “okay, what’s the point of this?” And I think for most of our listeners, for their needs, they’re going to be a lot happier with just a regular data warehouse, facts and dimensions kind of stuff. Where I see this being of value is whenever we start talking about third-party applications or specifically stuff with Microsoft, so Dynamics CRM and that sort of thing. So, what it seems to me is that a lot of it is providing, I don’t know, let’s say you have data in Salesforce or Adobe or SAP or something, some common shape for that data to go into. As an example, let’s say that you’re in the service industry, and so at my last job we would service fire extinguishers and I know, actually, my former employer was planning on moving to a application on top of Dynamics 365. You would have work orders and you would have customers and you would have service locations and technicians, all these sorts of things, and so you need some sort of shape for that data. What does it mean to be an account or a work order or technician? And you need to be able to migrate your data into that and then be able to work off of it. And so, from what I’ve been able to see with the Common Data Model, a lot of it is, “okay we want to integrate between these different applications.” Like maybe you have some data in SAP and you have some of your own data and you want it to be this common format. What I wonder sometimes though is how popular is this going to be? I’m wrapping up a course on Distributing Excel Files which is got to be one of the most boring course names ever.
Carlos: No spoilers.
Eugene: Yeah, right. Well, one of the things I talk about is some of the formats you can export Excel files to. And so, there’s PDF, which everybody use and everybody has, and then there’s XPS which is like the open XML paper specification and it’s supposed to be like PDF but nobody uses it. Just Microsoft. Like I think some commercial printers probably use it, but I wonder how much CDM is going to be XPS, where it’s this Microsoft-specific kind of shape for things and then they say, “it’s common, everyone can use it,” but really like it’s mostly Dynamics data and then Power Apps. Right?
Carlos: Sure, sure. Yeah so that’s fair, so as I take another sip of my Microsoft Kool-Aid, and it tastes good, ladies and gentlemen.
Eugene: It pays the bills. Listen, me and my spouse are addicted to having a roof over our heads so bring another cup.
Carlos: Yeah, I almost wonder– so some of the things that we’ve seen even from a community level, so the SQL Server community is very strong, very vibrant, so now they’re trying to do this again, build up another community in Power BI and so what used to be called Flows which is now called– I can’t remember what it’s called.
Eugene: Power Automate, I think it’s called.
Carlos: There we go, Automate, and then you’ve got.
Eugene: They have to have Power in the name, yeah.
Carlos: Yeah, yeah. And then the Power Apps with the Power Platform pieces we’re starting to see that integrate much more right with the Dynamics pieces and I–
Eugene: Yes, definitely.
Carlos: And so while, yes, maybe that part of the world is very small, comparatively, when you think about SQL Server, Dynamics may not be as ubiquitous as SQL Server, however, for those organizations that do have Dynamics and then want to start integrating into some of these other pieces, I can see where, again, it gives you that framework. It’s like, well, hey, if you– particularly if you don’t have one. I mean–
Carlos: Luckily for us, we’re still building data warehouses for folks. So it’s like, “okay, well now we don’t have to necessarily reinvent the wheel, and if we go into organizations that have some of these other technologies, it kind of gives us a starting point that we don’t have to think about. Right?
Eugene: Yeah, no, I see a lot of value, and I agree that the Dynamics/Power Platform, maybe not community, cause it’s definitely not as strong as SQL Family in that regard. But definitely the group of people, professionals in that space, is probably bigger than you might think, especially if you ever go to like the Power Platform Summit and you see how much Dynamics adjacent stuff there is there. I think my frustration is that it’s easy to look at this and assume that it’s a Swiss Army knife. In reality, I think it’s a scalpel for a couple of specific use cases.
Carlos: Sure, current state, definitely.
Eugene: Yeah. And you hit one of them on the head is like, okay, if you’re coming from the SharePoint, Dynamics, Office 365, I don’t like code, kind of world, this makes so much sense. This is kind of a low-code data warehousing solution. And not only that, but one of the things that I’ve been interested in with this is, potentially it’s a way to have kind of– at least it seems like a way to have, I don’t know, I want to call it like, platform as a service data storage for something like Power Apps or whatnot. Cause that’s the thing with Power Apps is let’s say you want to make like a simple survey app or something. Like you’ve got SQL Trail and you want to have a Power App or something, ignoring the fact that I don’t think you can distribute them publicly. But, let’s say you could.
Eugene: Maybe you wanted to have a survey app for how to go, or maybe, instead, you give everyone, all the people in your organization, a Power App to record feedback from people. Someone’s like, “oh, this part was great,” and you want some way to capture that. Well, the Common Data Model seems like a way to have that data substrate without having to go, “okay, do I want to use Blob storage or Azure SQL Database or whatever?” So, I definitely see it as a way to potentially provide a data substrate for users who don’t want to have to deal with database administration. Although, I think, at least when I looked into it, it seemed like the amount of data storage you get was small. I think it was measured in megabytes instead of gigabytes or something like that, at least on a per user basis.
Carlos: Oh really? Okay, I guess I must have missed that there.
Eugene: I’d have to look into it again, but I think definitely that kind of low-code, more business user kind of audience, I think this makes a ton of sense for.
Carlos: Sure. And cause it also does provide, again, some of those back-end pieces, so with the Azure Data Lake Gen2, it’s just like throw everything up there and then we’ll help you figure it out?
Eugene: Yeah, so that’s actually really interesting is that the Power BI Data Flows and Azure Data Lake Gen2 piece, they had a really weird name for it at the time and they changed it. So, you’ve got the Common Data Service for apps, which is kind of what we think about a lot with the Power Apps piece, the Dynamics CRM piece, that whole data substrate that I talked about, and then they had something that was called Common Data Service for Analytics and then everyone was like, “what?” and so they renamed it in the Power BI Data Flows. But the idea is that when you create a dataflow which is basically just a reusable piece of M or Power Query code that lives in the cloud, that’s all that it is. That data gets stored on Blob storage, specifically Azure Data Lake Gen2 Blob storage and if you bring your own storage account you can then access that directly and so you can do analytics on that. You can do machine learning on that. And what’s interesting, too, with the whole CDM thing is it’s optional for Dataflows, you don’t have to say, “okay, this table is an account and so it’s going to have these columns and that sort of thing. I believe you can create your own entities if you want, but what’s interesting, too, is I haven’t dug into the exact specifics, I haven’t played around with it, but Matthew Roche has a lot of good posts on this stuff. But my understanding is you use a dataflow, you assign a Common Data Model entity to it and so what happens is now you have this folder that, I think, has all these CSV files for the data but then also has a JSON file for that schema, for that Common Data Model schema. So, it’s interesting the way that they’re doing that there. But yeah, I think the whole Common Data Model thing, they’re working on growing it outward. They’re adding more integration with the whole dataflows and the data lake kind of piece. So, I can see where it’s going to expand its just, for me I have yet to find a specific use case. I’ve yet to find a case for an existing customer where I’m like, “yeah, you should do this instead of a data warehouse” or something like that. But I think a lot of that’s going to lean on, you’re either doing something with Power Apps something with Dataflows or something with, again, I don’t want to say a third-party application, cause a lot of these are from Microsoft, but basically these application solutions, like an ERP system or CRM system.
Carlos: Right. Yeah, it’s interesting to see what that adoption will look like. I mean, again, so this one, I don’t know if different piece, is that it’s not something that you install. It’s really just, again, it’s that framework. I’m looking in here at the entity reference, and they’ve got, oh I don’t know, maybe 50 entities in here. And then, if you’re familiar at all with Entity Relationship Diagram and ERD, and then creating how the relationships and the column names, that’s kind of what we’re looking at.
Eugene: It’s interesting because again, to my knowledge with the whole Common Data Model stuff, generally that’s being stored as like Data Lake type storage. Like let’s say you’re doing Dynamics CRM. To my knowledge, and I could easily be wrong about this, but I think if it’s in the whole Common Data Model kind of piece and it’s in Office 365, then your data’s not living in a traditional relational database. And so, I wonder how performance works with that. And again, I could be totally off base here, but that’s my current understanding at least, so it’s interesting. And the other thing I wonder about with like those entities is at my last job I did a decent amount of work with ERP systems. So, we had data in Great Plains, we had data we migrated into like a homespun ERP system for the fire protection kind of stuff, and it’s just so hard to get the right columns in the right shape of the data when you’re talking about those things. There’s so much variation and so it seems like from what I’ve seen, the approach they’re taking is, well, one, you know we’ve done so much work with designing these systems. There’s so many different variations of Dynamics cause Microsoft kept buying companies. So, we have an idea of what like the ideal shape is now, but then also, it’s like, well we’re just going to have 30 optional columns. And so that’s how we’re going to address some of that issue but I’m really curious how that works out in practice.
Carlos: Yeah, so compañeros, this would be a good time. So, if you have been using the Common Data Model, we’d be interested in chatting this up a bit more with you. You know, they have a few use cases. Again, I think like Eugene’s already mentioned, it seems kind of niched fairly narrowly right now. But we’re interested to see how– but I guess I should say, one of the things we didn’t mention was that this was in partnership with Adobe and SAP.
Eugene: Right. Correct, yeah, the whole Open Data Initiative.
Carlos: That’s right, so I do see a way, and again, I’m not sure if this just, at least with Microsoft and SAP. Admittedly Adobe I know less about, but you think about the exchanging of data from SAP to, you know, all kinds of different components and I could see that being helpful in the long-term. Again, maybe not rebuilding what you’ve already have, but–
Eugene: Yeah, that announcement when it happened, I think it was two years ago.
Carlos: Yeah, September 2018, yeah.
Eugene: Yeah, it was around the same time that I went to the Power Platform Summit 2 years ago. I think, myself included, some of us saw that as a shot across the bow against Salesforce. That was kind of the impression I got was, “hey we’re going to partner up with some other big names and try and offer a competing product against Salesforce.” Salesforce is huge. Like their, I think it’s called Dreamforce, but their conference like is a whole town in terms of ju– it’s like, I want to say its tens of thousands of people. It’s a lot. So, yeah, I feel like that and then when they bought LinkedIn, I feel like that was also kind of a shot across the bow for Salesforce. So, I can definitely see Microsoft trying to expand its data footprint and have a common repository or common shape for it to go in. Where that stuff that they own, like LinkedIn or companies they can partner with.
Carlos: Right, so this idea of like, “oh hey, all the data is starting to go into the cloud, hey maybe we can consolidate some of this stuff.”
Carlos: Yeah, it will be interesting and so again, compañeros, I think if you have experience with the Common Data Model, we’d love to chat with you about it.
Eugene: Yeah, definitely.
Carlos: Yeah, so I guess last thoughts here?
Eugene: Yeah, last thoughts is, I’ve been humbled before. When I first ran into Power BI, in terms of like my first impression with Power BI was like, “wow this is a really crappy SSRS.” And for the things that SSRS is best at, Power BI isn’t great, but it’s not meant for that. And you know, when I learned about Power Query, I’m like, “wow this is like a watered down SSIS.” And again, it’s not meant for that. And so, I have that kind of vague suspicion that there’s some real business needs that are truly and deeply being met by the Common Data Model and I’m just not swimming in those waters and so that’s why I’m missing it right now. So, I look forward to getting illuminated on the subject.
Carlos: Sure, I mean my take, or my thought is that, very helpful. So, as more and more organizations need to leverage that data and want to ultimately go into a warehousing-like model, that it will make more and more sense. I think because we are kind of already on the other side, like we’ve seen what has been done, we’re not experiencing that pain, cause we feel like we would have a solution for that. Whereas this is trying to help bridge the gap for folks that haven’t, or maybe even aren’t interested in, like you said, building the warehouse.
Eugene: Yeah, well, that’s been the whole theme of Power BI, is you look at so many of these things and you’re like, “you’re just reinventing the wheel from 30 years ago.” Like, you look at Dataflows and as a database administrator, you’re like, “we already have data warehouses. Why are you making this janky version of data warehouses?” But one of the key realizations I had with Power BI in general was, they’re building it all from scratch, but in such a way that Susan in Accounting finds it accessible and approachable and doable and so I think they’re doing the same thing here. I think you’re definitely right on that.
Carlos: Well, and it’s also interesting, and we will have another episode on this later, but you think about, so Azure SQL Data Warehouse, where they pretty much– so it’s still there. You can’t buy it, but then they’ve branded it or put it into the Synapse offering, which is still in preview. But Synapse really just seems like it’s a wrapper around all these other services, you know, including Databricks, so that you’re still getting charged for the individual components, but they just kind of wrapped an interface all around it and they’re like, “hey you don’t need to worry so much about– you know, we don’t want you, I guess, to be building a warehouse. We want you to throw everything into Azure Data Lake, again, this Common Data Model could come into that, start thinking about what you want to get out of that and then we’ll help you connect the tools to move the data around and get you the answers that you need.”
Carlos: It’ll be interesting. Okay, compañeros, I think that’s going to do it for today’s episode. Thanks again for tuning in, we do appreciate it. It’s nice to have listeners, right, Eugene?
Eugene: I suppose so.
Carlos: You suppose so.
Eugene: Better than us just shouting into the void.
Carlos: Just– yeah. Yes, so compañeros, if you are part of that void, thank you for tuning in.
Eugene: Oh, my goodness.
Carlos: Okay, so as always, you can reach out to us on LinkedIn. Eugene?
Eugene: Yeah, I’m on Twitter @sqlgene, my blog is at sqlgene and keep an eye out on Pluralsight for Distributing Excel Workbooks, the most exciting course available.
Carlos: The title might not be exciting, but I’m sure the content is.
Carlos: And for me, compañeros, you can reach out to me on LinkedIn. I’m at Carlos L Chacon. And if you know organizations, either healthcare organizations using a Centricity application or printing organizations, particularly those who are using the EPMS app, and think they could use a helping hand, we encourage you to send them our way and we can see what we can do for them. That’ll do it for today’s episode. Thanks again, compañeros, and we’ll see you on the SQL Trail.