I’ve been meaning to look at Azure Data Catalog for a long time but somehow never got round to it until now. I’m not sure why – perhaps it’s because there’s so much other cool stuff happening in the world of Power BI and Azure, and Azure Data Catalog seems a bit, um, boring in comparison. Maybe it’s a bit too much like documentation and all that other stuff we instinctively try to avoid. Whatever the reason, now I have looked at it I’m impressed and I can see its uses, not just for Power BI but also for traditional corporate BI scenarios. It certainly deserves a lot more attention from the Microsoft BI community than it seems to be getting at the moment.
First of all, what is Azure Data Catalog anyway? There’s lots of good documentation, videos and a tutorial here:
https://azure.microsoft.com/en-us/documentation/services/data-catalog/
…but the short answer is that it’s a cloud based service for cataloguing all the data sources you want to use for BI. It doesn’t store any of the data itself (except for a small amount for preview purposes, and you can turn that off), just the connection details; it also stores metadata (column names, data types etc) as well as descriptions, documentation and contact details for people who own or know about the data. End users can search the catalogue for data and even, for some data source types like SSAS, click a button to open that data source directly in Excel or Power BI Desktop.
Is it any good? Well, I managed to get going with it very easily. As well as an Azure subscription it does require you to have Azure Active Directory set up though, which is a pain – some of my customers (mostly the ones that have bought into Office 365 and Power BI) have it already, but I know for others it would be a showstopper. After setting everything up I was able to register SQL Server, SSAS and other types of data source very quickly using both the desktop application and the browser interface. It’s certainly simple enough for a reasonably technical end user to use and the implementation is very solid. I do have a few minor quibbles (I found the search results interface slightly confusing at first but soon came to grips with it) and it’s clear that it’s still very much a work in progress (right now the browser interface supports registering more data types than the desktop app, for example) but nothing very significant.
There is an obvious need for Azure Data Catalog if you are going to use Power BI and have embraced the whole concept of self-service BI: it makes it much easier for your users to find and consume the various different data sources that you have available. However, as I said earlier, I think on-premises users of SSAS and SSRS could also benefit from it too. It’s becoming increasingly common for me to see organisations with several SSAS cubes in production across various servers, each of them with a confusing list of measures and dimensions. Equally I’ve seen plenty of SSRS implementations with hundreds of reports, with nobody knowing what many of these reports do or even if they are used at all. Azure Data Catalog could provide a convenient central place to document and discover these cubes and reports. I guess it all depends on whether you can be bothered to do this documentation and annotation though, rather than whether you have a suitable tool for this purpose; I don’t think many organisations can be bothered, unfortunately.
One last thing: Azure Data Catalog is meant to include replacement functionality for the old Power BI for Office 365 Data Catalog but as yet it doesn’t allow you to publish or share Power Query queries. This is something that I and other Power Query/Power BI fans would love to see, especially if (unlike the old Data Catalog) it let you publish updates to your code without needing to manually update each Excel workbook/Power BI report that used it afterwards.
