Dyota's blog

Power BI: TMDL Data Model and the Tabular .NET namespace

This is a big deal.

I've put off converting my Power BI data models to TMDL ever since I found out about it because I was comfortable with working with model.bim.

model.bim is a a big, monolithic JSON file that holds all of the definitions of all of the "code" in Power BI. This means: everything in Power Query, everything in DAX, relationships, and more (it doesn't say anything about visual elements in the report though). Because it's JSON, I can manipulate it programmatically using PowerShell, using standard methods for PowerShell objects. I can do some pretty powerful stuff like rip out all of the code and store them in separate files, and so on. The JSON format makes this possible.

The main pain point with having a big monolithic JSON file is that it's difficult to use in scenarios where there are different branching versions of the same file. Merging different branches is a real pain, and requires careful manual editing to make sure that all of the changes are merged correctly, and that the file maintains its integrity (i.e. no accidental structural changes that will cause Power BI to refuse it).

The TMDL paradigm solves many of my problems. For one, each table is saved in its own file as a .tmdl file. The whole data model is defined in a folder structure. This is what I wanted all along! It's not exactly what I want, because each .tmdl file contains more information than I care about, but it's a massive leap forward, and it's an official part of the product.

However, the actual text format of TMDL is whitespace indentation, like YAML. This is very hard to work with natively in PowerShell. This is the reason why I have been avoiding converting to TMDL. All of my PowerShell functions will break, and it's too much for me to rewrite everything.

Today I found out about the Microsoft.AnalysisServices library that allows me to suck in a whole folder and convert it to an object in memory ("deserialise" it) and still work with the data model programmatically.

Getting Started

  1. Install Microsoft.AnalysisServices
  2. Save it somewhere
  3. Deserialise a definition folder like this:
Add-Type -Path "$env:LOCALAPPDATA\PackageManagement\NuGet\Packages\microsoft.analysisservices.19.107.2\lib\net8.0\Microsoft.AnalysisServices.dll"
$modelPath = ".\*.SemanticModel\definition"
$model = [Microsoft.AnalysisServices.Tabular.TmdlSerializer]::DeserializeModelFromFolder($modelPath)

$model # this is the model object, ready to interact with

Notes

Data types and methods

Let's talk about data types. In the [Tabular] namespace, there are two types that we care about:

Correspondingly, let's look at two methods on the [TmdlSerializer] class:

Let's look at an example of a model.bim file to relate this to.

// model.bim

{                                               // <-- This layer is the [Database] type
    "compatibilityLevel": 1702,                 
    "model": {                                  // <-- This layer is the [Model] type
        "annotations": { ... },
        "description": { ... }, 
        "expressions": [
            {
                "name": "...",
                "expression": [ ... ]
            }
        ], 
        "tables": [
            {
                "name": "...",
                "columns": [ ... ]
            }
        ],
        ...
    }
}

This is what model.bim looks like. This is the [Database] data type; note that it has the compatibilityLevel property, and that it also has the model property. Don't let the name fool you; "Database" doesn't mean that it literally has tabular data, it means that it's the metadata about the database.

When you use the DeserializeDatabaseFromFolder method, you get out the [Database] type. This is equivalent to reading the .bim file and converting it using ConvertFrom-Json.

The model property of this is of type [Model]. This is the one that we really care about, and that we will be working on. The DeserializeModelFromFolder gets straight at the model property.

Therefore, these two operations are equivalent:

using namespace Microsoft.AnalysisServices.Tabular

# Take out the model via the database
$tabularDatabase = [TmdlSerializer]::DeserializeDatabaseFromFolder($modelPath)
$model1 = $tabularDatabase.model

# Take out the model directly
$model2 = [TmdlSerializer]::DeserializeModelFromFolder($modelPath)

# Convert to JSON so that we can compare
$json1 = [JsonSerializer]::SerializeObject($model1)
$json2 = [JsonSerializer]::SerializeObject($model2)

# Compare and see that they are the same
$json1 -eq $json2 # True

Differences

There are some differences that arise between using the .NET serialiser classes, and working directly from the JSON.

Properties

The result of deserialising is not the same as converting a model.bim to JSON. A [Model] object has more properties on it than model.bim contains. You won't end up with the same object.

As far as we can see, model.bim only has two properties: compatibilityLevel, and model.

However, when we deserialise using [JsonSerializer]::DeserializeDatabase, we get way more properties.

using namespace Microsoft.AnalysisServices.Tabular

$bimfile = "$path$\model.bim"

$bimText = Get-Content $bimFile -Raw

# pull out the model via serialiser
$model1 = [JsonSerializer]::DeserializeDatabase($bimText)

# pull out the model via JSON converter
$model2 = $bimText | ConvertFrom-Json -Depth 20

$props1 = $model1 | Get-Member | ? { $_.MemberType -eq 'Property' } | % Name
$props2 = $model2 | Get-Member | ? { $_.MemberType -eq 'NoteProperty' } | % Name

$props1.Count # 46
$props2.Count # 2

Putting model.bim back together again

When working with model.bim, it is best to treat it just as a JSON file, and not deserialise it using the .NET methods. It will break.

In addition to the extra properties that the [Model] class will put on, serialising it will also result in a JSON structure that is not alphabetically sorted.

Whenever Power BI saves a model.bim file, it will save it with all of the properties alphabetically sorted. This will actually introduce a big headache when comparing diffs.

#powerbi #tmdl