Dyota's blog

PowerShell for Power BI: Scan for Unused Columns

In the beginning of a Power BI project, sometimes it's necessary to bring in many, or all, of the columns from tables from your data sources. You don't know yet which columns you will need.

At the end, it will have been established which columns really necessary, but it's sometimes hard to keep track of which ones ended up being used.

I like to keep things clean at the end of the project. It reduces the file size, the data size (faster refreshes), and reduces confusion down the track when you have to re-open the project and make modifications.

Third-party tools like Bravo will tell you which columns are unused in the data model, but won't tell you if they are referenced in report visuals or page filters.

This information however is contained in report.json and model.bim, if you save your Power BI file as a .pbip.

Below is a PowerShell script that takes all of the columns in all of the tables present in the data model, and searches the following to check if they are referenced:

(The following should really also be scanned through, but I didn't need to do it today:

The output of this script is a text file called no references.txt.

This helps with identifying unused, but doesn't help with actually removing them.

using namespace System.Data; # for DataTable
using namespace System.Collections.Generic; # for List
function getMeasures ($model) {
    # given a data model, rip out all of the measures from it
    # this will form a reference to check against
    [List[pscustomobject]] $measures = @()
    $model.model.tables.measures |
        ForEach-Object {
            if ($null -ne $_.name) {
                $measures.Add(
                    [ordered] @{
                        name       = $_.name;
                        expression = $($_.expression) -join "`n"
                    }
                )
            }
        }
    return $measures
}
function getVisualReferences ($report) {
    # given a report definition, rip out all of the visuals
    # this will form a reference to check against
    [List[pscustomobject]] $visualReferences = @()
    $report.sections |
        ForEach-Object {
            $page = $_.displayName
            $_.visualContainers |
                ForEach-Object {
                    $config = $_.config | convertfrom-json
                    $visualId = $config.name
                    $visual = $config.singleVisual
                    $visualType = $visual.visualType
                    # $columnReference = $visual.prototypeQuery.Select.Name # don't use this, this is not accurate and can stick to old table names
                    $prototypeQuery = $visual.prototypeQuery
                    $from = $prototypeQuery.from | Select-Object name, entity
                    $select = $prototypeQuery.select
                    $select | 
                        ForEach-Object {
                            # get either column or measure, depending on which one is there
                            if ($null -ne $_.column) {
                                $reference = $_.column
                            }
                            elseif ($null -ne $_.measure) {
                                $reference = $_.measure
                            }
                            $source = $reference.expression.sourceref.source # alias
                            $tableName = ($from | Where-Object { $_.name -eq $source }).entity # full table name
                            $property = $reference.property
                            $columnReference = "$tableName.$property"
                            if ($null -ne $columnReference) {
                                $visualReferences.Add(
                                    [ordered] @{
                                        page            = $page;
                                        id              = $visualId;
                                        visualType      = $visualType;
                                        columnReference = $columnReference;
                                    }
                                )
                            }
                        }
                    }
                    # page level filters also reference columns,so check these
                    $_.filters | 
                        ConvertFrom-Json |
                        ForEach-Object {
                            $entity = $_.expression.column.expression.sourceref.entity
                            $property = $_.expression.column.property
                            $visualReferences.Add(
                                [ordered] @{
                                    page            = $page;
                                    visualType      = "page filter";
                                    columnReference = "$entity.$property";
                                }
                            )
                        }
                    }
    return $visualReferences
}
function hasNoReferences ($table, $column) {
    # check measures and visuals for any references to the target column
    $measuresCount = ($measures | Where-Object { $_.expression.Contains($column) }).Count
    $visualReferencesCount = ($visualReferences | Where-Object { "$table.$column" -in $_.columnReference }).Count
    return ($measuresCount -eq 0) -and ($visualReferencesCount -eq 0) 
}
$model = Get-Content ".\ProjectName.SemanticModel\model.bim" | ConvertFrom-Json
$measures = getMeasures $model
# reset the out file
'' > ".\no references.txt"
# This is an array, in case the semantic data model has multiple children reports
$reportNames = @(
    'ReportNae'
)
$visualReferences = $reportNames |
    ForEach-Object {
        $reportname = $_
        $report = Get-Content ".\$_.Report\report.json" | ConvertFrom-Json
        getVisualReferences $report |
            ForEach-Object {
                $_ | Add-Member -NotePropertyName 'source' -NotePropertyValue $reportname
                $_
            }
        } 
$visualReferences | convertto-json > ".\visual references.json"
$tables = $model.model.tables
$tables |
    ForEach-Object {
        $table = $_.name
        $columns = $_.columns
        $columns |
            ForEach-Object {
                $column = $_.name
                if (hasNoReferences $table $column) {
                    $out = "$table.$column"
                    $out >> ".\no references.txt"
                }
            }
        }
# calculated cols
# pq

#model.bim #powerbi #powershell