Dyota's blog

Excel number conversion

Excel's number conversion is hell. In the context of ERP implementations and data migration, it's a potential project killer.

This is a problem that is most visible with CSV files, with values that "look like" numbers. For example, a cell value that is 0001, Excel will eagerly convert that to 1, because it thinks that the leading zeroes are not semantically significant. In fact, that 0001 could be a string, and the leadings zeroes are significant.

When a user opens a CSV file in Excel, Excel will convert all the number-like values to numbers. When the user closes and saves the file, even if they save back as a CSV, they will have unwittingly mutated the data.

Fully quoted values is not resistant to this.

In this kind of environment, maintaining data integrity is like staying close to the ledge on a cliffside so that you don't make a misstep and fall to your death. It requires engineering, discipline, and gatekeepers.

The unavoidable fact is that normal people are going to have to touch your data files and either make corrections or notes.

Here are some ways on hardening up your data.

Things the provider can do

Read-only

Before distributing any CSV files, make them Read-only. This will make them resistant to any kind of edits, either in Excel or in text (notepad, VSCode, etc).

Export-Excel

If using the ImportExcel PowerShell module, the Export-Excel commandlet also eagerly converts numbers.

Make sure to use the -NoNumberConversion flag. Note that this flag requires arguments - you have specify exactly which columms you want to have NoNumberConversion. To say "all columns", use "*".

Export-Excel -NoNumberConversion *

Things the viewer can do

Disable automatic conversion in Excel

Excel Options > Data > Automatic Data Conversion > untick "Remove leading zeroes and convert to a number"

Power Query

A safe way of inspecting CSV files without inadvertently mutating them is by leaving them alone and pulling them in via Power Query. Power Query will preserve the data exactly as it was, especially if the column is explicitly cast as text.

View the file in VSCode

Not as nice looking, and has poor utility, but there are extensions that will virtually align CSV files into columns.