The importance of explicit data type in Power Query
What could go wrong when we have a column of “ANY” data type in Power Query? Well, it depends. Depends on what? Sometimes luck! 😅
What’s wrong here?
This query has been working well all the time until one day…
A real story of myself
One day a colleague called me for the following error message that stopped the queries from running in her Power BI report.
I helped her build that Power BI report quite some time ago. She has been updating the data sources and running the report successfully every week without any error. One day, she was stuck with such an error without any clues.
The error message is clear. Power Query cannot join Number with Text.
Having said, it is not an easy task to locate where we have such a problem, especially on a large model. What I am showing here is a super simplified version that we focus on only three columns of a table.
We took some time to identify the data files that potentially provoked the bug. Just to mention, those files are system-generated Excel files. The reports run from the system returned an updated dataset in “exactly” the same layout and format, which I thought it was even after looking at the data file below.
Honestly, can you spot what went wrong in this file?
As this is a system-generated file, all values in column A are supposed to be “text”. However, one weird record missed this important property and stored as “number” (Row 13 in the screenshot above). There are more than 10k lines of records across multiple files. How are we supposed to spot this outliner?
The error message, although imperfect, helped indeed:
From the message, we know what to search for. 😉
Why did Power Query fail?
Because we did not define the column “PO Number” as text before merging it with another column. 😰
As it has always been “text” in the source files, the error message had never come out until the unexpected bug from the source sytem showed up from nowhere, that returned a column mixed with text and number.
How to fix it?
Indeed, it is easy when we know where the problem is. Just define the column “PO Number” as Text before merging it with other column(s).
I am sure that I “overlooked” that column somehow as I have all other columns defined explicitly.
Although my colleague and I both wondered why the source files had such weird behavior suddenly, I take this surprising bug as a good learning and a reminder to myself:
Be incredibly careful in defining the data type of each column in Power Query.
Want to learn more about data type in Power Query? Check this out: