Find and Replace is quite common in Excel. If you are not sure about what options we have for Find and Replace, please read my blogpost here.
In regular Excel, the Find and Replace action is one-off. Whenever we have new data, we need to perform the same action, repeatedly. If we want to do that by a click of refresh, then we should do it in Power Query. Find and Replace is called “Replace values” or “Replace errors” under the Transform tab of Power Query Editor.
As the name suggests, we can replace values or errors in Power Query. To perform Find and Replace in Power Query is similar to doing the same thing in Excel. However, there are a few details we need to pay attention to.
The above image is not surprising at all. SUMIF returns the expected result correctly. We all write SUMIF like this. (Yes, me too. 😅) It works fine most of the time unless we are facing something not so common. Say for example, the text strings all start with “>> “, like the screenshot below:
To be clear, the content in A3 is equal to the content in D2, however the SUMIF is returning the wrong result? Strange…… 🤔
Do you know what’s wrong with the SUMIF formula? It is the same as the SUMIF formula in the first example. Then why does the first one work, and the second one fails? That’s the question!
If you want to know just the quick fix, here’s the formula:
=SUMIF(A2:A4, "=" & D2, B2:B4)
'The secret sauce is "=" &
If you want to know why, you need to know the basics of writing SUMIF.
In regular Excel, all the above Power Query transformations can be done with simple functions, except for TRIM which Power Query does it slightly different from the regular Excel. You may read the previous post for more details.
You may see the corresponding Excel functions in the following table:
Regular Excel function
Capitalize Each Word
Trim (only remove leading and trailing spaces)
TRIM (also remove extra spaces between words)
You probably know all these Excel functions and wonder why we need to do it in Power Query. I had the same questions too when I first started with Power Query. If you try to clean / transform your data in Power Query, you will find it makes perfect sense NOT to do it with regular Excel functions on worksheet. We shall do all sorts of transformations in Power Query Editor. And most of the time, text transformations stated above are only part of the data cleansing that we need. Therefore, it is important for beginners of Power Query to be aware of their existence in Power Query. And the best thing is we, as users, can do it through merely the User Interface of Power Query Editor, i.e. with a few clicks.
Let’s watch it in action how we can do these text transformations with Power Query:
In Power Query, there are some functions that are very similar to Excel, which you expect the same results from them. However, it may not be the case because they indeed perform differently. TRIM is one of them.
TRIM is a common function in Excel that removes all leading and trailing spaces in a text string. On top of that, TRIM also removes all extra space in between texts. As a result, only single spaces between texts will be kept. Below screenshot shows the results returned by TRIM.
When we perform Transform –> Trim or use the Text.Trim function in Power Query, the results may be a surprise to you! (At least it surprised me when I first noticed it). Here’s the result from Power Query:
Whenever we see “strange” results, we should always check the definition of the function. Indeed, we should do so before we use a function (in the ideal world). But I admitted that I did not do so either. 😅
From the MS documentation, I realized that the Text.Trim function removes leading and trailing spaces only. That’s why.
Unfortunately, there is no such function as Text.TrimLikeExcelin Power Query. But you know, when we work with Excel, there will be workaround(s), most of the time. We may achieve the same with a few Power Query functions together, i.e a little bit of M.
When we import data from a data source, be it a data warehouse, a CSV/TXT file, or even an Excel Table, there are chances that there are many more columns than needed for analysis. This is particularly true when we try to create a fact table with only “keys” and “fact” in the table. Luckily, we can easily remove columns that we don’t need, or simply keep those columns we need with Power Query.
In Power Query Editor, there is a button of “Keep Duplicates” in the ribbon. It’s located under Keep Rows in the section of Reduce Rows in the Home tab of the ribbon.
With this, it is super easy to keep duplicates in our dataset. By the way, duplicates mean value appears more than once. Let’s take a look at the screenshot below, “A”, “B”, and “F” are duplicates,
while “b”, “C”, “D”, “E” are uniques (i.e. values appear only once).
When we apply “Keep Duplicates” to Column1, we obtain the following result.
The good thing about Power Query is, we do not have to worry about the M code auto-generated. Power Query handles the M codes, we handle the User Interface, which is mainly “select and click”. Easy for users, isn’t it?
When we import data into Power Query editor, we should set the scope of data that we need. That means selecting only the rows and columns required for further processing. In the following video, I will show you different ways of keeping the rows that you need with Power Query.
Let’s think about this scenario. In the first encounter of Power Query, you were presented with demonstrations that show all the powerful transformations done by Power Query. You were surprised and amazed with the demos. Normally we did not pay attention to subtle details about the powerful tools. We jumped straight into all the transformation steps by following examples. Sometimes it works, something it doesn’t. And then you feel frustrated as you don’t know why some calculations give you unexpected results. If this sounds familiarize, I will say, this is because we are all trying to run before walk. 😅
In this blogpost, and in the video, I will talk about a basic but crucial step in Power Query – Data Type.
Data Type in Power Query is NOT formatting in Excel
Formatting does not change underlying values, defining data type in Power Query does.
When we work with data using Power Query, defining data type of each column before loading the result is a best practice. Please don’t mix it up with formatting in Excel. Defining data type to a column is fundamentally different from applying a formatting. In Excel, when we apply a format to a number, the underlying value of the number remains unchanged. On the contrary, when we define data type of a column in Power Query, all the values under the column will be converted to the data type that’s being defined.
Power Query is not behaving like Excel when doing calculation
Nevertheless, defining data type in Power Query is not for data conversion only. We need the data type to be explicitly defined before some calculations happen. Otherwise, we may have unexpected results returned. The worst scenario is we don’t even know that. For example, when you try to add “10” to 10, Excel returns 20 while Power Query returns error. Similarly, when you try to add 10 to an empty cell, Excel return 10 but Power Query returns null. In general, number does not work with text; text does not with number in Power Query.
Close and Load to is the last step in Power Query, but the first step in your reporting process. When I first learned Power Query, I preferred loading the result to Table as I could “see” the result on a worksheet. That’s normal, right?
The more I use Power Query, the more I realize that I rarely load my result to table directly. Most of the time, I load all queries to connection only. This is particularly true when my goal of cleansing/transforming data is for further analysis using Pivot Table (or data modeling).
If your goal is to use the processed data for Pivot Table, load the queries to connection.
In this short video, I will show you how to change the load destination from Table back to Connection only. Also, I will show you why it is better to load the queries to connection only when you are planning to use the result in Pivot Table.
If we want to see when a workbook was last modified, we can see the information from File Explorer,
or Excel Info
Very handy indeed.
However, if you want the information to be put on a worksheet, there is no formula to retrieve that information directly. No worries, we can get that piece of information using Power Query and it will be super easy (if you are not going to rename your workbook nor move it to a new folder). In fact, it is not difficult at all to turn the solution into a dynamic one. Let me show you step by step.
As mentioned in the previous post, the awareness of Power Query is extremely low. In my experience, the hurdle for people who have learned Power Query but not feeling comfortable in using it is the fact that they are not well informed about the three major steps involved:
If we learned Excel long enough, we are all trained to move cells around on worksheets for reshaping data; copy and paste for data consolidation; setting formulas for data cleansing or calculations, and so on and so forth. In Power Query, we are doing the same but through quite a different process, listed above. That is normal for people having difficulty to start with Power Query.
So, from now on, please bear in mind that the first step when using Power Query is always getting the data.
Power Query empowers users to get data from multiple sources. It could be data sitting within current Excel Workbook, another Workbook, CSV/TXT files, Web, PDF, or even files from a folder and many more. One of the cool things in this process is that we don’t even need to open the files with the data we need. Remember this: Power Query can read the data from a file (or source) without “opening” it.
Once we get the data (connect to the data source), all the cleansing / transformation magic will happen within Power Query Editor.
It is particularly important to get ourselves familiarized with the User Interface of Power Query Editor because most of the transformation steps are performed via different command buttons on the ribbon.
To illustrate the process and the User Interface of Power Query Editor, I have made this video:
I hope you like it. If you do, please give it a thumbs up, share and subscribe to my channel.
More videos on Power Query are coming. Stay tuned! 😉
Depends on what? It depends on what you do and how you do it in Excel. 😉
Excel is the core software I use for work
I use Excel for data analysis
I need to deal with data from multiple sources (or from multiple Excel workbooks)
I spent hours cleansing/transforming data before I could start analyzing it
I do repetitive cleansing/transformation tasks on a regular basis
I consolidate data by “copy and paste” data from multiple files
I do most of the data work with VBA
If you answered TRUE to more than two questions, then you should really consider setting Power Query as your learning goal in 2022. Believe it or not, you don’t need one year to learn it before you can use the power of it. Although mastering Power Query could be a long journey, most of us are not required to be at the expert level. (I am not an expert of Power Query either.) Just the basics of Power Query are good enough to change your (work) life.
Let me share a quick tip of Power BI today coz I work with Power BI Desktop a lot recently. 😎
What I am going to share in this post is my favorite shortcut key in Power BI Desktop. Honestly, I know just a few shortcuts in Power BI Desktop (much less when comparing to Excel). Nevertheless, the following shortcut is absolutely my all-time favorite, even a lifesaver! What is it?
Finding and highlighting differences should be a common task in Excel. Most of the time, it’s solved by formula or conditional formatting. Either way, we need to construct a simple formula like the one below to compare the cell contents in two lists.
= A1 = B1
In this post, I am going to show you a non-formula approach which is very handy, especially when we are dealing with a one-off task. What we need is just a few keystrokes, or mouse clicks.
Let’s watch it in action:
Prefer reading to watching? No problem. You may refer to the blogpost HERE or continue to read.
Have you ever encountered something like this? This is quite common. We are sure that the VLOOKUP formula is correct. We are sure that the lookup values are available in the lookup table (101, 106 107 in our example above). What we are not sure is the reason of the error returned… What happened?
The answer is simple: The “101” in B5 is different from the 101 in F5. What?? What’s the difference? The former is text, while the latter is numeric value. This is a basic but important concept for all Excel users, which I had discussed in the post HERE.
To fix the problem, we need to fix the data type in the lookup table. In other words, we need to make sure that all values for Room No. in the lookup table should be text (because the list of the lookup values in column B is text.
You may be wondering; can we simply change the cell format to text. See below:
Unfortunately, it does not work!
We cannot simply convert a number to text by changing the cell format. If we want Excel to accept a numeric input as text, we should set the cell format to Text BEFORE, not after the value is input.
We can use the function ISTEXT to test whether a value is text or not. See below:
Now we understand the problem. It’s time to show various solutions using formula approach. The trick is to use the function TRIM to convert all values under the lookup column into text to make the formula work.