It is quite common to perform date-related calculations when we work with Excel. Before Power Query, we must do it with built-in functions. However, we may not get the results we need with a single function most of the time. In fact, a combination of functions as well as simple math are required. With Power Query, we can perform many date-related calculations via User Interface of Power Query Editor. In this blog post, I am going to show you both the Power Query way (in video) and the Excel formula way (text).
I have a couple of blogposts about how to convert dates stored as text in various formats into real date that Excel recognizes, and you can further work on it. There are many approaches such as text to column and formulas. You may refer to these posts.
These are examples that most Excel users deploy. Did you know… there is a powerful tool in Excel that can make this task super easy! Yes, that’s Power Query!
It is a common task to extract certain texts from a cell in Excel. For example, we may want to
get the length of the cell content,
extract the first or last N characters,
extract the text in a range,
or even a portion of text that is before/after or even in-between specific delimiters.
If you are good in Excel formula, you may be able to achieve this by constructing formula with Excel functions such as LEN, LEFT, MID, LEFT, etc. For simple scenario, it works fine. Nevertheless, let’s face the fact that not everyone is good at writing Excel formula. Also, we must admit that there are some scenarios where only “complex” formula can help.
Unbelievably, with Power Query we don’t need to write a single formula to extract values from a column under many different circumstances. Most of the tasks can be done via User Interface.
I came across this challenge from Excel On Fire. This challenge just happened related to the topic of my previous post regarding splitting column using Power Query. So i take the challenge and offer a different solution.
Don’t miss the video from Excel On Fire. Oz demonstrated many different Power Query techniques to solve the problem, with his unique way of storytelling! I love that!
And this is my solution to the same problem using different approach. The core techniques involved are splitting column into rows and then pivot the resulting column. And the challenge is to keep the records with no item and make the solution dynamic to handle new data in the future.
I hope you like it. Do you have a different solution? Please feel free to share in the comments below.
Find and Replace is quite common in Excel. If you are not sure about what options we have for Find and Replace, please read my blogpost here.
In regular Excel, the Find and Replace action is one-off. Whenever we have new data, we need to perform the same action, repeatedly. If we want to do that by a click of refresh, then we should do it in Power Query. Find and Replace is called “Replace values” or “Replace errors” under the Transform tab of Power Query Editor.
As the name suggests, we can replace values or errors in Power Query. To perform Find and Replace in Power Query is similar to doing the same thing in Excel. However, there are a few details we need to pay attention to.
The above image is not surprising at all. SUMIF returns the expected result correctly. We all write SUMIF like this. (Yes, me too. 😅) It works fine most of the time unless we are facing something not so common. Say for example, the text strings all start with “>> “, like the screenshot below:
To be clear, the content in A3 is equal to the content in D2, however the SUMIF is returning the wrong result? Strange…… 🤔
Do you know what’s wrong with the SUMIF formula? It is the same as the SUMIF formula in the first example. Then why does the first one work, and the second one fails? That’s the question!
If you want to know just the quick fix, here’s the formula:
=SUMIF(A2:A4, "=" & D2, B2:B4)
'The secret sauce is "=" &
If you want to know why, you need to know the basics of writing SUMIF.
In regular Excel, all the above Power Query transformations can be done with simple functions, except for TRIM which Power Query does it slightly different from the regular Excel. You may read the previous post for more details.
You may see the corresponding Excel functions in the following table:
Regular Excel function
Capitalize Each Word
Trim (only remove leading and trailing spaces)
TRIM (also remove extra spaces between words)
You probably know all these Excel functions and wonder why we need to do it in Power Query. I had the same questions too when I first started with Power Query. If you try to clean / transform your data in Power Query, you will find it makes perfect sense NOT to do it with regular Excel functions on worksheet. We shall do all sorts of transformations in Power Query Editor. And most of the time, text transformations stated above are only part of the data cleansing that we need. Therefore, it is important for beginners of Power Query to be aware of their existence in Power Query. And the best thing is we, as users, can do it through merely the User Interface of Power Query Editor, i.e. with a few clicks.
Let’s watch it in action how we can do these text transformations with Power Query:
In Power Query, there are some functions that are very similar to Excel, which you expect the same results from them. However, it may not be the case because they indeed perform differently. TRIM is one of them.
TRIM is a common function in Excel that removes all leading and trailing spaces in a text string. On top of that, TRIM also removes all extra space in between texts. As a result, only single spaces between texts will be kept. Below screenshot shows the results returned by TRIM.
When we perform Transform –> Trim or use the Text.Trim function in Power Query, the results may be a surprise to you! (At least it surprised me when I first noticed it). Here’s the result from Power Query:
Whenever we see “strange” results, we should always check the definition of the function. Indeed, we should do so before we use a function (in the ideal world). But I admitted that I did not do so either. 😅
From the MS documentation, I realized that the Text.Trim function removes leading and trailing spaces only. That’s why.
Unfortunately, there is no such function as Text.TrimLikeExcelin Power Query. But you know, when we work with Excel, there will be workaround(s), most of the time. We may achieve the same with a few Power Query functions together, i.e a little bit of M.
When we import data from a data source, be it a data warehouse, a CSV/TXT file, or even an Excel Table, there are chances that there are many more columns than needed for analysis. This is particularly true when we try to create a fact table with only “keys” and “fact” in the table. Luckily, we can easily remove columns that we don’t need, or simply keep those columns we need with Power Query.
In Power Query Editor, there is a button of “Keep Duplicates” in the ribbon. It’s located under Keep Rows in the section of Reduce Rows in the Home tab of the ribbon.
With this, it is super easy to keep duplicates in our dataset. By the way, duplicates mean value appears more than once. Let’s take a look at the screenshot below, “A”, “B”, and “F” are duplicates,
while “b”, “C”, “D”, “E” are uniques (i.e. values appear only once).
When we apply “Keep Duplicates” to Column1, we obtain the following result.
The good thing about Power Query is, we do not have to worry about the M code auto-generated. Power Query handles the M codes, we handle the User Interface, which is mainly “select and click”. Easy for users, isn’t it?
When we import data into Power Query editor, we should set the scope of data that we need. That means selecting only the rows and columns required for further processing. In the following video, I will show you different ways of keeping the rows that you need with Power Query.
Let’s think about this scenario. In the first encounter of Power Query, you were presented with demonstrations that show all the powerful transformations done by Power Query. You were surprised and amazed with the demos. Normally we did not pay attention to subtle details about the powerful tools. We jumped straight into all the transformation steps by following examples. Sometimes it works, something it doesn’t. And then you feel frustrated as you don’t know why some calculations give you unexpected results. If this sounds familiarize, I will say, this is because we are all trying to run before walk. 😅
In this blogpost, and in the video, I will talk about a basic but crucial step in Power Query – Data Type.
Data Type in Power Query is NOT formatting in Excel
Formatting does not change underlying values, defining data type in Power Query does.
When we work with data using Power Query, defining data type of each column before loading the result is a best practice. Please don’t mix it up with formatting in Excel. Defining data type to a column is fundamentally different from applying a formatting. In Excel, when we apply a format to a number, the underlying value of the number remains unchanged. On the contrary, when we define data type of a column in Power Query, all the values under the column will be converted to the data type that’s being defined.
Power Query is not behaving like Excel when doing calculation
Nevertheless, defining data type in Power Query is not for data conversion only. We need the data type to be explicitly defined before some calculations happen. Otherwise, we may have unexpected results returned. The worst scenario is we don’t even know that. For example, when you try to add “10” to 10, Excel returns 20 while Power Query returns error. Similarly, when you try to add 10 to an empty cell, Excel return 10 but Power Query returns null. In general, number does not work with text; text does not with number in Power Query.
Close and Load to is the last step in Power Query, but the first step in your reporting process. When I first learned Power Query, I preferred loading the result to Table as I could “see” the result on a worksheet. That’s normal, right?
The more I use Power Query, the more I realize that I rarely load my result to table directly. Most of the time, I load all queries to connection only. This is particularly true when my goal of cleansing/transforming data is for further analysis using Pivot Table (or data modeling).
If your goal is to use the processed data for Pivot Table, load the queries to connection.
In this short video, I will show you how to change the load destination from Table back to Connection only. Also, I will show you why it is better to load the queries to connection only when you are planning to use the result in Pivot Table.
If we want to see when a workbook was last modified, we can see the information from File Explorer,
or Excel Info
Very handy indeed.
However, if you want the information to be put on a worksheet, there is no formula to retrieve that information directly. No worries, we can get that piece of information using Power Query and it will be super easy (if you are not going to rename your workbook nor move it to a new folder). In fact, it is not difficult at all to turn the solution into a dynamic one. Let me show you step by step.
As mentioned in the previous post, the awareness of Power Query is extremely low. In my experience, the hurdle for people who have learned Power Query but not feeling comfortable in using it is the fact that they are not well informed about the three major steps involved:
If we learned Excel long enough, we are all trained to move cells around on worksheets for reshaping data; copy and paste for data consolidation; setting formulas for data cleansing or calculations, and so on and so forth. In Power Query, we are doing the same but through quite a different process, listed above. That is normal for people having difficulty to start with Power Query.
So, from now on, please bear in mind that the first step when using Power Query is always getting the data.
Power Query empowers users to get data from multiple sources. It could be data sitting within current Excel Workbook, another Workbook, CSV/TXT files, Web, PDF, or even files from a folder and many more. One of the cool things in this process is that we don’t even need to open the files with the data we need. Remember this: Power Query can read the data from a file (or source) without “opening” it.
Once we get the data (connect to the data source), all the cleansing / transformation magic will happen within Power Query Editor.
It is particularly important to get ourselves familiarized with the User Interface of Power Query Editor because most of the transformation steps are performed via different command buttons on the ribbon.
To illustrate the process and the User Interface of Power Query Editor, I have made this video:
I hope you like it. If you do, please give it a thumbs up, share and subscribe to my channel.
More videos on Power Query are coming. Stay tuned! 😉
Depends on what? It depends on what you do and how you do it in Excel. 😉
Excel is the core software I use for work
I use Excel for data analysis
I need to deal with data from multiple sources (or from multiple Excel workbooks)
I spent hours cleansing/transforming data before I could start analyzing it
I do repetitive cleansing/transformation tasks on a regular basis
I consolidate data by “copy and paste” data from multiple files
I do most of the data work with VBA
If you answered TRUE to more than two questions, then you should really consider setting Power Query as your learning goal in 2022. Believe it or not, you don’t need one year to learn it before you can use the power of it. Although mastering Power Query could be a long journey, most of us are not required to be at the expert level. (I am not an expert of Power Query either.) Just the basics of Power Query are good enough to change your (work) life.