I didn’t think of blogging about this topic before. I thought that it is so straight-forward. Nevertheless, there is another way of doing so which is even more straight-forward BUT may give you unexpected results later on. And to my surprise, that is the way most people (I encountered) do. Maybe it’s worth blogging about it.😁
It is not uncommon to filter out all values that are not equal to zero. Let’s say we want to filter all records with Unit Sold not equal to zero, it is very common for us to simply uncheck 0 from the list of values. I did that (long before) too. 😅
By unchecking “0” from the list, we thought we have told Excel to keep all values but 0. However, Excel interprets it differently. Let’s see by hovering on the filter icon for unit sold:
We have many different tables to be appended. It should be a simple task with Power Query. You may refer to my blogpost here for the basic of appending tables with Power Query. Nevertheless, life could be challenging in workplace. What if the tables do not have consistent column names? We need to convert all the column names into common names before appending tables. Are we going to read and rename all column names table by table, manually? Of course not! And this is exactly the reason for this post! 😉
Inspirations and the Thinking Process:
Before jumping to the solution, I would like to share with you the inspirations and the thinking process behind the scene.
Don’t make me wrong. I am not complaining Power Query at all. If you have followed me for a while, you should know that I am a big fan of Power Query indeed. It is simply powerful! With Power Query, lots of data cleansing, reshaping, consolidating tasks become easy and efficient. If you want to learn more what Power Query could do, have a look at my blogposts and playlist of videos related to Power Query.
To me, data cleansing is a process with seven stages:
Understanding the expected outcome
Studying the data on hand
Finding the pattern(s)
Cleansing the data (this is the stage most people interested in)
Checking the results
Refining the cleansing steps (most of the cases required, and need to go back to stage 2)
Repeat the above cycles until the expected outcome is obtained
Most of the time, we focus on stage 4, i.e. Cleansing the data itself. Nevertheless, stages 5 to 7 are critical to success, although often got unnoticed.
Let me show you an example here. (Note: this post is not a step-by-step tutorial).
In the previous blogpost, I showed you how to use INDIRECT to prepare a summary table based on values from different ranges across worksheets. In this post, I am going to show you how to do that with Power Query.
Why Power Query? Because it is simply powerful! 😁
With INDIRECT, the summary table won’t update by “Refresh”. When new sheet is added, you have to add the sheet name and extend the formulas manually. With Power Query, a click of refresh would do.
There are many #Excel functions that you may not find them useful when you first learn them. Sometimes you may even doubt why there are such functions. INDIRECT should be one of them. 🤔
What INDIRECT does?
It returns the reference specified by a text string.
The syntax is simple:
where ref_text is required. It is the text string describing the cell reference
[a1] is optional. When it is omitted or TRUE, Excel treat ref_text as A1-style reference; when it is FALSE, Excel treats ref_text as R1C1-style reference.
Common examples are in the simplest form.
Say in A1, we input something like B2, B4, D10, etc. (any valid cell reference). And then in another cell, say B12, we reference to A1 using
When we change the value in A1, the formula in B12 returns another value based on the text input in A1. The following screencast shows the effect:
Now I hear you:
Why don’t we simply use a direct reference? We can do the same by inputting the simple direct cell reference in B12
and we can change the formula to =B4, B6, D10, or whatever valid cell reference, easily whenever needed. What’s the point of using INDIRECT and reference it somewhere else?
I had the same question too… until I found a valid use case of it.
Applying filter in Excel is a very common task. We used to filter a dataset and focus on a subset of data that we are interested in. It is super easy to filter. The problem is, are you sure you have the right dataset being filtered?
I have a list of files stored in a folder. All files are generated automatically with the same filename convention: “FiscalYearWeekyyyyww.csv”, where yyyy = year; ww = week. The file started from week 01 of fiscal year 2018, all the way to the latest week. The latest file, say, is “FiscalYearWeek202105.csv”.
The task is to combine all files but those for FiscalYear 2018, using Power Query of course. (note this post is not about Power Query, but you may refer to other related post here).
Easy! We can exclude files of fiscal year 2018 by filter rows where ‘Name’ does not contain 2018, as shown in the following screenshot.
If this is what you do, please pause for 10 seconds and think carefully…
Have you heard about all the amazing new functions/features like Dynamic Arrays in Excel 365 but not sure what they are?
Or you have no idea at all how Excel 365 is different from standalone versions of Excel?
If you answer “Yes” to any of these questions, I suggest you take some time to read the blogpost Personal Stock Portfolio Dashboard by MYNDA TREACY , and more importantly watch the video there.
Because you will see how to build such a beautiful Excel Dashboard in less than 30 minutes using some of the new features available in Excel 365.
Personally, I think this is an excellent video demonstrating some core new features like Excel Data Types and Dynamic arrays. Not to mention how easy an Excel Dashboard can be built in 30 minutes (when you get the skills required). 😁
At the end of the blogpost, you will also see more Excel Dashboard post by MYNDA. You will learn a lot of Excel skills, tricks and tips by reading those posts and watching the videos associated with them.
If you are planning to uplift your skills in Excel Dashboard, consider the popular Excel Dashboard Course by MYNDA TREACY. There is a 20% discountuntil Feb 11, 2021. 👍
So, do yourself a favor and check out the course. The price is incredibly fair, the course is awesome and it’ll transform your Excel reports and possibly even your Excel career. 😉
Disclosure: I make a small commission (at no additional costs to you) for students who join Mynda’s course via my site, but as you know I don’t just recommend anything and everything. It has to be of outstanding quality and value, and something I can genuinely recommend. After all, if it doesn’t live up to what I’ve promised you’ll think poorly of me too and I don’t want that. Oh, and just watching the course videos won’t transform your career, you have to actually put it into practice, as if reading a cookbook won’t make you a chef.
Part 2 – What if we have inconsistent table structures across files?
Note: this is a long post. 😉
This is the continuation of previous post. Suggest you read that post first if you are new to Power Query. Moreover you may download sample data files there to follow along.
In the previous post, we used Power Query to combine all files in a folder. With that built, we can get new data file by a simple click of Refresh. The following screencast shows you that we have combined three data files. When a new file – Week 4 data, comes into the folder, we can easily get the new data by a click of Refresh.
Before Week 4 data, there were 63 rows of data. After we have 84 rows of data.
Isn’t it great? It is awesome! 😎
Then next week, we have data file for Week 5. We feel great as we expect a consolidated table by putting the data file into the same folder, followed by a click of Refresh… BUT…
we didn’t see the date of week 5, but (Blank)? What happened? 😨
Let’s have a closer look at the Queries & Connections Pane
Part 1 – What Power Query does for us in the process?
If you know about Power Query, you should know that combining files in a folder is a powerful yet simple task using Power Query, especially when there is a consistent pattern among all files. Like in the example above, there is only one worksheet “Sheet1” in each file. Moreover, all files share consistent layout with common column headers, which reside on row 1.
When new files come in week by week, a simple click of “Refresh” would get us the combined table for further analysis. As simple as that.
It worked like a charm until one day, your I.T. team changed the headers slightly… without getting consent from you (and all other users) 🙄
For whatever reason, the headers had been changed to below:
Excuse me? Breaking the headers from one row into two rows??? What’s the point of doing this? I have no idea but it happened. 😭
And you knew it… the query was broken.
The luckiest part is, the change was minor AND it’s confirmed that it will not changed anymore. Put it in other way, all the coming files will have two rows of headers, with the same data layout in the four columns in the same order in the future.
To fix the broken query, we could:
Modify the query to fit table structure for “new” files and manually change the headers in all old files for consistence or
Modify the query to fit the table structure for both new and old files
Which option would you go for?
If there is only one or two “old” files, I would probably pick option 1. What if there are tens of old files in place already? Option 2 is the preferred way. No doubt! 😁
Either way, in order to modify the query for combining files, we need to know what to modify, and more importantly where to modify.
Therefore it’s better to understand what is happening when we try to combine files in a folder using Power Query. This will be the focus of this blogpost.
Have you ever encountered a situation like this? How are we supposed to work with a PivotTable without seeing the data fields?
I was so lost when I saw this from a workbook on my friend’s notebook. He came to me for PivotTable help but I was not able to even work with the PivotTable he opened and presented to me… How embarrassing… 😑😅