I really like Power Query (in both Excel and Power BI). Honestly I do not know much about the M behind the scene (still learning it piece by piece). Nevertheless, I’ve got the POWER simply through the User Interface of Power Query Editor. Seriously, everyone can become a data wizard by using Power Query! Let’s watch the following two videos to feel the POWER of Power Query.
Here’s is a challenge (with solution) from Mr Excel on how to reshape data.
You may download a sample workbook from Mr.Excel YouTube channel to follow along.
And here’s my approach to solve the challenge:
If you prefer reading to watching, please continue to read this post.
Note: All screenshots used in this post are captured using Excel for Office 365. If you are using other versions of Excel, you may find the locations and look&feel of buttons a bit different. The concept is the same though.
Here’s the challenge – Reshaping the data
A picture tells thousand words:
Before Power Query, the first thing in my mind is…
OMG…. seriously? Do i have to re-shape the data by manual Cut and Paste? How many columns do we have? Is it one-off task? Reoccurred tasks with new data?
With Power Query, these questions bother me no more! 🙂
Here’s the step-by-step solution:
A. Turn the table into Excel Table
- Select the range of data (A5:Z11) in our example
- Go to Insert tab
- Select Table
- Uncheck “My table has headers”
- Click OK
B. Load the Excel Table to Power Query
Select any cell of the Table just created…
- Go to Data tab
- Select From Table/Range
C. Transpose the Table
This is our first transformation.
- Go to Transform tab
- Select Transpose
Now it’s the time to make a promotion 🙂 To promote the first row as header:
- Go to Home tab
- Select Use First Row as Headers
Now we’ve got a much better layout 🙂
If you are experienced in preparing data for pivot table, you should know that grand totals and/or subtotals are not necessary in the dataset. Same case here! We will remove those grand totals and subtotals later.
D. Remove Grand Totals
To remove the grand totals in this case is easy. Simply remove the top 5 rows:
- Go to Home tab
- Select Remove Rows
- Select Remove Top Rows
- Input 5
- OK
Nevertheless we need further steps before we can remove the subtotals.
E. Split the Employee from Quarter; and Add a column for Employee
As a general rule, we should have more than one information under one column. In our dataset we now have both Employee and Quarter under the same column “Category Description” (we will rename this header later). This is no good. We need to separate Employee from this column and create an column of Employee alone.
Let’s do it:
- Go to Add Column tab
- Click Index Column
A new column called “Index” is added. You will see a list of sequential number from 0 to 19 there. Nevertheless, this sequential list doesn’t help much at this stage. Look at the dateset, our records are organized in a way of 5 rows per record, with the first row being the employee name and the subtotal. So a list of repeating pattern from 0 to 4 would help!
Let’s further transform the Index column to get what we need:
- Select the Index column
- Go to Transform tab
- Select Standard
- Select Modulo
- Input 5
- OK
What Modulo does?
Based on the value input, the function Module will give the remainder for each value in the column. For a sequential list from 0 to 19, this function returns a list of 0 to 4 repeatedly.
Think about this:
- 5 divided by 5 –> Remainder is 0
- 6 divided by 5 –> Remainder is 1
- 7 divided by 5 –> Reainder is 2
- …
- 10 divided by 5 –> Remainder is 0
- 11 divided by 5 –> Remainder is 1
- …
Make sense? As a result, the Index column become this:
Now we need another helper column to extract the employee name:
- Select the Index column
- Go to Add Column
- Select Conditional Column
Then input the Condition and Output as below:
With the new column “Name” just created, we’ve got the employee name correctly on each row with “0” in the Index column. We need to fill the gap:
- Select the Name column; Go to Transform tab
- Select Fill
- Select Down
F. Remove Subtotals
With the two helper columns added in the previous section, we are now ready to remove the subtotals:
- Click the pull down icon on the header of “Index“
- Uncheck 0
- OK
Let’s rename the columns by double-clicking the header:
The column Index is no longer require. Right-click on the header and Remove it.
Thanks for its contribution! 👏
G. Unpivot Columns other than Employee and Quarter
- Select the column Quarter
- Also select the column Name (by holding Ctrl key)
- With the two columns selected, right-click on the header –> Unpivot Other Columns
Rename “Attribute” to “Category”
Tip: Now we’ve got a layout that is ideal for various data analysis and pivot table. We may stop here and get the required output by using Pivot Table. Please read the bonus part of my video for more details.
H. Pivot Column
- Select the column Quarter
- Go to Transform tab
- Select Pivot Column
- In the Pivot Column dialog box, Select Value
- OK
Almost there…
We need the total column for four quarters.
Let’s do it:
- Select the four columns Q1 to Q4 (tip: hold SHIFT key for selecting adjacent columns)
- Go to Add Column tab
- Select Standard
- Select Add (so obvious 😁)
Remember to rename to added column from “Addition” to “Total”
To make the final output 100% matched with the requirement, let’s re-order the columns by drag and drop:
Final touch-up, define the date type.
I. Close and Load
Now we are ready to load the output to spreadsheet!
- Go to Home tab
- Select Close and Load
- Select Close and Load to…
In the subsequent dialog box:
- Select Table
- Select Existing worksheets: and input the cell reference where you want to load the result
- OK
Here we go! Enjoy!
As you see, we can do the data-reshaping all the way with Power Query’s UI only. Not a single line of code written. Isn’t it cool? 😎
Although we can finish the transformation with Power Query, I’d highly recommend we stop at section G because the resulting layout there is ideal for further analysis. We can do it with regular Pivot Table with more flexibility to end users, provided that end users know how to use Pivot Table. 😉
You may watch the bonus session in the video to learn more.
I hope you can feel the POWER of Power Query and start falling in love with it.
Click HERE to download the sample file with the solution.
My solution was similar to yours. However, instead of pivoting the data, I used Group By.
let
Source = Excel.CurrentWorkbook(){[Name=”UglyData”]}[Content],
#”Transposed Table” = Table.Transpose(Source),
#”Promoted Headers” = Table.PromoteHeaders(#”Transposed Table”),
#”Removed Top Rows” = Table.Skip(#”Promoted Headers”,5),
#”Added Conditional Column” = Table.AddColumn(#”Removed Top Rows”, “Category”, each if not Text.StartsWith([Category Description], “Q”) then [Category Description] else null),
#”Filled Down” = Table.FillDown(#”Added Conditional Column”,{“Category”}),
#”Filtered Rows” = Table.SelectRows(#”Filled Down”, each Text.StartsWith([Category Description], “Q”)),
#”Grouped Rows” = Table.Group(#”Filtered Rows”, {“Category”}, {{“Data”, each Table.PromoteHeaders(Table.Transpose(Table.DemoteHeaders(Table.RemoveColumns(_,{“Category”})))), type table}}),
#”Expanded Data” = Table.ExpandTableColumn(#”Grouped Rows”, “Data”, {“Category Description”, “Q1”, “Q2”, “Q3”, “Q4”}, {“Category Description”, “Q1”, “Q2”, “Q3”, “Q4″}),
#”Inserted Sum” = Table.AddColumn(#”Expanded Data”, “Addition”, each List.Sum({[Q1], [Q2], [Q3], [Q4]}), type number)
in
#”Inserted Sum”
LikeLike