Today we'll see how to unpivot data using Power Query in MS Excel.
How can we transpose data that
are in multiple columns, but also has multiple headers,
which is more like a reporting type of layout into a tabular data set
so that it's easy for us to create a pivot table, or do
further analysis using formulas?
We want to do it dynamically, though so
that if we end up adding more columns to this, all we need to do is refresh the
end report and everything will pull automatically through.
So we have our customer names, we have
an article description, and we have a nice report about sales values by month,
which are recorded as dates and by scenario. What we want to do is to transform
the report into a proper tabular data set. In the end,
we should just have a single column for a customer, for the article.
There shouldn't be any gaps, then we
have a separate column for a scenario, followed by a separate column for a
month, and finally a single column for the sales values. But we have a few
challenges to overcome.
Challenges
Our first challenge is that we also
have totals in the middle of the report. And we need to exclude these from our
results. So the pattern is we have customer information and
then we have the total for that customer, and then all the way, in the end, we
have the grand total.
Now, in addition to this, our second
challenge is to somehow overcome the problem of multiple headers.
If we just had a single header, things would be a lot simpler. We can use Power
Query's Unpivot feature and get this done fast. But somehow we need to
deal with the multiple column headers.
Our third challenge is to get this to
be dynamic, so that if we add data for future months, so let's say for April,
all we have to do is refresh our end report and we get everything pulled
through in a proper format.
So let's see how we can set this up
with Power Query. The first thing we need to do is send the
data to the Power Query Editor, which means we can work
either with named ranges or with Excel tables.
Now I prefer to work with tables
whenever I can. But what I don't want to happen is to change the look of the
report. I want this to look the same, which means that when you want to work
with ranges that looked like merge cells, you shouldn't use
merge cells and I haven't used it in the report. Instead, what you need to do,
if I press Control + 1, in the Alignment options, is that you need to use Center
Across Selection. Because this has the same visual effect as a merged
cell, but it’s not a merged cell.
So when you create a table out
of the data set, it's not going to disrupt the look of your
report. Because remember, merge cells aren't allowed
in Excel tables. So if you create a table that
includes merging cells, these will be demerged. Because I'm using Center
Across Selection, I'm not going to have a problem.
So let's create a table out
of this but just so we can keep our eye on the bottom of the report as well,
let's split our view. So go to the View tab and add a split on the bottom side.
I'm just going to scroll down so we can keep our table in view, the top part
and the bottom part.
Now let's select everything,
press Control + A, everything is properly selected, and then press Control + T
to create a table out of this. Now, I don't want my
table to have headers because I don't want the actual and budget to be used as
headers. So I'm going to uncheck that and then click on OK.
To go back to the original look of the
table, let's start by removing the table style. Then let's
remove the header row. So these headers were automatically created by MS
Excel because I didn't have any headers there. So it just puts the
default column one, column two, et cetera. But I'm going to remove that, no one
needs to see that.
As a third step, let's give this table
a name. I'll call it TCustomer. Now that everything is set up, let's send this
to Power Query, Data, From Table
Range. Give the end report a name, I'll call it DataProper.
Okay, so Power Query went
ahead. It promoted the headers and it added a Changed Type step. I don't need
these steps. So let's reverse them.
Okay, so now let's think about the
steps that we need to do to get this into a proper format. Well, the first
thing is that we have to identify the anchor columns. The columns that we want
to keep as is, that's our customer name and the article description,
except, of course, there shouldn't be any gaps in them.
So first thing after you identify the
anchor columns, fill up any gaps that they might have. So I'm going to a right
mouse click, Fill and Fill Down. Now at this point, I'm not ready to use the
Unpivot feature on the data set, because I have multiple
column headers.
If I didn't have actual and budget
information, I could unpivot the months and I'd be done with this task.
But because I have multiple column headers, I need to follow a different set of
steps.
What I have to do next is to combine
my anchor columns into a single column.
So again, it doesn't matter how many anchor columns you
have, it could be more than two, but you just need to follow the same steps. It
also doesn't matter how many levels of column headers you have; the same steps
will also apply to those situations.
Okay, so after you identify the anchor
columns and fill in the gaps, you need to merge them to become a
single column. From the Transform tab, let's go into merge the columns. Pick a
separator that doesn't exist inside your data set, so that
you're not using in the descriptions.
So I'm not using a semicolon, I'm going
to go with that. You can just leave the default name there. We're going to be
updating the column header names in the end. So go on OK.
Now that we have a single column here,
we can transpose the dataset. So
from the Transform tab, select Transpose.
Now we have our different scenarios in their columns and our dates, which are
months in a separate column as well.
So the next step is to fill in any gaps
we have here. Now right mouse click, Fill and Fill Down. Now because we've
created a single column header, out of our multiple columns, we can promote
this row to become headers.
So use the first row as headers.
Now Power Query went ahead and automatically applied
the Changed Type step, but I don't need that right now, so I'm just going to
remove that. We're going to apply a Changed Type step in the end.
Okay, so now we have a separate column
for the scenario. There is a separate column for the date. And that is where
our data starts. So because we have a single column
header now, we're ready to use Unpivot.
So select the anchor columns, right
mouse click, Unpivot Other Columns. Now things are starting to look good.
Before we update the names, let's split our attribute column to the original
two columns that we had.
So under Transform, Split
Column, By Delimiter, the delimiter is
a semicolon, and OK. Now let's remove that Changed Type step as well. Next,
let's give the columns a name. The first one is scenario, then we have the
dates, next is the customer, then we have our article. And finally, we have
sales value.
Now one thing we still need to do is to
remove the totals. So anywhere we have the word total, we need to filter it
out. Let's go add a text filter for that and filter for everything that does
not end with the word total and OK.
Everything looks good, let's update the
data types for this scenario is text. For date, let's just keep the date
portion, text and text is good and sales value is currency.
Okay, everything looks good. Let's send
this to Excel, close & Load. It's going to create
a new table on a new sheet with our report
looking like a proper tabular data set. We have the actual
information followed by the budget information.
We can of course change the order of
this if we want customer and article to be first followed by scenario and date.
So let's go ahead and do that. Let's select both customer and article and pull
them to the front and update our report.
Okay, so everything is set up and one
thing we need to do is to make sure that it works if we add data for April. So
I add some April data. The first column is for actuals and
the second column is for the budget.
So let's test that, I'm just going to
add a column for actuals. And let's go and copy and paste
that in. So April is from P4 until P130. I'm going to cut it, so Control + X, go
to the table here and Control + V.
For budget, let's
just expand our table area and do the same for the budget.
So that's Q4 until Q130, Control + X and Control + V. Now I'm also going to
remove the split. So go to View and remove that split.
And now let's check if April shows up
in our end report once we press Refresh, so currently, we
have 596 rows. Let's right mouse click and refresh. We have
new rows pull in there, which looks good, so if we jump to the bottom, we have
April data automatically included.
Now it looks like we have some cells on
our report side that aren't picked up by Power Query as
null value cells. That's why they show up in our results. So, let's go and
check that out and see how we can fix that.
Let's go back to the Merged
Column step. Now, most of the empty cells are seen as null. So
when we unpivot them, we don't see these as empty cells. So these are not the
ones that are causing us problems, the ones that are, are these cells there.
Later when I go and unpivot, those
cells are going to show up as empty cells. So there's still not null value
cells. But after I update the type, and I change the Sales Value
column to currency, then these empty cells become null cells.
So to clean up my final report, what I
can do is go and remove any null cells and go with okay, send this back
to Excel. And I get back a report, that's a lot leaner,
okay?
Closing Comment
So that's it, those are the steps you
need to follow when you come across more complex unpivot situations. Hopefully,
you enjoyed the Power Query unpivot data example in MS Excel.

Weldon
ReplyDelete