This is the second part of a series of articles that looks at how Excel's Power Query/Get & Transform tools can be used for a great deal more than just acquiring data from external sources.
The first part of the series included an overall introduction to how these tools work, what they are capable of and how they might require a different way of thinking compared to 'standard' Excel techniques. As stated in that introduction, for convenience, we will refer to Power Query throughout, but Excel 2016 Get & Transform and Power BI Get Data operate in a similar way.
Dealing with dynamic ranges
The purpose of this series is to see how Power Query can help with worksheet, rather than just data, problems. One area that often causes an issue in Excel is the ability to cope with ranges of cells that need to expand to accommodate additional information – usually adding rows to a table of data. The introduction of Excel Tables in Excel 2007 helped to some extent by allowing references to ranges to expand automatically to include added rows.
However, there are still some operations where even the advantages of Tables don't provide a complete solution. For example, you might have several tables of information that you want to combine into a single table that you can use as the basis for a PivotTable or the use of Excel summary functions. If the number of rows in each table is static, you could do this by using normal Excel formulae and functions with references to each row, but as soon as you introduce the possibility of the number of rows in each table changing, the whole process becomes a great deal more complicated.
In this example, we have the sales results for three countries held in Excel Tables in three separate worksheets. We want to combine them into a single Table to use as the basis for a series of PivotTables that analyse the entire set of data:
Just to make things a little bit more complicated, some of our columns don't have consistent names as we can see for the Customers and Product columns.
Tables to Queries
We will use Power Query to consolidate our data. There are several different methods we could use to do this, but we'll start with what is probably the most straightforward. We will create queries based on each of our three Tables and then combine the queries. By default, each query will be given the same name as the Table name that it was based on, but this can be changed if required.
To create a query that includes the content of an Excel Table we just need to select any cell within the Table and choose the From Table command. This displays our Table data in the Query Editor. For the UK Table we can just use the Close & Load dropdown, Close and Load To… option to just create the connection to our Table without loading it back into another worksheet:
We do the same for our other two Tables but, before we use Close & Load, we double-click in the Customers.CompanyName and ProductName column headings to change them to be consistent with the UK Table:
We now have the data in our three Tables available as three separate queries. We can use the New Query dropdown, Combine Queries, Append command to join them together:
Until a recent update, combining more than two tables was a slightly messy process. Either you had to combine two tables and then combine each additional table to the previous combined query, or you could add the query names directly to the underlying code that was created. However, there is now a 'Three or more tables 'option which makes things much easier. We can just select each of our tables in turn and add them to our new query:
Because we ensured that each of our queries used the same column headings, the append operation will create one table with the same columns as each of our tables, but including the rows from all of them. Had our columns had different names, additional columns would have been created for each different column name:
We can see our three Table queries in the left-hand query pane. We have changed the name of our new query from the default of Append to Consolidated. We can then go ahead and Close & Load our query, this time choosing to load it into a new Excel worksheet:
Here we can see that our consolidated table includes the 641 rows from our original UK, France and Germany Tables.
Why not just use Copy and Paste?
There are benefits in setting up the process in this way. We haven't had to use any VBA code or type in any complicated Excel formulae or functions. Instead, we have just used some of the Power Query interface options to create our consolidated output. However, perhaps the greatest advantage is not the comparative ease of set up, but how we deal with changes to the data. If this had been a one-off requirement, it would have been quite simple to have just used Copy and Paste to create our single table. However, where we need to cope with changes to our data this would have meant repeating the process whenever the data changed. Because this involves a series of manual operations, it not only takes time, but also introduces several opportunities for making a mistake. Using the Power Query method can reduce the update process to a single click, or keyboard shortcut, or even automate the process completely.
To demonstrate this, we'll add a few more rows to our UK Table:
Because our data is in an Excel Table, as we add the rows our Table automatically expands to encompass the new content. If we refresh our queries, the tblUK query will include the new rows as being part of the Table upon which it is based, and our Consolidated query will included all the rows now within the tblUK query, and our other queries:
Clicking the Refresh All command in the Connections group of the Data Ribbon tab (or using the Ctrl+Alt+F5 keyboard shortcut) refreshes our queries and we can see that our Consolidated query now loads the additional 8 rows from our expanded UK Table.
We can go even further and set our Data Connections to refresh periodically, although it should be pointed out that this is still not the same as Excel's automatic formula recalculation. With queries, the refresh happens based on elapsed time, not on whether any 'precedent' value has been changed.
Although we have automated the process of dealing with changes in a defined set of Tables, we would still need to make manual changes to add any complete new Tables. In the next episode of the series we will see how to fully automate the process with variable numbers of Tables as well as just changing numbers of rows.