# New Excel functions: XLOOKUP and XMATCH

Liam Bastick looks at two new functions that will shortly be available in Office 365: XLOOKUP and XMATCH.

I have explained previously why **INDEX MATCH **was a more powerful LOOKUP combination than **VLOOKUP**, which to this day remains one of the accounting world’s most popular Excel functions. However, time has moved on. There are new kids on the block. At the time of writing, these functions are still rolling out in Office 365, but they are coming.

**XLOOKUP(lookup_value, lookup_vector, results_array, [if_not_found]**,** [match_mode], [search_mode])**

This function seeks out a **lookup_value** in the **lookup_vector** and returns the corresponding value in the **results_array**. Most of the time you will only require the first three arguments:

**lookup_value:**this is required and defines what value you want to look up**lookup_vector:**this reference is required and is the row or column of data you are referencing to look up**lookup_value****results_array:**this is where the corresponding item is you wish to return and is also required (even if it is the same as**lookup_vector**). This does__not__have to be a vector (ie one row or one column of cells): it may be an array (with at least two rows and at least two columns of cells). The only stipulation is that the number of rows/columns must equal the number of rows/columns in the column/row vector – but more on that later**if_not_found:**this optional argument allows you to replace the usual return of #N/A with something more informative like an alternative formula, text or a value**match_mode:**this argument is optional. There are four choices:**0:**exact match (default)**-1:**exact match or else the largest value less than or equal to**lookup_value****1:**exact match or else smallest value greater than or equal to**lookup_value****2:**wildcard match. You should use the special character**?**to match any character and*****to match any run of characters.

What’s impressive, though, is that for certain selections of the final argument (**search_mode**), you __don’t__ need to put your data in alphanumerical order! As far as I am aware, this is a first for Excel

**search_mode:**this argument is also optional. There are again four choices:**1:**search first to last (default)**-1:**search last to first**2:**what is known as a binary search, first to last (requires**lookup_vector**to be sorted). Just so you know, a binary search is a search algorithm that finds the position of a target value within a sorted array. A binary search compares the target value to the middle element of the array. If they are not equal, the half in which the target cannot lie is eliminated and the search continues on the remaining half, again taking the middle element to compare to the target value, and repeating this until the target value is found**-2:**another binary search, this time last to first (and again, this requires**lookup_vector**to be sorted).

Let’s have a look at **XLOOKUP **versus** VLOOKUP**:

You can see the **XLOOKUP** function is shorter:

**=XLOOKUP(H52,F41:F47,G41:G47)**

Only the first three arguments are needed, whereas **VLOOKUP** requires both a fourth argument and, for full flexibility, the **COLUMNS** function as well. **XLOOKUP **will automatically update if rows/columns are inserted or deleted. It’s just simpler.

**HLOOKUP** has similar issues:

Here, this highlights what happens if I try to deduce the student name from the Student ID. **HLOOKUP **cannot refer to earlier rows, just as **VLOOKUP **cannot consider columns to the left. Given any unused elements of the table are ignored also, it’s just good news all round. Goodbye limitations, hello **XLOOKUP**.

Indeed, things get even more interesting when you start considering **XLOOKUP**’s final two arguments, namely **match_mode **and **search_mode**, viz.

Notice that I am searching the ‘Value’ column, which is neither sorted nor contains unique items. Do you see how the results have changed once more, depending upon **match_mode **and **search_mode**?

The **match_mode **zero (0) returns “Not Found” now instead of #N/A because there is no exact match and the formula has now stipulated what to do in such an instance.

When **match_mode** is -1, **XLOOKUP** seeks an exact match or else the largest value less than or equal to **lookup_value **(6.5). That would be 4 – but this occurs more than once (B and D both have a value of 4). **XLOOKUP **chooses depending upon whether it is searching top down (**search_mode **1, where B will be identified first) or bottom up (**search_mode **-1, where D will be identified first). Note that with binary searches (with a **search_mode** of 2 or -2), the data needs to be sorted. It isn’t – hence we have garbage answers that cannot be relied upon.

With **match_mode **1, the result is clearer cut. Only one value is the smallest value greater than or equal to 6.5. That is 7, and is related to A. Again, binary search results should be ignored, although it is worth noting “Not Found” occurs when Excel identifies the lookup value has not been found.

The **match_mode **2 results are spurious. This is seeking wildcard matches, but there are no matches, hence “Not Found” instead of N/A for the only **search_modes** that may be seen as creditable (1 and -1). It’s interesting to note a binary search causes errors which are not trapped by the new argument.

Clearly binary searches are higher maintenance. In the past, it was worth investing in them as they did return results more quickly. However, according to Microsoft, this is no longer the case: apparently, there is “…no significant benefit to using (sic) the binary search options…”. If this is indeed the case, then I would strongly recommend not using them going forward with **XLOOKUP**.

Whilst **XLOOKUP **wins hands down against **HLOOKUP **and **VLOOKUP**, the same cannot necessarily be said for **LOOKUP**. You may recall **LOOKUP** has two forms: an __array__ form and a __vector__ form. As a reminder:

- an
**array**is a collection of cells consisting of at least two rows and at least two columns - a
**vector**is a collection of cells across just one row (row vector) or down just one column (column vector).

The diagram should be self-explanatory:

The array form of **LOOKUP** looks in the first row or column of an array for the specified value and returns a value from the same position in the last row or column of the same array:

**LOOKUP(lookup_value, array)**

where:

**lookup_value**is the value that**LOOKUP**searches for in an array. The**lookup_value**argument can be a number, text, a logical value, or a name or reference that refers to a value**array**is the range of cells that contains text, numbers, or logical values that you want to compare with**lookup_value**.

The array form of **LOOKUP** is very similar to the **HLOOKUP** and **VLOOKUP** functions. The difference is that **HLOOKUP** searches for the value of **lookup_value** in the first row, **VLOOKUP** searches in the first column, and **LOOKUP** searches according to the dimensions of array.

If **array **covers an area that is wider than it is tall (i.e. it has more columns than rows), **LOOKUP** searches for the value of **lookup_value** in the first row and returns the result from the last row. Otherwise, **LOOKUP** searches for the value of **lookup_value** in the first column and returns the result from the last column instead.

The alternative form is the vector form:

**LOOKUP(lookup_value, lookup_vector, [result_vector])**

The **LOOKUP** function vector form syntax has the following arguments:

**lookup_value**is the value that**LOOKUP**searches for in the first vector**lookup_vector**is the range that contains only one row or one column**[result_vector]**is optional – if ignored,**lookup_vector**is used – this is the where the result will come from and must contain the same number of cells as the**lookup_vector**.

Like the default versions of **HLOOKUP **and **VLOOKUP**, **lookup_value** must be located in a range of ascending values.

Let me demonstrate with an example:

**LOOKUP **is a great function to use with time series analysis / forecasting. Dates are in ascending order and the **LOOKUP **syntax is remarkably simple. As a modeller, I use it regularly when I am modelling many more forecast periods than I want assumption periods.

Here, you can see I carry assumptions only for 2020 until 2024 (the final value is 2024, just with a “+” in number formatting). The formula

**=LOOKUP(G$74,$G$67:$K$68)**

returns the corresponding value for the period that is either an exact match or else the largest value less than or equal to the **lookup_value**. **LOOKUP** uses the top row of the table for looking up its data and the final row for returning the corresponding value. Simple. As for **XLOOKUP**:

**=XLOOKUP(G$82,$G$67:$K$67,$G$68:$K$68,-1)**

This formula is longer and requires two additional arguments (**match_mode **-1 is required to mirror the behaviour of **LOOKUP**). Indeed, given that an **IF **statement is required to ensure no errors for earlier periods, e.g.

**=IF(G$90<$G$67,$G$68,LOOKUP(G$90,$G$67:$K$68))**

it may be argued that **LOOKUP** is a simpler function to use here than its counterpart.

This isn’t the only time **LOOKUP **outperforms **XLOOKUP**:

Here, we do see a limitation of **XLOOKUP**. Whilst the third argument of **XLOOKUP**, **results_array**, does not need to be a vector, it cannot be the transposition of the **lookup_vector**. You would have to transpose it using the **TRANSPOSE **function, for example. This makes **LOOKUP **much easier to use – compare:

**=LOOKUP(H112,F105:F109,G102:K102)**

with

**=XLOOKUP(H112,F105:F109,TRANSPOSE(G102:K102))**

In this instance, **LOOKUP **wins.

**Useful Features of XLOOKUP**

**XLOOKUP** can be used to perform a two-way match, similar to **INDEX MATCH**:

Many advanced users might use the formula

**=INDEX(H40:N46,MATCH(G53,G40:G46,0),MATCH(G51,H39:N39,0))**

where:

**INDEX(array, row_number, [column_number])**returns a value or the reference to a value from within a table or range (list) citing the**row_number**and the**column_number**

**MATCH(lookup_value, lookup_vector, [match_type])**returns the relative position of an item in an array that (approximately) matches a specified value. It’s most commonly used with**match_type**zero (0), which requires an exact match.

Therefore, this formula finds the position in the row for the student and the position in the column of the subject. The intersection of these two provides the required result.

**XLOOKUP **does it differently:

**=XLOOKUP(G53,G40:G46,XLOOKUP(G51,H39:N39,H40:N46))**

Welcome to the wonderful world of the nested **XLOOKUP** function! Here, the internal formula

**=XLOOKUP(G51,H39:N39,H40:N46)**

demonstrates a key difference between this and your typical lookup function – the first argument is a cell, the second argument is a column vector and the third is an array – with, most importantly, the same number of rows as the **lookup_vector**. This means it returns a column vector of data, not a single value. This is great news in the brave new world of dynamic arrays.

In essence, this means the formula resolves to

**=XLOOKUP(G53,G40:G46,J40:J46)**

as **J40:J46 **is the resultant vector of **=XLOOKUP(G51,H39:N39,H40:N46)**. This is a really powerful – and virtually new – concept to get your head around, that admittedly **SUMPRODUCT** exploits too. Once you understand this, it’s clear how this formula works and opens your eyes to the power of nested **XLOOKUP **functions.

I can’t believe I am talking about the virtues of nested functions here! Let me change the subject quickly…

To show you how dynamic arrays can make the most of being able to create resultant vectors, consider the following example:

The formula

**=XLOOKUP(G77,I65:L65,I66:L72)**

again resolves to a vector – but this time is allowed to spill as a dynamic array. Obviously, this will only work in Office 365, but it’s a very useful tool that might just make you think it’s time to drop that perpetual licence.

Once you start playing with the dynamic range side, you can start to get imaginative. For example:

In this illustration, I want to calculate the sales between two periods:

This might seem like a simple drop-down list using data validation (**ALT + D + L**), but **XLOOKUP **has been used in determining the list to be used for the end months.

Let me explain. I have hidden the range of relevant dates in cell **H101** spilled across

**XLOOKUP **can return a reference, so the formula

**=XLOOKUP(G100,H94:S94,H94:s94):S94**

evaluates to the row vector **N94:S94** (since the start month is July). This spilled dynamic array formula is then referenced in the data validation:

(You may recall **$H$101# **means the spilled range starting in cell **H101**.) It should be noted that the formula **=XLOOKUP(G100,H94:S94,H94:s94):S94 **may not be used directly in the ‘Data Validation’ dialog, but this is a neat trick to ensure you cannot select an end month before the start month (assuming you are a rational human being that selects the start before the end!).

The formula to sum the sales then is

**=SUM(XLOOKUP(G100,H94:S94,H95:S95):XLOOKUP(G101,H94:S94,H95:S95))**

Again, this uses the fact **XLOOKUP **can return a reference, so this formula equates to

**=SUM(N95:Q95)**

Easy! Now I am combining two **XLOOKUP **formulae with a colon (:) to form a range. This joins other illustrious functions used this way such as **CHOOSE**,** IF**,** IFS**,** INDEX**,** INDIRECT**,** OFFSET, SINGLE (@)**,** SWITCH **and **TEXT**. First nesting, now joining – what’s next?

Seeking partial matches (sounds like an unfussy dating agency!) suddenly became a lot easier too. You can use wildcards if you want to – just set the **match_mode **to 2:

Here, I am searching for **J?n*n* **- which is fine as long as you know what the wildcard characters mean:

**?**means “any character”, but just one character. If you wanted to make space for two and only two characters you would use**??*******means “any number of characters’ – including zero.

For example, **M?n*m* **would identify “Manmade”, “minimum” and “Manikum” but would not accept “millennium”. Here, our formulae

**=XLOOKUP(G184,H174:H179,I174:I179,,2)**

**=XLOOKUP(G184,H174:H179,I174:I179,,2,-1)**

would locate the first and last items that satisfied the condition **J?n*n* **(i.e. “Jonathan” and “Jonny” respectively).

But what if you wanted an exact match with case sensitivity? You just have to think a little but outside of the proverbial box:

Here, we use another feature of **XLOOKUP **– its ability to search a virtual vector, i.e. one that has been constructed in memory, rather than physically within the spreadsheet cells. Consider the formula

**=XLOOKUP(TRUE,EXACT(H145:H154,G159),I145:I154)**

Here, the interim calculation **=EXACT(H145:H154,G159)**, looks at the range **H145:H154** and deduces whether the cells are an exact match for the selection ‘Sum Product’ in cell **G159**. The **EXACT** function would evaluate as

**{FALSE; TRUE; FALSE; FALSE; FALSE; FALSE; FALSE; FALSE; TRUE; FALSE}**

Therefore, the formula coerces to

**=XLOOKUP(TRUE,{FALSE; TRUE; FALSE; FALSE; FALSE; FALSE; FALSE; FALSE; TRUE; FALSE},I145:I154)**

and then the formula becomes simple to understand.

As I said at the beginning,** XLOOKUP **did not land in isolation. In addition to **XLOOKUP**, **XMATCH** has arrived with a similar signature to **XLOOKUP**, but instead it returns the index (position) of the matching item. **XMATCH** is both easier to use and more capable than its predecessor **MATCH**.

****New 077**

**XMATCH** has the following syntax:

**XMATCH(lookup_value, lookup_vector, [match_mode], [search_mode])**

where:

**lookup_value:**this is required and defines what value you want to look up**lookup_vector:**this reference is required and is the row or column of data you are referencing to look up**lookup_value****match_mode:**this argument is optional. There are four choices:**0:**exact match (default)**-1:**exact match or else the largest value less than or equal to**lookup_value****1:**exact match or else smallest value greater than or equal to**lookup_value****2:**wildcard match. You should use the special character**?**to match any character and*****to match any run of characters.

Again, for certain selections of the final argument (**search_mode**), you __don’t__ need to put your data in alphanumerical order

**search_mode:**this argument is also optional. There are again four choices:**1:**search first to last (default)**-1:**search last to first**2:**this is a binary search, first to last (requires**lookup_vector**to be sorted)**-2:**another binary search, this time last to first (and again, this requires**lookup_vector**to be sorted).

As you can see, it’s a fairly straightforward addition to the **MATCH **family. It acts similarly to **MATCH **– just with heaps more functionality.

*You can check out both functions in the attached Excel workbook.*

### You might also be interested in

Recognised by Microsoft as one of 104 Most Valuable Professionals (MVPs) in Excel worldwide by Microsoft, Liam has over 30 years’ experience in financial model development/auditing, valuations, M&A, strategy, training and consultancy. He has headed Ernst & Young’s modelling team in Melbourne and was an Assistant Director in their...

## Replies (1)

## Please login or register to join the discussion.

Looks good. It was a big move for me to go to index from look up but the more I used it the simpler the formulae become and its slashes the time and problems. This looks a step further. Ideal as I use excel as an alternative to bank feeds to cut down on bank processing time.