Modelling tips and tricks: Lookout for VLOOKUP and HLOOKUP

looking for something
istock_richvintage_ls
Share this content
Tags

Welcome to our series of applied tips and tricks in Excel for accountants and financial analysts. As a professional modeller, FCA and Excel MVP Liam Bastick takes a swipe at many accountants’ favourite functions.

Often we need to look up data in a table or list – and two functions most of us are very familiar with are VLOOKUP and HLOOKUP. But do you realise it’s very easy to make a mistake with these functions?

Let’s start with a refresher.

VLOOKUP(lookup_value,table_array,column_index_

number,[range_lookup]) has the following syntax:

  • lookup_value: what value do you want to look up?
  • table_array: where is the lookup table?
  • column_index_number: which column has the value you want returned?
  • [range_lookup]: do you want an exact or an approximate match? This is optional and to begin with, I am going to ignore this argument exists.

HLOOKUP is similar, but works on a row, rather than a column basis.

Example

I am going to use VLOOKUP throughout to keep things simple. VLOOKUP always looks for the lookup_value in the first column of a table (the table_array) and then returns a corresponding value so many columns to the right, determined by the column_index_number.

1

In this above example, the formula in cell G25 seeks the value 2 in the first column of the table F13:M18 and returns the corresponding value from the eighth column of the table (returning 47). You can follow all of these examples in the attached Excel file.

So what goes wrong? Well, what happens if you add or remove a column from the table range?

Adding (inserting) a column gives us the wrong value:

2

With a column inserted, the formula contains hard code (8) and therefore, the eighth column (M) is still referenced, giving rise to the wrong value. Deleting a column instead is even worse:

3

Now there are only seven columns so the formula returns #REF! Oops.

It is possible to make the column index number dynamic using the COLUMNS function:

4

COLUMNS(reference) counts the number of columns in the reference. Using the range F13:M13, this formula will now keep track of how many columns there are between the lookup column (F) and the result column (M). This will prevent the problems illustrated above.

But there’s more issues. Consider duplicate values in the lookup column. With one duplicate, the following happens:

5

Here, the second value is returned, which might not be what is wanted. With two duplicates:

6

Ah, it looks like it might take the last occurrence. Testing this hypothesis with three duplicates:

7

Yes, there seems to be a pattern: VLOOKUP takes the last occurrence. Better make sure:

8

Rats. In this example, the value returned is the fourth of five. The problem is, there’s no consistent logic and the formula and its result cannot be relied upon. It gets worse if we exclude duplicates but mix up the lookup column a little:

9

In this instance, VLOOKUP cannot even find the value 2!

So what’s going on? The problem – and common modelling mistake – is that the fourth argument has been ignored:

VLOOKUP(lookup_Value,table_array,col_index_num,[range_lookup])

[range_lookup] appears in square brackets, which means it is optional. It has two values:

  • TRUE: this is the default setting if the argument is not specified. Here, VLOOKUP will seek an approximate match, looking for the largest value less than or equal to the value sought. There is a price to be paid though: the values in the first column (or row for HLOOKUP) must be in strict ascending order – this means that each value must be larger than the value before, so no duplicates.
    This is useful when looking up postage rates for example where prices are given in categories of pounds and you have 2.7lb to post (say). It’s worth noting though that this isn’t the most common lookup when modelling.
  • FALSE: this has to be specified. In this case, data can be any which way – including duplicates – and the result will be based upon the first occurrence of the value sought. If an exact match cannot be found, VLOOKUP will return the value #N/A.

And this is the problem highlighted by the above examples. The final argument was never specified so the lookup column data has to be in strict ascending order – and this premiss was continually breached.

The robust formula needs both COLUMNS and a fourth argument of FALSE to work as expected:

10

This is a very common mistake in modelling. Using a fourth argument of FALSE, VLOOKUP will return the corresponding result for the first occurrence of the lookup_value, regardless of number of duplicates, errors or series order. If an approximate match is required, the data must be in strict ascending order.

VLOOKUP (and consequently HLOOKUP) are not the simple, easy to use functions people think they are. In fact, they can never be used to return data for columns to the left (VLOOKUP) or rows above (HLOOKUP). So what should modellers use instead?

Well, tune in next time.

Word to the wise

As stated above, HLOOKUP works like VLOOKUP but hunts out a value in the first row of a table and returns a value so many rows below this reference. However, it has the same limitations and should be used just as carefully.

About Liam Bastick

Liam Bastick

Recognised by Microsoft as one of 104 Most Valuable Professionals (MVPs) in Excel worldwide by Microsoft, Liam has over 30 years’ experience in financial model development/auditing, valuations, M&A, strategy, training and consultancy. He has headed Ernst & Young’s modelling team in Melbourne and was an Assistant Director in their strategic valuations team in London. He was also a senior member of the UK Post Office’s M&A and strategy teams and has worked for / assisted various other Australian modelling companies including BPM, Corality, Navigator Project Finance, PKF and SumProduct.

He has worked in the UK, Australia, Belgium, Denmark, France, Germany, Hong Kong, Indonesia, Malaysia, New Zealand, United States, Switzerland and Vietnam, with many internationally recognised clients, constructing and reviewing strategic, operational and valuation models for many high profile IPOs, LBOs and strategic assignments.  Liam is a Fellow of the Institute of Chartered Accountants (ICAEW), a Fellow of the Institute of Chartered Management Accountants (CIMA) and is a professional mathematician.

Replies

Please login or register to join the discussion.

avatar
12th Oct 2018 14:08

I often use Vlookup, it's so helpful to me. As soon as i have done a Vlookup, I instantly change the formula to values only, this way there is no issue with adding/removing columns later.

Thanks (0)
avatar
12th Oct 2018 14:56

index(match()) achieves the same results as v/hlookup, while allowing you to insert, delete and rearrange columns almost at will. You don't even need the lookup "key" to be to the left of the data you want returned, which gives far more flexibility for error checking.

Thanks (1)
avatar
12th Oct 2018 16:51

Another vote for INDEX/MATCH here - can be a bit more difficult to learn if you're not used to it, but once you understand how the logic of the individual functions work they can be so much more flexible and powerful than the H/VLOOKUP functions.

Thanks (0)
avatar
13th Oct 2018 08:26

A very good article, however as we have been advised on Accounting Web, spreadsheets are finished, so you have wasted your time ;).

On the upside those that say spreadsheets are finished, but never really explain properly whats going to replace them.

Thanks (0)
avatar
13th Oct 2018 09:56

Index match works great but when you have large sets of data being cross referenced in a single workbook, the number of calculations required and size of the workbook can increase at the expense of performance speed.

My way around this is to use combinations of sumifs to narrow down criteria and sometimes dynamic named ranges referenced by array formulas.

The problem with a lot of these methods is that for other users of the workbook it's very difficult to follow through the calculations if they need to pick it apart!

Is there a better way that allows ease of traceability?

Thanks (0)
to Graeme G
14th Oct 2018 16:22

Without wishing to pre-empt Liam's follow up article, it sounds like it would be worth investigating the Power Query/Get & Transform tools:
https://www.accountingweb.co.uk/tech/excel/supercharge-excel-part-4-vloo...
https://www.accountingweb.co.uk/tech/excel/supercharge-excel-vlookup-v-m...

Thanks (0)
13th Oct 2018 11:21

As someone who hasnt got further than =(sum....) thank you Liam for a clear understandable article. More please.
And AndrewV12 - may I direct you to this:
https://www.accountingweb.co.uk/tech/accounting-software/mtd-avalara-bri...
Spreadsheets are most definitely here to stay. Which is why HMRC backed down and now permits spreadsheet links under MTDVAT.

Thanks (1)
avatar
24th Oct 2018 11:47

When dealing with tabulated data, Tables ('Ctrl_t') (VBA: List Objects) should generally now be employed since you can refer to the table in functions by name and fieldname(s) (column headings). It enforces function consistency per field in the table (if your table contains functions) and maintains formatting when adding additional rows, and fields can be repositioned by dragging and dropping. It also makes reviewing look-up functions far more intuitive. Data itself should be independent of order (both rows and columns) otherwise it implied there is a missing 'orderNo' field. Problems encountered here in the vlookup example of having multiple choices indicate an issue with the data or the solution (as IDs should be unique), alternatively other fields should be employed to create combination keys. At that point using SumIf() of SumIfs() can be used (you could validate the number of values summed with CountIf()). The benefit of Vlookup/Index-Match is that it can return a non numberic values. Personally I prefer using SumProduct() as an multi-conditional boolean array in a Table
eg. =SUMPRODUCT((tbTransport[vehicleType]="Car")*(tbTransport[colour]="Blue")*(tbTransport[value])). The example here is using a Table (tbTransport) of three fields: 'VehicleType', 'Colour', and a numerical field 'value'. Note in all these examples the looked-up field position in the table is irrelevant.

Liam: I'm keen to read your promised follow-up article: ...'So what should modellers use instead?'.

Thanks (0)
avatar
10th Nov 2018 04:34

With all due respect, "In fact, they can never be used to return data for columns to the left (VLOOKUP) or rows above (HLOOKUP)" is simply NOT TRUE!!

All too often VLOOKUP and HLOOKUP are accused of deficiencies, but this is usually done in unfair comparison with two other functions used in combination (e.g. INDEX & MATCH) to overcome a deficiency/limitation in the naked function.

VLOOKUP and HLOOKUP can be used to look in the opposite direction (i.e. left or up respectively) by using the CHOOSE function to return the table_array argument, and the integer 2 for the column_index_num. This works in an almost identical way to an IDNEX/MATCH combo - even allowing non-contiguous ranges to be specified for the location of the lookup and return values!

Check out the tutorial on Mynda Treacy's MyOnlineTrainingHub blog: https://www.myonlinetraininghub.com/excel-vlookup-to-the-left-using-choose
or search the web where you'll find a number of variations on the same theme.

VLOOKUP and HLOOKUP are NOT as impotent as many espouse!

Thanks (0)