Excel tip: Use COUNTIF to manage duplicate list values

David H Ringstrom continues his exploration of Excel functions with an explanation of how COUNTIF can work with LARGE and SMALL functions to manage numerical lists.

My last tutorial demonstrated how the LARGE and SMALL functions can help you find the largest or smallest values in an Excel list. This follow-on piece shows how COUNTIF can help you deal with items that share the same value.

Last time around, LARGE and SMALL helped us identify the extreme values, and MATCH and INDEX functions were applied to return the item names that corresponded with those values. But if two items on a list share the same value, MATCH/INDEX will return the same name for both.

Instead, using the following formula determines how many times the value appears in the list and subtracts 1 from it:

=(COUNTIF($B$2:B2,B2)-1)*0.0001+B2

If the value appears on the list only once, there's no reason to change the original value. If the amount appears more than once, the formula will add .0001 to it based on the number of times that it has appeared previously in the list. As shown here, kiwi sales in cell C6 become 637.0001. If strawberry sales were also 637, its amount would become 637.0002.

Instead of banans occupying third and fourth place, COUNTIF has return both types of fruit with equal sales. I hope the trick will help you deal with similar “ties” you encounter in your Excel spreadsheets.

"Either you work Excel, or it works you!" says David Ringstrom CPA, the head of Atlanta-based software and database consultancy Accounting Advisors. He presents Excel training webcasts for CPE Link and contributes articles on Excel to AccountingWEB and Microsoft Professional Accountant's Network newsletter. He can be reached by email at david[AT]acctadv.com. You can find further ExcelZone tutorials from David H Ringstrom here.

Continued...

» Register now

The full article is available to registered AccountingWEB members only. To read the rest of this article you’ll need to login or register.

Registration is FREE and allows you to view all content, ask questions, comment and much more.

Comments
Dick Lloyd's picture

Ranking

Dick Lloyd | | Permalink

Would it not be easier to use the ranking function, then if two values in the list are, in fact, the same, they will have the same ranking, which is arguably better than ranking Kiwi 3rd and Bananas 4th merely bacause Kiwis appear in your list after Bananas.

The formula that I would enter in cell E2 on your example would be:

=Rank (B2,B$2:B$11,0), then copy the formula down column E.

I would then simply sort the list by reference to the rank value in column E.

 

However, I can see that COUNTIF would be useful to use if you wanted to identify genuine duplicated data so it can be removed.

 

Creates same problem...

duncanphilpstate | | Permalink

I think you'll find that RANK will also place Kiwis and Bananas with equal rank and so INDEX will only retrieve the first it finds? Which is very similar to the LARGE/SMALL issue under discussion. Whatever the start point, you need to prevent duplicates in what a programmer would call the key value.

Whether you rank Bananas ahead of Kiwis or vice versa is then personal preference and arbitray but they do have to be different.

Or use ROW()

duncanphilpstate | | Permalink

Rather than COUNTIF as illustrated I prefer to use ROW to achive this.

Each cell in your column C would be (B2 + ROW() / 1000000) which has the effect of adding its row number to each value but well to the right of the decimal point and immaterial.

If you've got a long list, I've found that the ROW method executes faster than a whole chain of COUNTIFs, as ithe calculation only needs to refer to current cell information and not to the entire list above it.

Not to say that using COUNTIF in this way is not a useful technique of course.

Error in following

AMHCPA | | Permalink

In repeating your example it was found that COUNTIF() displayed 2 beside BOTH Bananas and Kiwi. Is there a option I need to set to use this as you have shown? Thank you duncanphilpstate for the idea of using +ROW()/1000000 to increment the individual data.

Good thought on RANKING

dringstrom | | Permalink

Dick, 

RANKING is a great alternative, and I appreciate the enrichment your comment adds to my article. Thanks for reading!

David

 

Clever!

dringstrom | | Permalink

Duncan,

Great point on ROW(). I see users often learn a few functions in Excel and try to apply them to pretty much every task. I sometimes fall into that trap myself. I often use ROW for other purposes, but in this instance it is more effective than COUNTIF when working with large data sets. Thanks for reading, and for enriching my article!

David

AMHCPA,

dringstrom | | Permalink

AMHCPA,

The bit of nuance is that you must anchor the first range in COUNTIF by way of putting a $ in front the row number, i.e. =COUNTIF($B$2:B2,B2). The $ in front of B is extraneous in this context because we're not copying the formula sideways, so =COUNTIF(B$2:B2,B2) would work just as well. By anchoring the formula, COUNTIF looks at an ever expanding range, and that's what makes it work as a numbering scheme.

Thank you for reading my article, I hope you found it helpful!

David