Save content
Have you found this content useful? Use the button above to save it to your profile.

IT BASICS: Data validation for better spreadsheets. By Simon Hurst

4th May 2006
Save content
Have you found this content useful? Use the button above to save it to your profile.

AccountingWEB recently featured a discussion concerning the need for good spreadsheet design to help ensure the integrity and accuracy of all spreadsheets. A key element of this is ensuring that the correct data is entered in the first place. Excel 97 introduced the Data Validation facility. In this IT Basics article we will look at Data Validation and how to use it to increase the reliability of your spreadsheets.

However well you design your spreadsheet, when it comes to entering data, you are usually at the mercy of a user who might not understand what is required as well as the spreadsheet author. This could result in text being entered where a number or date is required, or the entry of values outside the range the spreadsheet was designed to cope with.

Excel includes functions that can check the contents of a cell ' the IS series including ISBLANK, ISNUMBER and ISTEXT - and these could be used to report on incorrect entries. However, it is usually preferable to stop incorrect data being entered in the first place and this is where Data Validation comes in. As its name implies, this feature is used to validate data that a user types into a cell.

Setting up data validation
To use the Excel data validation facility, select the data input cell or cells and then choose Data-Validation from the Excel menu.

Excel Data Validation dialogue box

The Data Validation screen has three sections. The first, Settings, enables you to specify the data that will be accepted. Input Message allows you to specify a message when the user moves to the cell, and the Error Alert screen defines the action to be taken when invalid data is entered, and the message to be displayed.

Data Validation can be extremely effective in preventing invalid data being typed in, but it doesn't stop invalid data being copied and then pasted in. We will look at how to identify such data later. Also, if the cells are not locked and the worksheet is not protected, the data validation settings could easily be altered or removed, or even accidentally pasted over.

We will look at the Settings screen first. The Allow box lets you specify the type of data you will consider to be valid, the Data box lets you set the type of comparison test, such as 'greater than', 'less than', 'not equal to' or 'between'. Then appropriate maximum, minimum or value boxes appear to enable you to set the range of values you want to allow. Note that these values can either be entered as actual numbers, or as references to other cells. Finally you can choose whether to allow blanks, or subject them to the validation process, and whether you wish to 'Apply these changes to all other cells with the same settings'.

Details of the Allow options are as follows:

  • Any value - This allows the user to enter any text or number so probably doesn't go a long way towards validating the data. It does allow the input message to appear when the cell is selected without imposing any restriction. If anyone knows any other brilliant reason for why this option exists please let us know.
  • Whole number - Whole numbers subject to value-based criteria
  • Decimal ' As for whole numbers, but with decimal points allowed
  • List - Items from a list held elsewhere on the same sheet (or a named range of cells). You can choose 'In-cell dropdown' to enable the user to choose from a drop down list in the cell.
  • Date ' Dates subject to date-based criteria similar to the value-based ones
  • Time ' Similar to dates
  • Text Length - Text of a length defined by value-based criteria again
  • Custom - A formula that evaluates to either true (valid) or false (invalid). For example if you want the text entered in cell C12 to start with the letter 'a' you could enter the custom formula: =left(C12,1)="a". The Custom validation criteria allows for great flexibility and very sophisticated data validation.

    Input message
    If you want to help the user out by giving them a bit of a clue about what they need to enter in the cell or cells that you are validating, then you can choose the Input Message tab and type in an appropriate heading and message to appear whenever the user selects the cell. So, for example, if we are assuming that we are entering sales orders and that we don't want to accept an entry of less than 1, or more than 100 items, we could set the input message options as follows:
    Title: Order quantity
    Input message: Please enter a value between 1 and 100
    Hopefully we would have set the validation settings to 'Allow' a 'Whole number' 'between' a 'minimum' of 1 and a 'maximum' of 100.

    Error alert
    Finally we define the message a user sees when they enter invalid data. Even if you do not set up an input message, it is important to set up an error message, otherwise a rather uninformative system-generated message will be inflicted on your user. If you have gone to the trouble to provide a suitable input message, you might feel an irresistible urge to enter something like:
    Title: Whey-faced loon
    Input message: Just read what it says on the screen why don't you
    However, it would be infinitely preferable to help the user with something that re-states the original validation criteria. Maybe along the following lines:
    Title: Incorrect order quantity
    Input message: Please ensure you enter a whole number between 1 and 100

    As well as setting up the error text, this screen allows you to determine what happens when invalid values are entered ' this is controlled by the Style setting, which is a lot more important than the Style description suggests. In fact Style controls what you can get into a field (so perhaps quite appropriate after all'). Only the Stop style actually prevents the typing in of data outside the validation criteria:

  • Stop ' prevents the entry of invalid data, only cancel and retry options are available
  • Warning ' displays the message with Continue ' Yes, No, Cancel buttons. Yes allows the entry of the invalid amount
  • Information ' displays the message with OK and Cancel buttons. OK allows the entry of the invalid amount.

    Highlighting invalid data
    As we mentioned, it is possible for people to paste in invalid data, or to apply Data Validation to an area that already contains data ' some of it possibly invalid.

    In order to highlight such invalid data you can choose Tools-Auditing-Show auditing toolbar. If you then click on the Circle Invalid Data button, cells containing invalid data will be circled in red.

    This can be a useful technique in its own right. If someone habitually sends you lists of figures where some are entered as text rather than numbers, you could highlight the range of cells, set up a data validation to limit the acceptable data to a range of decimal numbers, and then use the Circle Invalid Data button to identify the non-numeric cells. More recent versions of Excel have an Error Checking option that will automatically mark such cells with a rather less dramatic green triangle in the top left corner.

    Related articles

  • Understanding Excel formulae
  • How to design better spreadsheets
  • Spreadsheet errors and how to avoid them

    About the author
    Simon Hurst is a former chairman of the ICAEW IT Faculty and runs The Knowledge Base, a consultancy dedicated to helping practitioners make effective use of technology. He is also the author of AccountingWEB's Office ProductivITy Kit and '100 Time-saving Tips for Microsoft Office'. For more information, visit The Knowledge Base website. AccountingWEB members can also access his back catalogue and IT Clinic online email support via the Office ProductivITy service.

    Subscribe to the ExcelZone NewswireSubscribe to the ExcelZone newswire
    To keep up with all spreadsheet-related developments, click the button to subscribe to the ExcelZone newswire. You will return to the AccountingWEB home page when your name has been added to the mailing list.

Replies (2)

Please login or register to join the discussion.

Simon Hurst
By Simon Hurst
14th May 2006 21:33

Invalid data
Many thanks for your comments and the important warning regarding the audit tools and invalid data.

Kind regards


Thanks (0)
By RayButler
12th May 2006 13:35

Limitations on Auditing, Circle Invalid Data tool
Thanks for an excellent summary of an underused tool. Data, Validation is really useful when setting up spreadsheets for data input.

One small point about the auditing tool that circles invalid data - in Excel versions up to 2003 it will only circle the first 256 validation failures. Applying validation to existing data and using the audit tool will not give you the comprehensive check you're expecting.

To find invalid data reliably in a large population of existing data, it's best to either use conditional formatting to set the background colour of the cells in question to something distinctive if the condition that would make them fail the Data, Validation test is met, or use an =if formula to display "Error" if the validation conditions are not met. You can then use data, autofilter to display only the dubious values.

Much more on this in Patrick O'Beirne's excellent book on Spreadsheet Check & Control

Thanks (0)