Once you have a dataset in your Power BI desktop, one of the first things that you will want to do is to perform some calculations on your data.
While most of the time DAX will be very familiar to anyone with experience in Excel, Power BI offers you two different ways to do these calculations; measures and calculated columns. A lot of the time you can achieve the same result with either, but not always and you will very quickly need at least a basic understanding of when to use which technique.
Content seriesView full content series
There are many articles on this subject, but 18 months ago as a newbie to Power BI and it’s modeling language DAX, I found most of them simply confusing. For this reason, I have decided to write this article after much research and experimentation for the benefit of other newbies to the topic. In doing this, I may miss some super-technical subtleties that become important with complex models working on large datasets, but for most people working with accounting data from a system like Sage 50 these are not important.
Are you trying to calculate summary values or create new fields in a table?
The first question to ask yourself is what kind of calculation are you trying to do. Most of the time when you are working with accounting data, your calculations fall into two main categories:
Additional calculated fields for every row in a table (for example the margin for every line of a sales invoice)
Summary calculations over many rows of a table (for example the overall margin % for many lines of many sales invoices).
As a rule of thumb, you should use calculated columns in the first case and Measures in the second. If you are looking at your table and wish that there were another field in that table that somehow is calculable based on your dataset, then you should probably be thinking of using a calculated column. If on the other hand you are trying to compute a value that is potentially just a single number (e.g. total sales last 28 Days), then you should be thinking about using a measure.
In this example I have a table of sales invoice lines that contains products with their unit cost and quantity. I want to calculate extended cost for each item and I use a calculated column:
Extended cost = [unit cost] * [quantity]
In this next example I have have used an existing field called [sales] and combined this with my new calculated column [extended cost] to create another new calculated column called “margin”
Margin = [sales] - [extended cost]
Now I want to compute my margin %. It doesn’t work if I do this as another calculated column. I want to see the margin % for each row, as a summary total for all rows and possibly a filtered view of them using a slicer. This would be a very common requirement. I cannot just sum my margin % values. 10% margin on one item plus 15% margin on another does not equal 25% on both! I first need to sum all my margin values and then my sales values for the context in which I am looking at my data and then calculate a % margin. The final calculation needs to be done based on the overall context of my visualisation.
For this type of calculation where I am summarising data on the fly depending on the context I need to use a measure. In this example I have created a measure called margin %.
Margin % =
(SUMX('Sales Invoice Details',[Margin]),
SUMX('Sales Invoice Details',[Sales]),
As a side-note you will see that I have used the “safe divide” DAX function to handle any case where my denominator is zero.
This [margin %] value is computed on the fly when the visualisation is rendered and takes into account the context. In the example, I show two visualisations; a “table” and a “card”.
Notice three things.
The context of the table is [product ID] and my [margin %] measure is evaluated for each [product ID]. This gives me the correct % margin for each product.
At the foot of the table are totals. My [margin %] measure is smart enough to know to calculate the overall % based on the totals of the [margin] and [sales] columns - because this is the context of the totals.
The card is just a single value for margin % and again the measure takes the context from that page on the report and gives me the correct figure of 16.52% across all of my products.
Do a simple test if you are still confused or unsure
There are many situations where you can use calculated columns and measures interchangeably and still get the same correct results. If you are not sure which to use, then I would recommend that you do a simple test to check that you are getting the correct results.
To illustrate the point, in the same model I have created a new calculated column called [cMargin %] using the same formula that I used for the measure [margin %]. I get completely different results that are clearly not what I am trying to achieve.
For a beginner with Power BI, sometimes a quick test like this is much faster than trying to figure it out from first principles.
Use measures to avoid summary tables
Power BI is designed to be able to iterate very quickly over hundreds of thousands or even millions of rows of data. With this in mind you can often avoid creating special summary tables with calculated columns simply by applying measures to a big fat table that contains all of your records and base-level calculated columns. This keeps your model simple and will help you to produce more flexible visualisations that can be sliced, diced and summarised any which way.
All measures are available to all tables and calculations
Another difference between measures and calculated columns is that measures are available to all calculations and visualisations in your model.
If you are wanting to use a table field or calculated column in calculation that involves another table, then you must have some way of relating the two tables. This can be done through specifying a fixed relationship in your model under “manage relationships” or by using the DAX LOOKUPVALUE function. In accounting software a classic fixed relationship would be between a ‘customers’ table and a ‘sales invoices’ table. Every sales invoice is a “child” of a customer.
Measures don’t behave this way. They are calculated values, or arrays of values that are available to be used freely anywhere in your model or in any visualisation. They are calculated on the fly at the time of visualisation with filters applied according to the context in which you are using them.
In this example I have two visualisations. One based on [customer ID] and the other based on [purchase order ID] with the calculated column [margin] and measure [margin %] from my sales invoices. Why you would want to do the second visualisation is another matter, but it is here to show a point.
Now since my [margin] and [margin %] values are related to my customers, each value is summarised by [customer ID] in the first table.
Since there is no relationship between these values and my [purchase order ID], the second table shows the total for all sales records of £1,888,842,82 for margin and 16.52% for margin %. So far so good and the calculated column [margin] and measure [margin %] are behaving as expected.
Now to illustrate the point I have also created a new calculated column [PO amount x margin%] in my ‘purchase orders’ table:
PO amount x margin% = [net amount] * [margin %]
Now from a business point of view this is a useless number, but I can do it. And if I filter my report by customer this value will change.
In other words; IF I WANT TO, UNLIKE A CALCULATED COLUMN, I CAN INCLUDE ANY MEASURE IN ANY CALCULATION.
This means two things. It means that I can use measures to calculate constants that can be used anywhere in my model (e.g [number of days remaining this Year]). It also means that to organise my model in a nice friendly way I can put all my measures into a single “measures table’ for easy referencability. My measures created in my sales invoices table do not need to reside in my sales invoices table.
Take care though. Just because you can use any measure in any calculation does not mean that you should and that you will produce a useful answer (like my example with [PO amount x margin%].
Use measures to display constants
In my “Accounting Insights for Sage 50” model I do use measures to display a few constants (at least they are constant for the time that you are looking at the report and not affected by any filters. These are all calculations per se, but all exploit the ability to use a measure anywhere within my model or report regardless of context.
First day of current year
Days remaining this year
Days remaining this month
Today (just to display today’s date on a page)
Most recent invoice date
New customer wins YTD
New customer wins last year
New customer wins last YTD
New customer wins last 28 days etc..
Create a ‘Measures’ table to tidy up your model
Because your measures can reside in any table and be used throughout your model you can put them all together into a ‘measures’ Table where you can easily find them. I confess that I didn’t do this with my first iteration of my “Accounting Insights for Sage 50” content pack, but it makes sense if I want other people to be able to use the model more effectively.
I will add this to the next release. To do this is a bit of a fudge but it works and is very simple. How to do this is explained very well in this article. You may want to put all of your measures into the one table or group them by topic (e.g. ‘sales measures’). I would suggest is to pay careful attention to the name of the measure. So my [margin %] measure might be better re-named [sales margin %]. This would help with readability and I also suspect it would help with the “ask a question” feature on any Power BI service dashboard (to be checked).
Now for the techie bit. I am not going to go into detail here but your model will perform differently depending on the choices you make between calculated columns and measures. There are of course many other considerations that affect performance. What I can share though is my own practical experience. I am working pretty much exclusively with Sage 50 accounting data at the moment, so my datasets are so far all less than a million records in their raw form (most considerably less than that), and I cut out a lot of unwanted data by only taking the last 800-days of transactions.
Reading other articles on the topic and from my own experience, a model with a lot of complex calculated columns will take a long time to load whereas a report or dashboard that is over-dependant on very complex measures will take a long time (or even fail) to render certain visualisations.
Sage 50 Accounts does not really produce what you might call “big data” and so performance is not usually an issue. The only time that I really had a problem was with my sales forecasting calculations. These analyse the last two years’ sales invoice line items to produce forecasts that can be sliced by customer, product, sales rep, department, product category etc. To try to take into account seasonality what I ended up with was in effect a quadratic regression of sales by product and customer with calculations feeding into other calculations feeding into other calculations.
In my first attempt I was a little bit green and paranoid about how long it would take to render the visualisations. I took an extreme approach and pretty much built the whole thing using calculated columns. It worked ok, but it meant that in Power BI Desktop it took several minutes to load on my fairly regular laptop and that I had to close everything else on the computer.
Editing the model was painfully slow. When I tested deploying the content pack to the cloud on Power BI service it would take about 10 minutes to load and refresh the dataset (breaking Microsoft’s maximum requirements). Once loaded though it worked well but I had to look at rebuilding it to address the loading problems.
In my second attempt (more of an experiment) I took an opposite extreme approach and rebuilt the forecasting model pretty much using measures throughout. This would load very quickly and was well within Microsoft’s requirements for a content pack but the visualisations struggled to render and sometimes failed due to lack of resources.
In my third attempt I took a hybrid approach. For all of the base-level and intermediate calculations I used calculated columns and for the final stages that I would want to slice and dice I used measures. This works well now. I will still sometimes hit an issue with a large dataset with some over-ambitious visualisations.
What I wanted to do was to pack as much as I could into a single visual that you could drill down through a sales forecast from sales rep to customer and then product. With a really large dataset this still needs to be simplified and you need to decide by what parameter you want to show the forecast.
If you want to start working with Power BI Desktop, then one of the first things you need to get your head around (at least with some basic guidelines) is when to use measures vs calculated Columns. Or at least when it might matter. As a rule of thumb think of using calculated columns wherever what you really wanted was just another field in a table that may be used in other calculations or as an attribute to filter or slice your visualisations.
Tend towards using measures whenever you are performing summary calculations. Remember that the same formula will not necessarily produce the same results in a calculated column vs a measure. If in doubt, test it and double-check your answers.
You can take a look at the Sales Forecast model. In this case I have a demo/test dataset that is at the top end in size of what you would expect in a Sage 50 dataset and I have left in the drill-downs for the forecast visualisations. You will see that these visualisations take the longest to render. Cutting out the drill-downs dramatically improves performance for a large dataset.