Cutting Big Data down to size: Power BI visualisations
In the first part of this series we discussed possible definitions of Big Data and considered whether some of the technology and techniques involved could be of use to businesses of all sizes. In part 2 we used Microsoft Power BI Desktop to combine some internal accounting data with over 1.6m rows of data downloaded from the Ofcom website and plotted the resulting values on a Power BI Map visualisation:
This time we will explore some of the available data visualisations in a bit more detail using Power BI Desktop.
Power BI Desktop
As we mentioned last time, because the availability of the Excel Power BI tools varies with the different versions and editions of Excel, we will also be using the freely available Power BI Desktop to demonstrate many of the techniques that we will be looking at. This can be downloaded here. Note that Power BI is subject to frequent updates so if you don't see the options we refer to, or they appear differently, it may be due to differences between versions.
Power BI visualisations
So far, we have added just a single map visualisation to a single page of our Power BI report. We will look at how to add visualisations and how visualisations can interact with each other.
We'll start off by using some of the built-in visualisations shown in the top section of the Visualisations pane. The first thing to note is that to add a visualisation you must first make sure that you haven't got any existing visualisations selected; otherwise, instead of adding your chosen visualisation, you will change the one that is currently selected. Here we have used the sizing handles around our map visualisation to reduce the size and then clicked in the white space outside of it to deselect it. We can then click on another visualisation to add it to our page:
Once we have added our visualisation, the available chart areas and filters will be shown in the bottom section of the pane and we can drag our data fields to the appropriate areas.
Here, we have added column chart and a treemap diagram to page one and added the required fields from the clients table in our Fields pane:
Although so far the results don't look particularly different from charts in an Excel worksheet, unlike Excel, the charts are linked by default, so that selecting an item in one chart will affect the others. In this case, we have clicked on the 'B52s Bar' in our column chart and then hold down the control key before clicking on 'The Anchor' to select them both. Our other charts change. Our map just shows the two selected items and our treemap highlights the same two items:
We can change how each of our charts affects the others. With our column chart selected we have clicked on Edit interactions in the Visual tools, Format Ribbon tab. This displays icons at the top of each of our other charts. For our map, we can either turn the interaction off altogether, so that it will not be affected by any selections in our column chart, or leave it set to apply the selection as a filter. Our treemap has the additional, default, option to highlight the selected data. We have switched to filter to show the difference:
The lower part of the Visualisations pane has three sections which can be selected using the icons in the heading. So far, we have been looking at the chart areas section. We can click on the paint roller icon to see the formatting section. The options available here will depend on the type of chart selected. For our column chart we can, for example, choose the colour scheme, including setting the colour of each column independently, and also choose a background picture for our plot area:
The third section controls 'Analytics features' and, for a column chart, makes it easy to add a 'Constant' line, such as a line representing the breakeven threshold. For a line chart, further line options such as Min, Max and Percentile are also available. Here we have copied our column chart to a new page so that we can expand it to make it easier to see, and we have added constant lines at the Ofcom thresholds for Standard (10Mbps), and Superfast (30Mbps), broadband services:
Although there is a reasonable range of built-in visualisations, working with Big(ish) Data often requires some more innovative methods of visualising the message within the data. Custom visualisations have been available within Power BI for some time, but in the most recent (July 2017) update they have been made much more accessible through the inclusion of a Custom visuals group in the Home Ribbon tab. The From Store option provides access to dozens of alternative ways of presenting data, split into several different categories:
In order to demonstrate the use of one of these custom visuals to display less-structured data in a very different way to a normal spreadsheet chart, we will use a Word Cloud to analyse the frequency of words used in Facebook posts. Here we have made a completely random choice of a Facebook page, and created a connection to the Posts in it using the Power BI, Get Data, More, Online Services, Facebook option. This gives us access to the Facebook messages. We can add the Word Cloud visual and use it to visualise the contents of our message column using font size as the indicator of word frequency:
The format section of the Visualisation pane includes some options specific to the Word Cloud graphic. In particular, it is possible to use 'Stop Words' to ignore certain words. The Default option removes commonly-used words such as: to, the, a. You can also enter specific words to be ignored in the Words box.
Note that to use the Facebook connector you will need to login to your Facebook account. Also, if you do try and connect to a Facebook site that isn't your own, you may need to reduce the number of posts that you wish to work with by adding a 'limit' parameter to the code generated: