Excel news feed
Microsoft releases pricing for retail versions of Office 2010
Microsoft is departing from its standard pricing model for its Office 2010 software product by eliminating upgrade pricing. Instead, packaged versions will be licensed for two or three computers, while discounts will be available if a user unlocks a preinstalled version of Office 2010. There will also be two free versions of Microsoft Office.
Microsoft Office 2010 Technical Preview
Today, Microsoft is announcing a limited, invitation only Technical Preview program for Microsoft Office 2010 that will kick off in July. The program will provide you with the opportunity to experience early, pre-release versions of Office 2010 which will include the following applications: Word 2010, Excel 2010, Outlook 2010, PowerPoint 2010, OneNote 2010, Access 2010, InfoPath 2010 and Publisher 2010. Check out www.office2010themovie.com for signing up to be considered for the Technical Preview Program. <?xml:namespace prefix = o ns = "urn:schemas-microsoft-com:office:office" />
VBA Coding Contest
I'm passing along some information I think will be interesting to many of the readers of this blog.
Test your VBA coding skills and win!<?xml:namespace prefix = o ns = "urn:schemas-microsoft-com:office:office" />MSDN® is sponsoring a coding contest for Office 2007, named OfficePalooza! This sweepstakes will run two weeks beginning April 20, 2009, and features ten fun Visual Basic of Applications™ (VBA) coding challenges in the form of puzzles and games. Each entrant will earn a chance to win one of hundreds of available prizes, determined by a random drawing at the end of the contest.
In conjunction with this contest, an Advanced Business User theme will run on Microsoft Office Online from mid-April to mid-May, and will showcase the automation and extensibility aspects of Office 2007 through macros, custom VBA coding, the Fluent UI, and Office Open XML. This collaborative effort will also highlight existing and newly-created content on the MSDN Microsoft Office Developer Center.
Modifying Conditional Formatting Color Ranges in Excel 2007
Today’s authors, Amit Velingkar, a Program Manager on the Excel team, and Bob Silverstein, a User Experience Researcher on the Excel team, talk about creating professional looking Color Ranges in Excel.
Introduction:
My fellow PM on Excel, Robin Wakefield, wrote a blog article on how to create professional looking charts in Excel 2007. Continuing with the same theme, we will investigate the idea of tweaking Color Ranges in Excel 2007. We will start with a default Conditional Formatting Color Range and modify it to give it more pastel look. Also, this would be an ideal forum to give us your feedback on existing conditional formatting colors used in Excel 2007.
As a part of my preparation for this article, I did two things – researched customer files that use conditional formatting and explored various options with Microsoft designers.
Looking at some customer files, I quickly observed a trend. I found out that many users wanted to highlight a small portion of their data. They were achieving this by changing some of the colors in their color ranges to white. Also, some customers preferred a pastel shade of the same color.
I discussed this trend with Microsoft designers and they taught me some neat tricks about manipulating colors. First, they told me that choosing the correct shade of white was critical while mixing colors. And last, an intuitive way of changing shades of colors was to use the HSL (Hue, Saturation and Luminosity) model.
My blog article attempts to bring all these lessons together. We will change some of the colors to white and tweak the colors shades to pastel. Finally, we will introduce a VBA macro that lets you quickly try out new color shades.
Procedure:
1. I started out with applying three of our default color combinations to some data. I choose the following color ranges: Red-Yellow-Green, Red-Yellow-Blue and Yellow-Green.
2. I replaced the yellow color with the appropriate shades of “white”.
Changing the yellow colors to pure whites created a high contrast. For creating whites with a good spectrum of colors, we crank up the RGB values all the way to 255 creating a pure white – Remember RGB (255,255,255) is white. Then, remove just a little amount of the participating color to create “target” white.
As an example while mixing Red and Green, we set our white to (250,250,255) by removing some red and green. The end result is a color range containing pure shades of Red and Green.
Color 1
Color 2
White
Notes
Red (255,0,0)
Green (0,255,0)
(250,250,255)
Remove some red and green
Red (255,0,0)
Blue (0,0,255)
(250,255,250)
Remove some red and blue
Red(255,0,0)
-
(250,255,255)
Remove some red
Green (0,255,0)
-
(255,250,255)
Remove some green
Select the color range. On the Home tab, click on Conditional Formatting, then Manage Rules. In the Conditional Formatting Rules Manager, select the Conditional Format and click on Edit Rule. In the Edit Formatting Rule dialog change the yellow color to white.
3. Using HSL Model to change non-white colors
We can use the HSL model to manipulate color. HSL defines the color spectrum based on Hue (base color from the rainbow spectrum), Luminosity (brightness) and Saturation (purity of the color).
Select the color range. On the Home tab, click on Conditional Formatting, then Manage Rules. In the Conditional Formatting Rules Manager, select the Conditional Format and click on Edit Rule. In the Edit Formatting Rule dialog, click on the color dropdown and choose More Colors. Click on the Custom tab and choose HSL in the Color model dropdown.
In this example, we will change colors using the HSL models as follows:
o Hue: Unchanged, this leaves the base color unchanged. Hence our Reds stay Red.
o Saturation: Decrease the Saturation to 150. This makes the colors more pastel.
o Luminosity: Increase the luminosity to 200. The goal is to increase the brightness to make the color more visible and showcase the text behind the color ranges.
Result:
By using shades of white and by tweaking the luminosity and saturation of colors, we changed the color ranges as follows:
HSL Color Manipulation using VBA:
Since I wanted to try out a lot of combinations of Luminosity and Saturation values to tweak color ranges, I decided to write a VBA macro to do this. I quickly learnt that while Excel allows HSL manipulation through the UI, it does not support HSL manipulation through VBA. I looked around for ways to manipulate HSL and finally found a blog post by Tony Jollans that did something similar. Although most of the code used is directly from Tony Jollans’s blog post, I wrote some additional subroutines, added some comments and re-factored some of the subroutines for my purpose.
The entry function for my macro is called ChangeColorRangeColors().
Sub ChangeColorRangeColors()
...
For Each clrScale In Selection.FormatConditions
For Each criteria In clrScale.ColorScaleCriteria
' get clr
clr = criteria.FormatColor.Color
' split RGB
SplitRGB clr, R, G, B
'skip whites - when all RGB componenst are above 240
If R < 240 Or G < 240 Or B < 240 Then
' get HSL
RGBtoHSL clr, H, S, L
' increase luminosity
L = (iLuminosity / 255) ' from 0.0 to 1.0
' decrease saturation
S = (iSaturation / 255) ' from 0.0 to 1.0
' get RGB
HSLtoRGB H, S, L, clr
' set new clr
criteria.FormatColor.Color = clr
End If
Next
Next
End Sub
At a high level, my HSL manipulator code works in the following ways:
- Get the RGB encoded value from the selected color ranges.
- Split the RGB encoded values into the Red, Green and Blue components.
- Skip the white colors – we do not want to change these.
- Covert RGB into HSL.
- Modify the Saturation and Luminosity, leaving Hue unchanged.
- Convert the modified HSL values back to encoded RGB.
- Apply the new RGB value back to the color range.
I won’t go into the details of converting between RGB and HSL values. These conversion routines are well documented and can be freely found on the internet.
However, another tricky part was trying to split the encoded color value that is used by Excel into individual RGB components. VBA provides a function, aptly named RGB(), that returns an encoded long value when provided with the R,G and B components. I have used this function in my macro to set the Color Range colors. However, VBA does not provide a reverse function to split this encoded long value. I have used a custom function called SplitRGB to do this.
Sub SplitRGB(RGB As Long, R As Double, G As Double, B As Double)
Dim HexString As String
' convert the long to Hex - bb:gg:rr
HexString = Hex(RGB)
' in order to get r,g,b components out of the string,
' we have to make it is atleast 6 characters long - bb:gg:rr
HexString = Right(String$(5, "0") & HexString, 6)
' get each individual color and convert to an double (range: 0 to 255)
R = CDbl("&H" & Mid$(HexString, 5, 2))
G = CDbl("&H" & Mid$(HexString, 3, 2))
B = CDbl("&H" & Mid$(HexString, 1, 2))
End Sub
Since the R, G, B values span from 0 to 255 and can be accommodated in 1 byte. Excel uses a 4-Byte long to encode the RGB values. The first byte is used to store the red value, the second byte stores the green value and the third byte stores the blue values. An easy way to observe these values is to convert this encoded long into a hexadecimal string. We then split this string using the Mid function to isolate the R, G, B components. Finally, we use a conveniently provided CDbl function to convert hexadecimal text to a number. Observe that appending the “&H” enables the CDbl function to identify the text as hexadecimal.
To recap, We can create professional looking Conditional Formatting color ranges by choosing the correct shades of whites and using the HSL model to tweak Saturation and Luminosity values. I am hoping that by using these techniques and subroutines, you will be able to automate the look and feel of your Excel Spreadsheets.
The attached workbook contains these color ranges and the VBA macros. We would love to hear your feedback on these colors.
A Robust Way To Reference Multiple Columns In a Table
I’m sure many of you have built a spreadsheet like this before: you’ve got a table of data; one of the columns in this table contains a formula which references a span of columns in the same table. The number of columns you need to reference may change over time as requirements change. How do you you build this table in such a way that the number of columns can be changed without breaking the formula that references them?
Let’s walk through a somewhat contrived but simple example to demonstrate the issue. Say I’m a real estate investor and I’m tracking a list of houses I’m interested in purchasing. I’m a demanding and detail oriented buyer so I want to know the square footage of the individual rooms in the dwelling. Based on this information, I want to calculate the count of rooms in the dwelling as well as the total square feet of the rooms combined. The table might look something like this:
The formula for the “#Rooms” column looks something like this:
=COUNT( Table1[[#This Row],[SqFt-Room1]:[SqFt-Room3]] )
Now, let’s say I want to add a new house, but this new house has four rooms. Simple, just add a column, right? Not so fast; depending on how we add that column our formulas for “#Rooms” and “Ttl Sq. ft.” may not update as expected.
Or let’s say I’ve changed focus to small condominiums and no longer have a need for the “Room3” column. If I attempt to delete it, then my formulas will break. How do we avoid this?
Here's one trick I use in such situations. It may not be the best answer, but I’ll share it with you here and if others have suggestions to offer you can add them to the comments.
Solution: add an extra “dummy” column before and after the span as end caps, and refer to those columns in your formulas. Using our real estate example, I would add a column before “SqFt-Room1” called “RoomsStart” and one after “SqFt-Room3” called “RoomsEnd”, like so:
I would keep these columns empty and never put data in them. This is important because otherwise my COUNT and SUM formulas may return the wrong results. Then I’d highlight the columns a different shade (this is totally optional but I personally like the visual effect of marking off the start and end of the span) and resize them to something very small so they are out of the way for the most part, like so:
Alternatively, you can hide the columns if you so desire. Then I’d update my formulas so that they referenced these columns instead, like so:
Now, when I want to add a fourth room, I select the “RoomsEnd” column and right-click \ Insert and I get a new column which I can name “SqFt-Room4”. This new column will automatically be included in my COUNT and SUM calculations. Similarly, if I ever get rid of “SqFt-Room4”, I don’t have to worry about breaking any of my formulas. It may not be the most elegant solution, but it gives me a virtually worry-free way to reference a changing number of columns in my table.
Analyzing Data: Functions or PivotTables
Today’s author, Monica Poinescu, a Software Developer in Test on the Excel team, discusses two different approaches to analyzing data in Excel.
Edit: I've attached a file at the bottom of this blog that contains spreadsheets of the examples discussed in this post.
My earlier blog on the new Excel 2007 function SUMIFS spawned a very interesting discussion (thanks to everyone who posted comments there): when trying to analyze/aggregate data in a table, how do we decide whether to use functions versus PivotTables?
This blog outlines reasons to use one option or another. To better illustrate the two alternatives I’ll consider a real estate inspired example: let’s say I have list of homes for sale and their corresponding characteristics:
and I’m trying to find the average price for those homes which have at least 3 bedrooms, a garage and are between 5 and 10 years old. Just by looking at the table, we see that only house3 and house4 satisfy all conditions and the average of their prices is 312,500.
Here is a functions based solution:
=AVERAGEIFS(G2:G6,C2:C6,">2",E2:E6,"yes",F2:F6,">1999",F2:F6,"<2004")
which returns $ 312,500.
To build a corresponding PivotTable, one can use several filters:
Advantages of PivotTables:
- While AVERAGEIFS is limited to using at most 127 conditions, PivotTables can handle more than 127 conditions.
- The elements of the set that fulfill all conditions are listed in the resulting PivotTable.
- PivotTables have a lot of flexibility: the ease of use offered by the new UI allows for a very quick detailed analysis of different available options. Nested layers in a PivotTable offer added results visualization.
- Several different approaches are possible: one can construct different PivotTables that answer the same question above.
- Particularly useful for large data sources: when relying on an external data source, you don’t need to bring all the data in Excel and one could, for example, use OLAP databases.
Advantages of using functions:
- It’s easier to see in one glance all the conditions being used either by looking in the formula bar, or by listing all criteria in separate ranges.
- The result updates immediately when adding rows to the source table, while PivotTables need to be refreshed.
- The criteria, when referenced in a cell, could be as well the result of another formula, while value filters in PivotTables can only use constants.
For example, in the formula above, one could replace
=AVERAGEIFS(G2:G6,C2:C6,">2",…
with
=AVERAGEIFS(G2:G6,C2:C6,I4,…
where I4 contains another calculation ( e.g. =">"&FIXED(SUM(1,1),0)).
The image below shows a corresponding PivotTable filter:
The last field will not accept formulas, only numbers. - Formulas take little space and are easy to move around in a sheet.
Note that in both cases you can use the wildcard characters to define criteria. Also both solutions deal in a similar manner with missing data or errors in the range.
I don't know if I can be fully objective on this question because I'm more of a formula person myself. If I've missed a reason you should pick one approach over another, feel free to let me know the reason you use formulas or PivotTables for summarizing data by leaving a comment.
Excel VBA Performance Coding Best Practices
Today’s author, Chad Rothschiller, a Program Manager on the Excel team, is back with a follow up from his previous post on VBA and Excel performance.
I want to start off this post by thanking everyone who sent in their examples in response to my January request. It is incredibly helpful to be able to look at what you all are doing with Excel! Not only did I see a huge variety in how Excel is being used, you also pointed out various tips and tricks for writing fast VBA code in Excel.
In this post I'm going to share with you the most important performance tips I know about. There are tons of sites, pages, and people who are experts as well on this subject, have performed their own tests, and shared their results and ideas. If you think I missed an important concept for how to optimize Excel VBA performance, or if you've got a valuable comment or link to share, please feel free to post here so everyone can benefit. Thanks!
Turn Off Everything But the Essentials While Your Code is Running
This optimization explicitly turns off Excel functionality you don't need to happen (over and over and over) while your code runs. Note that in the code sample below we grab the current state of these properties, turn them off, and then restore them at the end of code execution.
One reason this helps is that if you're updating (via VBA) several different ranges with new values, or copy / pasting from several ranges to create a consolidated table of data, you likely do not want to have Excel taking time and resources to recalculate formulas, display paste progress, or even redraw the grid, especially after every single operation (even more so if your code uses loops). Just one recalculation and one redraw at the end of your code execution is enough to get the workbook current with all your changes.
Here's some sample code that shows how and what to shut off while your code runs. Doing this should help improve the performance of your code:
'Get current state of various Excel settings; put this at the beginning of your code
screenUpdateState = Application.ScreenUpdating
statusBarState = Application.DisplayStatusBar
calcState = Application.Calculation
eventsState = Application.EnableEvents
displayPageBreakState = ActiveSheet.DisplayPageBreaks 'note this is a sheet-level setting
'turn off some Excel functionality so your code runs faster
Application.ScreenUpdating = False
Application.DisplayStatusBar = False
Application.Calculation = xlCalculationManual
Application.EnableEvents = False
ActiveSheet.DisplayPageBreaks = False 'note this is a sheet-level setting
'>>your code goes here<<
'after your code runs, restore state; put this at the end of your code
Application.ScreenUpdating = screenUpdateState
Application.DisplayStatusBar = statusBarState
Application.Calculation = calcState
Application.EnableEvents = eventsState
ActiveSheet.DisplayPageBreaks = displayPageBreaksState 'note this is a sheet-level setting
Here's a quick description for each of these settings:
Application.ScreenUpdating: This setting tells Excel to not redraw the screen while False. The benefit here is that you probably don't need Excel using up resources trying to draw the screen since it's changing faster than the user can perceive. Since it requires lots of resources to draw the screen so frequently, just turn off drawing the screen until the end of your code execution. Be sure to turn it back on right before your code ends.
Application.DisplayStatusBar: This setting tells Excel to stop showing status while False. For example, if you use VBA to copy/paste a range, while the paste is completing Excel will show the progress of that operation on the status bar. Turning off screen updating is separate from turning off the status bar display so that you can disable screen updating but still provide feedback to the user, if desired. Again, turn it back on right before your code ends execution.
Application.Calculation: This setting allows you to programmatically set Excel's calculation mode. "Manual" (xlCalculationManual) mode means Excel waits for the user (or your code) to explicitly initiate calculation. "Automatic" is the default and means that Excel decides when to recalculate the workbook (e.g. when you enter a new formula on the sheet). Since recalculating your workbook can be time and resource intensive, you might not want Excel triggering a recalc every time you change a cell value. Turn off calculation while your code executes, then set the mode back. Note: setting the mode back to “Automatic” (xlCalculationAutomatic) will trigger a recalc.
Application.EnableEvents: This setting tells Excel to not fire events while False. While looking into Excel VBA performance issues I learned that some desktop search tools implement event listeners (probably to better track document contents as it changes). You might not want Excel firing an event for every cell you're changing via code, and turning off events will speed up your VBA code performance if there is a COM Add-In listening in on Excel events. (Thanks to Doug Jenkins for pointing this out in my earlier post).
ActiveSheet.DisplayPageBreaks: A good description of this setting already exists: http://support.microsoft.com/kb/199505 (Thanks to David McRitchie for pointing this out).
Read/Write Large Blocks of Cells in a Single Operation
This optimization explicitly reduces the number of times data is transferred between Excel and your code. Instead of looping through cells one at a time and getting or setting a value, do the same operation over the whole range in one line, using an array variable to store values as needed.
For each of the code examples below, I had put random values (not formulas) into cells A1:C10000.
Here's a slow, looping method:
Dim DataRange as Range
Dim Irow as Long
Dim Icol as Integer
Dim MyVar as Double
Set DataRange=Range("A1:C10000")
For Irow=1 to 10000
For icol=1 to 3
MyVar=DataRange(Irow,Icol) 'Read values from the Excel grid 30K times
If MyVar > 0 then
MyVar=MyVar*Myvar ' Change the value
DataRange(Irow,Icol)=MyVar 'Write values back into the Excel grid 30K times
End If
Next Icol
Next Irow
Here's the fast version of that code:
Dim DataRange As Variant
Dim Irow As Long
Dim Icol As Integer
Dim MyVar As Double
DataRange = Range("A1:C10000").Value ' read all the values at once from the Excel grid, put into an array
For Irow = 1 To 10000
For Icol = 1 To 3
MyVar = DataRange(Irow, Icol)
If MyVar > 0 Then
MyVar=MyVar*Myvar ' Change the values in the array
DataRange(Irow, Icol) = MyVar
End If
Next Icol
Next Irow
Range("A1:C10000").Value = DataRange ' writes all the results back to the range at once
Note: I first learned of this concept by reading a web page by John Walkenbach found here: http://www.dailydoseofexcel.com/archives/2006/12/04/writing-to-a-range-using-vba/
A previous Excel blog entry by Dany Hoter also compares these two methods, along with a selection / offset method as well: http://blogs.msdn.com/excel/archive/2008/10/03/what-is-the-fastest-way-to-scan-a-large-range-in-excel.aspx
...which leads me to my next point.
Avoid Selecting / Activating Objects
Notice that in the above-referenced blog post, the selection method of updating a range was the slowest. This next optimization minimizes how frequently Excel has to respond to the selection changing in the workbook by minimizing the selection changing as much as possible.
Range Example: Again, see the Excel blog post quoted above. It demonstrates that using selection is the slowest of the 3 methods discussed for reading and writing to ranges.
Shapes Example: Setup: I have 40 shapes on a sheet, and I want to write "Hello" in each of them.
Using the slower "selection" method, the code looks like this:
For i = 0 To ActiveSheet.Shapes.Count
ActiveSheet.Shapes(i).Select
Selection.Text = "Hello"
Next i
The much faster method is to avoid selection completely and directly reference the shape:
For i = 0 To ActiveSheet.Shapes.Count
ActiveSheet.Shapes(i).TextEffect.Text = "Hello"
Next i
The concepts illustrated by the examples above can also be applied to objects other than Ranges and Shapes.
Note: I first learned of this concept, in the context of shapes, by reading a web page by Ron de Bruin found here: http://www.rondebruin.nl/shape.htm
Related Performance Paper
See the "Improving Performance in Excel 2007" paper on MSDN: http://msdn.microsoft.com/en-us/library/aa730921.aspx
This is a fairly detailed and comprehensive paper that introduces the bigger grid and increased limits in Excel 2007, and primarily focuses on Excel calculation performance and debugging calculation performance bottlenecks. There's also a short section on how to write faster VBA macros.
Other Performance Optimizations
While the above optimizations are what I consider the most important, there are a few other "honorable mention" optimizations I will mention briefly for you to consider.
Consider the performance gains by implementing your code's functionality via XLL / C-API. An overview and supporting materials for the SDK can be found here: http://msdn.microsoft.com/en-us/library/bb687827.aspx .
Declare variables with explicit types to avoid the overhead of determining the data type (repetitively if used in a loop) during code execution.
For simple functions used by your code in high frequency, implement them yourself in VBA instead of using the WorksheetFunction object.
Use Range.SpecialCells() to scope down the number of cells your code needs to work with.
Creating Dynamic Validation Ranges
Today’s author, Dany Hoter, a Product Planner for the Excel team, shares a neat trick he learned recently for creating dynamic data validation ranges. The sample file used for this blog post can be found in the attachments at the bottom of this post.
The problem: Validating data entry based on hierarchical (parent child) data. The example used is regions and countries but it could be countries and cities, product categories and sub-categories, class and student name, etc.
You want to enter a region from a list of regions and in the next cell you want to select a country but only from the countries which belong to that region.
How do you define the list of countries to validate against? The trick is basing the country validation list on an expression which will point to a different range based on the region value.
Follow the instructions in the file itself (attached below) and see how it works.
Points to notice:
- In Step 2 named ranges are created automatically based on a selected range.
- The named range used for country validation uses relative reference to point to the list of values. In this way the same mechanism will work anywhere in the spreadsheet. The validation of the cell with country is based on the region value to the left of country. The actual expression is =INDIRECT(B13). For example the named range for countries in Europe is Europe. The indirect function uses the value of the region to point the relevant named range. B13 happens to be the cell to the left of the current cell when defining the name.
- The validation rules are propagated automatically to each new row in the table.
- The lists of regions and countries do not have to be in the same worksheet as the input cells.


