## Archive for the ‘Excel Functions’ Category

## Who’s On First?–Ranking Performance Using Excel

#### Introduction

No this isn’t a variation on the classic Abbott and Costello comedy routine. It’s a discussion of how to use an excel formula to arrange members of a group according to some sort of performance criteria.

One way of finding out who the top performers in a list are is to sort the list in descending order by the value that reflects the performance. Of course, that would mean having to re-sort the list every time you want to see the new ranking.

This article will demonstrate how to perform ranking without resorting the underlying list using formulas. These formulas use three excel functions, LARGE(), INDEX(), and MATCH() to rank individuals according to their total sales.

The companion workbook for this article, RANKER.xlsx can be downloaded from here. All data in the workbook is fictitious.

#### The Challenge

We have a large list of sales representatives. We want to know, *without sorting the list,* the names of the top 10 sales people. The leaders list should always display the current top 10 sellers based on current data in the sales representatives list.

At first glance, this problem seems to call for some sort of lookup functionality. Unfortunately VLookup() is not up to the challenge because that function can only use the first column of a table or list as an index to the list. Or list of sales representatives has the person’s id or employee number in in its first column.

The more general Lookup() function is also unsuitable because the index list it refers to must be sorted in ascending order.

#### The Functions

For complete descriptions of the functions used in this example refer to linked help pages, LARGE(), INDEX(), and MATCH().

**LARGE **The Large() function searches a list of values and returns the value that is in the relative position in the list that you specify. For example, you can find the largest value in a list without having to sort the list. The function is very flexible because it has an argument that lets you specify any specific relative position in the list from 1 (largest value) to the very last position (by specifying a number equal to the total number of items in the list. All this can be done without ever having to sort the list.

**INDEX **The Index() function returns the value found at a specific position in a list.

**MATCH **The Match() function searches a list and returns the number of the list row in which the value you have said to search for is found. The example used in this article specifically uses a one-column list. If the list contains duplicate values, Match will find one of the duplicates over and over for as many times as the value is duplicated.

#### Solving the Duplicates Issue

While the probability of two representatives having exactly the same Sales value may be low, the very possibility that a duplicate might exist necessitates a strategy to deal with duplicates should they occur.

In this example, we have used a bit of formula trickery to create a unique way of identifying duplicate values.

This formula when copied down a column will count the instance of a value in the specified range.

COUNTIF($E$2:E2,E2)

Notice the use of mixed references in the first argument. For this technique to work, the list may not be formatted as an Excel 2007/2010 Table. If the data is formatted as a Table, convert it to a range before entering the formula.

This expression is part of the more complex formula contained in cells F2 to F224 on the Demo_Notable worksheet. When copied down a column the reference to the start of the range remains fixed on E2 while the end of the range and the criteria cell reference are incremented for each successive reference.

However, that’s not the formula trickery I mentioned. In order to use the Large() function, we need numeric values. At the same time, in order to create a useful identifier that will not distort results, we need to work with numbers formatted as text. The trick is to convert intermediate calculations to text and then reconvert the final result back to a number, not just the textual representation of the number.

This is what the full formula looks like

=VALUE(TEXT(E2,"00000000")&TEXT(COUNTIF($E$2:E2,E2),"0000"))

When you copy this formula down the column, say to E3, Excel modifies the relative cell references. So in Row 3 the formula would become:

=VALUE(TEXT(E3,"00000000")&TEXT(COUNTIF($E$2:E3,E3),"0000"))

The idea is to convert the Sales value to text with a fixed number of digits and then concatenate the text value with the text representation of the count of duplicates. Simple adding the two values together arithmetically would distort the value and not bring us any closer to having unique value that will distinguish between duplicate instances. COUNTIF() is counts how many times the value of interest occurs in the specified range, which grows row by row as copy the formula down the column.

Here’s what the formula looks like when you copy it down several rows. Notice that the second argument of the CountIf() function changes the end of the range being counted for each row the formula is copied to. However, the range beginning is anchored to cell E2 because that part of the reference is absolute.

Caution: Using this technique with a very large list may be very slow. For lists of a few hundred items or less should not be affected.

#### The Final Solution

Six formulas drive the solution. Here are the formulas from the second row of the try it worksheet. Once the individual formulas have been created they can be copied down their respective columns (using autofill.)

The formula in F2 converts the sales value in E2 to Text and concatenates that value with a four character representation of the result of counting the frequency with which the value in E2 appears. This gives us a unique index value that ensures we will be able to find each instance of the duplicate value.

In I2 the formula finds the value in column E that is x positions from the largest value. H2 contains the value 1, so the formula in I1 returns the largest value from the list. This result is not quite adequate because it can’t distinguish duplicate values.

J2 finds the row number of the value in I2. When there are duplicate values in column E, the result in this column will always indicate the first row in which the value occurs.

The formulas in K2 and L2 are very similar to those in I2 and J2 EXCEPT the formula in K2 uses values in column F to determine rank. The forumlas in column F ensure that there are no duplicates to be ranked. The method uses has a significant side effect because it yields an higher rank for the first occurrence of a duplicate and progressively lower ranks for each subsequent occurrence.

The video accompanying this article includes a demonstration of the effect of duplicates on ranking.

The final formula in cell M2 uses the row number calcluation in L2 to find the name of the rep having a particular rank.

## Data Validation Using Multi-tiered Lists

I came across this video today which is an excellent tutorial of advanced Excel techniques for data validation. The example uses the Indirect() function as the source for the lists that are used as choices in a ‘drop-down’. I will let the video speak for itself.

One thing that the video doesn’t mention, is that because the lists validation works with are named ranges, they can actually be on a different sheet than the one that the data validation is used on. Ordinarily, data validation lists must be on the same sheet as the cells with the validation rules.

## Conditional Formatting–Everything You Wanted to Know but Were Afraid to Ask

At one time one of my job responsibilities included budget management and financial reporting for the charity I worked for. Among other things, I had to identify, analyze, and report on variances on each account that was $1,000 or more over or under budget. At the time, my employer operated offices in four locations, with three or four departments in each location. So I had a fair number of accounts (as I recall about 600) to review every month just to identify those that needed further attention.

Keep in mind that personal computers spreadsheet software were, if not in their infancy, still in their very early youth. These were the days of DOS and non-graphic spreadsheets. Colour monitors, were as yet virtually unheard of.

So, at the time, the best solution I could come up with was a conditional formula in cells adjacent to the variance column that displayed an asterisk when the actual value was $1000 or more over or under budget.

### Formulas

The formulas in column G use the **ABS()** function to calculate the variance. Printed reports had to show this value for all accounts. The formula first calculates the difference between the actual and budget for the account. ABS calculates the absolute value of the result. So whether the variance is positive or negative, it will be displayed as a positive number.

Column H will display asterisks only if the variance is $1,000 or greater. The formulas use the** IF()** function to test the variance for being equal to or greater than $1,000. If the test is true, the cell displays an asterisk “*”. The third argument “” cause the cell to display a blank. “” is known as the empty string. You can use it anywhere that you need to have text with no actual characters.

### Welcome to the recent past

By today’s standards my solution was rather primitive. With the advent of Graphical interfaces in general and Excel for Windows in particular, *conditional formatting *became available. With conditional formatting, you can control the format of a cell depending on a rule you create. The format of the affected cells changes automatically depending on whether the rule is true or false whenever the worksheet is recalculated.

So now, something like this became available:

You could set up to three rules so you could highlight cells differently according to the rules you set. So you could visually rank cells according to their values.

Here I have three rules. If the cell value is less than $500.00 the fill colour should be green. If the cell value is $500.00 – $999, the fill colour is yellow. For variances of $1,000 and more, the fill colour is red.

It is also possible to control the formatting of a cell depending on the value of a different cell:

Conditional formatting not only makes it easy to flag serious error conditions (reportable variances in this case) but also enables more pro-active flagging. So, in this example a red cell indicates that variance reporting is required; yellow indicates an account that should be reviewed and monitored; green indicates that no action is needed for the line item at this time. By yellow flagging accounts that may be on their way to a serious variance you can be pro-active and start looking for ways and means to prevent the item from sliding out of control.

### Welcome to the 21st Century

Excel 2007 introduced some major enhancements to Conditional Formatting. Now we don’t just have a choice of graphics to convey information about our data even more effectively.

Here the conditional formatting in column G indicates the relative value of each variance by the size of the bar. In column H the traffic light icons indicate whether a report is required for the variance, whether the account should be watched more closely, or whether current results for the account are within satisfactory limits.

The possibilities of Conditional Formatting, while not endless, are extensive. Whether your worksheet is strictly a working document designed to help you manage you job responsibilities more effectively, or whether your worksheet is part of a presentation to your boss or others, Conditional Formatting is well worth exploring.

## Chart Intermittent Dates

## Quote of the day
-Walker Percy |

Natasha wanted to graph photocopier downtimes to show the duration of each downtime in minutes with data points labeled with the date of the down time incident. Her data list included the date of the copier needed service and the number of minutes it remained out of order. This seems simple enough but when she tried to generate a graph for the year, every day of the year appeared in the graph. That wasn’t quite what Natasha was looking for. She wanted to show only dates when there was copier downtime.

Graphing, it seems, assumes that if a series consists of date values then you want a date series showing every date between the first and last dates. Here you see the problem in the graph on the left, and a solution in the graph on the right. In the original data, the first entry is February 12 and the last, December 14. Notice how the graph is display dates from February 12 to December 12.

Because of the size of the graph, the labels display only one date for each of the eleven months.

The graph on the right, on the other hand, has data points only for each of the dates in its data series. (The size of the graph may cause alternate data points to be un-labeled, so the graph might need to be widened somewhat so that each data point has a label. You may also have to format the axis.) Because the X-Axis data is self-explanatory, I have removed the legend from both graphs.

The secret, to paraphrase *Fried Green Tomatoes,* is in the formula. Actually, it is in a function that the formula uses. TEXT() is a very versatile function, well worth adding to your Excel arsenal

In this example, I have used TEXT() to convert the date values in column B to text values that include the name of the weekday on which the date occurs.

The TEXT() function converts numeric values to text . It takes two arguments, a reference to a cell containing a numeric value and a string to indicate how the text should be formatted.

For example, the formula in cell J3 is:

=TEXT(B3, “ddd mmm dd”)

The format string here specifies that the text should begin with the three character abbreviation for the day name followed by a space, the three character abbreviation for the month name, another space, and finally a two digit number for the day.

There are many variations on the format string. Search the Excel help files for *Text Functions* or point your browser to Office On-line. The help article gives full details of how to specify the format string (the second argument of the function) depending on the numeric value you are trying to convert to text.

The point in this example is that by converting the dates in column B to text, and using the data in column J for the series labels, we can fool Excel into displaying date without having it automatically treating the dates a continuous series from the earliest date to the latest date.

Thanks Natasha for raising the question.