
The Power of
Arrays
The Excel
tool that performs multiple functions in a single
step.
by Paul
Goldwater and Timothy Fogarty
|
To help
readers follow the instructions
in this article, we used two
different typefaces: Boldface
type is used to
identify the names of icons,
agendas and URLs. Sans serif type
shows the names of files and the
names of commands and
instructions that users should
type into the computer.
|
|
ne
of the most powerful features of Excel is the
arraya formula designed to act
simultaneously on sets of two or more values in
order to calculate other values. Yet, because
arrays appear to be forbidding, few CPAs use
them. This article is designed to dispel
arrays bad reputation and demonstrate how
they can speed and simplify your work while
making it less prone to errors. So get ready to
overcome your bias against arrays.
Well
begin with the most basic array formula, and as
we movestep by stepto more
complicated ones, youll see how powerful
arrays can get. To make it easier for you to
follow along, download an Excel file from www.aicpa.org/download/pubs/jofa/mar2007/goldwater.xls. The file contains two
versions of each worksheet. One worksheet in each
set has blank cells in which you can practice
entering the arrays and other formulas mentioned
in this article, while the other has all the
cells already completed.
AVERAGE IT
Accountants often need to tightly summarize data.
Exhibit 1 uses a one-dimensional array
formula on payroll information to calculate the
average pay of each employee and the global
average of all employees. (In your downloaded
file, see the Average It worksheet).
Heres how we did it:
To
calculate the average per employee, select the
range G3:G7 and type this formula:
=(B3:B7+C3:C7+D3:D7+E3:E7+F3:F7)/5
Then
press Ctrl+Shift+Enter, which does two things: It
automatically places curly brackets{
}around the formula, labeling it an array
formula, and simultaneously triggers the array
calculation. The array formula now exists in G3
to G7 and cannot be changed except by rewriting
the entire formula.
To
calculate the average pay per month, select the
range B8:G8 and type in this formula:
=(B3:G3+B4:G4+B5:G5+B6:G6+B7:G7)/5
Then
press Ctrl+Shift+Enter. The global average for
all employees is now in cell G8.
RANK IT
Creating two-dimensional arrays is slightly more
challenging. Consider again the payroll data of Exhibit
1. This
time we want to rank the paychecks by size. To do
that, copy the list of names (as shown in the
lower half of Exhibit 2) and type this formula:
=RANK(B3:F7,
B3:F7)
Press
Ctrl+Shift+Enter, and presto, the data are
rankeda task that would be far more
difficult without arrays.
ANALYZE IT
Arrays also are useful when performing analyses
that impose conditions upon mathematical
operations. For example, say you have a
spreadsheet loaded with sales data and you want
to see various subsets of the data based on
ranges of products prices and quantity (see Exhibit 3). For this calculation we will use
a single-cell array. (See the worksheet Single
Cell in your downloaded file.)
If we
want to know the sum of range B3:B7 by range
C3:C7, typically we could create D3:D7 and place
the sum in cell D8. However, an array formula in
D9 would do the job in one pass:{=SUM(B3:B7*C3:C7)}
| Table 1 |
| Accounting
question of interest |
Cell |
Formula |
| Sum
(using arrays) |
D9 |
{=SUM(B3:B7*C3:C7)} |
| Sum (if
greater than the average price) |
D10 |
{=SUM((C3:C7>AVERAGE(C3:C7))*B3:B7*C3:C7)} |
| Sum (if
greater than the average quantity) |
D11 |
{=SUM((B3:B7>AVERAGE(B3:B7))*B3:B7*C3:C7)} |
Sum (if
greater than the average price
and greater than theaverage quantity) |
D12 |
{=SUM((C3:C7>AVERAGE(C3:C7))*B3:B7*(B3:B7>
AVERAGE(B3:B7))*C3:C7)} |
Sum (if
greater than the average price
or greater than the average quantity) |
D13 |
{=SUM(((C3:C7>AVERAGE(C3:C7))+(B3:B7>
AVERAGE(B3:B7))>0)*B3:B7*C3:C7)}
|
That simple formula extracts all the
information from the underlying cells without the
usual sum formulas in D3:D8. The grayed area at
the bottom of Exhibit 3 (D9:D13) contains various
array formulas to compute numerous values of
interest to CPAs. Table 1 lists many of the typical
ways CPAs are called upon to manipulate such data
and the array formulas that perform each of those
calculations.
Advisory: Often CPAs need
to know, and perhaps to explain to clients or
executives who are not handy in Excel, how
certain spreadsheet numbers are derived (see
screenshot below). You can display this
information easily by clicking on Tools,
Formula Auditing, Evaluate
Formula. This allows you to step through
the calculations, first seeing cell references
and then the numbers those references represent.

CALCULATE THE CONSTANTS
Arrays often need to import constants, including
useful explanatory information such as dates,
names and numbers. This comes up, for example,
when you need to include transaction fees, such
as sales taxes and shipping charges, in formulas.
Exhibit 4 shows invoices both before the
application of a 4% sales tax and a 1% shipping
fee (column D) and after (column G). See the
downloaded files Single Cell
worksheet.
This kind of array has to be created
and then recalled by way of a drop-down menu,
which is evoked by clicking on Insert,
Name, Define
(see Exhibit 5). Under Names in workbook,
add the designation of a name such as
StandardCharges, and in the Refers to
field, add the specification of the percentage
rates, such as = {0.04,0.01}. Both of these
values could have been stored in separate cells
on the spreadsheet. But often these values
(constants) should be hidden or made inaccessible
so the employees cannot easily change them.
Place the formula in the affected
cells (in this case E3:E7 and F3:F7) that recall
the percentage rates and apply them to the base
amounts. For example, the sales taxes require
this formula to be created as an array:
{=D3:D7*INDEX(StandardCharges,1)}
The
shipping charge array requires selecting the
second value from StandardCharges:
{=D3:D7*INDEX(StandardCharges,2)}
Lets
look at three typical calculations CPAs face. For
each, refer to Exhibit 6 and your downloaded
worksheet Calculate Constant.
Once again, use the Evaluate Formula
tool to observe what Excel is doing for each of
the following examples.
SUMMING BASED ON A CONDITION
When you need to sum values based on one or more
conditions, the array formula (in D9) is:
=SUM(IF((D3:D7>=1000)*(D3:D7<1700),D3:D7))
This
formula returns the sum of all cells in the range
D3:D7 where the value is greater than or equal to
1,000 and less than 1,700. You can include more
conditions in the array formula if necessary.
SUMMING THE n LARGEST VALUES IN A RANGE
Cell D10 contains this formula:
=SUM(LARGE(D3:D7,ROW(INDIRECT("1:2"))))
This
formula returns the sum of the two largest values
in the range D3:D7. LARGE is an
Excel function that will be evaluated twice, each
time with a different second argument (that is,
1, 2). Also, if you wish to sum the n smallest
values in a range, use the Excel SMALL
function instead of the LARGE
function.
COMPUTING AN AVERAGE THAT INCLUDES ZEROS
Often the use of averaging commands is distorted
by the presence of zeros in the data. Zero may
indicate that the transaction is a different
type, and that the calculation is not relevant.
In many situations we wish to ignore zero values
when determining an average. In D11 (Exhibit
6) the
array formula =AVERAGE(IF(D3:D7<>0,D3:D7)) averages all
nonzero values in the range D3:D7. If you had
used the Excel function =AVERAGE(D3:D7), you would have received
the value 1,133.07 rather than 1,416.34.
COUNTING THE NUMBER OF DIFFERENCES
IN TWO RANGES
Arrays are exceptionally helpful in ascertaining
difference conditions that might have
significance. Thats especially true when
performing internal controls, where consistency
and agreement are essential. In Exhibit
7, cell
E13 contains an array formula that compares
corresponding values in two ranges (D3:D11 and
E3:E11) and determines the number of differences
that are greater than a specific value. The
specific value is in I13 ($2.00); think of this
as a materiality threshold on the discrepancy, to
separate out those that are trivial or caused by
predictable events. If there are no differences
between the two ranges, the formula will return
0. When entering the data for this formula, both
ranges must be the same size. (Refer to the
downloaded files Differences
worksheet.)
IDENTIFY THE
CELLS
The previous array formula returned the number of
differences, but we also may want an itemized
listing of those differences. To do that in the
range F15:F23 enter this array formula:
=LARGE($E$3:$E$11-$D$3:$D$11,ROW(INDIRECT("1:"&$E$13)))
The
formula will return the six cells (see cell E13)
that are of interest to us because their hourly
rate increased by more than $2. To list in
ascending order the six employees whose hourly
rate has increased by more than $2 per hour,
enter the following formula in E15 and copy it
into the range E16:E23:
=INDEX($B$3:$B$11,MATCH(F15,($E$3:$E$11)-($D$3:$D$11),0))
RETURNING THE LOCATION OF
THE MAXIMUM VALUE IN A RANGE
To return the row number of the maximum hourly
rate in the range E3:E11, enter this formula in
F26:
=MIN(IF(E3:E11=MAX(E3:E11),ROW(E3:E11),""))
It
would return the value 5. To return the address
of the maximum value, use this array formula in
G26:
=ADDRESS(MIN(IF(E3:E11=MAX(E3:E11),ROW(E3:E11),"")),COLUMN(E3:E11))
The
value $E$5 will be returned. Obviously this task
can be accomplished by eyeballing the data, but
when youre dealing with hundreds of
employees, Excel can do the job more effectively.
DETERMINING WHETHER A RANGE CONTAINS VALID VALUES
Many times we need to know whether the values in
one range are contained in another. This could be
used to uncover fraud based on a
padded payroll, for example. (Refer
to the downloaded files Padded
Payroll worksheet.) We could check for
that by determining whether every employee
receiving a check was also in the master list of
employee names. Exhibit 8 demonstrates four array
formulas that could do the jobthat is,
compare payroll names against a master list.
Start by placing this formula in J2:
=ISNA(MATCH(TRUE,ISNA(MATCH($B$2:$B$12,$A$2:$A$10,0)),0))
It
will return either TRUE or FALSE. In this case it
returns FALSE because there are names in the
range B2:B12 receiving payroll checks that are
not in the range A2:A10 (the master list of
employee names).
Then,
in J3 enter this formula:
=SUM(1*ISNA(MATCH($B$2:$B$12,$A$2:$A$10,0)))
It
returns the value 4the number of names
receiving a payroll check that are not in the
master employee list.
Next,
to discover who these people are, enter this
formula in the range C2:C12:
=SMALL(IF(ISNA(MATCH($B$2:$B$12,$A$2:$A$10,0)),
ROW($B$2:$B$12),""),ROW(INDIRECT("1:"&$J$3)))
The
formula returns the array of row numbers of
concern.
And
then in D2:D12 is the formula:
=INDIRECT(ADDRESS(C2,2))
It
returns the names on the payroll checks.
As
you can see, CPAs can use array formulas in many
ways. The formulas do complex jobs in step-saving
ways and are able to cull material that would
otherwise consume long hours of searching.
Paul
Goldwater, Ph.D., is an
associate professor of accounting at the
University of Central Florida in Orlando, Fla. Timothy
Fogarty, CPA, Ph.D., is an
associate dean and a professor of accounting at
Case Western Reserve University in Cleveland.
Their e-mail addresses are paul.goldwater@bus.ucf.edu and timothy.fogarty@case.edu, respectively.
|