Excel Tutorial Series – Advanced Lookup/Counting Functions

In this tutorial, I'll be showing you some more advanced lookup and counting functions.

But first, a logic lesson!

According to Merriam Webster, truth tables are "a table that shows the truth-value of a compound statement for every truth-value of its component statements." The truth-value in this case is boolean: true or false.

Truth tables show the outcome of an expression, given all possible combinations of its input values.

NOT
A-A
falsetrue
truefalse

As you can see above, for any given value of A, NOT A is the opposite boolean value. Simple enough.

AND
ABA /\ B
falsefalsefalse
falsetruefalse
truefalsefalse
truetruetrue

In the AND truth table, the result is TRUE only if both input values are TRUE.

OR
ABA \/ B
falsefalsefalse
falsetruetrue
truefalsetrue
truetruetrue

In the OR truth table, the result is TRUE if at least one input value is TRUE.

There is a direct corollary here with multiplication and addition. Since Excel treats TRUE as 1 and FALSE as 0, we can see how AND and OR correspond with multiplication and addition, respectively.

Multiplication
*=
falsefalsefalse
falsetruefalse
truefalsefalse
truetruetrue

Addition
+=
falsefalsefalse
falsetruetrue
truefalsetrue
truetruetrue

This is possible because anything times zero equals zero (multiplication), while adding zero to itself is the only way to stay at zero (addition).

When you join two variables with AND (or by multiplying them), the expression is TRUE if and only if BOTH values are TRUE.

When you join two variables with OR (or by adding them), the expression is TRUE if at least one value is TRUE.
This leads us to the conclusion that the AND function is the logical equivalent of multiplication, while the OR function is the logical equivalent of addition.

We can see how this applies when using array formulas in Excel.


Array Formulas

One of my favorite array formulas is the INDEX/MATCH combination. It can be used to create some very complex lookups.

The format is:

=INDEX(lookup_column,MATCH(1,((some_value=one_column)*(another_value=another_column)),0))

Where:

  • lookup_column = a range reference (i.e. array of values), usually a column in a table, that contains the value you want to return
  • some_value = a value that would be found in the corresponding row of lookup_column
  • one_column = a range reference (i.e. array of values), usually a column in a table, that contains some_value
  • another_value = a value that would be found in the corresponding row of lookup_column
  • another_column = a range reference (i.e. array of values), usually a column in a table, that contains another_value

In English, this roughly translates to "Return the exact match in lookup_column where, in the corresponding row, some_value is found in one_column and another_value is found in another_column.

Let's apply this to an actual example. I've created a list of fake names courtesy of my Random Data Generator Excel add-in. I want to look up a certain SSN, but the names are split into two columns. How can I make sure I only return the SSN I need?

array index match lookup

After creating the fake data, I created named ranges by going to Insert » Name » Create and making sure only "Top Row" was selected. This is a quick way of making named ranges from static data; each column will be named with the content of its topmost cell; Column A will be named "First_Name", column B will be "Last_Name", and so on (note the underscores replacing spaces in each range name).

Here is the formula being used:

=INDEX(SSN,(MATCH(1,("von"=First_Name)*("tinsley"=Last_Name),0)))

If I kept the names in separate cells and wanted to use cell references instead:

=INDEX(SSN,(MATCH(1,(H4=First_Name)*(H5=Last_Name),0)))

And if I kept the name in a single cell, and wanted to use that as a reference:

=INDEX(SSN,(MATCH(1,((LEFT(H4,FIND(" ",H4)-1)=First_Name)*((RIGHT(H4,LEN(H4)-FIND(" ",H4)))=Last_Name),0)))

(See Excel Tutorial Series – Text Functions for explanation of how to extract first and last names from a single cell)

Remember these are all array formulas, so they must be committed to the cell by pressing Ctrl+Shift+Enter. For that reason, Bill Jelen calls them CSE Formulas.

It's also important to note that the arrays must be the same size, since their values are being compared to each other on a one-to-one basis.

Our formula is returning the value from the SSN column, where the values "von" and "tinsley" are found in the corresponding row.

Now let's examine exactly what the formula is doing.

The MATCH formula takes three parameters:

  • The value to be matched
  • The lookup column
  • The type of search to perform

For more information on the MATCH function, try the Excel Function Reference

The middle parameter is an array, so this is where our truth table lesson comes back into play. Since we're looking up values in two columns, we'll combine them with an AND condition, because we only want a hit (true) when BOTH values are found.

The middle parameter returns an array of true and false values as ones and zeros. So wherever a value in column First_Name equals "von", that part of the array will be a one, and wherever a value in column Last_Name equals "tinsley", that part of the array will be a one. The formula looks something like

=INDEX(SSN,(MATCH(1,({FALSE;FALSE;FALSE})*(FALSE;FALSE;FALSE}),0)))

which can also be expressed as

=INDEX(SSN,(MATCH(1,({0;0;0})*(0;0;0}),0)))

An AND expression only returns TRUE when both sides are TRUE. In this example, the thirteenth values are one, because in the thirteenth row, the value in First_Name and Last_Name equals "von" and "tinsley", respectively. So the formula ends up looking like

=INDEX(SSN,(MATCH(1,{0;0;0;0;0;0;0;0;0;0;0;0;1;0;0;0;0},0)))

which in turn becomes

=INDEX(SSN,13)

and this returns the value in the thirteenth row of the SSN column, which is Von Tinsley's SSN.


SUMPRODUCT

SUMPRODUCT only returns a number, so if you are working with text, use INDEX/MATCH instead. You can still use the SUMPRODUCT, however, if you wanted to count the number of rows that match your criteria. For example, to count the occurrences of the name "Von Tinsley" in the above worksheet:

=SUMPRODUCT(("Von"=First_Name)*("Tinsley"=Last_Name))

or

=SUMPRODUCT((H4=First_Name)*(H5=Last_Name))

This formula will return the number of rows where Von and Tinsley are found in the same row, using the same technique as above (except SUMPRODUCT isn't a CSE formula).

Using SUMPRODUCT we can do things that would normally need Autofilter, for example a count of the number of females in Texas but not in the city of Keyport:

=SUMPRODUCT(("female"=Gender)*("TX"=State)*("Keyport"<>City))

The named ranges really make these formulas easy to read and construct. If your data isn't static, consider using dynamic named ranges.

Note that SUMPRODUCT cannot be used on whole columns in Excel 2003. I believe that has changed with Excel 2007.

Download sample workbook

(Note: Descriptions of each function were taken from: Excel Function Reference)


COUNTIF

COUNTIF(range,criteria)
Counts or sums the number of cells within a range that meet the given criteria.

Arguments:
Range is the range of cells from which you want to count cells.
Criteria is the criteria in the form of a number, expression, cell reference, or text that defines which cells will be counted. For example, criteria can be expressed as 32, "32", ">32", "apples", or B4.

countif

SUMIF

SUMIF(range,criteria,sum_range)
Adds the cells specified by a given criteria.

Arguments:
Range is the range of cells that you want evaluated by criteria.
Criteria is the criteria in the form of a number, expression, or text that defines which cells will be added. For example, criteria can be expressed as 32, "32", ">32", or "apples".
Sum_range are the actual cells to add if their corresponding cells in range match criteria. If Sum_range is omitted, the cells in Range are both evaluated by Criteria and added if they match Criteria.

sumif

Suppose I have two columns of numbers on a worksheet:

countifsumif

This formula will give us the count of all numbers greater than 34:

=COUNTIF(A2:B101,">34")

And this one will sum those numbers:

=SUMIF(A2:B101,">34")

And this compound formula will sum only the numbers between 37 and 40. It works by summing all of the numbers greater than 36, then subtracting the sum of all the numbers greater than 40. You are left with the sum of numbers inbetween.

=SUMIF(A2:B101,">36")-SUMIF(A2:B101,">40")

Download sample workbook

The next post in the tutorial series will have more about arrays and dynamic named ranges, stay tuned!

About JP

I'm just an average guy who writes VBA code for a living. This is my personal blog. Excel and Outlook are my thing, with a sprinkle of Access and Word here and there. Follow this space to learn more about VBA. Keep Reading »



Share This Article:

Share and bookmark this articledelicious buttonfacebook buttonlinkedin buttonstumbleupon buttontwitter button

comment bubble 7 Comment(s) on Excel Tutorial Series – Advanced Lookup/Counting Functions:

  1. I have a problem….. I have a range of data $B$2:$H$22 and i need to find the row in which a specific value occurs.

    Other than rearranging the data, I am not sure how to do this. Any thoughts?

    • The MATCH function will give you the relative row where your value is found. If you wanted to find the row where the number 5 is found in a range of numbers in B2:H22, you would enter this formula anywhere on the sheet:

      =MATCH(5,B2:H22,0)

      and it would tell you what row (if any) it was found. For example, if it returned the number 1, that means the number 5 is in cell B2. And so on.

      HTH

  2. Really? The issue I have with that, is that subsequent to asking the question I have found that in all the examples that I have found, the array (where you have B2:H22) can only be a single column wide.

    So I have used the following equation to help me with my problem…

    It checks if the value exists in the column, and if it does (by
    returning 'FALSE' ironically) then it returns the row number. If it
    doesn't, then it returns a zero. It checks each column in the range
    returning either a zero or the actual number. By summing these results
    you get the correct row number. It wouldn't work if the values in the
    range were not unique…

    =IF(ISNA(MATCH($a1,Sheet1!A$2:A$22,0))=FALSE,MATCH($a1,Sheet1!A$2:A$22,0),0)
    +IF(ISNA(MATCH($a1,Sheet1!B$2:B$22,0))=FALSE,MATCH($a1,Sheet1!B$2:B$22,0),0)

    for each column in the range

    • You're right. For some reason, I missed the multi-column range when writing about the MATCH formula. The following array formula seems to work:

      {=MAX(ROW(1:21)*(B2:H22=A1))}

      Just enter =MAX(ROW(1:21)*(B2:H22=A1)) and press Ctrl+Shift+Enter to commit the formula.

      (I assumed by your reply that the value you are searching for is in A1)

  3. Dennis J Bujan writes:

    Hi there, i was trying to experiment with Excel and lookup functions, I need to find a value of temperature from a list, the first section of the list goes from low to high, and then high to low, for the porpuses of the analisys, I cannot rearrenge all the data to make a sort.

    From the list I need to find a value lets say 124, but when I use the function Lookup, it gives me from a consecutive list of values the last one. For Example I have the following list of numbers in ascending order…

    A 1
    B 2
    C 34
    D 64
    E 124
    F 124
    G 124

    I want to detect the first value of 124 to get a return value of "E", but instead I get a value of "G"

    Also, its posible to use with the lookup function the operators to define a range?

    Thanks and best regards

  4. Dennis J Bujan writes:

    Im using the VLOOKUP FORMULA, the data from the formula looks something like this…

    A B C
    =VLOOKUP(124,D2:AG226,$AF$903-D903)

    A= its the value I want to search
    B= Is the data I want to analize
    C= its a mathematical formula to decrease the index as I make click and drag to fill other cells.

    Thanks.

This article is closed to any future comments.
excel school learn dashboards