 Excel 2019/365: "Database" functions

Updated 2019-08-06

Objectives

• Learn how to use database functions to calculate DAverage, DCount, DMax, DMin, DStddev, DVar (the "D" functions only apply to records that meet given criteria).
• Sample data file: "Fruit.txt"

Database functions

Excel has functions for calculating simple statistics on ranges of data. Some of the most common are:

• The Count functions count the cells in a range. There are several variations:

·       Count(range)             Counts only the cells that contain numbers (or formulas that evaluate to numbers)

·       CountA(range)           Counts all cells that are not empty (cells with either numbers, formulas, or text)

·       CountBlank(range)     Counts all of the blank cells (cells that are completely empty)

·       CountIf(range, criteria)        Counts all of the cells that match a specified condition

• The Min and Max functions determine the largest value in a range.

·       Max(range)                Returns the maximum value from the range.

·       Min(range)                Returns the minimum value from the range.

• Common statistical values can be calculated:

·       StdDev(range)           Calculates the standard deviation

·       Var(range)                 Calculates the variance

·       Average(range)          Calculates the average (mean)

·       Median(range)           Calculates the median (middle value)

·       Mode(range)              Calculates the mode (most frequent value)

• The Sum functions total the values in a range:

·       Sum(range)               Sums all of the numbers in the range

·       SumIf(range, criteria) Sums all of the numbers in the range that meet the specified criteria

The "D" functions are similar to the functions described above, except that they (1) must be applied to a "database" (a list of records, one per row in Excel), and (2) they only apply the calculation to the values that meet a given condition (a "filter").

Using simple database functions

In addition to the functions Sum, Min, Max, Average, and Count (from the Autosum button on the Home tab), Excel has similar "database" versions of each of these functions: Dsum, Dmin, Dmax, Daverage, and Dcount. Each of these "D" functions performs a calculation over a range, but only on the records that match the criteria in the criteria range. In all of the following examples, the arguments are:

·        Database: refers to a list of records

·        Column: can be either text (the name at the top of the column) or a number indicating the column's position in the database.

·        Criteria range: refers to a range with the same column headings as the database with at least one row below the headings that has the criteria used to determine if a row in the database should be selected or not.

DSum(database, column, criteria range).

Sums all of the values in the given column of the data range that meet the criteria specified in the criteria range.

DMin(database, column, criteria range).

Finds the minimum of the values in the given column of the data range that meet the criteria specified in the criteria range.

DMax(database, column, criteria range).

Finds the maximum of the values in the given column of the data range that meet the criteria specified in the criteria range.

DAverage(database, column, criteria range).

Finds the average of the values in the given column of the data range that meet the criteria specified in the criteria range.

DCount(database, column, criteria range).

Counts the number of cells in the given column of the data range that meet the criteria specified in the criteria range.

There are also some less-common "D" functions (we won't use them; they are listed here just for your information):

DstdDev(database, column, criteria range).

Finds the standard deviation of the values in the given column of the data range that meet the criteria specified in the criteria range.

Dvar(database, column, criteria range).

Finds the variance of the values in the given column of the data range that meet the criteria specified in the criteria range.

Dproduct(database, column, criteria range).

Multiplies all of the values in the given column of the data range that meet the criteria specified in the criteria range.

Examples

Import the file Fruit.txt into Excel. This is a tab-delimited file. To import a text file from the browser, you need to do the following:

·       From the browser, select the entire file (<ctrl>+a).

·       Copy the file to the clipboard (<ctrl>+c).

·       Paste the file into Excel, but you can't do an ordinary paste command.

·       On the Home tab, click on the down arrow on the Paste button.

·       Click on Paste Special. ·       When the Paste Special dialog box appears, click on Text. ·       Click on the OK button. ·       The text should be correctly pasted into Excel.

Assigning names

Assigning names to an entire table.

Assign a name to the entire table (call it TreeTable—note that there is no blank; blanks are forbidden in Excel names). To assign a name to the entire table:

·       Select the table, including the headings.

·       In the Name Box, enter the text TreeTable (no spaces): ·       From now on, instead of typing A5:E11, you can just type TreeTable.

Assign names to columns

To assign names to columns:

·        Select the entire table, including the column headings.

·        Click on the Formulas tab.

·        In the Defined Names section, click on the Create from Selection button.

·        In the Create Names from Selection dialog box, click on the Top Row check box to turn it on (if necessary), and turn all of the others off (if necessary).

·        Click on OK. Each column will be named using the names in the top row (Tree, Height, Age, Yield, and Profit). This is much easier than naming each column separately. You can check it out by selecting one of the columns and checking in the Name Box: Note: When using the "D" functions, if you give the column name rather than the column number, you must give the column name as a character string (text with quotation marks), not as a name (names do not have quotation marks). For example, "Age" (a string) rather than Age (a name).

Note: The "D" functions described below only count cells that have numeric data in them. It only makes sense that you must have numeric data for the min, max, sum, and average. But numeric data is also required for the dCount function. For example, do not try to count the number of names in a list using the "dCount" function. If you want to count all nonblank cells, use the "dCountA" function.

Use database functions to calculate dmin, dmax, dcount, dsum, daverage, dstddev, dvar (the "D" functions only apply to records that meet given criteria).

DCOUNT Example:

To count the records of Apple trees that have a height between 10 and 16 feet and that have a number in the "Age" field, use either one of the following:

DCOUNT(TreeTable, "Height", A1:F2)

DCOUNT(TreeTable, 2, A1:F2)

Use this formula (Note that we are not using row 3 in our criteria yet. Also note that we have added another column (column F) because we have two conditions for the Height column): This will return the value 1. This function looks at the records of apple trees between a height of 10 and 16 and counts how many of the Height fields in those records contain numbers. Note that we could have used any numeric column (Height, Age, Yield, or Profit) instead of the Height column because all of those columns have numbers in all of their rows. We could not use the Tree column because its values are not numeric.

DMAX Example:

Find the maximum profit of apple and pear trees. Use one of these formulas (Note that we are only using column A in our criteria.):

DMAX(TreeTable, "Profit", A1:A3)

DMAX(TreeTable, 5, A1:A3)

This will return the value \$105.00, the maximum profit of apple and pear trees. DMIN Example:

Find the minimum profit on apple trees that are more than 10 feet tall. Use one of these formulas:

DMIN(TreeTable, "Profit", A1:B2)

DMIN(TreeTable, 5, A1:B2)

This will return the value \$75.00, the minimum profit of apple trees that have a height greater than 10. DSUM Example:

Find the total profit on all apple trees. Use one of these formulas:

DSUM(TreeTable, "Profit", A1:A2)

DSUM(TreeTable, 5, A1:A2)

This will return the value \$225.00, the total profit from apple trees. DSUM Example:

Find the total profit on all apple trees whose height is greater than 10 feet. Use one of these formulas:

DSUM(TreeTable, "Profit", A1:B2)

DSUM(TreeTable, 5, A1:B2)

This will return the value \$180.00, the total profit from apple trees with a height greater than 10 feet. DAVERAGE Example:

Find the average yield on all apple trees that are over 10 feet tall. Use one of these formulas:

DAVERAGE(TreeTable, "Yield", A1:B2)

DAVERAGE(TreeTable, 4, A1:B2)

This will return the value 12, the average yield of apple trees over 10 feet in height. DCOUNTA Example:

Count the records of Apple trees that have a height between 10 and 16 feet and that have non-blanks in the "Tree" field. Note that this function is an exception to the second note above. It will count not just cells with numbers, but all non-blank cells. Also note that "between" in this case does not include the endpoints (10 and 16). This is not always the interpretation of "between" in Excel. Use one of the following formulas:

DCOUNTA(TreeTable, "Tree", A1:F2)

DCOUNTA(TreeTable, 1, A1:F2)

This will return the value 1. This function looks at the records of apple trees between a height of 10 and 16 and counts how many of the "Tree" fields in those records are not blank. Note that we could have used any column (numeric or not) instead of the Tree column because all of those columns have non-blank values in all of their rows. The following functions are for information purposes only. You are not responsible for them.

DSTDDEV Example:

Find the standard deviation of the yield of apple and pear trees. Use this formula:

DSTDEV(TreeTable,"Yield",A1:A3)

This will return the value 2.97, the estimated standard deviation in the yield of apple and pear trees if the data in the database is only a sample of the total orchard population. DVAR Example:

Find the variance of the yield for apple and pear trees. Use this formula:

DVAR(TreeTable,"Yield",A1:A3)

This will return the value 8.8, the estimated variance in the yield of apple and pear trees if the data in the database is only a sample of the total orchard population. DPRODUCT Example:

Find the product of the yields of apple trees with a height greater than 10. (Multiplying these numbers together really doesn't make any sense.) Use this formula:

DPRODUCT(TreeTable,"Yield",A1:B2)

This will return the value 140, the product of the yields from apple trees with a height greater than 10. 