Group Functions | Paul's Linux Box

By Paulus, 18 April, 2007

Group functions are also called multiple-row functions. These functions return a result for a group of rows that they process. To add grouping capabilities the SELECT clause will have two new clauses GROUP BY and HAVING.

SELECT * | colname, colname…
FROM tablename
[WHERE condition]
[GROUP BY colname, colname…]
[HAVING group condition];

When using group functions there are two things that need to be kept in mind:

1. Using the DISTINCT keyword will include only unique values. The ALL keyword is defaulted and instructs Oracle to include all values except null.

2. All group functions except the COUNT(*) will ignore NULL values.

SUM( [ Distinct | ALL ] n )

Description:

Takes all the data in the rows and adds them together.

AVG( [ Distinct | ALL ] n )

Description:

Returns the average of all the rows.

COUNT( * | [ Distinct | ALL ] n )

Description:

Counts all rows that have a value in. COUNT can count the records that have a non-NULL or the total records that meet a specific condition which can include NULL values.

MAX( [ DISTINCT | ALL ] n )

Description:

Gets the maximum value of a specified column. This function can be applied with nonnumeric data.

MIN( [ Distinct | ALL ] n )

Description:

Works just like the MAX function but gets the smallest value and returns that...

SUM( [ Distinct | ALL ] n )

Description:

Takes all the data in the rows and adds them together.

STDEV( [ Distinct | ALL ] n )

Description:

This will perform a standard deviation calculation on all the data that’s passed through it to determine how close individual values are to the mean.

VARIANCE( [ Distinct | ALL ] n )

Description:

This function will take all the values that are passed to determine how widely the data is spread in a group.

Grouping Data

Oracle offers a powerful feature where the user can group data. So instead of using multiple SELECT statements and limiting the data, a single query can be issued by using the GROUP BY clause. If a column is selected and a group function is used on another column, Oracle will error out:

SELECT category, TO_CHAR(SUM(cost), '$999.99') “Total”
FROM products;

You will see this error:

ORA-00937: not a single-group group function.

The correct way would be:

SELECT category, TO_CHAR(SUM(cost), '$999.99') “Total”
FROM products
GROUP BY category;

When working with the GROUP BY clause remember these four things:

1. If there is a group function that is being used in the SELECT clause and any individual column is listed, the individual group column must be listed in the GROUP BY clause.

2. The column names that are listed in the GROUP BY clause do not have to be in the SELECT clause. When columns are in the GROUP BY clause they are only there as an identifier of groups.

3. Column aliases cannot be used in the GROUP BY clause.

4. Results returned from a SELECT statement, which included the GROUP BY clause will display the data in ascending order unless the ORDER BY clause is used to specify something different.

Restricting Aggregated Output
The WHERE clause is only for restricting the rows that are selected, in order to restrict groupings the HAVING clause must be used. The HAVING clause works just like the WHERE clause. You may use both WHERE and HAVING in a query. When a query includes all three clauses ( WHERE, GROUP BY, and HAVING ) Oracle will process them in the following order:

WHERE

GROUP BY

HAVING

Enhanced Aggregation for Reporting Oracle extends the GROUP BY clause which allows both aggregation across multiple dimensions or generation of increasing levels of subtotals with a single SELECT statement. One way to achieve aggregated results across multiple dimensions would be to use a series of aggregate queries joined with a UNION operator. The use of multiple queries is not very efficient. To perform the same task, the GROUPING SETS expression is much easier. By using GROUPING SETS this will reduce the overhead that is required by executing multiple queries.

Grouping Sets
The grouping set is the basic building block for the ROLLUP and CUBE which are explained later. The GROUPING SETS are a group of GROUP BY clauses.

SELECT name, catid, COUNT(pn), AVG(cost)
FROM parts JOIN category USING(catid)
GROUP BY GROUPING SETS(name, category_desc, (name, category_desc), ()); The output will display three different groupings. One of the groups will contain both name, and category_desc followed by how many parts there are, and the average cost. The second group will include the category_desc, number of pn (part numbers), and the average cost. The second group will not include the name of the part. The third group will include everything but the category (name, number of part numbers, and average cost). The ‘()’ in the GROUPING SETS tells Oracle that you want to know the total number of part numbers from each category.Cube

The CUBE extension of the GROUP BY clause instructs Oracle to perform aggregations for all possible combinations of the columns indicated. If only a subset of aggregate levels calculated are needed, the GROUPING SETS must be used because CUBE will always perform all aggregation levels. SELECT name, catid, COUNT(pn), AVG(cost)FROM parts JOIN category USING(catid)

GROUP BY GROUP BY CUBE(name, category_desc);

Rollup
ROLLUP is used to create subtotals for each column that is listed. Each column that is listed in the ROLLUP will be subtotaled. SELECT name, catid, COUNT(pn), AVG(cost)FROM parts JOIN category USING(catid)

GROUP BY GROUP BY ROLLUP(name, category_desc);