Subqueries and Merges

By Paulus, 4 May, 2007

HTML Source EditorWord wrap Sub queries come in handy when you want to find a record but need to know what exactly to look for. For example, if you want to find all the items that are more expensive than a specific item. Instead of issuing two SELECT statements, you can use sub queries to find the answer in just one query.  

 

When designing Sub queries, the first query that is designed is the sub query. In order to execute sub queries correctly, you must determine what it is that you are looking for. After successfully testing the sub query, it is this that will be passed as input to the parent query or the outer query.  

 

 

 

Sub query
Description
Single-row sub query Returns to the outer one row of results that consists of one column.
Multiple-row sub query Returns to the out query more than one row of results.
Multiple-column sub query Returns to the outer query more than one column of results.
Correlated sub query References a column in the outer query, and executes the sub query once for every row in the out query.
Uncorrelated sub query Executes the sub query first and passes the value to the outer query.
MERGE statement Conditionally processes a series of DML statements.

Rules of the sub query:  

 

  • The sub query must be a complete query in itself. This means that the sub query must contain at least a SELECT and FROM clause. 

  • An ORDER BY clause cannot be used in a sub query. This is reserved for the parent. 

  • The sub query must be enclosed between parentheses to separate it from the outer query. 

  • When using a sub query in a WHERE or HAVING clause the sub query must be on the right side of the comparison operator. 

Single-Row Sub queries

 

 

A single-row sub query is used when the results of the outer query are based on a single, unknown value. However, a single-row sub query can return a row of data that consists of only column. The single-row sub query which does this is called a single value.  

 

SELECT name, part#, cost
FROM products
WHERE cost >  (SELECT cost
                           FROM products
                            WHERE part# = 9965)
AND parttype = 'Hardware';

The sub query, which is enclosed in parentheses, only returns one value. This value is used in the outer query to return one or more rows. When using a single-row sub query in a WHERE clause the only operators that can be used are: =, >, <, >=, <=, <>, and IN. These single row operators instruct Oracle that only one value is expected and if more than one result is returned then Oracle will error out.

Multiple-Row Sub Queries

Multiple-row sub queries return more than one row of results to the outer query. These types of sub queries are commonly used in the WHERE and HAVING clauses. When using a multiple-row sub query in the HAVING and WHERE clauses, you must use the multiple-row operators. These operators include the IN ALL, and ANY If a single row comparison operator is used on a multiple-row sub query then Oracle will error out. Of all the multiple-row operators, the IN is most commonly used.

SELECT name, part#, parttype
FROM products
WHERE cost IN (    SELECT MAX(cost)
                                FROM products
                                GROUP BY category)
ORDER BY parttype;

The ANY and ALL operators can be combined with other comparison operators to treat the results of a sub query as a set of values.

Operator Description
>ALL More than the highest value returned by the sub query.
Less than the lowest value returned by the sub query.
Less than the highest value returned by the sub query.
>ANY More than the lowest value returned by the sub query.
=ANY Equal to any value returned by the sub query. This is the same as using the IN operator.

Exists Operator

The EXISTS operator is used to determine whether a condition is present in a sub query or not. Take for example the following query:

SELECT part_name
FROM products
WHERE EXISTS (      SELECT part#
                                    FROM orderitems
                                    WHERE products.part# = orderitems.part#
);

The sub query selects all the parts that have been ordered. The outer query selects all the products. The EXISTS takes the two results and compares them. If the product exists in both tables then it's added to the end result. If you wanted to find all the products that that have not been ordered. The above query can be modified by inserting a NOT operator in front of EXISTS.

Multiple-Column Sub queries

Multiple-column sub queries return more than one columns per row. They can be included in the FROM, WHERE, or HAVING clause of a query. If a multiple-column sub query is used in a FROM clause Oracle creates a temporary table, which is called an inline view, to hold the data and can be referenced.

SELECT p.part_name, p.cost, p.parttype, a.parttypeavg
FROM products p, ( SELECT parttype, AVG(cost) parttypeavg
                                 FROM products
                                GROUP BY parttype) a
WHERE p.parttype = a.parttype
AND p.cost > a.parttypeavg;

You could have also done this using a JOIN.

SELECT p.part_name, p.cost, p.parttype, a.parttypeavg
FROM products p JOIN (SELECT parttype, AVG(cost) parttypeavg
                                        FROM products
                                        GROUP BY parttype)
USING(parttype)
WHERE cost > parttypeavg;

In the case of using multiple-column sub queries in the WHERE and HAVING clause of the outer query, the IN operator is used by the outer query to evaluate the results of the sub query. When using a sub query in the WHERE and HAVING clause, remember these rules:

  • Since the WHERE clause contains more than one column name, the column list must be enclosed within parentheses.
  • Column names listed in the WHERE clause must be in the same order as they appear in the SELECT clause of the sub query.

If there is any doubt that a sub query will return a NULL value, the NVL function should be used to prevent the full query from returning no rows. There are two things to keep in mind when using the NVL function in the sub query:

  1. Substitution of the NULL value must occur for the NULL value both in the sub query and in the outer query.
  2. The value substituted for the NULL value must be one that could not exist anywhere else in the column.

Passing a NULL value from a sub query does create problems, searching for NULL values are allowed in a sub query.

Correlated Sub queries

The previous types of sub queries are uncorrelated, where the sub query is executed then the outer query is executed last. In order to implement a correlated sub query, the EXISTS keyword is used. When the EXISTS keyword is used, the query is taken as a whole and each row is processed at a time. The outer query is executed, followed by the sub query which then they are compared.

MERGE Statement

With a MERGE statement, a series of DML actions can occur with a single SQL statement. This comes in handy in Data Warehousing, when another table needs to be updated based on another table. One table may be used for updating, while the other table is used for querying an reporting. By doing this it will reduce the time it takes to query the table.

MERGE INTO table1
USING table2
ON (col1 = col2)
WHEN MATCHED THEN
update query
WHEN NOT MATCHED THEN
insert query

The WHEN MATCHED and WHEN NOT MATCHED are not necessary unless they are needed by the application. WHERE clauses can be used, basically the update and insert queries can be whatever the application calls for. They are full blown queries, so if you wanted to, you could use