How to Use Nested SQL Queries that Return Sets of Rows

By Allen G. Taylor

To illustrate how a nested SQL query returns a set of rows, imagine that you work for a systems integrator of computer equipment. Your company, Zetec Corporation, assembles systems from components that you buy, and then it sells them to companies and government agencies. You keep track of your business with a relational database.

The database consists of many tables, but right now you’re concerned with only three of them: the PRODUCT table, the COMP_USED table, and the COMPONENT table. The PRODUCT table contains a list of all your standard products.

Column Type Constraints
Model CHAR (6) PRIMARY KEY
ProdName CHAR (35)
ProdDesc CHAR (31)
ListPrice NUMERIC (9,2)

The COMPONENT table lists components that go into your products.

Column Type Constraints
CompID CHAR (6) PRIMARY KEY
CompType CHAR (10)
CompDesc CHAR (31)

The COMP_USED table tracks which components go into each product.

Column Type Constraints
Model CHAR (6) FOREIGN KEY (for PRODUCT)
CompID CHAR (6) FOREIGN KEY (for COMPONENT)

A component may be used in multiple products, and a product can contain multiple components (a many-to-many relationship). This situation can cause integrity problems. To circumvent the problems, create the linking table COMP_USED to relate COMPONENT to PRODUCT. A component may appear in many rows in the COMP_USED table, but each of those rows will reference only one component (a one-to-many relationship).

Similarly, a product may appear in many rows in COMP_USED, but each row references only one product (another one-to-many relationship). Adding the linking table transforms a troublesome many-to-many relationship into two relatively simple one-to-many relationships. This process of reducing the complexity of relationships is one example of normalization.

Subqueries introduced by the keyword IN

One form of a nested query compares a single value with the set of values returned by a SELECT statement. It uses the IN predicate with the following syntax:

SELECT column_list
 FROM table
 WHERE expression IN (subquery) ;

The expression in the WHERE clause evaluates to a value. If that value is IN the list returned by the subquery, then the WHERE clause returns a True value. The specified columns from the table row being processed are added to the result table. The subquery may reference the same table referenced by the outer query, or it may reference a different table.

In the following example, the Zetec’s database is used to demonstrate this type of query. Assume that there’s a shortage of computer monitors in the computer industry, so that when you run out of monitors, you can no longer deliver products that include them. You want to know which products are affected. Glancing gratefully at your own monitor, enter the following query:

SELECT Model
 FROM COMP_USED
 WHERE CompID IN
  (SELECT CompID
   FROM COMPONENT
   WHERE CompType = ‘Monitor’) ;

SQL processes the innermost query first, so it processes the COMPONENT table, returning the value of CompID for every row where CompType is ‘Monitor’. The result is a list of the ID numbers of all monitors. The outer query then compares the value of CompID in every row in the COMP_USED table against the list.

If the comparison is successful, the value of the Model column for that row is added to the outer SELECT’s result table. The result is a list of all product models that include a monitor. The following example shows what happens when you run the query:

Model
-----
CX3000
CX3010
CX3020
MB3030
MX3020
MX3030

You now know which products will soon be out of stock. It’s time to go to the sales force and tell them to slow down on promoting these products.

When you use this form of nested query, the subquery must specify a single column, and that column’s data type must match the data type of the argument preceding the IN keyword.

Hopefully, you remember the KISS principle. Keeping things simple is important when you’re dealing with software of any kind, but it is especially important when dealing with database software. Statements that include nested SELECTs can be difficult to get right.

One way to get them working the way they should is to run the inner SELECT all by itself first and then verify that the result you get is the result you expect. When you’re sure the inner SELECT is functioning properly, you can enclose it in the outer part of the statement and have a better chance that the whole thing will work as advertised.

Subqueries introduced by the keyword NOT IN

Just as you can introduce a subquery with the IN keyword, you can do the opposite and introduce it with the NOT IN keywords. In fact, now is a great time for Zetec management to make such a query. Zetec management found out what products not to sell. That is valuable information, but what Zetec management really wants to know is what products to sell.

Management wants to emphasize the sale of products that don’t contain monitors. A nested query featuring a subquery introduced by the NOT IN keywords provides the requested information:

SELECT Model
 FROM COMP_USED
 WHERE CompID NOT IN
  (SELECT CompID
   FROM COMPONENT
   WHERE CompType = 'Monitor')) ;

This query produces the following result:

Model
-----
PX3040
PB3050
PX3040
PB3050

In the example, the number of rows does not create a problem because the result table is short. In the real world, such a result table may have hundreds or thousands of rows. To avoid confusion, it’s best to eliminate duplicates. You can do so easily by adding the DISTINCT keyword to the query. Only rows that are distinct from all previously retrieved rows are added to the result table:

SELECT DISTINCT Model
 FROM COMP_USED
 WHERE CompID NOT IN
  (SELECT CompID
   FROM COMPONENT
   WHERE CompType = 'Monitor')) ;

As expected, the result is as follows:

Model
-----
PX3040
PB3050