Nested Subqueries

SQL provides a mechanism for nesting subqueries. A subquery is a select-from- where expression that is nested within another query. A common use of subqueries is to perform tests for set membership, make set comparisons, and determine set car- dinality. We shall study these uses in subsequent sections.

Set Membership

SQL draws on the relational calculus for operations that allow testing tuples for membership in a relation. The in connective tests for set membership, where the set is a collection of values produced by a select clause. The not in connective tests for the absence of set membership. As an illustration, reconsider the query “Find all customers who have both a loan and an account at the bank.” Earlier, we wrote such a query by intersecting two sets: the set of depositors at the bank, and the set of borrowers from the bank. We can take the alternative approach of ﬁnding all account holders at the bank who are members of the set of borrowers from the bank. Clearly, this formulation generates the same results as the previous one did, but it leads us to write our query using the in connective of SQL. We begin by ﬁnding all account holders, and we write the subquery

(select customer-name

from depositor)

We then need to ﬁnd those customers who are borrowers from the bank and who appear in the list of account holders obtained in the subquery. We do so by nesting the subquery in an outer select. The resulting query is

select distinct customer-name

from borrower

where customer-name in (select customer-name

from depositor)

This example shows that it is possible to write the same query several ways in SQL. This ﬂexibility is beneﬁcial, since it allows a user to think about the query in the way that seems most natural. We shall see that there is a substantial amount of redundancy in SQL.

In the preceding example, we tested membership in a one-attribute relation. It is also possible to test for membership in an arbitrary relation in SQL. We can thus write the query “Find all customers who have both an account and a loan at the Perryridge branch” in yet another way:

We use the not in construct in a similar way. For example, to ﬁnd all customers who do have a loan at the bank, but do not have an account at the bank, we can write

select distinct customer-name

from borrower

where customer-name not in (select customer-name

from depositor)

The in and not in operators can also be used on enumerated sets. The following query selects the names of customers who have a loan at the bank, and whose names are neither Smith nor Jones.

select distinct customer-name

from borrower

where customer-name not in (’Smith’, ’Jones’)

Set Comparison

As an example of the ability of a nested subquery to compare sets, consider the query “Find the names of all branches that have assets greater than those of at least one branch located in Brooklyn.” In Section 4.2.5, we wrote this query as follows:

select distinct T.branch-name

from branch as T, branch as S

where T.assets > S.assets and S.branch-city = ’Brooklyn’

SQL does, however, offer an alternative style for writing the preceding query. The phrase “greater than at least one” is represented in SQL by > some. This construct allows us to rewrite the query in a form that resembles closely our formulation of the query in English.

generates the set of all asset values for all branches in Brooklyn. The > some comparison in the where clause of the outer select is true if the assets value of the tuple is greater than at least one member of the set of all asset values for branches in Brooklyn.

SQL also allows < some, <= some, >= some, = some, and <> some comparisons. As an exercise, verify that = some is identical to in, whereas <> some is not the same as not in. The keyword any is synonymous to some in SQL. Early versions of SQL allowed only any. Later versions added the alternative some to avoid the linguistic ambiguity of the word any in English.

Now we modify our query slightly. Let us ﬁnd the names of all branches that have an asset value greater than that of each branch in Brooklyn. The construct > all corresponds to the phrase “greater than all.” Using this construct, we write the query as follows:

As it does for some, SQL also allows < all, <= all, >= all, = all, and <> all comparisons. As an exercise, verify that <> all is identical to not in.

As another example of set comparisons, consider the query “Find the branch that has the highest average balance.” Aggregate functions cannot be composed in SQL. Thus, we cannot use max (avg (.. .)). Instead, we can follow this strategy: We begin by writing a query to ﬁnd all average balances, and then nest it as a subquery of a larger query that ﬁnds those branches for which the average balance is greater than or equal to all average balances:

Test for Empty Relations

SQL includes a feature for testing whether a subquery has any tuples in its result. The exists construct returns the value true if the argument subquery is nonempty. Using the exists construct, we can write the query “Find all customers who have both an account and a loan at the bank” in still another way:

We can test for the nonexistence of tuples in a subquery by using the not exists construct. We can use the not exists construct to simulate the set containment (that is, superset) operation: We can write “relation A contains relation B” as “not exists (B except A).” (Although it is not part of the SQL-92 and SQL:1999 standards, the contains operator was present in some early relational systems.) To illustrate the not exists operator, consider again the query “Find all customers who have an ac- count at all the branches located in Brooklyn.” For each customer, we need to see whether the set of all branches at which that customer has an account contains the set of all branches in Brooklyn. Using the except construct, we can write the query as follows:

ﬁnds all the branches at which customer S.customer-name has an account. Thus, the outer select takes each customer and tests whether the set of all branches at which that customer has an account contains the set of all branches located in Brooklyn.

In queries that contain subqueries, a scoping rule applies for tuple variables. In a subquery, according to the rule, it is legal to use only tuple variables deﬁned in the subquery itself or in any query that contains the subquery. If a tuple variable is deﬁned both locally in a subquery and globally in a containing query, the local deﬁnition applies. This rule is analogous to the usual scoping rules used for variables in programming languages.

Test for the Absence of Duplicate Tuples

SQL includes a feature for testing whether a subquery has any duplicate tuples in its result. The unique construct returns the value true if the argument subquery contains no duplicate tuples. Using the unique construct, we can write the query “Find all customers who have at most one account at the Perryridge branch” as follows:

Formally, the unique test on a relation is deﬁned to fail if and only if the relation contains two tuples t1 and t2 such that t1 = t2. Since the test t1 = t2 fails if any of the ﬁelds of t1 or t2 are null, it is possible for unique to be true even if there are multiple copies of a tuple, as long as at least one of the attributes of the tuple is null.

Search This Blog

Database Management System course