Nested Subqueries

Nested Subqueries

SQL provides a mechanism for nesting subqueries. A subquery is a select-from- where expression that is nested within another query. A common use of subqueries is to perform tests for set membership, make set comparisons, and determine set car- dinality. We shall study these uses in subsequent sections.

Set Membership

SQL draws on the relational calculus for operations that allow testing tuples for membership in a relation. The in connective tests for set membership, where the set is a collection of values produced by a select clause. The not in connective tests for the absence of set membership. As an illustration, reconsider the query “Find all customers who have both a loan and an account at the bank.” Earlier, we wrote such a query by intersecting two sets: the set of depositors at the bank, and the set of borrowers from the bank. We can take the alternative approach of finding all account holders at the bank who are members of the set of borrowers from the bank. Clearly, this formulation generates the same results as the previous one did, but it leads us to write our query using the in connective of SQL. We begin by finding all account holders, and we write the subquery

(select customer-name

from depositor)

We then need to find those customers who are borrowers from the bank and who appear in the list of account holders obtained in the subquery. We do so by nesting the subquery in an outer select. The resulting query is

select distinct customer-name

from borrower

where customer-name in (select customer-name

from depositor)

This example shows that it is possible to write the same query several ways in SQL. This flexibility is beneficial, since it allows a user to think about the query in the way that seems most natural. We shall see that there is a substantial amount of redundancy in SQL.

In the preceding example, we tested membership in a one-attribute relation. It is also possible to test for membership in an arbitrary relation in SQL. We can thus write the query “Find all customers who have both an account and a loan at the Perryridge branch” in yet another way:

image

We use the not in construct in a similar way. For example, to find all customers who do have a loan at the bank, but do not have an account at the bank, we can write

select distinct customer-name

from borrower

where customer-name not in (select customer-name

from depositor)

The in and not in operators can also be used on enumerated sets. The following query selects the names of customers who have a loan at the bank, and whose names are neither Smith nor Jones.

select distinct customer-name

from borrower

where customer-name not in (’Smith’, ’Jones’)

Set Comparison

As an example of the ability of a nested subquery to compare sets, consider the query “Find the names of all branches that have assets greater than those of at least one branch located in Brooklyn.” In Section 4.2.5, we wrote this query as follows:

select distinct T.branch-name

from branch as T, branch as S

where T.assets > S.assets and S.branch-city = ’Brooklyn’

SQL does, however, offer an alternative style for writing the preceding query. The phrase “greater than at least one” is represented in SQL by > some. This construct allows us to rewrite the query in a form that resembles closely our formulation of the query in English.

image

generates the set of all asset values for all branches in Brooklyn. The > some comparison in the where clause of the outer select is true if the assets value of the tuple is greater than at least one member of the set of all asset values for branches in Brooklyn.

SQL also allows < some, <= some, >= some, = some, and <> some comparisons. As an exercise, verify that = some is identical to in, whereas <> some is not the same as not in. The keyword any is synonymous to some in SQL. Early versions of SQL allowed only any. Later versions added the alternative some to avoid the linguistic ambiguity of the word any in English.

Now we modify our query slightly. Let us find the names of all branches that have an asset value greater than that of each branch in Brooklyn. The construct > all corresponds to the phrase “greater than all.” Using this construct, we write the query as follows:

image

As it does for some, SQL also allows < all, <= all, >= all, = all, and <> all comparisons. As an exercise, verify that <> all is identical to not in.

As another example of set comparisons, consider the query “Find the branch that has the highest average balance.” Aggregate functions cannot be composed in SQL. Thus, we cannot use max (avg (.. .)). Instead, we can follow this strategy: We begin by writing a query to find all average balances, and then nest it as a subquery of a larger query that finds those branches for which the average balance is greater than or equal to all average balances:

image

Test for Empty Relations

SQL includes a feature for testing whether a subquery has any tuples in its result. The exists construct returns the value true if the argument subquery is nonempty. Using the exists construct, we can write the query “Find all customers who have both an account and a loan at the bank” in still another way:

image

We can test for the nonexistence of tuples in a subquery by using the not exists construct. We can use the not exists construct to simulate the set containment (that is, superset) operation: We can write “relation A contains relation B” as “not exists (B except A).” (Although it is not part of the SQL-92 and SQL:1999 standards, the contains operator was present in some early relational systems.) To illustrate the not exists operator, consider again the query “Find all customers who have an ac- count at all the branches located in Brooklyn.” For each customer, we need to see whether the set of all branches at which that customer has an account contains the set of all branches in Brooklyn. Using the except construct, we can write the query as follows:

image

finds all the branches at which customer S.customer-name has an account. Thus, the outer select takes each customer and tests whether the set of all branches at which that customer has an account contains the set of all branches located in Brooklyn.

In queries that contain subqueries, a scoping rule applies for tuple variables. In a subquery, according to the rule, it is legal to use only tuple variables defined in the subquery itself or in any query that contains the subquery. If a tuple variable is defined both locally in a subquery and globally in a containing query, the local definition applies. This rule is analogous to the usual scoping rules used for variables in programming languages.

Test for the Absence of Duplicate Tuples

SQL includes a feature for testing whether a subquery has any duplicate tuples in its result. The unique construct returns the value true if the argument subquery contains no duplicate tuples. Using the unique construct, we can write the query “Find all customers who have at most one account at the Perryridge branch” as follows:

image

Formally, the unique test on a relation is defined to fail if and only if the relation contains two tuples t1 and t2 such that t1 = t2. Since the test t1 = t2 fails if any of the fields of t1 or t2 are null, it is possible for unique to be true even if there are multiple copies of a tuple, as long as at least one of the attributes of the tuple is null.

Comments

Popular posts from this blog

Database System Architectures:Parallel Systems.

DATABASE DESIGN -2 part2

Database System Architectures:Network Types