Relational-Database Design:Decomposition

Decomposition

The bad design of Section 7.2 suggests that we should decompose a relation schema that has many attributes into several schemas with fewer attributes. Careless decomposition, however, may lead to another form of bad design.

Consider an alternative design in which we decompose Lending-schema into the following two schemas:

Branch-customer-schema = (branch-name, branch-city, assets, customer-name) Customer-loan-schema = (customer-name, loan-number, amount)

Using the lending relation of Figure 7.1, we construct our new relations branch-customer (Branch-customer) and customer-loan (Customer-loan-schema):

branch-customer = Πbranch -name, branch -city , assets, customer -name (lending)

customer -loan = Πcustomer -name, loan-number , amount (lending)

Figures 7.9 and 7.10, respectively, show the resulting branch-customer and customer- name relations.

Of course, there are cases in which we need to reconstruct the loan relation. For example, suppose that we wish to find all branches that have loans with amounts less than $1000. No relation in our alternative database contains these data. We need to reconstruct the lending relation. It appears that we can do so by writing

image

image

Figure 7.11 shows the result of computing branch-customer customer -loan. When we compare this relation and the lending relation with which we started (Figure 7.1), we notice a difference: Although every tuple that appears in the lending relation appears in branch-customer customer -loan, there are tuples in branch-customer customer -loan that are not in lending. In our example, branch-customer has the following additional tuples:

image

Consider the query, “Find all bank branches that have made a loan in an amount less than $1000.” If we look back at Figure 7.1, we see that the only branches with loan amounts less than $1000 are Mianus and Round Hill. However, when we apply the expression

image

we obtain three branch names: Mianus, Round Hill, and Downtown.

A closer examination of this example shows why. If a customer happens to have several loans from different branches, we cannot tell which loan belongs to which branch. Thus, when we join branch-customer and customer-loan, we obtain not only the tuples we had originally in lending, but also several additional tuples. Althoughwe have more tuples in branch-customer customer -loan, we actually have less information. We are no longer able, in general, to represent in the database information about which customers are borrowers from which branch. Because of this loss of information, we call the decomposition of Lending-schema into Branch-customer-schema and customer-loan-schema a lossy decomposition, or a lossy-join decomposition. A decomposition that is not a lossy-join decomposition is a lossless-join decomposi-

image

Thus, the only way that we can represent a relationship between, for example, customer-name and assets is through branch-name. The difference between this example and the preceding one is that the assets of a branch are the same, regardless of the customer to which we are referring, whereas the lending branch associated with a certain loan amount does depend on the customer to which we are referring. For a given branch-name, there is exactly one assets value and exactly one branch-city;

whereas a similar statement cannot be made for customer-name. That is, the functional dependency

branch-name assets branch-city

holds, but customer-name does not functionally determine loan-number.

The notion of lossless joins is central to much of relational-database design. There- fore, we restate the preceding examples more concisely and more formally. Let R be a relation schema. A set of relation schemas {R1, R2,... , Rn} is a decomposition of R if

image

Note that the relations in Figures 7.1 and 7.11 are not the same.

To have a lossless-join decomposition, we need to impose constraints on the set of possible relations. We found that the decomposition of Lending-schema into Branch schema and Loan-info-schema is lossless because the functional dependency

branch-name branch-city assets

holds on Branch-schema.

Later in this chapter, we shall introduce constraints other than functional dependencies. We say that a relation is legal if it satisfies all rules, or constraints, that we impose on our database.

Let C represent a set of constraints on the database, and let R be a relation schema. A decomposition {R1, R2,... , Rn} of R is a lossless-join decomposition if, for all relations r on schema R that are legal under C,

image

We shall show how to test whether a decomposition is a lossless-join decomposition in the next few sections. A major part of this chapter deals with the questions of how to specify constraints on the database, and how to obtain lossless-join decompositions that avoid the pitfalls represented by the examples of bad database designs that we have seen in this section.

Comments

Popular posts from this blog

Database System Architectures:Parallel Systems.

DATABASE DESIGN -2 part2

Database System Architectures:Network Types