Relational-Database Design:Pitfalls in Relational-Database Design.
Pitfalls in Relational-Database Design
Before we continue our discussion of normal forms, let us look at what can go wrong in a bad database design. Among the undesirable properties that a bad design may have are:
• Repetition of information
• Inability to represent certain information
We shall discuss these problems with the help of a modified database design for our banking example: In contrast to the relation schema used in Chapters 3 to 6, sup- pose the information concerning loans is kept in one single relation, lending, which is defined over the relation schema
Lending-schema = (branch-name, branch-city, assets, customer-name, loan-number, amount)
Figure 7.1 shows an instance of the relation lending (Lending-schema). A tuple t in the lending relation has the following intuitive meaning:
• t[assets] is the asset figure for the branch named t[branch-name].
• t[branch-city] is the city in which the branch named t[branch-name] is located.
• t[loan-number] is the number assigned to a loan made by the branch named t[branch-name] to the customer named t[customer-name].
• t[amount] is the amount of the loan whose number is t[loan-number].
Suppose that we wish to add a new loan to our database. Say that the loan is made by the Perryridge branch to Adams in the amount of $1500. Let the loan-number be L-31. In our design, we need a tuple with values on all the attributes of Lending- schema. Thus, we must repeat the asset and city data for the Perryridge branch, and must add the tuple
to the lending relation. In general, the asset and city data for a branch must appear once for each loan made by that branch.
The repetition of information in our alternative design is undesirable. Repeating information wastes space. Furthermore, it complicates updating the database. Sup- pose, for example, that the assets of the Perryridge branch change from 1700000 to 1900000. Under our original design, one tuple of the branch relation needs to be changed. Under our alternative design, many tuples of the lending relation need to be changed. Thus, updates are more costly under the alternative design than under the original design. When we perform the update in the alternative database, we must ensure that every tuple pertaining to the Perryridge branch is updated, or else our database will show two different asset values for the Perryridge branch.
That observation is central to understanding why the alternative design is bad. We know that a bank branch has a unique value of assets, so given a branch name we can uniquely identify the assets value. On the other hand, we know that a branch may make many loans, so given a branch name, we cannot uniquely determine a loan number. In other words, we say that the functional dependency
branch-name → assets
holds on Lending-schema, but we do not expect the functional dependency branch- name → loan-number to hold. The fact that a branch has a particular value of assets, and the fact that a branch makes a loan are independent, and, as we have seen, these facts are best represented in separate relations. We shall see that we can use functional dependencies to specify formally when a database design is good.
Another problem with the Lending-schema design is that we cannot represent directly the information concerning a branch (branch-name, branch-city, assets) unless there exists at least one loan at the branch. This is because tuples in the lending relation require values for loan-number, amount, and customer-name.
One solution to this problem is to introduce null values, as we did to handle updates through views. Recall, however, that null values are difficult to handle, as we saw in Section 3.3.4. If we are not willing to deal with null values, then we can create the branch information only when the first loan application at that branch is made.
Worse, we would have to delete this information when all the loans have been paid.
Clearly, this situation is undesirable, since, under our original database design, the branch information would be available regardless of whether or not loans are currently maintained in the branch, and without resorting to null values.
Comments
Post a Comment