Object-Relational Databases:Nested Relations
Object-Relational Databases
Persistent programming languages add persistence and other database features to existing programming languages by using an existing object-oriented type system. In contrast, object-relational data models extend the relational data model by providing a richer type system including complex data types and object orientation. Relational query languages, in particular SQL, need to be correspondingly extended to deal with the richer type system. Such extensions attempt to preserve the relational foundations — in particular, the declarative access to data — while extending the modeling power. Object-relational database systems (that is, database systems based on the object-relation model) provide a convenient migration path for users of relational databases who wish to use object-oriented features.
We first present the motivation for the nested relational model, which allows relations that are not in first normal form, and allows direct representation of hierarchical structures. We then show how to extend SQL by adding a variety of object-relational features. Our discussion is based on the SQL:1999 standard.
Finally, we discuss differences between persistent programming languages and object-relational systems, and mention criteria for choosing between them.
Nested Relations
In Chapter 7, we defined first normal form (1NF), which requires that all attributes have atomic domains. Recall that a domain is atomic if elements of the domain are considered to be indivisible units.
The assumption of 1NF is a natural one in the bank examples we have considered. However, not all applications are best modeled by 1NF relations. For example, rather than view a database as a set of records, users of certain applications view it as a set of objects (or entities). These objects may require several records for their representation. We shall see that a simple, easy-to-use interface requires a one-to-one correspondence
between the user’s intuitive notion of an object and the database system’s notion of a data item.
The nested relational model is an extension of the relational model in which do- mains may be either atomic or relation valued. Thus, the value of a tuple on an at- tribute may be a relation, and relations may be contained within relations. A complex object thus can be represented by a single tuple of a nested relation. If we view a tu- ple of a nested relation as a data item, we have a one-to-one correspondence between data items and objects in the user’s view of the database.
We illustrate nested relations by an example from a library. Suppose we store for
each book the following information:
• Book title
• Set of authors
• Publisher
• Set of keywords
We can see that, if we define a relation for the preceding information, several domains will be nonatomic.
• Authors. A book may have a set of authors. Nevertheless, we may want to find all books of which Jones was one of the authors. Thus, we are interested in a subpart of the domain element “set of authors.”
• Keywords. If we store a set of keywords for a book, we expect to be able to retrieve all books whose keywords include one or more keywords. Thus, we view the domain of the set of keywords as nonatomic.
• Publisher. Unlike keywords and authors, publisher does not have a set-valued domain. However, we may view publisher as consisting of the subfields name and branch. This view makes the domain of publisher nonatomic.
Figure 9.1 shows an example relation, books. The books relation can be represented in 1NF, as in Figure 9.2. Since we must have atomic domains in 1NF, yet want access to individual authors and to individual keywords, we need one tuple for each (keyword, author) pair. The publisher attribute is replaced in the 1NF version by two attributes: one for each subfield of publisher.
Figure 9.3 shows the projection of the relation flat-books of Figure 9.2 onto the preceding decomposition.
Although our example book database can be adequately expressed without using nested relations, the use of nested relations leads to an easier-to-understand model: The typical user of an information-retrieval system thinks of the database in terms of books having sets of authors, as the non-1NF design models. The 4NF design would require users to include joins in their queries, thereby complicating interaction with the system.
We could define a non-nested relational view (whose contents are identical to flat- books) that eliminates the need for users to write joins in their query. In such a view, however, we lose the one-to-one correspondence between tuples and books.
Comments
Post a Comment