Storage and File Structure:Data-Dictionary Storage

Data-Dictionary Storage

So far, we have considered only the representation of the relations themselves. A relational-database system needs to maintain data about the relations, such as the

image

image

schema of the relations. This information is called the data dictionary, or system catalog. Among the types of information that the system must store are these:

• Names of the relations

• Names of the attributes of each relation

• Domains and lengths of attributes

• Names of views defined on the database, and definitions of those views

• Integrity constraints (for example, key constraints)

In addition, many systems keep the following data on users of the system:

• Names of authorized users

• Accounting information about users

• Passwords or other information used to authenticate users

Further, the database may store statistical and descriptive data about the relations, such as:

• Number of tuples in each relation

• Method of storage for each relation (for example, clustered or nonclustered)

The data dictionary may also note the storage organization (sequential, hash or heap) of relations, and the location where each relation is stored:

• If relations are stored in operating system files, the dictionary would note the names of the file (or files) containing each relation.

• If the database stores all relations in a single file, the dictionary may note the blocks containing records of each relation in a data structure such as a linked list.

In Chapter 12, in which we study indices, we shall see a need to store information about each index on each of the relations:

• Name of the index

• Name of the relation being indexed

• Attributes on which the index is defined

• Type of index formed

All this information constitutes, in effect, a miniature database. Some database systems store this information by using special-purpose data structures and code. It is generally preferable to store the data about the database in the database itself. By using the database to store system data, we simplify the overall structure of the system and harness the full power of the database for fast access to system data.

The exact choice of how to represent system data by relations must be made by the system designers. One possible representation, with primary keys underlined, is Relation-metadata (relation-name, number-of-attributes, storage-organization, location) Attribute-metadata (attribute-name, relation-name, domain-type, position, length) User-metadata (user-name, encrypted-password, group) Index-metadata (index-name, relation-name, index-type, index-attributes) View-metadata (view-name, definition) In this representation, the attribute index-attributes of the relation Index-metadata is assumed to contain a list of one or more attributes, which can be represented by a character string such as “branch-name, branch-city”. The Index-metadata relation is thus not in first normal form; it can be normalized, but the above representation is likely to be more efficient to access. The data dictionary is often stored in a non-normalized form to achieve fast access.

The storage organization and location of the Relation-metadata itself must be recorded elsewhere (for example, in the database code itself), since we need this information to find the contents of Relation-metadata.

Comments

Popular posts from this blog

XML Document Schema

Extended Relational-Algebra Operations.

Distributed Databases:Concurrency Control in Distributed Databases