Storage and File Structure:File Organization

File Organization

A ﬁle is organized logically as a sequence of records. These records are mapped onto disk blocks. Files are provided as a basic construct in operating systems, so we shall

assume the existence of an underlying ﬁle system. We need to consider ways of representing logical data models in terms of ﬁles.

Although blocks are of a ﬁxed size determined by the physical properties of the disk and by the operating system, record sizes vary. In a relational database, tuples of distinct relations are generally of different sizes.

One approach to mapping the database to ﬁles is to use several ﬁles, and to store records of only one ﬁxed length in any given ﬁle. An alternative is to structure our ﬁles so that we can accommodate multiple lengths for records; however, ﬁles of ﬁxed-length records are easier to implement than are ﬁles of variable-length records. Many of the techniques used for the former can be applied to the variable-length case. Thus, we begin by considering a ﬁle of ﬁxed-length records.

Fixed-Length Records

As an example, let us consider a ﬁle of account records for our bank database. Each record of this ﬁle is deﬁned as:

If we assume that each character occupies 1 byte and that a real occupies 8 bytes, our account record is 40 bytes long. A simple approach is to use the ﬁrst 40 bytes for the ﬁrst record, the next 40 bytes for the second record, and so on (Figure 11.6). However, there are two problems with this simple approach:

1. It is difﬁcult to delete a record from this structure. The space occupied by the record to be deleted must be ﬁlled with some other record of the ﬁle, or we must have a way of marking deleted records so that they can be ignored.

2. Unless the block size happens to be a multiple of 40 (which is unlikely), some records will cross block boundaries. That is, part of the record will be stored in one block and part in another. It would thus require two block accesses to read or write such a record.

When a record is deleted, we could move the record that came after it into the space formerly occupied by the deleted record, and so on, until every record following the deleted record has been moved ahead (Figure 11.7). Such an approach requires moving a large number of records. It might be easier simply to move the ﬁnal record of the ﬁle into the space occupied by the deleted record (Figure 11.8).

It is undesirable to move records to occupy the space freed by a deleted record, since doing so requires additional block accesses. Since insertions tend to be more frequent than deletions, it is acceptable to leave open the space occupied by the deleted record, and to wait for a subsequent insertion before reusing the space. A simple marker on a deleted record is not sufﬁcient, since it is hard to ﬁnd this available space when an insertion is being done. Thus, we need to introduce an additional structure.

At the beginning of the ﬁle, we allocate a certain number of bytes as a ﬁle header. The header will contain a variety of information about the ﬁle. For now, all we need to store there is the address of the ﬁrst record whose contents are deleted. We use this

ﬁrst record to store the address of the second available record, and so on. Intuitively, we can think of these stored addresses as pointers, since they point to the location of a record. The deleted records thus form a linked list, which is often referred to as a free list. Figure 11.9 shows the ﬁle of Figure 11.6, with the free list, after records 1, 4, and 6 have been deleted.

On insertion of a new record, we use the record pointed to by the header. We change the header pointer to point to the next available record. If no space is available, we add the new record to the end of the ﬁle.

Insertion and deletion for ﬁles of ﬁxed-length records are simple to implement, because the space made available by a deleted record is exactly the space needed to insert a record. If we allow records of variable length in a ﬁle, this match no longer holds. An inserted record may not ﬁt in the space left free by a deleted record, or it may ﬁll only part of that space.

Variable-Length Records

Variable-length records arise in database systems in several ways:

• Storage of multiple record types in a ﬁle

• Record types that allow variable lengths for one or more ﬁelds

• Record types that allow repeating ﬁelds

Different techniques for implementing variable-length records exist. For purposes of illustration, we shall use one example to demonstrate the various implementation techniques. We shall consider a different representation of the account information stored in the ﬁle of Figure 11.6, in which we use one variable-length record for each branch name and for all the account information for that branch. The format of the record is

We deﬁne account-info as an array with an arbitrary number of elements. That is, the type deﬁnition does not limit the number of elements in the array, although any actual record will have a speciﬁc number of elements in its array. There is no limit on how large a record can be (up to, of course, the size of the disk storage!).

Byte-String Representation

A simple method for implementing variable-length records is to attach a special end- of-record (⊥) symbol to the end of each record. We can then store each record as a

string of consecutive bytes. Figure 11.10 shows such an organization to represent the ﬁle of ﬁxed-length records of Figure 11.6 as variable-length records. An alternative version of the byte-string representation stores the record length at the beginning of each record, instead of using end-of-record symbols.

The byte-string representation as described in Figure 11.10 has some disadvantages:

• It is not easy to reuse space occupied formerly by a deleted record. Although techniques exist to manage insertion and deletion, they lead to a large number of small fragments of disk storage that are wasted.

• There is no space, in general, for records to grow longer. If a variable-length record becomes longer, it must be moved — movement is costly if pointers to the record are stored elsewhere in the database (e.g., in indices, or in other records), since the pointers must be located and updated.

Thus, the basic byte-string representation described here not usually used for imple- menting variable-length records. However, a modiﬁed form of the byte-string repre-

sentation, called the slotted-page structure, is commonly used for organizing records within a single block.

The slotted-page structure appears in Figure 11.11. There is a header at the beginning of each block, containing the following information:

1. The number of record entries in the header

2. The end of free space in the block

3. An array whose entries contain the location and size of each record

The actual records are allocated contiguously in the block, starting from the end of the block. The free space in the block is contiguous, between the ﬁnal entry in the header array, and the ﬁrst record. If a record is inserted, space is allocated for it at the end of free space, and an entry containing its size and location is added to the header.

If a record is deleted, the space that it occupies is freed, and its entry is set to deleted (its size is set to −1, for example). Further, the records in the block before the deleted record are moved, so that the free space created by the deletion gets occupied, and all free space is again between the ﬁnal entry in the header array and the ﬁrst record. The end-of-free-space pointer in the header is appropriately updated as well. Records can be grown or shrunk by similar techniques, as long as there is space in the block. The cost of moving the records is not too high, since the size of a block is limited: A typical value is 4 kilobytes.

The slotted-page structure requires that there be no pointers that point directly to records. Instead, pointers must point to the entry in the header that contains the actual location of the record. This level of indirection allows records to be moved to prevent fragmentation of space inside a block, while supporting indirect pointers to the record.

Fixed-Length Representation

Another way to implement variable-length records efﬁciently in a ﬁle system is to use one or more ﬁxed-length records to represent one variable-length record.

There are two ways of doing this:

1. Reserved space. If there is a maximum record length that is never exceeded, we can use ﬁxed-length records of that length. Unused space (for records

shorter than the maximum space) is ﬁlled with a special null, or end-of-record, symbol.

2. List representation. We can represent variable-length records by lists of ﬁxed- length records, chained together by pointers.

If we choose to apply the reserved-space method to our account example, we need to select a maximum record length. Figure 11.12 shows how the ﬁle of Figure 11.10 would be represented if we allowed a maximum of three accounts per branch. A record in this ﬁle is of the account-list type, but with the array containing exactly three elements. Those branches with fewer than three accounts (for example, Round Hill) have records with null ﬁelds. We use the symbol ⊥ to represent this situation in Figure 11.12. In practice, a particular value that can never represent real data is used (for example, an account number that is blank, or a name beginning with “*”). The reserved-space method is useful when most records have a length close to the maximum. Otherwise, a signiﬁcant amount of space may be wasted. In our bank example, some branches may have many more accounts than others. This situation leads us to consider the linked list method. To represent the ﬁle by the linked list method, we add a pointer ﬁeld as we did in Figure 11.9. The resulting structure appears in Figure 11.13.

The ﬁle structures of Figures 11.9 and 11.13 both use pointers; the difference is that, in Figure 11.9, we use pointers to chain together only deleted records, whereas in Figure 11.13, we chain together all records pertaining to the same branch.

A disadvantage to the structure of Figure 11.13 is that we waste space in all records except the ﬁrst in a chain. The ﬁrst record needs to have the branch-name value, but subsequent records do not. Nevertheless, we need to include a ﬁeld for branch-name in all records, lest the records not be of ﬁxed length. This wasted space is signiﬁcant, since we expect, in practice, that each branch has a large number of accounts. To deal with this problem, we allow two kinds of blocks in our ﬁle:

1. Anchor block, which contains the ﬁrst record of a chain

2. Overﬂow block, which contains records other than those that are the ﬁrst record of a chain Thus, all records within a block have the same length, even though not all records in the ﬁle have the same length. Figure 11.14 shows this ﬁle structure.

Organization of Records in Files

So far, we have studied how records are represented in a ﬁle structure. An instance of a relation is a set of records. Given a set of records, the next question is how to organize them in a ﬁle. Several of the possible ways of organizing records in ﬁles are:

• Heap ﬁle organization. Any record can be placed anywhere in the ﬁle where there is space for the record. There is no ordering of records. Typically, there is a single ﬁle for each relation

• Sequential ﬁle organization. Records are stored in sequential order, according to the value of a “search key” of each record. Section 11.7.1 describes this organization.

• Hashing ﬁle organization. A hash function is computed on some attribute of each record. The result of the hash function speciﬁes in which block of the ﬁle the record should be placed. Chapter 12 describes this organization; it is closely related to the indexing structures described in that chapter.

Generally, a separate ﬁle is used to store the records of each relation. However, in a clustering ﬁle organization, records of several different relations are stored in the same ﬁle; further, related records of the different relations are stored on the same block, so that one I/O operation fetches related records from all the relations. For example, records of the two relations can be considered to be related if they would match in a join of the two relations. Section 11.7.2 describes this organization.

Sequential File Organization

A sequential ﬁle is designed for efﬁcient processing of records in sorted order based on some search-key. A search key is any attribute or set of attributes; it need not be the primary key, or even a superkey. To permit fast retrieval of records in search-key order, we chain together records by pointers. The pointer in each record points to the next record in search-key order. Furthermore, to minimize the number of block accesses in sequential ﬁle processing, we store records physically in search-key order, or as close to search-key order as possible.

Figure 11.15 shows a sequential ﬁle of account records taken from our banking example. In that example, the records are stored in search-key order, using branch- name as the search key.

The sequential ﬁle organization allows records to be read in sorted order; that can be useful for display purposes, as well as for certain query-processing algorithms that we shall study in Chapter 13.

It is difﬁcult, however, to maintain physical sequential order as records are inserted and deleted, since it is costly to move many records as a result of a single

insertion or deletion. We can manage deletion by using pointer chains, as we saw previously. For insertion, we apply the following rules:

1. Locate the record in the ﬁle that comes before the record to be inserted in search-key order.

2. If there is a free record (that is, space left after a deletion) within the same block as this record, insert the new record there. Otherwise, insert the new record in an overﬂow block. In either case, adjust the pointers so as to chain together the records in search-key order.

Figure 11.16 shows the ﬁle of Figure 11.15 after the insertion of the record (North Town, A-888, 800). The structure in Figure 11.16 allows fast insertion of new records, but forces sequential ﬁle-processing applications to process records in an order that does not match the physical order of the records.

If relatively few records need to be stored in overﬂow blocks, this approach works well. Eventually, however, the correspondence between search-key order and physical order may be totally lost, in which case sequential processing will become much less efﬁcient. At this point, the ﬁle should be reorganized so that it is once again physically in sequential order. Such reorganizations are costly, and must be done during times when the system load is low. The frequency with which reorganizations are needed depends on the frequency of insertion of new records. In the extreme case in which insertions rarely occur, it is possible always to keep the ﬁle in physically sorted order. In such a case, the pointer ﬁeld in Figure 11.15 is not needed.

Clustering File Organization

Many relational-database systems store each relation in a separate ﬁle, so that they can take full advantage of the ﬁle system that the operating system provides. Usually, tuples of a relation can be represented as ﬁxed-length records. Thus, relations can be mapped to a simple ﬁle structure. This simple implementation of a relational database system is well suited to low-cost database implementations as in, for example, embedded systems or portable devices. In such systems, the size of the database is small, so little is gained from a sophisticated ﬁle structure. Furthermore, in such environments, it is essential that the overall size of the object code for the database system be small. A simple ﬁle structure reduces the amount of code needed to implement the system.

This simple approach to relational-database implementation becomes less satisfactory as the size of the database increases. We have seen that there are performance advantages to be gained from careful assignment of records to blocks, and from careful organization of the blocks themselves. Clearly, a more complicated ﬁle structure may be beneﬁcial, even if we retain the strategy of storing each relation in a separate ﬁle.

However, many large-scale database systems do not rely directly on the underlying operating system for ﬁle management. Instead, one large operating-system ﬁle is allocated to the database system. The database system stores all relations in this one ﬁle, and manages the ﬁle itself. To see the advantage of storing many relations in one ﬁle, consider the following SQL query for the bank database:

This query computes a join of the depositor and customer relations. Thus, for each tuple of depositor, the system must locate the customer tuples with the same value for customer-name. Ideally, these records will be located with the help of indices, which we shall discuss in Chapter 12. Regardless of how these records are located, however, they need to be transferred from disk into main memory. In the worst case, each record will reside on a different block, forcing us to do one block read for each record required by the query.

As a concrete example, consider the depositor and customer relations of Figures 11.17 and 11.18, respectively. In Figure 11.19, we show a ﬁle structure designed for efﬁcient execution of queries involving depositor customer. The depositor tuples for each customer-name are stored near the customer tuple for the corresponding customer name. This structure mixes together tuples of two relations, but allows for efﬁcient processing of the join. When a tuple of the customer relation is read, the entire block containing that tuple is copied from disk into main memory. Since the corresponding

depositor tuples are stored on the disk near the customer tuple, the block containing the customer tuple contains tuples of the depositor relation needed to process the query. If a customer has so many accounts that the depositor records do not ﬁt in one block, the remaining records appear on nearby blocks.

A clustering ﬁle organization is a ﬁle organization, such as that illustrated in Figure 11.19 that stores related records of two or more relations in each block. Such a ﬁle organization allows us to read records that would satisfy the join condition by using one block read. Thus, we are able to process this particular query more efﬁciently.

Our use of clustering has enhanced processing of a particular join (depositor customer), but it results in slowing processing of other types of query. For example,

select *

from customer

requires more block accesses than it did in the scheme under which we stored each relation in a separate ﬁle. Instead of several customer records appearing in one block, each record is located in a distinct block. Indeed, simply ﬁnding all the customer records is not possible without some additional structure. To locate all tuples of the customer relation in the structure of Figure 11.19, we need to chain together all the records of that relation using pointers, as in Figure 11.20.

When clustering is to be used depends on the types of query that the database designer believes to be most frequent. Careful use of clustering can produce signiﬁcant performance gains in query processing.

Search This Blog

Database Management System course