Sorting

Sorting

Sorting of data plays an important role in database systems for two reasons. First, SQL queries can specify that the output be sorted. Second, and equally important for query processing, several of the relational operations, such as joins, can be implemented efficiently if the input relations are first sorted. Thus, we discuss sorting here before discussing the join operation in Section 13.5.

We can sort a relation by building an index on the sort key, and then using that index to read the relation in sorted order. However, such a process orders the relation only logically, through an index, rather than physically. Hence, the reading of tuples in the sorted order may lead to a disk access for each record, which can be very expensive, since the number of records can be much larger than the number of blocks. For this reason, it may be desirable to order the records physically.

The problem of sorting has been studied extensively, both for relations that fit entirely in main memory, and for relations that are bigger than memory. In the first case, standard sorting techniques such as quick-sort can be used. Here, we discuss how to handle the second case.

Sorting of relations that do not fit in memory is called external sorting. The most commonly used technique for external sorting is the external sort – merge algorithm.

We describe the external sort – merge algorithm next. Let M denote the number of page frames in the main-memory buffer (the number of disk blocks whose contents can be buffered in main memory).

1. In the first stage, a number of sorted runs are created; each run is sorted, but contains only some of the records of the relation.

image

2. In the second stage, the runs are merged. Suppose, for now, that the total number of runs, N, is less than M, so that we can allocate one page frame to each run and have space left to hold one page of output. The merge stage operates as follows:

read one block of each of the N files Ri into a buffer page in memory;

repeat

choose the first tuple (in sort order) among all buffer pages;

write the tuple to the output, and delete it from the buffer page;

if the buffer page of any run Ri is empty and not end-of-file(Ri)

then read the next block of Ri into the buffer page;

until all buffer pages are empty

The output of the merge stage is the sorted relation. The output file is buffered to reduce the number of disk write operations. The preceding merge operation is a generalization of the two-way merge used by the standard in-memory sort – merge algorithm; it merges N runs, so it is called an N-way merge.

In general, if the relation is much larger than memory, there may be M or more runs generated in the first stage, and it is not possible to allocate a page frame for each run during the merge stage. In this case, the merge operation proceeds in multiple passes. Since there is enough memory for M 1 input buffer pages, each merge can take M 1 runs as input.

The initial pass functions in this way: It merges the first M 1 runs (as described in item 2 above) to get a single run for the next pass. Then, it merges the next M 1 runs similarly, and so on, until it has processed all the initial runs. At this point, the number of runs has been reduced by a factor of M 1. If this reduced number of runs is still greater than or equal to M , another pass is made, with the runs created by the first pass as input. Each pass reduces the number of runs by a factor of M 1. The passes repeat as many times as required, until the number of runs is less than M ; a final pass then generates the sorted output.

Figure 13.3 illustrates the steps of the external sort – merge for an example relation.

For illustration purposes, we assume that only one tuple fits in a block (fr = 1), and we assume that memory holds at most three page frames. During the merge stage, two page frames are used for input and one for output.

image

We compute how many block transfers are required for the external sort merge in this way: Let br denote the number of blocks containing records of relation r. The first stage reads every block of the relation and writes them out again, giving a total of 2br disk accesses. The initial number of runs is pbr /M l. Since the number of runs decreases by a factor of M 1 in each merge pass, the total number of merge passes required is plogM 1(br /M )l. Each of these passes reads every block of the relation once and writes it out once, with two exceptions. First, the final pass can produce the sorted output without writing its result to disk. Second, there may be runs that are not read in or written out during a pass — for example, if there are M runs to be merged in a pass, M 1 are read in and merged, and one run is not accessed during the pass. Ignoring the (relatively small) savings due to the latter effect, the total number of disk accesses for external sorting of the relation is

image

Applying this equation to the example in Figure 13.3, we get a total of 12(4+1) = 60 block transfers, as you can verify from the figure. Note that this value does not include the cost of writing out the final result.

Comments

Popular posts from this blog

Concurrency Control:Shadow Paging

Choice of Evaluation Plans

Entity-Relationship Model part2