The Relational Data Model and Relational Database part2

PROJECT

PROJECT Operation is denoted by p (pi)

If we are interested in only certain attributes of relation, we use PROJECT

This operation keeps certain columns (attributes) from a relation and discards the other columns.

PROJECT creates a vertical partitioning

The list of specified columns (attributes) is kept in each tuple. The other attributes in each tuple are discarded.

Example: To list each employee’s first and last name and salary, the following is used:

∏LNAME, FNAME,SALARY(EMPLOYEE)

Examples of applying SELECT and PROJECT operations

clip_image003

Single expression versus sequence of relational operations

We may want to apply several relational algebra operations one after the other. Either we can write the operations as a single relational algebra expression by nesting the operations,

or

We can apply one operation at a time and create intermediate result relat ons. In the latter case, we must give names to the relations that hold the intermediate results.

To retrieve the first name, last name, and salary of all employees who work in department number 5, we must apply a select and a project operation

We can write a single relational algebra expression as follows:

∏FNAME, LNAME, SALARY(σ DNO=5(EMPLOYEE))

OR We can explicitly show the sequence of operations, giving a name to each intermediate relation:

DEP5_EMPS ← σDNO=5(EMPLOYEE)

RESULT ← ∏ FNAME, LNAME, SALARY (DEP5_EMPS)

Example of applying multiple operations and RENAME

clip_image006

RENAME

The RENAME operator is denoted by ρ (rho)

In some cases, we may want to rename the attributes of a relation or the relation name or both

Useful when a query requires multiple operations Necessary in some cases (see JOIN operation later)

RENAME operation – which can rename either the relation name or the attribute names, or both

The general RENAME operation ρ can be expressed by any of the following forms:

image

Relational Algebra Operations from Set Theory

• Union

• Intersection

• Minus

• Cartesian Product

UNION

It is a Binary operation, denoted by U

The result of R È S, is a relation that includes all tuples that are either in R or in S or in both R and S

Duplicate tuples are eliminated

The two operand relations R and S must be “type compatible” (or UNION compatible)

R and S must have same number of attributes

Each pair of corresponding attributes must be type compatible (have same or compatible domains)

Example:

To retrieve the social security numbers of all employees who either work in department 5 (RESULT1 below) or directly supervise an employee who works in department 5 (RESULT2 below)

image

Example of the result of a UNION operation

UNION Example

 

clip_image013

INTERSECTION

INTERSECTION is denoted by

The result of the operation R ∩ S, is a relation that includes all tuples that are in both R and S

The attribute names in the result will be the same as the attribute names in R

The two operand relations R and S must be “type compatible”

SET DIFFERENCE

SET DIFFERENCE (also called MINUS or EXCEPT) is denoted by – The result of R – S, is a relation that includes all tuples that are in R but not in S The attribute names in the result will be the same as the attribute names in R

The two operand relations R and S must be “type compatible”

Example to illustrate the result of UNION, INTERSECT, and DIFFERENCE

clip_image015

Some properties of UNION, INTERSECT, and DIFFERENCE

Notice that both union and intersection are commutative operations; that is

image

image

The following query results refer to this database state

clip_image022Example of applying CARTESIAN PRODUCT

image

Binary Relational Operations

Division

Join

Division

Interpretation of the division operation A/B:

- Divide the attributes of A into 2 sets: A1 and A2.

- Divide the attributes of B into 2 sets: B2 and B3.

- Where the sets A2 and B2 have the same attributes.

- For each set of values in B2:

- Search in A2 for the sets of rows (having the same A1 values) whose A2 values (taken together) form a set which is the same as the set of B2’s.

- For all the set of rows in A which satisfy the above search, pick out their A1 values and put them in the answer.

clip_image023

image

 

clip_image001[12]

clip_image028

JOIN

JOIN Operation (denoted by )

The sequence of CARTESIAN PRODECT followed by SELECT is used quite commonly to identify and select related tuples from two relations

This operation is very important for any relational database with more than a single relation, because it allows us combine related tuples from various relations

The general form of a join operation on two relations R(A1, A2, . . ., An) and S(B1, B2, . . ., Bm) is:

R               <join condition>S

where R and S can be any relations that result from general relational algebra expressions.

Example: Suppose that we want to retrieve the name of the manager of each department.

To get the manager’s name, we need to combine each DEPARTMENT tuple with the EMPLOYEE tuple whose SSN value matches the MGRSSN value in the department tuple.

DEPT_MGR ← DEPARTMENT MGRSSN=SSN EMPLOYEE

The following query results refer to this database state

clip_image019[1]Example of applying the JOIN operation

DEPT_MGR DEPARTMENT MGRSSN=SSN EMPLOYEE

clip_image038

The general case of JOIN operation is called a Theta-join:

R           theta S

The join condition is called theta

Theta can be any general boolean expression on the attributes of R and S; for example:

R.Ai<S.Bj AND (R.Ak=S.Bl OR R.Ap<S.Bq)

EQUIJOIN

The most common use of join involves join conditions with equality comparisons only Such a join, where the only comparison operator used is =, is called an EQUIJOIN.

The JOIN seen in the previous example was an EQUIJOIN

NATURAL JOIN

Another variation of JOIN called NATURAL JOIN — denoted by *

It was created to get rid of the second (superfluous) attribute in a condition.

Another example: Q ← R(A,B,C,D) * S(C,D,E)

EQUIJOIN

The implicit join condition includes each pair of attributes with the same name, “AND”ed together:

R.C=S.C AND R.D = S.D

Result keeps only one attribute of each such pair: Q(A,B,C,D,E)

Example: To apply a natural join on the DNUMBER attributes of DEPARTMENT and DEPT_LOCATIONS, it is sufficient to write:

DEPT_LOCS ← DEPARTMENT * DEPT_LOCATIONS

Only attribute with the same name is DNUMBER

An implicit join condition is created based on this attribute: DEPARTMENT.DNUMBER=DEPT_LOCATIONS.DNUMBER

clip_image043

Example of NATURAL JOIN operation

clip_image045

Complete Set of Relational Operations

The set of operations including SELECT σ, PROJECT ∏, UNION U, DIFFERENCE

- , RENAME ρ, and CARTESIAN PRODUCT X is called a complete se

because any

other relational algebra expression can be expressed by a combination of these five operations.

For example:

R ∩ S = (R U S ) – ((R - S) U (S - R))

R                <join condition>S = σ <join condition> (R X S)

Recap of Relational Algebra Operations

clip_image051NATURAL JOIN

Example: To apply a natural join on the DNUMBER attributes of DEPARTMENT and DEPT_LOCATIONS, it is sufficient to write:

DEPT_LOCS ← DEPARTMENT * DEPT_LOCATIONS

Only attribute with the same name is DNUMBER

An implicit join condition is created based on this attribute:

DEPARTMENT.DNUMBER=DEPT_LOCATIONS.DNUMBER

Aggregate Functions and Grouping

A type of request that cannot be expressed in the basic relational algebra is to specify mathematical aggregate functions on collections of values from the database.

Examples of such functions include retrieving the average or total salary of all employees or the total number of employee tuples.

Common functions applied to collections of numeric values include SUM, AVERAGE, MAXIMUM, and MINIMUM.

The COUNT function is used for counting tuples or values.

Use of the Aggregate Functional operation ζ

ζ MAX Salary (EMPLOYEE) retrieves the maximum salary value from the EMPLOYEE relation

ζ MIN Salary (EMPLOYEE) retrieves the minimum Salary value from the EMPLOYEE relation

ζ SUM Salary (EMPLOYEE) retrieves the sum of the Salary from the EMPLOYEE relation

ζCOUNT SSN, AVERAGE Salary (EMPLOYEE) computes the count (number) of employees and their average salary

Additional Relational Operations Outer Join

The OUTER JOIN Operation

In NATURAL JOIN and EQUIJOIN, tuples without a matching (or related) tuple are eliminated from the join result

Tuples with null in the join attributes are also eliminated This amounts to loss of information.

A set of operations, called OUTER joins, can be used when we want to keep all the tuples in R, or all those in S, or all those in both relations in the result of the join, regardless of whether or not they have matching tuples in the other relation.

The left outer join operation keeps every tuple in the first or left relation R in R S; if no matching tuple is found in S, then the attributes of S in the join result are filled or “padded” with null values.

A similar operation, right outer join, keeps every tuple in the second or right relation S in the result of R S.

A third operation, full outer join, denoted by keeps all tuples in both the left and the right relations when no matching tuples are found, padding them with null values as needed.

Left Outer Join

E.g. List all employees and the department they manage, if they manage a department.

clip_image053

Outer join

clip_image055

Left outer,rightouter and full outer join

clip_image057

Examples of Queries in Relational Algebra

Q1: Retrieve the name and address of all employees who work for the ‘Research’ department.

RESEARCH_DEPT ← σ DNAME=’Research’ (DEPARTMENT)

RESEARCH_EMPS ← (RESEARCH_DEPT DNUMBER= DNOEMPLOYEE EMPLOYEE) RESULT ← ∏ FNAME, LNAME, ADDRESS (RESEARCH_EMPS)

• Q6: Retrieve the names of employees who have no dependents.

ALL_EMPS ← ∏ SSN(EMPLOYEE) EMPS_WITH_DEPS(SSN) ← ∏ ESSN(DEPENDENT) EMPS_WITHOUT_DEPS ← (ALL_EMPS - EMPS_WITH_DEPS)

RESULT ← ∏ LNAME, FNAME (EMPS_WITHOUT_DEPS * EMPLOYEE)

Comments

Popular posts from this blog

XML Document Schema

Extended Relational-Algebra Operations.

Distributed Databases:Concurrency Control in Distributed Databases