The Relational Data Model and Relational Database part2
PROJECT
PROJECT Operation is denoted by p (pi)
If we are interested in only certain attributes of relation, we use PROJECT
This operation keeps certain columns (attributes) from a relation and discards the other columns.
PROJECT creates a vertical partitioning
The list of specified columns (attributes) is kept in each tuple. The other attributes in each tuple are discarded.
Example: To list each employee’s first and last name and salary, the following is used:
∏LNAME, FNAME,SALARY(EMPLOYEE)
Examples of applying SELECT and PROJECT operations
Single expression versus sequence of relational operations
We may want to apply several relational algebra operations one after the other. Either we can write the operations as a single relational algebra expression by nesting the operations,
or
We can apply one operation at a time and create intermediate result relat ons. In the latter case, we must give names to the relations that hold the intermediate results.
To retrieve the first name, last name, and salary of all employees who work in department number 5, we must apply a select and a project operation
We can write a single relational algebra expression as follows:
∏FNAME, LNAME, SALARY(σ DNO=5(EMPLOYEE))
OR We can explicitly show the sequence of operations, giving a name to each intermediate relation:
DEP5_EMPS ← σDNO=5(EMPLOYEE)
RESULT ← ∏ FNAME, LNAME, SALARY (DEP5_EMPS)
Example of applying multiple operations and RENAME
RENAME
The RENAME operator is denoted by ρ (rho)
In some cases, we may want to rename the attributes of a relation or the relation name or both
Useful when a query requires multiple operations Necessary in some cases (see JOIN operation later)
RENAME operation – which can rename either the relation name or the attribute names, or both
The general RENAME operation ρ can be expressed by any of the following forms:
Relational Algebra Operations from Set Theory
• Union
• Intersection
• Minus
• Cartesian Product
UNION
It is a Binary operation, denoted by U
The result of R È S, is a relation that includes all tuples that are either in R or in S or in both R and S
Duplicate tuples are eliminated
The two operand relations R and S must be “type compatible” (or UNION compatible)
R and S must have same number of attributes
Each pair of corresponding attributes must be type compatible (have same or compatible domains)
Example:
To retrieve the social security numbers of all employees who either work in department 5 (RESULT1 below) or directly supervise an employee who works in department 5 (RESULT2 below)
Example of the result of a UNION operation
UNION Example
INTERSECTION
INTERSECTION is denoted by ∩
The result of the operation R ∩ S, is a relation that includes all tuples that are in both R and S
The attribute names in the result will be the same as the attribute names in R
The two operand relations R and S must be “type compatible”
SET DIFFERENCE
SET DIFFERENCE (also called MINUS or EXCEPT) is denoted by – The result of R – S, is a relation that includes all tuples that are in R but not in S The attribute names in the result will be the same as the attribute names in R
The two operand relations R and S must be “type compatible”
Example to illustrate the result of UNION, INTERSECT, and DIFFERENCE
Some properties of UNION, INTERSECT, and DIFFERENCE
Notice that both union and intersection are commutative operations; that is
The following query results refer to this database state
Example of applying CARTESIAN PRODUCT
Binary Relational Operations
• Division
• Join
Division
Interpretation of the division operation A/B:
- Divide the attributes of A into 2 sets: A1 and A2.
- Divide the attributes of B into 2 sets: B2 and B3.
- Where the sets A2 and B2 have the same attributes.
- For each set of values in B2:
- Search in A2 for the sets of rows (having the same A1 values) whose A2 values (taken together) form a set which is the same as the set of B2’s.
- For all the set of rows in A which satisfy the above search, pick out their A1 values and put them in the answer.
JOIN
JOIN Operation (denoted by )
The sequence of CARTESIAN PRODECT followed by SELECT is used quite commonly to identify and select related tuples from two relations
This operation is very important for any relational database with more than a single relation, because it allows us combine related tuples from various relations
The general form of a join operation on two relations R(A1, A2, . . ., An) and S(B1, B2, . . ., Bm) is:
R <join condition>S
where R and S can be any relations that result from general relational algebra expressions.
Example: Suppose that we want to retrieve the name of the manager of each department.
To get the manager’s name, we need to combine each DEPARTMENT tuple with the EMPLOYEE tuple whose SSN value matches the MGRSSN value in the department tuple.
DEPT_MGR ← DEPARTMENT MGRSSN=SSN EMPLOYEE
The following query results refer to this database state
Example of applying the JOIN operation
DEPT_MGR ← DEPARTMENT MGRSSN=SSN EMPLOYEE
The general case of JOIN operation is called a Theta-join:
R theta S
The join condition is called theta
Theta can be any general boolean expression on the attributes of R and S; for example:
R.Ai<S.Bj AND (R.Ak=S.Bl OR R.Ap<S.Bq)
EQUIJOIN
The most common use of join involves join conditions with equality comparisons only Such a join, where the only comparison operator used is =, is called an EQUIJOIN.
The JOIN seen in the previous example was an EQUIJOIN
NATURAL JOIN
Another variation of JOIN called NATURAL JOIN — denoted by *
It was created to get rid of the second (superfluous) attribute in a condition.
Another example: Q ← R(A,B,C,D) * S(C,D,E)
EQUIJOIN
The implicit join condition includes each pair of attributes with the same name, “AND”ed together:
R.C=S.C AND R.D = S.D
Result keeps only one attribute of each such pair: Q(A,B,C,D,E)
Example: To apply a natural join on the DNUMBER attributes of DEPARTMENT and DEPT_LOCATIONS, it is sufficient to write:
DEPT_LOCS ← DEPARTMENT * DEPT_LOCATIONS
Only attribute with the same name is DNUMBER
An implicit join condition is created based on this attribute: DEPARTMENT.DNUMBER=DEPT_LOCATIONS.DNUMBER
Example of NATURAL JOIN operation
Complete Set of Relational Operations
The set of operations including SELECT σ, PROJECT ∏, UNION U, DIFFERENCE
- , RENAME ρ, and CARTESIAN PRODUCT X is called a complete se
because any
other relational algebra expression can be expressed by a combination of these five operations.
For example:
R ∩ S = (R U S ) – ((R - S) U (S - R))
R <join condition>S = σ <join condition> (R X S)
Recap of Relational Algebra Operations
Example: To apply a natural join on the DNUMBER attributes of DEPARTMENT and DEPT_LOCATIONS, it is sufficient to write:
DEPT_LOCS ← DEPARTMENT * DEPT_LOCATIONS
Only attribute with the same name is DNUMBER
An implicit join condition is created based on this attribute:
DEPARTMENT.DNUMBER=DEPT_LOCATIONS.DNUMBER
Aggregate Functions and Grouping
A type of request that cannot be expressed in the basic relational algebra is to specify mathematical aggregate functions on collections of values from the database.
Examples of such functions include retrieving the average or total salary of all employees or the total number of employee tuples.
Common functions applied to collections of numeric values include SUM, AVERAGE, MAXIMUM, and MINIMUM.
The COUNT function is used for counting tuples or values.
Use of the Aggregate Functional operation ζ
ζ MAX Salary (EMPLOYEE) retrieves the maximum salary value from the EMPLOYEE relation
ζ MIN Salary (EMPLOYEE) retrieves the minimum Salary value from the EMPLOYEE relation
ζ SUM Salary (EMPLOYEE) retrieves the sum of the Salary from the EMPLOYEE relation
ζCOUNT SSN, AVERAGE Salary (EMPLOYEE) computes the count (number) of employees and their average salary
Additional Relational Operations Outer Join
The OUTER JOIN Operation
In NATURAL JOIN and EQUIJOIN, tuples without a matching (or related) tuple are eliminated from the join result
Tuples with null in the join attributes are also eliminated This amounts to loss of information.
A set of operations, called OUTER joins, can be used when we want to keep all the tuples in R, or all those in S, or all those in both relations in the result of the join, regardless of whether or not they have matching tuples in the other relation.
The left outer join operation keeps every tuple in the first or left relation R in R S; if no matching tuple is found in S, then the attributes of S in the join result are filled or “padded” with null values.
A similar operation, right outer join, keeps every tuple in the second or right relation S in the result of R S.
A third operation, full outer join, denoted by keeps all tuples in both the left and the right relations when no matching tuples are found, padding them with null values as needed.
Left Outer Join
E.g. List all employees and the department they manage, if they manage a department.
Outer join
Left outer,rightouter and full outer join
Examples of Queries in Relational Algebra
• Q1: Retrieve the name and address of all employees who work for the ‘Research’ department.
RESEARCH_DEPT ← σ DNAME=’Research’ (DEPARTMENT)
RESEARCH_EMPS ← (RESEARCH_DEPT DNUMBER= DNOEMPLOYEE EMPLOYEE) RESULT ← ∏ FNAME, LNAME, ADDRESS (RESEARCH_EMPS)
• Q6: Retrieve the names of employees who have no dependents.
ALL_EMPS ← ∏ SSN(EMPLOYEE) EMPS_WITH_DEPS(SSN) ← ∏ ESSN(DEPENDENT) EMPS_WITHOUT_DEPS ← (ALL_EMPS - EMPS_WITH_DEPS)
RESULT ← ∏ LNAME, FNAME (EMPS_WITHOUT_DEPS * EMPLOYEE)
Comments
Post a Comment