relational-algebra

Definition

Relational Algebra

Relational algebra is a theory that uses algebraic structure for modeling data and defining queries on it with well founded semantics. It was introduced by Edgar Frank Codd.

Domain

A domain is a set of possible values for an attribute.

Relation

Definition

Relation

Let be domains. A relation is a tuple based on the cartesian product of multiple, possibly identical, domains.

Example:

Link to original

Relation Schema

Definition

Schema (Relational Algebra)

A schema of a relation describes the structure of data in a relation and is denoted as or as .

The notation is defined as follows:

where is an attribute of with domain .

Example:

Link to original

Arity

Arity

The number of attributes of a relation is called arity.

Attribute

Operations

Selection

The unary selection operator selects tuples of relation based on the given conditions and maps them to a relation with the same schema but with only the selected tuples.

where are conditions of which tuples (rows) are selected.

Projection

The unary project operator takes all attributes of that are specified in the projection’s index, meaning .

Note that the projection operator allows simple arithmetic operations on attributes, e.g.:

Rename

Renames the relation to .

To select an attribute name of renamed relation S, we denote:

Grouping

where are the attributes to group by and are the aggregation functions to apply per group.

Example: Given a relation :

To retrieve the number of students per semester, we denote:

Union

Union Compatibility

Two relations and are compatible if they have the same number of attributes with the same same domains, whereby the attribute names are not considered.

Intersection

Join

Joins are a way to associate two relations and based on certain conditions. See joins.

Division

Joins

Commutativity: due to different output schemas.

Theta Join

The theta join links two relations and based on the given condition .

In theta joins, only tuples of both relations that can be joined are kept.

Notation:

Equi Join: A theta join is called equi join if the theta condition only checks for equality.

Natural Join

The natural join links two relations and based on equality of equally named attributes.

Notation:

In natural joins, only tuples of both relations that can be joined are kept.

Outer Joins

Outer joins are about how to handle tuples that cannot be joined.

Left Outer Join

In left outer joins , the left tuples of relation are kept even if they cannot be joined with the right tuples of relation .

Notation:

Right Outer Join

In right outer joins , the right tuples of relation are kept even if they cannot be joined with the left tuples of relation .

Notation:

Full Outer Join

In full outer joins , the tuples of relation and are kept even if they cannot be joined.

Notation:

Semi Joins

Left Semi Join

The left semi join maps all tuples of that could be joined with at least a tuple in .

The schema of .

Right Semi Join

The right semi join maps all tuples of that could be joined with at least a tuple in .

The schema of .

Aggregation Functions

Count

Counts the number of elements per group.

Sum

Sums up the values of an attribute per group.

Min

Finds the minimum value of an attribute per group.

Max

Finds the maximum value of an attribute per group.

Avg

Calculates the average value of an attribute per group.