relational-algebra

Definition

Relational Algebra

Relational algebra is a theory that uses algebraic structure for modeling data and defining queries on it with well founded semantics. It was introduced by Edgar Frank Codd.

Domain

A domain is a set of possible values for an attribute.

Relation

Definition

Relation (Relational Algebra)

Let be domains. A relation is a set of tuples based on the Cartesian product of multiple, possibly identical, domains.

Example:

Link to original

Relation Schema

Definition

Schema (Relational Algebra)

A schema or of a relation describes the structure of data in a relation and is denoted as or as .

The notation is defined as follows:

where is an attribute of with domain .

Example:

Link to original

Arity

Arity

The number of attributes of a relation is called arity.

Attribute

Operations

Definition

Selection (Relational Algebra)

The selection operation performs a filter on a relation and maps it to a new filtered relation , i.e.

Link to original

Projection

Definition

Projection (Relational Algebra)

The projection operation selects a subset of attribute from a relation and maps it to a new relation containing only those attributes, i.e.

Where is a new tuple containing only the values from the attribute specified in the list for a given tuple . Since the resulting relation is a set, any duplicate tuple that arise from this operation are automatically eliminated.

Note that there’s also the possibility for extended projection, meaning specifying expressions instead of attributes, i.e.:

Link to original

Rename

Definition

Rename (Relational Algebra)

The rename operation, denoted by , modifies the schema of a relation without altering the data within its tuples. It produces a new relation with either a new name for the relation itself or new names for its attributes. It is used in two primary forms:

  1. Renaming a Relation
    The expression renames the relation to . The new relation has the identical schema and contains the exact same set of tuples as .

  2. Renaming Attributes
    The expression produces a new relation where the attribute from is renamed to . This can be extended to rename multiple attributes at once, e.g., .

The values and the number of tuples remain unchanged, only the attribute names in the schema are modified.

Link to original

Grouping and Aggregation

Definition

Grouping and Aggregation (Relational Algebra)

The grouping and aggregation operation, denoted by , is an operator that first partitions tuples of a relation into groups and then computes a single or multiple summary value for each group using an aggregate function.

This process involves two main steps:

  1. Grouping: Tuples that have the same values for all attributes in a specified grouping list are placed into the same group.
  2. Aggregation: An aggregate function (like SUM, COUNT, AVG, MIN, MAX) is applied to an attribute of the tuples within each group, producing a single value per group.

where:

  • is the list of attributes to group by
  • is the aggregate function to be applied to each group. This also defines the new attribute for the aggregate value in the result schema (e.g., )
  • is the input relation

The resulting relation consists of the grouping attributes in and the new attribute(s) generated by the aggregate function(s) in .

Link to original

Union

Definition

Union (Relational Algebra)

The union operation between two union compatible relations is the set of all tuples contained by or , i.e.

Link to original

Intersection

Definition

Intersection (Relational Algebra)

The intersection operation between two union compatible relations is the set of all tuples contained by both and , i.e.

Link to original

Difference

Definition

Difference (Relational Algebra)

The difference operation between two union compatible relations is the set of all tuples that are in but not in , i.e.

Link to original

Join

Definition

Join (Relational Algebra)

Joins are a way to associate two relations and based on certain conditions.

Link to original

Division

Definition

Relational Division (Relational Algebra)

The relational division operation, denoted by , is typically used to answer queries that involve the phrase “for all”. It finds all tuples in one relation that are associated with every single tuple from another relation .

The division operation can be expressed using projection, difference, and the Cartesian Product:


Link to original

Joins

Commutativity: due to different output schemas.

Theta Join

Definition

Theta Join (Relational Algebra)

The theta join combines two relations via a joining predicate by combining only tuples for which holds.

Link to original

Natural Join

Definition

Natural Join (Relational Algebra)

The natural join combines two relations via common attributes (same names and domains) by combining only tuples with the same values for common attributes.

Given two relations and , the natural join can be expressed using projection, selection and the Cartesian Product:

Link to original

Outer Joins

Definition

Outer Join

Outer joins deal with unmatched tuples without matching join partners in the other relation (“dangling tuples”).

Link to original

Left Outer Join

Definition

Left Outer Join

In left outer joins , the left tuples of relation are kept even if they cannot be joined with the right tuples of relation .

Link to original

Right Outer Join

Definition

Right Outer Join (Relational Algebra)

In right outer joins , the right tuples of relation are kept even if they cannot be joined with the left tuples of relation .

Link to original

Full Outer Join

Definition

Full Outer Join (Relational Algebra)

In full outer joins , the tuples of relation and are kept even if they cannot be joined.

Link to original

Semi Joins

Definition

Semi Join (Relational Algebra)

A semi join finds all tuples in a relation for which there are matching tuples in the other relation.

Link to original

Left Semi Join

Definition

Left Semi Join (Relational Algebra)

The left semi join maps all tuples of that could be joined with at least a tuple in .

The schema of .

Link to original

Right Semi Join

Definition

Right Semi Join (Relational Algebra)

The right semi join maps all tuples of that could be joined with at least a tuple in .

The schema of .

Link to original