Relational Algebra Expressions

Relational Algebra

Basic:

$R$ $\sigma_{C}(R)$
$\Pi_{(a_1\ldots,a_k )}(R)$
$R_1\times R_2$
$R_1\cup R_2$
$R_1-R_2$
$\rho_{S(a_1,\ldots,a_n )}(R)$

Derived:

$R_1\bowtie R_2$
$R_1 \bowtie_\theta R_2$
$R_1\cap R_2$

Additional:

$\gamma_{(a,f(b))}(R)$
$\delta(R)$

Building Complex Expressions

Algebras allow us to express sequences of operations in a natural way.

Example:

  • in arithmetic algebra: (x + 4)*(y – 3)
  • in stack “algebra”: T.push(S.pop())

Relational algebra allows the same.

Three notations, just as in arithmetic:

  1. Sequences of assignment statements.
  2. Expressions with several operators.
  3. Expression trees.

Sequences of Assignments

Create temporary relation names. Renaming can be implied by giving relations a list of attributes.

Example: $R_3 = R_1 \bowtie_\theta R_2$ can be written:
$R_4=R_1\times R_2$
$R_3 =\sigma_\theta (R_4)$

Find Salespersons who have a commission less than or equal to 11 and are managed by manager with id of 44.

$R=\sigma_{(\textrm{“comm”}<11\wedge\textrm{“manager_id”}=44)} (\textrm{Salesperson})$

Find Names of Customers who bought an item on February 12th.

$R = \textrm{Sale}\bowtie\textrm{Customer}$
$S= \sigma_{(\textrm{date=02/12})} (R)$
$T =\Pi_{(\textrm{cust_name})} (S)$

Expressions with several operators

Example: the theta-join $R_3 = R_1 \bowtie_\theta R_2$ can be written $R_3 =\sigma_\theta (R_1\times R_2)$

Precedence of relational operators:

  1. Unary operators — select, project, rename — have highest precedence, bind first.
  2. Then come products and joins.
  3. Then intersection.
  4. Finally, union and set difference bind last.

But you can always insert parentheses to force the order you desire.

Find Products made in Peru

$R=\Pi_{(\textrm{prod_desc})} (\textrm{Product}\bowtie(\sigma_{(\textrm{country=’Peru’})} \textrm{Manufacturer}))$

Expression Trees

  • Leaves are operands — either variables standing for relations or particular, constant relations.
  • Interior nodes are operators, applied to their child or children.

Find Salespersons who sold something to Customers with less than 50000 in the bank.

$\Pi_{(\textrm{salpers_name})} (\textrm{Salesperson}\bowtie(\textrm{Sale}\bowtie\sigma_{(\textrm{curbal}<50000)} \textrm{Customers}))$

Example: Find all manufacturer ids that sell a product for less than $25 made in Peru.

Find Customers or Salespersons who live in Tokyo

$\rho_{\textrm{name}} (\Pi_{\textrm{cust_name}} (\sigma_{(\textrm{city =’Tokyo’})}(\textrm{Customer}))) \cup \rho_\textrm{name} (\Pi_{\textrm{salpers_name}} (\sigma_{(\textrm{office=’Tokyo’})}(\textrm{Salesperson})))$

Operations on Bags

(and why we care)

Relational Algebra uses set-semantics, which is why $\Pi$ eliminates duplicate tuples.

Databases actually use bag-semantics:

Union: $\{a,b,b,c\}\cup\{a,b,b,b,e,f,f\} = \{a,a,b,b,b,b,b,c,e,f,f\}$
(add the number of occurrences)

Difference: $\{a,b,b,b,c,c\}-\{b,c,c,c,d\} = \{a,b,b\}$
(subtract the number of occurrences)

Intersection: $\{a,b,b,b,c,c\}\cap\{b,b,c,c,c,c,d\} = \{b,b,c,c\}$
(minimum of the two numbers of occurrences)

Selection: preserve the number of occurrences

Projection: preserve the number of occurrences (no duplicate elimination in real databases)

Cartesian product, join: no duplicate elimination

Summary of Relational Algebra

Why bother? Can write any RA expression directly in C++/Java, seems easy.

Two reasons:

  1. Each operator admits sophisticated implementations (think of $\bowtie$,$\sigma_theta$)
  2. Expressions in relational algebra can be rewritten: optimized