Entity Relationship Diagrams

GOAL – How do we model our universe?

Three types of elements:

  • Entity sets
  • Attributes
  • Relationships

Suppose you are working on a project

Step 0: pick an application domain
  • we will talk about this later
Step 1: conceptual design
  • discuss with your team mates what to model in the application domain
  • need a modeling language to express what you want
  • ER model is the most popular such language
  • output: an ER diagram of the app. Domain
Step 2: pick a type of DBMS
  • relational DBMS is most popular and is our focus
Step 3: translate ER design to a relational schema
  • use a set of rules to translate from ER to rel. schema
  • use a set of schema refinement rules to transform the above rel. schema into a good rel. schema

At this point you have a good relational schema on paper

Subsequent steps include
  • implement your relational DBMS using a “database programming language” called SQL
  • ordinary users cannot interact with the database directly
  • and the database also cannot do everything you want
  • hence write your application program in C++, Java, Perl, etc to handle the interaction and take care of things that the database cannot do

So, the first thing we should start with is to learn ER model

How do we specify: what information the DB must hold? And what are the relationships among components of that information?

What we will cover

  • basic things
  • constraints
  • weak entity sets
  • design principles

Entities

  • real-world objects distinguishable from other objects
  • described using a set of attributes

Attributes

  • each has an atomic domain: string, integers, reals, etc.

Entity set

  • a collection of similar entities

Relationships

A mathematical definition:

if $A$, $B$ are sets then relation $R$ is a subset of $A \times B$

$A=\{1,2,3\}, B=\{a,b,c,d\}$, then $R = \{(1,a),(1,c),(3,b)\}$

Makes is a subset of Product $\times$ Company:

Relationship cardinality

Important Note: “one” means at most one, could be zero

What are some examples of each case:

  • One-to-one
  • Many-to-one/one-to-many
  • Many-to-many

Multiway relationship

There can be attributes on a relationship

Decision: Product — as an entity set or an attribute?

Can we model this as a mathematical set? How?

Convert to Binary:

Roles in Relationships

What if we need an entity set twice in one relationship?

Arrows in multi-way relationships?

What does the arrow mean?

Only one salesperson can be on a purchase. But many buyers can participate in a purchase.

Constraints

A constraint = an assertion about the database that must be true at all times

Very important part of the database schema

Finding constraints is part of the modeling process. Commonly used constraints:

  • Keys: social security number uniquely identifies a person.
  • Single-value constraints:  a person can have only one mother.
  • Referential integrity constraints: if you work for a company, it must exist in the database.
  • Domain constraints:  peoples’ ages are between 0 and 150.
  • General constraints: all others (at most 50 students enroll in a class)

Constraints give more semantics to the data – help us better understand it

Allow us to refer to entities (e.g., using keys)

Enable efficient storage, data lookup, etc.

Keys

A key is the thing that uniquely identifies an entity within an entityset.

Underline to refer to keys:

What are the keys?

Every entity set must have a key

Why?

A key can consist of more than one attribute, we call these multi attribute keys.

There can be more than one key for an entity set, called candidate keys: one key will be designated as primary key

Referential Integrity

Exactly one value exists in a given role

The Referential Integrity Constraint on relationships explicitly requires a reference to exist e.g., no “dangling pointers” like in C++

Many to one, but one can also be zero.

Referential integrity makes sure that a product is made by one and only one company, eg.

Draw ER diagram for the entities: Movies, Studios, CEOs

Assumptions: Every CEO runs a Studio, if a studio ceases to exist, then the CEO ceases to be CEO. If a CEO is fired or resigns, then the Studio still exists.