When designing a database, the first question is simple but profound:
How do we model our universe?
We want to represent what exists, what properties those things have, and how they are related. ER modeling provides a formal, systematic way to do this.
Three Key Components
All ER models revolve around three types of elements:
- Entity Sets – collections of real-world objects or concepts
- Attributes – properties that describe entities or relationships
- Relationships – associations between entities
Steps in Database Design
When working on a project, we typically follow a multi-step process:
Step 0: Pick an application domain
- What is the world we are modeling? A library, a university, a movie studio, etc.
Step 1: Conceptual design
- Discuss with your team what to model in this domain
- Use a modeling language to express what information the database must hold
- ER Model is the most popular language for conceptual design
- Output: an ER diagram representing your domain
Step 2: Pick a DBMS type
- Most common: Relational DBMS, which will be the focus of this course
Step 3: Translate ER design to a relational schema
- Use standard rules to convert entities, relationships, and attributes into tables
- Refine the schema to reduce redundancy and avoid anomalies
Subsequent steps:
- Implement the schema in a database using SQL
- Write an application program (C++, Java, Python, etc.) to handle user interaction and logic that the database cannot enforce
The ER Model in Detail
Entities
- A real-world object that is distinguishable from other objects
- Represented as a rectangle in an ER diagram
- Described using a set of attributes
Examples:
Student
,Course
,Instructor
,Movie
,Studio
Attributes

- Each attribute has an atomic domain: integer, string, real, date, etc.
- Attributes describe entities or relationships
- Keys: an attribute (or combination) that uniquely identifies an entity
Examples of attributes for Movie
:
- Title, ReleaseYear, Genre, Rating
Entity Sets
- A collection of similar entities
- Example: the set of all
Students
or allMovies
Relationships
- Define associations between entities
- Mathematically: if $A$ and $B$ are sets, a relationship $R \subseteq A \times B$
- Example:
- $A = {1,2,3}$ (students), $B = {a,b,c,d}$ (courses)
- $R = {(1,a),(1,c),(3,b)}$
- Relationships can have attributes: e.g.,
Enrollment
hasGrade
andTerm

Makes is a subset of Product $\times$ Company:
Relationship Cardinality
- Cardinality expresses the number of instances of one entity that can be associated with another
Types:

Important Note: “one” means at most one, could be zero
What are some examples of each case:
- One-to-one
- Many-to-one/one-to-many
- Many-to-many
Multiway relationship

There can be attributes on a relationship

Decision: Product — as an entity set or an attribute?

Can we model this as a mathematical set? How?
Convert to Binary:

Roles in Relationships
What if we need an entity set twice in one relationship?

Arrows in multi-way relationships?
What does the arrow mean?
Only one salesperson can be on a purchase. But many buyers can participate in a purchase.
Constraints
Constraints restrict the data in the database and give meaning to the schema. They are essential for correct design.
Common constraints include:
- Keys: uniquely identify entities
- E.g.,
SSN
uniquely identifies a person
- E.g.,
- Single-value constraints: an entity attribute can have only one value
- E.g., a person has only one mother
- Referential integrity constraints: ensure referenced entities exist
- E.g., a Product must be made by an existing Company
- Domain constraints: restrict values of attributes
- E.g., Age $\in [0, 150]$
- General constraints: other rules
- E.g., a course may have at most 50 students
Constraints improve:
- Data integrity
- Query efficiency
- Clarity of the domain
Keys
- Primary key: uniquely identifies an entity
- Candidate key: multiple attributes that could serve as the key
- Multi-attribute keys: keys composed of more than one attribute
Underline to refer to keys:

What are the keys?

Every entity set must have a key
Why?
A key can consist of more than one attribute, we call these multi attribute keys.
There can be more than one key for an entity set, called candidate keys: one key will be designated as primary key
Referential Integrity
- Ensures references in relationships are valid
- Example: Each
Product
is produced by exactly one Company - Prevents “dangling pointers” like in traditional programming

Many to one, but one can also be zero.
Referential integrity makes sure that a product is made by one and only one company, eg.

Draw ER diagram for the entities: Movies, Studios, CEOs

Assumptions: Every CEO runs a Studio, if a studio ceases to exist, then the CEO ceases to be CEO. If a CEO is fired or resigns, then the Studio still exists.