Welcome!
Please look over the Syllabus. We will be using Canvas and Gradescope to manage the class. Also homework assignments will be posted to this site.
What is a Database?
System that stores data
What databases can you name?
Smart people think about how data is organized.
There is rigor and principles behind this. We will talk about those principles in this class
Let’s consider a simple table:
Student Name | SID | Major | Course | CRN | Term | Instructor |
---|---|---|---|---|---|---|
Alice | 9019958 | CS | CSE30246 | 12028 | F24 | Weninger |
Robert | 9025547 | CPEG | CSE40497 | 27364 | S23 | Weninger |
Felicity | 9015863 | CS | CSE30246 | 12028 | F22 | Weninger |
James | 9018654 | EE | CSE60246 | 12029 | F24 | Weninger |
What happens if James takes another course?
Student Name | SID | Major | Course | CRN | Term | Instructor |
---|---|---|---|---|---|---|
Alice | 9019958 | CS | CSE30246 | 12028 | F24 | Weninger |
Robert | 9025547 | CPEG | CSE40497 | 27364 | S23 | Weninger |
Felicity | 9015863 | CS | CSE30246 | 12028 | F22 | Weninger |
James | 9018654 | EE | CSE60246 | 12029 | F24 | Weninger |
James | 9018654 | EE | CSE30332 | 11013 | F24 | Morrison |
Lots of duplicate information inserted. CRN magically appears, Prof. McMillan apparently gets hired.
What happens if James switches from EE to CPEG?
Student Name | SID | Major | Course | CRN | Term | Instructor |
---|---|---|---|---|---|---|
Alice | 9019958 | CS | CSE30246 | 12028 | F24 | Weninger |
Robert | 9025547 | CPEG | CSE40497 | 27364 | S23 | Weninger |
Felicity | 9015863 | CS | CSE30246 | 12028 | F22 | Weninger |
James | 9018654 | CPEG | CSE60246 | 12029 | F24 | Weninger |
James | 9018654 | CPEG | CSE30332 | 11013 | F24 | Morrison |
We have to go down the whole list of majors and update all instances of when James was an EE.
Do we update F20 and S21? If not then now apparently EE no longer exists.
What happens if James drops CSE30332?
Student Name | SID | Major | Course | CRN | Term | Instructor |
---|---|---|---|---|---|---|
Alice | 9019958 | CS | CSE30246 | 12028 | F24 | Weninger |
Robert | 9025547 | CPEG | CSE40497 | 27364 | S23 | Weninger |
Felicity | 9015863 | CS | CSE30246 | 12028 | F22 | Weninger |
James | 9018654 | CPEG | CSE60246 | 12029 | F24 | Weninger |
Does Prof McMillan get fired? He has two small children!
These are called Data Anomalies
Data Anomalies
There are three types of Data Anomalies: Update Anomalies, Insertion Anomalies, and Deletion Anomalies.
Update Anomalies happen when some records get changed and some do not. For example, when James updated his major from EE to CPEG – what data elements do we change? What if we forget some?
If the data is stored redundantly in the same table, and the person misses any of them, then there will be multiple majors associated with the student. Things can quickly spiral out of control.
Insertion Anomalies happen when inserting data into the database is not possible because other data is not already there. For example, when James enrolled in CSE30332 we had to invent a course and a professor to teach it.
Another example is if your system is designed to require that a customer be on file before a sale can be made to that customer, but you cannot add a customer until they have bought something, then you have an insert anomaly. It is the classic “catch-22” situation.
Deletion Anomalies happen when the deletion of unwanted information causes desired information to be deleted as well. For example, when James drops CSE30332, we apparently have to delete Prof McMillan from the system as well.
How do we prevent anomalies in a database system?
There exist principles for how to conceptually manage and store data to make life easier for yourself and users of database systems. We will explore those in this class.