Homework 3

I’ve created a database of college football games and plays from 2005 to 2013. I got this data from Kaggle. You will use this database to complete this assignment and submit this homework to Sakai by the due date as a file named netid_cse30246_hw3.sql. We will test run your queries on a modified version of the database.

Log in to db.cse.nd.edu, log in to mysql and use the cse30246 database. The class-lecture tables are in this database as are the college football tables, prefixed by cfb_

These are the tables:

mysql> use cse30246;
Database changed
mysql> show tables;
+--------------------+
| Tables_in_cse30246 |
+--------------------+
| cfb_conference     |
| cfb_drive          |
| cfb_game           |
| cfb_game_stats     |
| cfb_kickoff        |
| cfb_kickoff_return |
| cfb_pass           |
| cfb_play           |
| cfb_player         |
| cfb_punt           |
| cfb_punt_return    |
| cfb_reception      |
| cfb_rush           |
| cfb_stadium        |
| cfb_team           |
+--------------------+

We’re going to use these tables to answer interesting questions and hone our SQL skills.

For example, if I want to know the size of Notre Dame Stadium I can ask the following query:

mysql> select capacity from cfb_stadium where name = 'Notre Dame Stadium';
+----------+
| capacity |
+----------+
|    80795 |
+----------+

For each question, write a single query to return the answer to the following questions.

1. 05pts How many total spectators have seen a game in this dataset?

+------------------+
| total_attendance |
+------------------+
| 332382121        |
+------------------+

2. 05pts On what down do most penalties occur?

+------+
| down |
+------+
| 1    |
+------+

3. 10pts The Red zone is when the ball is within the 20 yards of the goal line. A red zone attempt is when the team moves inside the red zone, and a successful red zone attempt is when the team scores a touchdown. Which 5 teams with over 10 red zone attempts have the best red zone success rate (%) and what was their success rate?

+---------------+--------+
| name          | rate   |
+---------------+--------+
| Towson        | 0.7647 |
| South Dakota  | 0.7500 |
| New Hampshire | 0.7407 |
| Indiana State | 0.7143 |
| Liberty       | 0.7143 |
+---------------+--------+

4. 10pts Which conference has the shortest games and how long are they on average?

+-+---------+----------+
| name      | duration |
+-----------+----------+

<No Hint>

5. 10pts Which player had the most yards on 1st down in December.

+------------+-----------+
| first_name | last_name |
+------------+-----------+
| Case       | Keenum    |
+------------+-----------+

6. 10pts From which state does Notre Dame receive most of its players? How many does that state send, and what percentage of ND players come from that state?

+-----------+-----+-----------+
| homestate | cnt | perc      |
+-----------+-----+-----------+

<No Hint>

7. 5pts Players typically have only 4 years of eligibility to play in college. How many players have played for 3 or more teams?

+----------+
| count(*) |
+----------+
|       10 |
+----------+

8. 10pts Which stadium has, on average, the longest punt returns?

+---------------+---------+
| name          | yds     |
+---------------+---------+

<No Hint>

9. 10pts Which stadium has witnessed their home team lose the most times?

+-------------------------------------------+-------------+
| name                                      | home_losses |
+-------------------------------------------+-------------+

<No Hint>

10. 15pts Which player(s) had the longest reception in the 4th quarter. What team where they on and what year was it? (breaking non-trivial ties is 5 of the 15 points, (i.e 10 points with LIMIT, 15 points without LIMIT)

+------------+-----------+------------------+------+
| first_name | last_name | name             | year |
+------------+-----------+------------------+------+
| Amari      | Cooper    | Alabama          | 2013 |
| Johnny     | Forzani   | Washington State | 2009 |
| Jamison    | Crowder   | Duke             | 2012 |
+------------+-----------+------------------+------+

11. 10pts What are the top 5 teams for total home-attendance?

+-----------+------+--------+
| name      | y    | att    |
+-----------+------+--------+
| Michigan  | 2011 | 897431 |
| Michigan  | 2007 | 882115 |
| Michigan  | 2009 | 871464 |
| Tennessee | 2009 | 867537 |
| Michigan  | 2006 | 864035 |
+-----------+------+--------+

This is an individual assignment. You may discuss strategies in groups, but you may not work together when writing the solutions. SQL-Plagiarism is easy to detect.

Submissions must follow the naming convention netid_cse30246_hw3.sql and be submitted to GradeScope by midnight on the due date

We will test your solution on reduced data using the command:
mysql -p -Dcse30246 < netid_cse30246_hw3.sql