I’ve created a database of college football games and plays from 2005 to 2013. I got this data from Kaggle. You will use this database to complete this assignment and submit this homework to Canvas by the due date as a file named netid_cse30246_hw3.sql. We will test run your queries on a modified version of the database.
Log in to db8.cse.nd.edu, log in to mysql and use the cse30246 database. The class-lecture tables are in this database as are the college football tables, prefixed by cfb_
These are the tables:
mysql> use cse30246; Database changed mysql> show tables; +--------------------+ | Tables_in_cse30246 | +--------------------+ | cfb_conference | | cfb_drive | | cfb_game | | cfb_game_stats | | cfb_kickoff | | cfb_kickoff_return | | cfb_pass | | cfb_play | | cfb_player | | cfb_punt | | cfb_punt_return | | cfb_reception | | cfb_rush | | cfb_stadium | | cfb_team | +--------------------+
We’re going to use these tables to answer interesting questions and hone our SQL skills.
For example, if I want to know the size of Notre Dame Stadium I can ask the following query:
mysql> select capacity from cfb_stadium where name = 'Notre Dame Stadium'; +----------+ | capacity | +----------+ | 80795 | +----------+
For each question, write a single query to return the answer to the following questions.
1. 5pts How many total spectators have seen a game in this dataset?
+------------------+ | total_attendance | +------------------+ | 332382121 | +------------------+
2. 15pts On which down do most penalties occur?
+---------+
| down |
+---------+
| no hint |
+---------+
3. 15pts The Red zone is when the ball is within the 20 yards of the goal line. A red zone attempt is when the team moves inside the red zone, and a successful red zone attempt is when a touchdown is scored. Which 5 schools with over 10 red zone attempts have the best red zone success rate (%) and what was their success rate?
+---------------+--------+
| name | rate |
+---------------+--------+
| Towson | 0.7647 |
| South Dakota | 0.7500 |
| New Hampshire | 0.7407 |
| Indiana State | 0.7143 |
| Liberty | 0.7143 |
+---------------+--------+
4. 20pts Which conference has the shortest average game duration?
+-----------------+----------+
| conference_name | duration |
+-----------------+----------+
| no hint | no hint |
+-----------------+----------+
5. 15pts Which player has gained the most yards on 3rd down in October?
+------------+-----------+-------------+
| first_name | last_name | total_yards |
+------------+-----------+-------------+
| Colt | McCoy | 1328 |
+------------+-----------+-------------+
6. 15pts Find the top 3 states from which Notre Dame receives the most players? How many does that state send, and what percentage of ND players come from that state?
+-----------+-----+------------+ | homestate | cnt | proportion | +-----------+-----+------------+ | CA | 31 | 0.1047 | | IL | 30 | 0.1014 | | FL | 28 | 0.0946 | +-----------+-----+------------+
7. 15pts How many players have played for three or more different teams?
+----------------+
| num_players |
+----------------+
| <no hint> |
+----------------+
8. 20pts Which stadium has sees the longest punt returns on average, and how long is that average?
+---------------+---------+
| name | yds |
+---------------+---------+
| Wrigley Field | 19.3333 |
+---------------+---------+
9. 20pts Which stadium has witnessed the home team lose the most times?
+-------------------------------------------+-------------+ | name | home_losses | +-------------------------------------------+-------------+
Hint: #2 is Aggie Stadium, Texas A&M New Mexico with 39 losses, which is #1?
10. 20pts Which player(s) had the longest reception in the 4th quarter. What team where they on and what year was it? (breaking non-trivial ties is 5 of the 15 points, (i.e 10 points with LIMIT, 20 points without LIMIT)
+------------+-----------+------------------+------+ | first_name | last_name | name | year | +------------+-----------+------------------+------+ | Amari | Cooper | Alabama | 2013 | | Johnny | Forzani | Washington State | 2009 | | Jamison | Crowder | Duke | 2012 | +------------+-----------+------------------+------+ <HINT: Order is not important>
11. 20pts Which 5 teams (and seasons) had the most attendance at their games?
+-----------+------+--------+
| name | y | att |
+-----------+------+--------+
| Michigan | 2011 | 897431 |
| Michigan | 2007 | 882115 |
| Michigan | 2009 | 871464 |
| Tennessee | 2009 | 867537 |
| Michigan | 2006 | 864035 |
+-----------+------+--------+
This is an individual assignment. You may discuss strategies in groups, but you may not work together when writing the solutions. SQL-Plagiarism is easy to detect.
Submissions must follow the naming convention netid_cse30246_hw3.sql and be submitted to Canvas by midnight on the due date
We will test your solution on reduced data using the command:
mysql -p -Dcse30246 < netid_cse30246_hw3.sql