I’ve created a database of college football games and plays from 2005 to 2013. I got this data from Kaggle. You will use this database to complete this assignment and submit this homework to Canvas by the due date as a file named netid_cse30246_hw3.sql. We will test run your queries on a modified version of the database.
Log in to db8.cse.nd.edu, log in to mysql and use the cse30246 database. The class-lecture tables are in this database as are the college football tables, prefixed by cfb_
These are the tables:
mysql> use cse30246; Database changed mysql> show tables; +--------------------+ | Tables_in_cse30246 | +--------------------+ | cfb_conference | | cfb_drive | | cfb_game | | cfb_game_stats | | cfb_kickoff | | cfb_kickoff_return | | cfb_pass | | cfb_play | | cfb_player | | cfb_punt | | cfb_punt_return | | cfb_reception | | cfb_rush | | cfb_stadium | | cfb_team | +--------------------+
We’re going to use these tables to answer interesting questions and hone our SQL skills.
For example, if I want to know the size of Notre Dame Stadium I can ask the following query:
mysql> select capacity from cfb_stadium where name = 'Notre Dame Stadium'; +----------+ | capacity | +----------+ | 80795 | +----------+
For each question, write a single query to return the answer to the following questions.
1. 5pts How many total spectators have seen a game in this dataset?
+------------------+ | total_attendance | +------------------+ | 332382121 | +------------------+
2. 15pts Teams usually PUNT on fourth down, but not always. How many times did a PUNT occur on a non-fourth down play?
+-------------+ | total_punts | +-------------+
<No hint>
3. 15pts The Red zone is when the ball is within the 20 yards of the goal line. A red zone attempt is when the team moves inside the red zone, and a successful red zone attempt is when a touchdown is scored. Which 5 schools with over 20 red zone attempts have the best red zone success rate (%) and what was their success rate?
+---------------+--------+ | name | rate | +---------------+--------+ | New Hampshire | 0.7407 | | Indiana State | 0.7143 | | Wisconsin | 0.7048 | | Texas Tech | 0.6945 | | Boise State | 0.6749 | +---------------+--------+
4. 20pts Which school has the most punts; which school has the highest percentage of plays that are puns; and are these the same team?
+----------------+------------+-----------------------+------------+------------+
| most_punt_team | num_punts | highest_pct_punt_team | pct_punts | same_team? |
+----------------+------------+-----------------------+------------+------------+
| | | | | False |
+----------------+------------+-----------------------+------------+------------+
<HINT: 1 or 0 in the same_team column is fine>
5. 15pts Which Notre Dame player had the most yards on 1st down.
+------------+-----------+
| first_name | last_name |
+------------+-----------+
| Jimmy | Clausen |
+------------+-----------+
6. 15pts From which state does Notre Dame receive most of its players? How many does that state send, and what percentage of ND players come from that state?
+-----------+-----+-----------+ | homestate | cnt | perc | +-----------+-----+-----------+
<No Hint>
7. 15pts What school has the highest average passing yards per game attendant?
+----------------+------------+
| name | ypa |
+----------------+------------+
| <no hint> | 0.03636000 |
+----------------+------------+
<Hint – I used the micro-average here, i.e., the passing yards attendant for each game, averaged over all the team’s games. You can also do the macro-average, which sums up all the attendants and pass yards per game then takes a global average per team you can get 0.027ish>
8. 20pts Which stadium has sees the longest punt returns on average, and how long is that average?
+---------------+---------+ | name | yds | +---------------+---------+ | Wrigley Field | 19.3333 | +---------------+---------+
9. 20pts Which stadium has witnessed the home team lose the most times?
+-------------------------------------------+-------------+ | name | home_losses | +-------------------------------------------+-------------+
<No Hint>
10. 20pts Which player(s) had the longest reception in the 4th quarter. What team where they on and what year was it? (breaking non-trivial ties is 5 of the 15 points, (i.e 10 points with LIMIT, 20 points without LIMIT)
+------------+-----------+------------------+------+ | first_name | last_name | name | year | +------------+-----------+------------------+------+ | Amari | Cooper | Alabama | 2013 | | Johnny | Forzani | Washington State | 2009 | | Jamison | Crowder | Duke | 2012 | +------------+-----------+------------------+------+ <HINT: Order is not important>
11. 20pts Of all the schools that are not in a conference, i.e., “Ind”, Notre Dame is certainly the heaviest. But how much heavier, exactly? Compare the average weight of all the players on Notre Dame against the average weight of players on other teams that are also not part of a conference.
+----------+ | wgt_diff | +----------+ | 7.2212 | +----------+
Hint: 232.1913 – 224.9701 = 7.2212
(My initial solution used “Ind” but there are both “Ind” and “Independent” in the cfb_conference table. If you use both you get 232.1913 – 225.1023 = 7.089, which is more correct. I’ll accept either)
This is an individual assignment. You may discuss strategies in groups, but you may not work together when writing the solutions. SQL-Plagiarism is easy to detect.
Submissions must follow the naming convention netid_cse30246_hw3.sql and be submitted to Canvas by midnight on the due date
We will test your solution on reduced data using the command:
mysql -p -Dcse30246 < netid_cse30246_hw3.sql