CS377: Database Design - SQL Join Programming with the Lahman Baseball Database (100 Points)

Assignment Goals

The goals of this assignment are:
  1. To implement join statements to query an existing nontrivial database

Background Reading and References

Please refer to the following readings and examples offering templates to help get you started:

The Assignment

In this assignment, you will query an existing large database to answer meaningful questions.

Write a program that uses the sqlite3 database library to connect to this database and to execute your queries.

About the Dataset

The Baseball Archive database is maintained by Sean Lahman [1], and contains a number of statistics about players and teams up to and including the current baseball season. We will use the database for the 2019 season. It is not important that you are familiar with the game of baseball to query this data; rather, that you use the links shown in the database schema to connect relevant tables together. This is often helpful when we are called upon to answer questions about data with queries in application domains about which we are less familiar! The schema for this database is provided below:

Lahman Baseball Database Schema

What to Do

Answer the following questions using the dataset (feel free to write these queries for any year if you’d like to explore!):

  1. How many wins did the Philadelphia Phillies have in 2019? You will use the teams table to find this.
  2. Which team had the most hits in 2019? The teams table has this as well.
  3. Who was the highest paid player in 2016? To find this, you will connect the salaries table with the people table . Recall that MAX is an aggregating function that can be used in a SELECT statement.
  4. What players played in the all-star game in 2019? You will connect the allstarfull table with the people table to find this.
  5. For each player in the major leagues in 2019 that also played in college, what is the name of the school(s) they attended? You will connect the people, appearances, collegeplaying, and schools table to find this.
  6. Make up any question you’d like and answer it. If you’re not a baseball fan, you can keep this relatively simple, but a little data exploration can be fun and we will compare our questions and queries (anonymously if you prefer!) in class.

Exporting your Project for Submission

When you’re done, write a README for your project, and save all your files, before exporting your project to ZIP. In your README, answer any bolded questions presented on this page.

  1. The database is copyright 1996-2021 by Sean Lahman and licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License. For details see: http://creativecommons.org/licenses/by-sa/3.0/ 

Submission

In your submission, please include answers to any questions asked on the assignment page in your README file. If you wrote code as part of this assignment, please describe your design, approach, and implementation in your README file as well. Finally, include answers to the following questions:
  • If collaboration with a buddy was permitted, did you work with a buddy on this assignment? If so, who? If not, do you certify that this submission represents your own original work?
  • Please identify any and all portions of your submission that were not originally written by you (for example, code originally written by your buddy, or anything taken or adapted from a non-classroom resource). It is always OK to use your textbook and instructor notes; however, you are certifying that any portions not designated as coming from an outside person or source are your own original work.
  • Approximately how many hours it took you to finish this assignment (I will not judge you for this at all...I am simply using it to gauge if the assignments are too easy or hard)?
  • Your overall impression of the assignment. Did you love it, hate it, or were you neutral? One word answers are fine, but if you have any suggestions for the future let me know.
  • Any other concerns that you have. For instance, if you have a bug that you were unable to solve but you made progress, write that here. The more you articulate the problem the more partial credit you will receive (it is fine to leave this blank).

Assignment Rubric

Description Pre-Emerging (< 50%) Beginning (50%) Progressing (85%) Proficient (100%)
Algorithm Implementation (60%) The algorithm fails on the test inputs due to major issues, or the program fails to compile and/or run The algorithm fails on the test inputs due to one or more minor issues The algorithm is implemented to solve the problem correctly according to given test inputs, but would fail if executed in a general case due to a minor issue or omission in the algorithm design or implementation A reasonable algorithm is implemented to solve the problem which correctly solves the problem according to the given test inputs, and would be reasonably expected to solve the problem in the general case
Code Quality and Documentation (30%) Code commenting and structure are absent, or code structure departs significantly from best practice, and/or the code departs significantly from the style guide Code commenting and structure is limited in ways that reduce the readability of the program, and/or there are minor departures from the style guide Code documentation is present that re-states the explicit code definitions, and/or code is written that mostly adheres to the style guide Code is documented at non-trivial points in a manner that enhances the readability of the program, and code is written according to the style guide
Writeup and Submission (10%) An incomplete submission is provided The program is submitted, but not according to the directions in one or more ways (for example, because it is lacking a readme writeup or missing answers to written questions) The program is submitted according to the directions with a minor omission or correction needed, including a readme writeup describing the solution and answering nearly all questions posed in the instructions The program is submitted according to the directions, including a readme writeup describing the solution and answering all questions posed in the instructions

Please refer to the Style Guide for code quality examples and guidelines.