CS377: Database Design - Aggregation in Relational Databases


Activity Goals

The goals of this activity are:
  1. To explain the use of primary and foreign keys in database systems
  2. To aggregate records across multiple tables with referential keys

Supplemental Reading

Feel free to visit these resources for supplemental background reading material.

The Activity

Directions

Consider the activity models and answer the questions provided. First reflect on these questions on your own briefly, before discussing and comparing your thoughts with your group. Appoint one member of your group to discuss your findings with the class, and the rest of the group should help that member prepare their response. Answer each question individually from the activity on the Class Activity Questions discussion board. After class, think about the questions in the reflective prompt and respond to those individually in your notebook. Report out on areas of disagreement or items for which you and your group identified alternative approaches. Write down and report out questions you encountered along the way for group discussion.

Model 1: Tables

FirstNameLastNameAge
AlexSmith20
LeeJones21
BrianMcMullen18
SamanthaJohnson22
LeeJones24

CourseNumCourseName
CS377Database Design
CS173Intro to Computer Science
CS174Object Oriented Programming
CS275Software Engineering

Questions

  1. How might you compare database tables and records, with software classes and objects?
  2. How can you differentiate between the two students named Lee?
  3. Not all cultures use "first" and "last" names; what might be a better choice for these column names?

Model 2: Primary and Foreign Keys

FirstNameLastNameAgeCourse
AlexSmith20CS377 Database Design
LeeJones21CS377 Database Design
BrianMcMullen18CS173 Intro to Computer Science
SamanthaJohnson22CS174 Object Oriented Programming
LeeJones24CS174 Object Oriented Programming
LeeJones21CS275 Software Engineering

Questions

  1. What is unfortunate about the structure of these tables? What could we do to improve upon it?
  2. Modify the tables above to add a "serial number" or a primary key to each of the tables.
  3. Can you think of a way to identify courses without having to number them? What might a more "natural key" be?
  4. What disadvantage could arise from using the data itself (like "CS377") as a key, particularly if that data changes over time?
  5. Eliminate the redundant data in these tables by replacing any redundancies with a new column and row specifying the appropriate key value that can be looked up in the corresponding table. This is called a foreign key.

Model 3: Aggregation

FirstNameLastNameAgeCourse
AlexSmith20CS377 Database Design
LeeJones21CS377 Database Design
BrianMcMullen18CS173 Intro to Computer Science
SamanthaJohnson22CS174 Object Oriented Programming
LeeJones24CS174 Object Oriented Programming
LeeJones21CS275 Software Engineering

Questions

  1. Using your new table structure, describe an algorithm to determine the average age of students enrolled in CS377.
  2. What other types of aggregation functions might be useful across these tables?
  3. Extend the course database to show enrollments in a particular semester; once the class agrees on a design, draw out the tables and data including the key values.

Submission

Submit your answers to the questions using the Class Activity Questions discussion board. You may also respond to questions or comments made by others, or ask follow-up questions there. Answer any reflective prompt questions in the Reflective Journal section of your OneNote Classroom personal section.