DBExamples

advertisement
Creating a Database: A Few Examples
Adapted from Chuck Cusack’s Notes
Problem 1: Grade book
Problem Statement
A professor wants to implement an online grade book to keep track of and display student grades
for several courses. Courses can generally be differentiated by department, course number,
semester, and year, and knowing the course title can be useful. For each class, the grade is
determined by grades on various categories, including tests, quizzes, homework, labs, projects,
etc. The percentages of the categories should add to 100%. The number of assignments from
each category is unspecified, and can change at any time.
Example: A course may be graded by the distribution: 50% tests, 20% quizzes, and 30%
homework. If there are 5 tests, each test is worth 50/5=10% of the grade.
Assignment scores are stored on a point basis, and assignments with a different number of points
in the same category are worth equal weight.
Example: If quizzes are worth 24%, and there are three quizzes worth 20, 50, and 100 points,
each quiz is still worth 24/3 = 8% of the final grade. If a fourth quiz is added, each quiz is now
worth 24/4 = 6% of the grade.
The professor wants to be able to list rosters for each class, including the name, student number,
and major of each student. He needs to change the distribution, add and delete assignments, and
update grades.
He would also like the system to compute the average grade for each assignment, the current
percentage score of each student. He would also like to sort the students by name, total
percentage, or on any individual assignment.
A few important things to remember:
 A student might be in more than one course.
 Since there will be many scores listed for students, it is important to not duplicate
information.
 Computed values, including average scores, final percentages, etc. should not be stored in
the database.
We want to design a database to store the required data for this project.
Solution
The keys to remember when designing the database are that each table should store one specific
kind of information, and information should not be duplicated.
From the description, it is clear we need tables for students and courses. But how do we keep
track of student scores, enrollment, assignments, and course distribution. Do we need tables for
each of these? Here is what we know so far:
 We need a table for Students that stores FirstName, LastName, StudentID, and Major.
 We need a table for Courses that stores Department, CourseNumber, CourseName,
Semester, and Year.
 We should have a table to store the Distribution for each course, including the Category
and Percent.
 We should have a table to store the details of each Assignment, including PointsPossible
and Instance (so we know whether it is quiz 1 or quiz 2, for instance.)
 We know that the following relationships exist
o A course can have 0 or more students, and a student can be enrolled in 0 or more
courses.
o A course will have 1 or more entries from the distribution table.
o An assignment must correspond to an entry from the distribution table, and each
entry from the distribution table can have 0 or more assignments associated with
it.
o For each assignment, there are scores for 0 or more students, and each student can
have scores recorded for 0 or more assignments.
Given these facts, we can construct the first draft of our entity-relationship (ER) model. Recall
that the Ns and Ms in the diagram indicate the cardinality of the relationships. For instance,
Students and Courses have a many-to-many relationship, whereas Courses and Distribution
have a one-to-many relationship.
Figure 1: Grade Book ER Model Step 1
Next, we need to have a primary key for each table. For Students, the StudentID will suffice,
since it is unique. Although a combination of Department, CourseNumber, Semester and Year
might uniquely identify a course, this is too complicated.
We will add a CourseID field as our primary key in Courses. Clearly nothing will uniquely
identify an entry in the Distribution, since two courses might have the same Category and
Percent. Therefore, we add DistributionID as the primary key. Similarly, we add AssignmentID
to Assignments. Version 2 of our ER-Model, with primary keys underlined, is given below.
Figure 2: Grade Book ER Model Step 2
Now we need to identify which of the entities (tables) are weak. That is, entities that depend on
other entities. For example, an entry in Distribution has to correspond to an entry in Courses,
so Distribution is weak. Similarly, an entry in Assignments depends on an entry in
Distribution, so Assignments is weak.
Our final ER Model shows the week entities and dependencies with double lines.
Figure 3: Grade Book ER Model Step 3
Now we are ready to convert the ER model into tables. We start by converting the non-weak
entities by simply listing all of the attributes. The non-weak entities here are Students and
Courses.
Figure 4: Grade Book Tables Step 1
Now we add the weak entities by including the attributes, and adding as a foreign key the
primary key from the dependency. In our case, we have Distribution, which depends on
Courses, and Assignments, which depends on Distribution. The arrows in the diagram indicate
where the foreign keys are drawn from.
Figure 5: Grade Book Tables Step 2
The next step involves dealing with one-to-one and one-to-many relationships. We have none
other than those dealt with just above. If we did, we would simply add to either table (in the oneto-one case) or the many-table the primary key from the other table.
Lastly, we need to deal with many-to-many relationships. Each of these involve two tables. For
each many-to-many relationship, we add a new table with a name relating to both tables that are
related, include as foreign keys the primary keys of the two related tables, and make that the
primary key of the new table. Also, add any attributes that are part of the relationship, and an
attribute to indicate order, if necessary.
We have two many-to-many relationships—one between Students and Courses, and one
between Students and Assignments. The former requires no additional attributes, but the latter
requires an additional attribute to store the Points. The final list of tables is given below.
Figure 6: Grade Book Tables Step 3
The final step is to convert the tables into SQL statements so we can create the tables in some
DBMS. Given the tables in Figure 6, this is fairly straightforward. Essentially the only choices
we still need to make are what data types to use for each attribute, and which attributes should
not be allowed to be NULL. If you think about it for a few minutes, you will realize that in our
tables, only the attribute Major is in any sense optional, so all other attributes will be required to
be not NULL. The SQL statements are given below.
CREATE TABLE Students (
StudentID int NOT NULL,
FirstName varchar(30) NOT NULL,
LastName varchar(30) NOT NULL,
Major varchar(20),
PRIMARY KEY (StudentID)
);
CREATE TABLE Courses (
CourseID int NOT NULL auto_increment,
Department varchar(20) NOT NULL,
CourseNumber int NOT NULL,
CourseName varchar(50) NOT NULL,
Semester varchar(10) NOT NULL,
Year int NOT NULL,
PRIMARY KEY (CourseID)
);
CREATE TABLE Enrollment (
StudentID int NOT NULL,
CourseID int NOT NULL,
PRIMARY KEY (StudentID, CourseID)
);
CREATE TABLE Distribution (
DistributionID int NOT NULL auto_increment,
CourseID int NOT NULL,
Category varchar(20) NOT NULL,
Percent int NOT NULL,
PRIMARY KEY (DistributionID)
);
CREATE TABLE Assignments (
AssignmentID int NOT NULL auto_increment,
DistributionID int NOT NULL,
Instance int NOT NULL,
PointsPossible int DEFAULT 0 NOT NULL,
PRIMARY KEY(AssignmentID)
);
CREATE TABLE StudentScores (
StudentID int NOT NULL,
AssignmentID int NOT NULL,
Points int DEFAULT 0 NOT NULL,
PRIMARY KEY (StudentID, AssignmentID)
);
And now we have the required statements to create the database. The only thing left to do is
insert the data into the database, and it is ready to be used by whatever type of application the
programmer wishes to write.
Exercises
The following are the types of things you might want to do with the database. Think about how
you would do each of these. Can you use queries to get the information or perform the action, or
will you need to process the data after (or before) you extract it from the database?
1. Compute the average (or high or low) score on an assignment.
2. Compute the percentage grade for a student.
3. List all of the students in a given course.
4. List all of the students in a course and all of their scores on every assignment.
5. Add an assignment to a course.
6. Change the percentages of the categories for a course.
7. Add 2 points (or 10%) to the score of each student on an assignment. Or just to those
students whose last name contains a ‘Q’.
8. Compute the percentage grade for a student, where the lowest score for a given category
should be dropped.
Problem 2: Album Collection
Problem Statement
A friend is interested in keeping track of information about his album collection. He is not
concerned about whether or not the albums are CDs, tapes, LPs, etc. Also, assume that he does
not have any compilation albums—that is, each album has songs from a single band. For each
album, he wants to store which band recorded the album, the title, the year, and the chronology
(e.g. this is the 4th album for that band). He also wants to store the songs, including title, length,
track number, and writer(s). Of course, if two bands record the same song, they might have
different track numbers and lengths. For each band (group or individual), he also wants to store
the names of all of the band members. For each band member, he needs their first and last
names, and country of origin. Consider both band members and songwriters as musicians.
Construct a database to store the required information so he can do things like list all of the
information for an album, list all songs written by a musician, etc.
Solution
We will give a much briefer solution to this problem. Here is what we know:






The obvious entities are Musicians, Bands, Albums, and Songs.
There is a one-to-many relationship between Albums and Bands, which will require
adding BandID to Albums.
For each Musician, we need to store MusicianFirstName, MusicianLastName, and
MusicianCountry. Since these may not be unique, we add MusicianID as a primary key.
For each Band, we need to store the BandName, and BandID for a primary key.
For each Album, we need to store AlbumTitle, AlbumYear, AlbumNumber (for
chronology), BandID from Bands, and AlbumID for a primary key.
For each Song, we need SongTitle and SongID as primary key.




There is a many-to-many relationship between Musicians and Bands, and Musicians
and Songs, and Albums and Songs.
We need a BandMusicians table with BandID and MusicianID.
We need a SongWriters table with SongID and MusicianID.
We need a AlbumSongs table with SongID, AlbumID, TrackLength, and TrackNumber.
The TrackLength needs to be stored here rather than in the Songs table since different
recordings could have different lengths.
From this, we can construct the following tables
Figure 7: Album Database Tables
Exercises
Can you answer these questions by querying the database directly, or do you need to extract the
information first, and then manipulate it further? What queries would you use to do each?
1. How many songs are on album X?
2. How many albums did artist Y record?
3. How many bands is musician Z a member of? What are they? Which albums is he on?
4. How many different bands recorded song X?
5. How many albums do I have? How many bands are represented in my collection? How
many musicians are represented in my collection?
6. Are there any bands whose members are all from different countries?
7. How many albums contain at least 9 songs?
Download