Using Databases to Solve Problems CHAPTER 9 QUERYING A DATABASE IN MS ACCESS Thus far we have discussed the structure of relational database and how to plan and create a database. Once the database is created and data placed in the appropriate table, we will need to be able to extract data. The data in a database can be retrieved by examining its tables; however, all the information contained in a table is frequently more information than is needed by a database user. In addition, database users often want to view information that spans related tables, or view data in a different order. Queries provide a way for users to obtain more meaningful information from a database that simply viewing all data contained in a table. Queries can be used to: Filter data by selecting specific records according to criteria. Sort records. Group records together (aggregate) to perform calculations such as Sum, Average, Count, Min, Max, etc. Perform arithmetic and Boolean calculations. Extract information from more than one table by joining related tables. This chapter will cover the first four bullets; the last item, joining tables, will be discussed in subsequent chapters. WHAT IS A QUERY Remember that a table is a database object that stores records of related data according to a prescribed field order. A query is also an Access database object; a query defines how data will be extracted from existing data tables or the results of previous queries and how the extracted data will be organized. In English, a query is a question. In a database, a query is a set of instructions the computer will follow to answer a question about the data in a database. The answers themselves are dynamic, it changes whenever the data in the database changes, and thus is not part of the query but rather the result of the querying operation. The results of a query are often referred to as a dynaset. Specifying and performing a query involves the use of relational algebra operations such as select, project, join, etc. Several languages exist that can be used to create database tables, relationships, and queries. One of the oldest is SQL, which stands for Structured Query Language (pronounced "SQL" or "see-qwill"). It was originally developed by IBM for use on its mainframe computers. Fortunately, in modern day databases, we will neither have to understand how to write instructions in SQL or the process by which the computer actually obtains the dynaset. A user of Microsoft Access can specify a query using a tool called Query by Example (QBE). This approach makes use of the graphics capabilities of the computer (GUI interface) to allow the user to select the elements required for a query. The software then translates QBE grids into SQL for processing by the computer. Page 1 Using Databases to Solve Problems Before we can begin to use the query tool, we or someone else must have already completed the following tasks: Created tables with defined fields, field properties, and primary keys. Entered data into the tables. Set up relationships between the tables using foreign key(s). The next section will describe a sample database that meets these requirements and will be used to demonstrate the query throughout the chapter. A SAMPLE DATABASE In order to illustrate some of the query operations we will utilize the sample database seen in Figure 1. Members ID# 635 534 345 546 903 002 047 Name Acquila, Jose Adler, Lawrence Amico, Donald Anderson, William Applegate, Bernice Aston, Martin Attwater, Lester Phone 614-927-8345 614-746-2385 419-345-8472 513-345-9384 513-485-3833 614-485-3888 614-292-1442 Address Pickerington, OH Pataskala, OH Toledo, OH Cincinnati, OH Cleveland, OH Columbus, OH Columbus, OH Active Yes Yes Yes No Yes No Yes Join Date 1/5/97 3/4/99 1/5/97 4/10/00 3/17/01 11/7/98 9/1/00 Finances ID# 635 534 345 546 903 534 Dues-Paid 100 50 100 0 0 50 Donation 0 0 50 250 0 20 Figure 1: Club Database – Database View of Tables This database contains two tables: Members and Finances. The table structures are as follows: Members(ID#,Name,Phone,Address,Active,Join Date) This table is a membership list of all active and retired members. The ID# is a Number data type and has been identified as the primary key for this table. The Active field is a “Yes/No” Boolean data type field (a yes indicates this is an active member). The Join Date field is a date data type field. All other fields in the Members table are Text. Note that when typing a Boolean field into an Access table, the column is usually displayed as check box. In this case, a check indicates that this is an active member; no check indicates Page 2 Using Databases to Solve Problems a past or retired member. The actual values stored are the Boolean values. If an Access table with a Boolean field is copied into an Excel spreadsheet, the Boolean values are translated to True/False cell entries. Finances(ID#,Dues-Paid,Donation) This table contains records of member transactions. The amount of dues paid (full or partial payment) is entered in the Dues-Paid field. Dues for the year are $100 for active members; retired members are not charged dues. Additional contributions are recorded in the Donations field. The ID# field is type Number and Dues-Paid and Donation fields are type Number. A member may make any number of transactions. Both tables contain and an ID# field representing the same information. Both these fields are defined as type Number. ID# is the primary key of the Members table. Thus all three requirements are met to set up a relationship (same information, same data type, primary of at least one table). The ID# fields create a foreign key relating the two tables, as diagrammed in Figure 2. We have now completed setting up the database relationships and are ready to ask questions about the database. Members Primary key: ID# ID# Finances Primary Key: none Figure 2 USING THE QUERY DESIGN TOOL In Access, the query object contains the instructions the computer will carry out to answer a specific question about the data in an Access database. These instructions tell the computer what set of data needs to be examined, and what filters and/or sorts to apply. Executing an Access query results in a dynaset, a set of data based on the most recent information in the database. THE QUERY BY DESIGN TOOL STRUCTURE There are two different ways to use the query tool in Access; one way is to follow a set of steps in the Query Wizard and the other is to use the Query Design tool. Both of these options can be launched from the Create ribbon in the Other group. This chapter will deal exclusively with the latter option, the Query Design tool. Figure 3 displays the Query Design tool window including the Show section and the QBE (query by example) grid. Figure 4 displays the content sensitive Design ribbon containing the different design commands available. Page 3 Using Databases to Solve Problems Transaction Table added to QBE grid Figure 4 Account# & Transaction$ fields selected Figure 3 In the upper portion of Design window, the data objects needed for the query (either tables or previously created queries) are displayed. In the lower portion on the QBE grid, the user specifies the fields, criteria, sorting, and information needed to carry out the query. This section is divided into a series of rows as follows: FIELD: Lists of all the fields involved in the query. TABLE: For each field entered, the table it came from must be listed. Sometimes two tables may have a field with the same name so the table is necessary to distinguish the fields. TOTAL: This line is normally not shown. It can be toggled using the toolbar button. It is used in queries that require grouping of data. SORT: This line is used when sorting. One or more fields can be organized in Ascending or Descending order. SHOW: This box is checked by default and is used to specify whether or not the field will be shown in the resulting dynaset. CRITERIA: This specifies criteria that a field must meet in order for a record to be included in the resulting dynaset. e.g., Donation >= 200. OR: This specifies any alternate (Boolean Or) criteria. The mechanics of selecting the data objects and selecting the appropriate fields is covered in the course textbook. Here, the focus will be on which design elements are needed to solve specific problems. Page 4 Using Databases to Solve Problems FILLING OUT THE QUERY GRID The design grid in Figure 5 is a query that lists the Name and Address fields from the Members table. The Show line indicates that both of these fields will be displayed in the resulting dynaset. The other fields on the Members table are not listed on the grid and therefore will play no role in either the dynaset display or the selection of records. The resulting dynaset for Table: Members Field Name Table Members Sort Show x Criteria OR this query will display these two fields for all records in the Members table, as there are no criteria specified to filter the data. Address Members x Figure 5 How can a question be translated onto the QBE grid? As with any problem solving activity, it helps to have a methodology containing a set of logical steps to follow. The following is a simplified method for taking a question and translating it into a QBE grid. This procedure will be extended in later chapters to allow for the use of multiple tables in a query. Step 1: Plan the QBE grid: When filling out the QBE grid it is always wise to plan ahead, since the next task will be to select the table to be included in the query. In order to know which table (or previous query) is required, the following information must be known: o What fields are to be shown in the resulting query and in which table (or previously written query) can these fields be found? o What field(s) will be filtered and therefore what criteria should be specified for that field? For example, if only a list of active members is desired, the criterion Yes must be specified in the field Active. Step 2: Add the Table: Using the Show table feature add the table(s) or saved query required for the query. The Show Table dialog box can be launched using the Show Table button on the Query Setup group of the Design ribbon. The Show Table dialog box can be seen in Figure 6. Select the desired table (or query) object in the dialog box by double clicking on the object name or clicking on the object and then the Add button. Figure 6 Page 5 Using Databases to Solve Problems Step 3: Fill out the Grid: 1. List each of the fields to be shown on the resulting dynaset. Select the fields from the table(s) shown. The detailed mechanics are described in the course text. 2. List any fields not shown but that will contain criteria. Make sure to uncheck the show box for such fields. 3. Write the necessary criteria in the appropriate fields using the Criteria and OR lines. 4. Use the Sort feature, Group by feature, and Expression builder as needed. Step 4: Save and name the query. SELECT QUERIES This section discusses writing queries that select specific records based on a set of criteria. This includes a single select criterion in a single field, multiple criteria in a single field, and multiple criteria in multiple fields. QUERIES WITH A SINGLE CRITERION First, the methodology described in the previous section will be used to create a query containing a single condition on a single field. A condition can be specified by using the Criteria line of the QBE grid. Consider the following example: Write a query to select all active members, listing their names and addresses. The results should be similar to structure/data listed in Figure 7. Step 1: Plan the QBE grid. The query specifies that the Name and Address fields are to be shown in the resulting dynaset. Both are found in the Members table. In addition, a criterion that only “active members” should be included on this list is specified, requiring a condition on the Active field (i.e., that it is equal to “Yes”), which also is on the Member’s table. Name Address Acquila, Jose Pickerington, OH Adler, Lawrence Pataskala, OH Amico, Donald Toledo, OH Applegate, Bernice Attwater, Lester Cleveland, OH Columbus, OH Figure 7 Step 2: Add the Member’s table on the QBE design view in Access. All information needed to answer the question is found in this table Step 3: Select the appropriate fields (Name, Address, and Active) and place them in the grid. Since Name and Address are to be shown in the dynaset, be sure the Show checkbox for each is checked. Uncheck the show box for Active, as this field requires a criteria but is not asked to be shown in the final result. Page 6 Using Databases to Solve Problems Fill in the selection criteria on the Criteria line of the QBE grid. Place the criteria, in this case, in the Active field column on the Criteria line. Since this is a Boolean type field the criterion can be either Yes or No. Since only Active members are specified, the criterion will be Yes. Quotes should not be used around Yes since it is a value in a Boolean field and not a text label. Step 4: Save the QBE grid by naming the query Query1. The final version of the QBE grid can be seen below (Figure 8). Tables: Members Field Name Table Members Sort Show x Criteria Address Members Active Members x Yes Figure 8 – Query 1 Logically, what steps are required to obtain this dynaset? To obtain this result, one would systematically go through each record of the Members table and test to see if the criteria are met. If the criteria are met (in this example, the member is active), then the appropriate fields for that record should be displayed. Notice that the records for Aston and Anderson do not appear since they are not active members. Another example of how a QBE grid is designed to answer a question with a single criterion is as follows: Write a query to list the names and addresses of all Members who live in Bexley, Ohio. Table: Members Field Name Table Members Sort Show x Criteria OR OR Address Members x “Bexley, OH” Using the methodology outlined, the QBE grid can be written as seen in Figure 9. Notice the criterion, “Bexley, OH”, is a criterion on a text field and therefore requires quotation marks. If you run this query you will discover that the dynaset contains no records, since none of the members live in Bexley. Figure 9 – Query 2A Is it a good idea to have a field in a database that contains both the city and state combined, as is the case in the example database? This is probably not the best way to organize data in a table as the city and state is not an inseparable unit. At times only the city or only that state may be needed. A better way to have designed the original table would have been to make these two pieces of information separate fields. Page 7 Using Databases to Solve Problems Write a query to list all club members and their addresses. Such a query contains no criteria. The two fields containing the required information should be placed on the QBE grid and the criteria row should be left blank, as seen in Query 2B. Table: Members Field Name Table Members Sort Show x Criteria OR Address Members x Figure 10 – Query 2B USING RELATIONAL OPERATORS IN QUERY CRITERIA It is also possible to use relational operators to specify criteria on a QBE grid. For example: Write a query to list by ID# all transactions with donations of greater than $0. Show the ID# and donation amount. What fields will be shown? ID# from the Finances table and Donation amount from the Finances table. What fields will have criteria that are not shown? No additional fields What criteria are needed? The donation field will have the criteria >0. This should be inserted on the grid in the column containing the Donation field on the row labeled Criteria. What table(s) are needed? All fields used are from the Finances table. The query grid and resulting dyanset are shown in Figure 11. Table: Finances Field ID# Table Finances Sort Show x Criteria OR Donation Finances x >0 Criterion in Donation field id# donation 345 50 546 250 534 20 Figure 11 - Query 3A To specify a greater than criteria, use the greater than operator followed by the value. The syntax does not include quotes when using a numerical type field. The following relational operators can be used in an Access query. Page 8 Using Databases to Solve Problems Equal to: = Greater than: > Greater than or equal to: Query 3B Field Table Sort Show Criteria OR Table: Members Name Address Members Members x < “m” Figure 12 x >= Not equal to: <> Less than: < Less than or equal to: <= Relational operators also work equally as well on text fields. A criteria of < “m” will result in all records where the value in that field begins with the letters A through L. This is illustrated on the Query 3B QBE grid in Figure 12. MS Access query criteria are not case sensitive; therefore M and m are treated identically. This is not the case for other DBMS software packages. Also notice the criteria, but not the relational operator, is enclosed in quotes. Criteria for text fields, unlike numerical fields, require quotation marks around the text. WILD CARDS IN CRITERIA FOR TEXT FIELDS Another tool that can be used when writing a criterion on a text field is to use a wildcard. The * wildcard can be used to replace one or more characters. The ? wildcard can be used to replace a single character. Whenever a wildcard is included in a criterion the expression Like must precede the expression. Like “J*” - this tells the computer the criterion is anything that starts with a J no matter how many characters it has or what they are: Jay, Justice, Jo, would all meet the criteria Like “t*p” – this tells the computer the criterion is anything that starts with t and ends with p. top, tip, tekdp all would meet the criteria. Like “*4*” – this tells the computer the criterion is any entry that contains the character 4. Like “?d” - this tells the computer the criterion is anything that ends in d and has one character in front of it. Id, bd, od would all meet the criteria. Tad would not meet this criteria Like “s??n” – this tells the computer the criterion is anything that is four characters that begins with s and ends with n. sean, soon, sxxn would all meet this criteria. Page 9 Using Databases to Solve Problems Write a query to list all members who have phone numbers that begin with 614. The QBE grid (Query 4 – in Figure 13) and resulting dynaset are as follows. Notice that since 614 is part of a text field, the numeral characters are enclosed in quotes. Tables: members Field Name Table Members Sort Show x Criteria OR Phone Members name Acquila, Jose Adler, Lawrence Aston, Martin Attwater, Lester Like “614*” Figure 13 – Query 4 QUERIES USING MULTIPLE CRITERIA IN A SINGLE FIELD: The examples shown thus far have been limited to using a single criterion – one condition in one field. Often multiple criteria are required to obtain the desired record set. An example of multiple criteria in one specific field is as follows: List the names of all members who either live in Pickerington or in Columbus. The QBE grid to list the members’ names either living in Pickerington, OH or in Columbus, OH is seen in Figure 14. name Acquila, Jose Aston, Martin Attwater, Lester Table: Members Field Name Table Members Sort Show x Criteria Address Members “Pickerington,OH” OR “Columbus, OH” OR Figure 14 – Query 5A Notice the use of the Boolean operator OR within the Criteria Box of the QBE grid. The OR operator in Access has the same meaning as the one in Excel, but the syntax is different. Another way the same criteria can be written is to place the OR item on a separate line. These OR lines assume that any item listed has an “or” relationship to the criteria above. An example of this alternate QBE design is seen in Query 5B on Figure 15. The resulting dynaset will be identical to the one generated in Query 5A. Page 10 Using Databases to Solve Problems Table: Members Field Name Table Members Sort Show x Criteria OR Address Members “Pickerington, OH” “Columbus, OH” Figure 15 – Query 5B Another example: List the ID#s and transaction amounts for all donations between $50 and $100. The criteria here can be interpreted as >=$50 and <=$100. The QBE grid would be as follows: Table: Finances Field ID# Table Finances Sort Show x Criteria OR Donation Finances id# x 345 donation 50 >=50 AND <=100 Figure 16 – Query 6 Notice that the Criteria box for the Donation field in Figure 16 contains a familiar Boolean operator, AND, again with a different syntax than seen in previous applications. The resulting dynaset contains only one record. Only member 345 made a donation between 50 and 100 dollars. Another valid syntax to express >=$50 and <=$100 is between 50 and 100. Can the AND operator be eliminated and the second criterion (<=100) placed on the next line of the grid? No, the OR lines represent “or” operations only. What if the criteria in the donation field looks like this: $50 AND $100? Logically, no single record can have both a $50 value and a $100 value in the donation field simultaneously. Thus no records will be selected. Page 11 Using Databases to Solve Problems QUERIES USING A MULTIPLE CRITERIA IN A MULTIPLE FIELDS: The next step in learning how to set up QBE grids is representing multiple criteria in multiple fields. Consider the following example: List the names of all Active members in the Columbus area. Here there are criteria is on both the Address and Active fields simultaneously. Both of these criteria must be met for a record to be selected. This indicates an AND operation is needed. To represent an AND relationship between two or more fields on a QBE grid, place the criteria on a single line. So the resulting grid and dynaset for this example would be (Figure 17): Tables: Members Field Name Table Members Sort Show x Criteria OR Address Members Active Members name Attwater, Lester “Columbus, OH” Yes Figure 17 – Query 7 Now consider the case of listing all active members from Columbus and all inactive members from Cincinnati. Include the members name, address and active status. Here we have the following logical operations: EITHER Address is Columbus AND Active is Yes OR Address is Pickerington AND Active is No. To express the AND operation place the criteria on the same line. To express the OR operation, place one set of criteria on one line and the second set on the other. Thus the resulting QBE grid and dynaset will look like the following: Tables: Members Field Name Table Members Sort Show x Criteria OR Address Members Active Members x “Columbus, OH” “Cincinnati, OH” x Yes No name Anderson, William Attwater, Lester address Cincinnati, OH Columbus, OH active No Yes Figure 18 Query 8 Page 12 Using Databases to Solve Problems QUERIES WITH SORTING: The use of the Sort feature to temporarily organize records on tables was discussed in Chapter 2.1. A similar but more powerful feature is available in the query tool. To sort a field in a query, select the field to be sorted and indicate the sort order (Ascending or Descending) on the Sort line of the QBE grid. Sorting can be combined with any other set of criteria. SORTING ON A SINGLE FIELD Consider the following problem: Write a query that will list the names of all Members, their address and join date. List them in order of join dates, with the most recent first. Tables: Members Field Name Table Members Sort Show x Criteria OR Address Members x Join Date Members Descending x Figure 19 – Query 9 Name Address Join Date Applegate, Bernice Cleveland, OH 3/17/01 Attwater, Lester Columbus, OH 9/1/00 Anderson, William Cincinnati, OH 4/10/00 Adler, Lawrence Pataskala, OH 3/4/99 Aston, Martin Columbus, OH 11/7/98 Acquila, Jose Pickerington, OH 1/5/97 Amico, Donald Toledo, OH 1/5/97 To accomplish this sort, the word Descending is added to the grid on the Sort row in the corresponding field column. Dates, much like in Excel, are represented by values, where the lower the value the earlier the date. Thus to sort the Join Date field by most recent date first, the word descending is used. The word Ascending is used to sort from smallest to largest value (earliest date first). When entering a sort, click on the Sort row in the column desired, and choose either Ascending or Descending from the drop-down list. SORTING ON MULTIPLE FIELDS Again consider the same query but this time sort by address and then by join date. This requires sorting on multiple fields. The question is which field is sorted first – which field is the major sort and which field is the minor sort? Should the list have (1) all like addresses together organized by join date, or (2) all like join dates together organized by address (Figure 20)? Page 13 Using Databases to Solve Problems (1)Sorted First by City then Date (2) Sorted first by Date then City Cincinnati, OH 4/10/00 Pickerington, OH 1/5/97 Cleveland, OH 3/17/01 Toledo, OH 1/5/97 Columbus, OH 11/7/98 Columbus, OH 11/7/98 Columbus, OH 9/1/00 Pataskala, OH 3/4/99 Pataskala, OH 3/4/99 Cincinnati, OH 4/10/00 Pickerington, OH 1/5/97 Columbus, OH 9/1/00 Toledo, OH 1/5/97 Cleveland, OH 3/17/01 Figure 20 Notice that the order of the records in each dynaset differs. In case (1) the location field is the major sort and the date field is the minor sort. In case (2) the date field is the major sort and the location field is the minor sort. How is the sort order specified on the QBE grid? Thus far the order of the selected fields on a QBE grid has not been important as it has only impacted the order of the fields displayed and not which records are selected. However, when trying to sort records into a specific order with multiple sort fields, the order of the fields listed on the QBE grid is critical to obtaining the desired outcome. The placement of these fields on the grid will determine which field is the major sort and which fields are the minor sorts. In other words, the order of the fields in the QBE grid determines the order of the records in the resulting dynaset. When executing a query, first the computer selects the requested records based on the given criteria. Then the computer sorts the records. It looks through the Sort row form left to right, sorting the first field it finds in the QBE grid with sorting instructions. Then it continues to the next column, sorting as instructed but this performing the minor sorts within the major sort groupings. The next column is then considered for a minor sort, and so on. Sorted fields need not be the first field nor adjacent to each other, but their order must be major sort first, then minor sort 1, minor sort 2, etc. The QBE grid for (1) is Figure 21 and for (2) is Figure 22: Write a query that will list the names of all Members, their address, and join dates. Sort the list by address and then by join date (most recent first). Tables: Members Field Table Sort Show Criteria OR OR Name Members x Figure 21 – Query 10 Sort (1) Address Members Ascending x Join Date Members Descending x Name Address Join Date Anderson, William Cincinnati, OH 4/10/00 Applegate, Bernice Cleveland, OH 3/17/01 Attwater, Lester Columbus, OH 9/1/00 Aston, Martin Columbus, OH 11/7/98 Adler, Lawrence Pataskala, OH 3/4/99 Acquila, Jose Pickerington, OH 1/5/97 Amico, Donald Toledo, OH 1/5/97 Page 14 Using Databases to Solve Problems Write a query that will list the names of all Members, their address, and join date. Sort the list by join date (most recent first) and then by address. Tables: Members Field Name Table Members Sort Show x Criteria OR OR Join Date Members Descending x Address Members Ascending x Name Join Date Address Applegate, Bernice 3/17/2001 Cleveland, OH Attwater, Lester 9/1/2000 Columbus, OH Anderson, William 4/10/2000 Cincinnati, OH Adler, Lawrence 3/4/1999 Pataskala, OH Aston, Martin 11/7/1998 Columbus, OH Amico, Donald 1/5/1997 Toledo, OH Acquila, Jose 1/5/1997 Pickerington, OH Figure 22 – Query 11 Sort (2) Re-write Query 11 to only display records for members who joined before January 1, 2000. Use the exact same QBE grid, but include a criterion in the Join Date field. Note that when typing in dates, # signs are used before and after the date. Dates can be expressed as mo/day/yr #1/1/2000#, or they can written out as follows - #January 1, 2000#. The # allows Access to determine that #1/1/2000# is January 1, 2000 and not the number 1 divided by 1 divided by 2000. Figure 23 is the QBE grid and resulting dynaset for this query. Tables: Members Field Name Table Members Sort Show x Criteria OR Join Date Members Descending x Address Members Ascending x <#1/1/2000# Name Join Date Address Amico, Donald 1/5/97 Toledo, OH Acquila, Jose 1/5/97 Pickerington, OH Aston, Martin 11/7/98 Columbus, OH Adler, Lawrence 3/4/99 Pataskala, OH Figure 23 – Query 12 AGGREGATING FIELDS USING QUERIES: How much has each member donated in total? To answer this question manually requires going through each transaction record and adding up the donations for each specific member number. For example member 534 has two transactions listed: one with a donation amount of $50 and the other with a donation amount of $0. Thus the total donation amount for member 534 is $50. This process will need to be repeated for each member id to find the total for each member. Access queries can be used to accomplish this same task using a feature known as Aggregate Functions. This feature can perform tasks such as counting all the members living in Columbus, Page 15 Using Databases to Solve Problems OH, totaling the dues or donations by ID#, or even to determine the average donation amount. Microsoft Access has the following built in aggregate functions: Sum, Avg, Count, Min, Max, Stdev, Var, First, Last. This aggregation of data may be done on all selected records or in specific groupings using the Group by specification in the desired field. Look at the QBE grid below that counts all members by location. The resulting dynaset should have a record for each location and a “count of …” field containing the number of records for that location. Tables: Members Field Address Table Members Total Group by Sort Show x Criteria OR ID# Members Count x address Cincinnati, OH Cleveland, OH Columbus, OH Pataskala, OH Pickerington, OH Toledo, OH CountOfid# 1 1 2 1 1 1 Figure 24 – Query 13 Aggregating is specified on the Total line of the QBE grid. This line does not automatically appear on the grid. To display the Total line, click on the button in the Design ribbon. If an aggregate function is specified for any field in the grid, the total line must be filled in for every other field in the grid. The Total boxes in the row must be filled in either with a Group by, a function (Count,Sum, Avg etc), Where (in fields with criteria), or Expression(for calculated data). The grid in Figure 24 groups the Address field so that all like entries in this field (Columbus entries, Toledo entries, etc) are put together into a group. Then within each group, the records in that group are counted using the ID# field. Notice that in the resulting dynaset, the field name for this counted field is now referred to as CountofID# instead of ID#. In the example above, the records in the ID# field of the Members table were counted. This query would work just as well if the phone number field or any other field were used to count the records in this grouping. The new field names in the dynaset would then correspond to the field used, CountofPhone etc. However, when summing a field (or using min, max, average or stdev) only the field being summed may be used, as demonstrated in a later example. Also consider what happens if the Names field were to be included in a query to count the number of members by location? Consider that each grouping of city may include many different members and thus many different member names. If the query is grouped by Name and City, the computer will automatically group by the smallest level of detail, in this case Name. Hence, the subtotal calculated will no longer be for each city as desired, but for each Name. Thus, it is necessary to plan the groupings to correctly reflect the level of aggregation. Page 16 Using Databases to Solve Problems How many active members reside in each city? Modify the QBE grid from query 13 to include the field Active with the criteria YES. Then in the Total row include a Where to tell Access that only records meeting this criteria should be included in the count. Fields with a Where (criteria) should not be shown. Will the results be different than that of Query 13? Tables: Members Field Address Table Members Total Group by Sort Show x Criteria OR ID# Members Count Active Members Where x Yes address Cleveland, OH Columbus, OH Pataskala, OH Pickerington, OH Toledo, OH CountOfid# 1 1 1 1 1 Figure 25 – Query 14 By ID#, create a summary listing the total dues-paid and the average donation made. Assume each member can have multiple dues and donation payment transactions. The QBE grid and resulting dynaset are illustrated in Figure 26. Tables: Finances Field ID# Table Finances Total Group by Sort Show x Criteria Dues-Paid Finances SUM Donation Finances AVG x x id# SumOfdues-paid AvgOfdonation 345 100 50 534 100 10 546 0 250 635 100 0 903 0 0 Figure 26 – Query 15 Recall member 534 had two transactions. One transaction was a $50 dues payment; the other was a $50 dues payment and a $20 donation. Notice that the aggregated values displayed for member 534 are $100 in total dues payments and $10 in average donations per transaction. As previously mentioned, unlike the Count function the other aggregate functions such as Sum, Avg, Min, Max must be used only on the field being summed, averaged etc. How can the total dues paid by all members and the overall average donation made by members be obtained? It would look almost the same as Query 15, but exclude the ID# field grouping (Group by). Query 16 details the results; notice only a single record containing these values is returned. Tables: Finances Field Dues-Paid Donation SumOfdues-paid AvgOfdonation 300 53.33 Table Finances Finances Total Sum Average Figure 27 – Query 16 Sort Show x x Criteria Page 17 Using Databases to Solve Problems USING CALCULATED FIELDS IN QUERIES: One of the cardinal rules when writing databases is never to repeat information or store information that can be easily generated from existing data. Consequently, when information such as the total value of a transaction (dues plus donation) is needed, a query can be written to obtain it instead of storing it permanently in a table. The Access Expression Builder tool found on the Design ribbon in the Query Setup Group can be used to write expressions in queries. As with any software tool, there is a specific syntax which must be used, and a library of available functions. Figure 28 – Expression Builder The Expression Builder dialog box, as seen in Figure 28, allows the user to select the object desired (table, saved query, function) from the hierarchical list in the left bottom section of the Builder. Once the object is selected, the fields available from that object will be listed in the center bottom section. By double clicking on these fields they can be added to the expression being built in the top box of the Builder. USING THE EXPRESSION BUILDER TOOL – A SIMPLE EXAMPLE: Consider the follow problem: Write a query that adds Dues-Paid to the Donations for each transaction. To create a query to calculate these values from the existing data fields, follow the steps below: Start a new query and show the appropriate table – Transactions. List the ID# field on the QBE grid. From a blank QBE grid box (fields row) select the expression building button to launch the Expression Builder tool. To select the first operand, Dues-Paid, first select the Transactions table object from the hierarchical list of available objects (tables, queries, etc.) in the left bottom box. Once the object is selected, the available fields for that object will appear in the center box of the Expression Builder. Click on the field required, in this case Dues-Paid. Click on the + operator just below the Expression section or simply type a plus sign in the Expression box next to Dues-Paid. Add the second operand, Donations, using this same methodology. Click the OK button when the expression is completed. The resulting expression should look something like this: Exp1:[Finances]![Dues-Paid] + [Finances]![Donations] Page 18 Using Databases to Solve Problems Notice the expression syntax which is generated: each object (table, field) appears inside of brackets [ ], and that table names are followed by an exclamation point (!). Use this syntax when writing expressions. Also notice that this expression is named Exp1 followed by a colon. Expressions can be renamed by simply typing over Exp1 on the QBE grid. To use any object already listed (or previously calculated) in the current query, save the query before writing the expression. By doing so, these objects now become available to use in the Expression builder. When referring to fields already selected on the QBE grid, the table names are not required in the formula unless the names are ambiguous (i.e., several fields have the same name – this can occur when using multiple tables). Placing this expression into the appropriate query we get the following: Tables: Finances Field ID# Table Sort Show Criteria total: [Finances]![dues-paid]+ [Finances]![donation] Finances x x id# 635 534 345 546 903 534 total 100 50 150 250 0 70 Figure 29 – Query 17 WRITING QUERIES WITH AGGREGATIONS AND EXPRESSIONS Queries may become more and more complex as the many tools available are combined together. Thus far queries have been used to select (filter) records using one or more sets of criteria, sort records, aggregate data based on field groupings, and now perform calculations. In the following query, both the aggregation of data and the use of expressions will be required. Summarize for each member id the dues-paid, donations made, and total value of their transactions. Remember that an individual member can make one or more financial transactions. Thus to total these transactions the Aggregate function tools will be required. Once these aggregated values are saved, the expression builder feature can be used to add SumofDuesPaid to SumofDonations. This may be done in a single query or with the use of a previously written query such as the previous query, which has already calculated the total dues plus donations and just requires aggregation. The example below uses the one-step approach, both aggregating the data and then applying the expression. Please note that not all questions can be answered in a single query. The query tool first aggregates data and then does the calculation. In some instances the order of these two operations may require opposite, first a calculation is required then the data needs to be aggregated. In such cases the user will need to create two queries. Page 19 Using Databases to Solve Problems Tables: Finance Field ID# Table Total Sort Show Criteria Dues-Paid Donation Finances Group by Finances Sum Finances Sum Expression x x x x id# SumOfdues-paid SumOfdonation 345 100 50 534 100 20 546 0 250 635 100 0 903 0 0 total: [sumofdues-paid] +[sumofdonation] total 150 120 250 100 0 Figure 30 – Query 18 To execute this query it is best to create the query excluding the expression and then save it. Only after it is saved will these values (SumofDues-Paid, SumOfDonations) be available for use in the Expression Builder. Notice that the Total row contains the word Expression in the field containing the formula. Whenever the total row is used (whenever there are aggregations) fields containing calculations must include Expression. Determine the amount still owed, assuming members are required to pay a total of $100 in dues. To accomplish this, the information generated in the previous query will be needed, specifically the sum of the dues paid by member. Thus a new query based on the results of Query 18 will be constructed. The calculation required will be the $100 minus the SumOfDuesPaid Tables: Query 18 Field ID# Table Query18 Sort Show x Criteria dues owed: 100 – [Query18]![Sumofdues-paid] Query18 x Object used is a previous Query id# 345 534 546 635 903 owed 0 0 100 0 100 Figure 31 – Query 19 Notice that in Query 19, Query 18 was used instead of a table. It is permissible to use any combination of tables and query objects to create new query objects. Note the syntax of the expression in the second query. The expression is named dues owed. The formula subtracts the previously calculated amount from a constant. Here the object identifies the data source as [Query18]! with a table name it is followed by an exclamation point. The field name is [Sumofdues-paid] (since the dues-paid field was summed in Query 18, the resulting field in the Query 18 dynaset is named Sumofdues-paid). Page 20 Using Databases to Solve Problems NULL VALUES: Consider the original finances table. When a member makes a dues payment without a donation, a zero has been entered in the donation field. But what if that field is left blank as seen on the modified table in Figure 32? These blank fields in a record are known as null values. Original Table: Modified Table with Null Values: ID Dues-Paid Donation 635 534 345 546 903 100 50 100 0 0 0 0 50 250 0 ID 635 534 345 546 903 Dues-Paid 100 50 100 0 Donation 50 250 Figure 32 Figure 33 In Excel, blank cells are considered to be 0 when they are added. For example, the result of the expression A1+B1, where cell A1 is blank (null) and cell B1 contains 10, is the value 10. Excel’s algorithm for dealing with a blank cell in addition is to ignore it. In Access however, these null values are treated differently. In a calculated field of a query with the formula [sumofduespaid] +[sumofdonation], if the donation amount is null for a specific record, the value returned will be null for that record. This propagation of null values can often lead to incorrect results. Let’s consider the affect this modified table would have on the QBE grid from query 17 that is designed to calculate total transaction value (dues+donation). Observe that the total value for the first record is null. The computer performs the calculation 100 + null which results in a null value. So even though member 645 made a $100 payment the total value displayed is null. Query 17-modified - Tables: Modified Finances Field ID# Table Sort Show Criteria Finances x total: [Finances]![dues-paid]+ [Finances]![donation] x id# 635 534 345 546 903 534 total 150 70 Figure 34 Page 21 Using Databases to Solve Problems Obviously this poses the problem of how to obtain an accurate subtotal of each financial transaction. One of the easiest ways to work with fields that may contain null values is to use the Nz function. The Nz function works like an IF statement. It takes a value in a field and if the value is null, it substitutes an alternate value. If the value is not null, it simply returns the original value. The value substituted for the null value can be user defined, or in the absence of a user definition it will default to zero. Nz ([variant], valueifnull) Here is the modified query using this Nz function in the calculated field expression as follows: Field ID# Table Sort Show Criteria Finances x total: Nz([Finances]![dues-paid])+ Nz([Finances]![donation]) x id# 635 534 345 546 903 534 total 100 50 150 250 0 70 Figure 35 The expression used total: Nz([Finances]![dues-paid]) + Nz([Finances]![donation]) places the Nz function in front of each operand. Since a value if null is not given, a field containing a null value will result in the value zero. The computer will then interpret this as follows for the first record (id 635) of the Finances table: Nz(100) + Nz( null) 100 + 0 100. Page 22 Using Databases to Solve Problems EXERCISE 9.1 EMPLOYEE DATABASE The following Access tables represent staff utilization for an ad agency. Each record in the Jobs1 and Jobs2 tables represents a task performed for that job. Answer the questions in this problem based on the tables below. You will be asked to list all tables required to complete each query. Table: Job1 taskid# Employee# Minutes 1 1 23 2 3 13 3 4 17 4 3 5 5 3 15 6 8 8 7 7 5 8 1 12 9 7 6 Date 7 /3 /2006 7 /3 /2006 7 /8 /2006 7 /9 /2006 7 /10/2006 7 /10/2006 7 /10/2006 7 /12/2006 7 /13/2006 Table: Employees Employee# Name Job Description Job Rate 1 Jorenson,Mike Artist II $42.50 2 Howsworth,Elizabeth Editor $32.75 3 Rancine, Michael Photgraher $55.00 4 Elkins, Naomi Artist II $44.50 5 Waters, Ann Marie Artist I $27.50 6 Devons, Robert Senior Editor $52.80 7 James, Miranda Artist II $42.50 8 Ballack, Stephanie Editor $37.20 9 Moreno, Mandy Ad Exec $125.00 Table: Job2 taskid# Employee# Minutes 1 3 20 2 4 10 3 1 15 4 3 5 5 3 15 6 9 12 7 7 22 8 7 15 1. (a) Draw the relationship diagram between these 3 tables. Label each primary key and the foreign keys on your relationship diagram. Table Name: Job1 Primary Key: Table Name: Employees Primary Key: Table Name: Job2 Primary Key: (b) Why can’t employee# be the primary key on the Job1 table? Page 23 Date 7 /1 /2006 7 /3 /2006 7 /3 /2006 7 /5 /2006 7 /9 /2006 7 /9 /2006 7 /10/2006 7 /11/2006 Using Databases to Solve Problems 2. Design a query to list for each Job2 task the following information: the taskid#, employee’s id and minutes they worked on this task. Table __________________________________ Field Table Total Sort Show Criteria OR OR 3. Design a query to list the taskid# and employee’s id for each Job1 task which lasted at least 15 minutes. Table __________________________________ Field Table Total Sort Show Criteria OR OR Page 24 Using Databases to Solve Problems 4. Design a query to list the employee id number for employees who worked on Job2 on either July 1, 2006 or on July 10, 2006. Table __________________________________ Field Table Total Sort Show Criteria OR OR 5. Design a query that lists the employee’s id and date and minutes spent for the Job 1 tasks that were between 15 and 20 minutes long (inclusive). Table __________________________________ Field Table Total Sort Show Criteria OR OR Page 25 Using Databases to Solve Problems 6. Using the Access Query design view below, design a query that results in a list of employees (Employee#) who worked on Job1 who: Worked for more than 10 minutes on a task on or after July 10, 2006 or Worked at least 5 minutes on a task before July 10, 2006. Table __________________________________ Field Table Total Sort Show Criteria OR OR 7. Using the Access Query design view below, design a query that will result in a list of all employees whose last names begin with the letter M or N. Table __________________________________ Field Table Total Sort Show Criteria OR OR Page 26 Using Databases to Solve Problems 8. Using the Access Query design view below, design a query to list the names of each employee in alphabetical order. Table __________________________________ Field Table Total Sort Show Criteria OR OR 9. Using the Access Query design view below, design a query to list the Employees who earn more than $35 per hour (job rate). List each employee’s name, Empoyee#, and Job Rate. Sort your list by Job Rate (most expensive first) and then alphabetically by name. Table __________________________________ Field Table Total Sort Show Criteria OR OR Page 27 Using Databases to Solve Problems 10. Design a query summarizing the Job2 tasks by employee. The list should include Employee# and the total minutes worked by that employee. Tables __________________________________ Field Table Total Sort Show Criteria OR OR 11. Design a query summarizing the Job1 tasks by employee#. The list should include Employee#, the number of tasks they worked, and the average time per task that they worked. Tables __________________________________ Field Table Total Sort Show Criteria OR OR Page 28 Using Databases to Solve Problems 12. An overhead fee of $5 per task is assessed for each task regardless of duration. Using the result from the previous query (which you named query11), write a query in the design view below which will return the Employee#, and the total amount of overhead fees for this employee for Job1. If an employee has not worked their name should not appear on this list. Tables __________________________________ Field Table Total Sort Show Criteria OR OR Additional room for calculated expression if needed: Page 29 Using Databases to Solve Problems 13. Currently clients are billed the cost of the tasks completed (hourly rate times hours) plus the overhead fee assessment. An alternative method of billing overhead has been suggested by management as follows: Hourly overhead costs will be billed at 25% of the hourly wage plus $3 per hour. So an Ad Exec with an hourly wage of $125/hour would have an hourly overhead charge of $125*.25 +3. Write a query to calculate the hourly overhead charge for each of the different employees listed on the employees table. If the Job Rate field is blank the overhead charge should simply be $3. Table __________________________________ Field Table Total Sort Show Criteria OR OR Additional room for calculated expression if needed: Page 30 Using Databases to Solve Problems EXERCISE 9.2 CHAPTER REVIEW – LIBRARY DATABASE The tables below represent the database system for the Fergie Library. The Book table lists the names of all the books and periodicals owned by the library. The Cardholder table lists the names of all authorized library users. The Circulation table is a running list of books that are borrowed. When books are borrowed, they are entered onto this table, and when they are returned, the original entry is noted as returned. Books Book# 23456 834592 4539 33982 33983 43928 Title 1812 Gone With the Wind Webster’s Unabridged Dictionary Hunt for Red October Hunt for Red October Joy of Cooking – 5th Edition Circulation Borrow# A83765 A83765 C52938 C77328 Cardholders Borrower# A23457 C52938 A83765 Book# 4539 33982 4539 43928 Date Borrowed 7/1/98 7/1/98 7/4/98 7/6/98 LastName Smith Jones Walters FirstName Michael Jennifer John Author Vidal Mitchell Webster Clancy Clancy Smith Type 4 4 3 1 1 2 Price $39 $19 $86 $22 $22 $45 Returned Yes No No Yes Address th 222 East 7 ST 18 Main Street 55 Elm Street City State Zipcode Bexley Dublin Columbus OH OH OH 43209 43218 43213 1. Database Relationships. Set up the relationships of this database. Using the boxes below, fill in the primary key (if any) of each table and draw relationship lines between tables. Label each relationship with the name of the foreign key(s). Table name: Books Primary Key: Table name: Circulation Primary Key: Table name: Cardholders Primary Key: Page 31 Using Databases to Solve Problems 2. What field type (Text, Number, Currency, Date/Time, Yes/No, AutoNumber) is best suited for each of the following fields: Borrow# (Circulation Table) _______________________ Date Borrowed (Circulation Table) _______________________ Returned (Circulation Table) _______________________ Address (Cardholders Table) _______________________ 3. Using the query design view below, construct a query to list the book titles and authors of all the type 4 books in alphabetical order by author then by book. Tables Used _____________________________________ Field Table Total Sort Show Criteria OR OR 4. List the dynaset created from the query in question #3 based solely on the data shown. Page 32 Using Databases to Solve Problems 5. List the title of all books that begin with the letters G or J. Tables Used _____________________________________ Field Table Total Sort Show Criteria OR OR 6. List the book# and borrow# of all books borrowed between June 1, 1998, and July 4, 1998. Sort the list by the borrower’s number. Tables Used ____________________________________ Field Table Total Sort Show Criteria OR OR Page 33 Using Databases to Solve Problems 7. List the title of all books that are either type 1 and cost less than $25 or type 2 and cost less than $50. Tables Used __________________________________ Field Table Total Sort Show Criteria OR OR 8. Write a query that will summarize the books by author. The query should list the author’s name, the total number of books in the collection by that author, and the average price of that author’s books. Tables Used __________________________________ Field Table Total Sort Show Criteria OR OR Page 34 Using Databases to Solve Problems 9. The library would like to calculate the accrued fines for lost books. You will assume that all books borrowed before January 1, 2001 and not returned are lost. The following fine is charged for all lost books: $.05 per day from the date borrowed plus a $10 flat fee handling charge. For each lost book, list the book number, the borrower number, and the fine. Tables Used ___________________________________ Field Table Total Sort Show Criteria OR OR Additional Expressions: Page 35