Data Manipulation Language: A Primer Copyright Konana 2000 Data Manipulation Language (DML): A Primer Copyright Konana, 1999 Prabhudev Konana, Ph.D. Assistant Professor The Graduate School of Business The University of Texas at Austin Austin, TX 78712 1 Data Manipulation Language: A Primer Copyright Konana 2000 Data Manipulation Language (DML) Data manipulation can be performed either by typing SQL statements or by using a graphical interface, typically called Query-By-Example (QBE). ACCESS allows you to use both. For complex queries, knowing SQL always helps. However, it is an individual’s choice. While some argue that it is enough to know QBE, many professionals believe knowing SQL helps significantly. This document describes how to use both. The set of examples chosen here provides you the concepts and is representative of the type of queries you encounter in real-life. The actual meaning and syntax definition are discussed with examples. One can construct pretty complicated queries using this knowledge. Knowing how to use SQL or QBE is like swimming. Once you understand you never forget and you will always get better with subsequent use. The description of the concepts and examples is unique and uses a cognitive-framework that has worked very well. While this cognitive framework will help formulate most queries, it is not a panacea for all types of queries. Only experience can make you better. The main DML constructs are: SELECT, INSERT, DELETE and UPDATE. SELECT statement is used for retrieval of data from one or more tables, views or snapshots (we will ignore snapshots). The other statements, namely, INSERT, DELETE and UPDATE allow changing of the database by inserting, deleting and updating data, respectively. SELECT SELECT allows users to retrieve data from one or more tables with various conditions. The SELECT statement has a well-structured set of clauses. The basic clauses are ([ ] implies optional constructs): SELECT [DISTINCT] attribute(s) or derived (computed) attributes FROM table, [table2,…], [Result of another query] [WHERE condition on attributes and join conditions for tables] [GROUP BY attribute(s)] [HAVING condition] [ORDER BY attributes(s)]; In order to query a database, one has to conceptualize (cognitive framework) the query in a clear logical way. This is, in fact, majority of the work in querying. This cognitive model then must be translated into SQL/QBE execution syntax. To develop the cognitive framework, a 1-6-step model (read One-Six-step model) is proposed. For each query, apply this 1-6-step model until querying strategy comes naturally to you (remember, this is like learning swimming; until you learn everything seems difficult). In this 1-6-step model, the one refers to reading the question carefully to see if the query involves sub-queries or nested queries (query within a query). Very often a query requires the result(s) of another query. For example: 2 Data Manipulation Language: A Primer Copyright Konana 2000 a. Find all customers whose purchase value was twice more than the average purchase value of all customers. b. Find consumers who charge more than 50% of their total annual purchases using some credit card. If you carefully read query (a), we need to know the average purchase value of all customers before finding the customers with higher than average purchases. Therefore, this query has two queries: (1) an inner query that requires finding the average purchase value of all customers; and (2) an outer query that uses the result of the first query. Similarly, query (b) requires you to first compute the total annual purchases of each consumer before finding the consumers who charge more than 50% through credit card. The One in 1-6-step model, requires parsing of a query to identify inner (nested or sub-queries) and outer queries. It is possible to have many inner queries for an outer query, or a inner query may have many inner queries and outer queries. Then for each query (first inner and then outer), apply the Six questions in 1-6-step model as discussed below. These six questions, translate into six clauses of the SELECT statement: a. What do you want to display? à SELECT clause. • In SELECT list all the attributes and any derived attributes (derived attributes may be aggregated result such as SUM, AVG and COUNT, or values derived by adding, subtracting, multiplying or dividing values of other attributes (e.g., price * quantity or Sales/# of employees). b. Where is the data coming from (tables)? à FROM clause. • List the tables that are required to display any information in the SELECT clause or required to specify any conditions in the WHERE clause. FROM clause can also include the result of another query. Any table used in FROM clause can be renamed (e.g., FROM course newcoursename – where course is the name of the table and newcoursename is the new name) c. Are there any conditions on attributes and/or on join conditions? à WHERE clause. • Any conditions on the attributes must be listed in WHERE clause. Also, whenever there are more than one table in the FROM clause, a join condition may be required. (there are different types of joins that will covered later). That is, we need to specify how two tables must be linked. Otherwise, the system creates a permutation and combination of records from both tables (i.e., if table A has 1000 records and B has 1000 records, a join without any join condition will create 1,000,000 records!). A condition may use the results of another query (nested query). d. Do you need to group by any attribute? à GROUP BY. • Whenever an aggregated value (e.g, SUM, AVG, COUNT, MIN, MAX) is used in the SELECT clause, a GROUP BY statement is required. The only exception is when the aggregation involves the whole table as one group. e. Are there any conditions on the aggregate functions? à HAVING 3 Data Manipulation Language: A Primer • Copyright Konana 2000 When there are conditions on the aggregated values listed in the SELECT statement, those conditions are specified in the HAVING clause (not in the WHERE statement. In general, WHERE clause can have only simple non-aggregated conditions). HAVING can also use results of another query. f. Do you need to order the results by some attribute? à ORDER BY (default ascending order) (There is much more to SELECT statement and we will address it gradually with examples). The best way to learn DML is by practicing queries. Rest of this chapter uses the (now infamous) COURSE-FACULTY-SECTION example for describing the SELECT statement. Both SQL and QBE solutions are provided. The SQL statements can be used with any DBMS. The following tables and data are used in the examples: COURSE Table: Course_ID MIS373.1 MIS373.2 MIS374 MIS380N.1 MIS382N.1 MIS382N.2 MIS382N.3 MIS382N.4 Course_Name Advanced Databases Advanced Data Communications Systems Analysis and Design ITM Data Communications Database High-Tech Strategies Systems Analysis and Design Credit_hours 3 3 3 3 3 3 3 3 FACULTY Table: Faculty_SSN 111-11-1111 222-22-2222 333-33-3333 444-44-4444 555-55-5555 666-66-6666 777-77-7777 Faculty_Name Konana Barua Ruefli Leibrock Jordan Unknown Karem Title Assistant Professor Associate Professor Professor CTO Professor Assistant Professor Assistant Professor 4 Data Manipulation Language: A Primer Copyright Konana 2000 SECTION Table: Course_ID MIS373.1 MIS373.1 MIS382N.2 MIS382N.2 MIS382N.2 MIS380N.1 MIS380N.1 MIS373.1 MIS373.2 MIS382N.2 MIS373.2 MIS373.2 MIS382N.3 MIS380N.1 MIS382N.3 MIS382N.3 MIS382N.3 MIS380N.1 MIS380N.1 MIS382N.4 MIS382N.4 MIS382N.4 MIS382N.4 MIS382N.4 MIS382N.4 MIS382N.4 Section_Num 1 2 1 1 1 1 2 1 1 2 1 2 1 3 2 1 2 4 5 1 2 1 1 2 1 2 Semester_code 981 981 991 992 971 984 984 991 991 991 981 981 991 984 991 981 981 984 984 994 971 971 991 991 981 981 Faculty_SSN 111-11-1111 111-11-1111 111-11-1111 111-11-1111 111-11-1111 111-11-1111 111-11-1111 111-11-1111 222-22-2222 222-22-2222 222-22-2222 222-22-2222 333-33-3333 333-33-3333 333-33-3333 333-33-3333 333-33-3333 444-44-4444 444-44-4444 666-66-6666 666-66-6666 666-66-6666 666-66-6666 666-66-6666 666-66-6666 666-66-6666 Enrollment 45 35 58 17 65 60 55 45 40 45 60 35 32 60 35 42 36 45 50 18 7 18 22 28 31 17 SEMESTER Table: Semester_code 971 974 981 984 991 992 993 994 Year 1997 1997 1998 1998 1999 1999 1999 1999 Semester Spring Fall Spring Fall Spring Summer I Summer II Fall 5 Data Manipulation Language: A Primer Copyright Konana 2000 Queries: Find the faculty names starting with letter “K.” Let’s use 1-6-step cognitive model discussed earlier. Ask the questions to yourself and write down the answers: One of 1-6-step: There are no sub queries in this query. Six of 1-6-step: a. We want to display faculty names (and may be title). b. This information comes from faculty table. c. The condition is that the faculty name must begin with the letter “K”. d. There is no need to group since we are not aggregating data based on any group e. There is no condition on the aggregation involved. f. There is no sorting specification as well. SQL statement: The following SQL statement will execute in any database (including ACCESS). In order to type in SQL statement in ACCESS, please follow the instructions given below: Step 2: Double click on Create query in Design View Step 3: Click on Close button Step1: Click on Queries objects 6 Data Manipulation Language: A Primer Copyright Konana 2000 Type in the following SQL statement in the SQL View workspace: SELECT faculty.faculty_name, faculty.title FROM faculty WHERE faculty.faculty_name LIKE “K*”; The SELECT clause lists all the attributes that are needed (Step 1). FROM lists the tables from where the data are retrieved. WHERE lists any conditions on retrieving data. LIKE operator is used whenever we need to search for text strings (or patterns). Any text matching must be within quotes (single quote when you use Oracle and other database products). * next to K implies that there may be several characters (wild character) after “K”. ANSI standards uses % instead of *. Please note that it is always preferable to append table names with attributes. Therefore, faculty_name is specified as faculty.faculty_name and title as faculty.title. After typing in the SQL query, click on “!” that you can find on the menu bar. Step 8: Click on ! mark to run the query 7 Data Manipulation Language: A Primer Copyright Konana 2000 Result: Faculty_Name Title Konana Assistant Professor Karem Assistant Professor QBE This is best described with figures. Please follow the instructions given with the figure: Step 2: Click on Create query in Design View Step 1: Choose Queries Object Step 4: Click on Add to add Faculty to the workspace Step 3: Select Faculty table by clicking on it Step 5: Close the Show Table window by clicking on Close button. (if you select the wrong table, just click on the table selected in the Query space and then press Delete key!) 8 Data Manipulation Language: A Primer Copyright Konana 2000 Step 6: Double click on Faculty_Name. This will display Faculty_Name in the circled area. Similarly, Click on Title attribute now. Step 7: Type in Criteria LIKE “K*” Step 8: Click on ! mark to run the query Once you complete Step 8, you will see the results. You can save this query as Query 1 (or any other name you want). 9 Data Manipulation Language: A Primer Copyright Konana 2000 Query 2: Find all the Faculty names with second letter “o”. There can be any number of characters after “o” and only ONE character before “o”. In Step 7 of Query 1 QBE, please change LIKE “K*” to LIKE “?o*”. The corresponding SQL statement will also change in the WHERE statement. Here, “?” implies that there is exactly one character in the first place followed by “o” and then arbitrary number of characters. The answer to this query is: Faculty_Name Title Konana Assistant Professor Jordan Professor The complete list of options for various types of string matches can be obtained by doing a search on LIKE operator on Access Help. The following is cut and pasted from the help: ---------From ACCESS Help --The following example returns data that begins with the letter P followed by any letter between A and F and three digits: Like “P[A-F]###” The following table shows how you can use Like to test expressions for different patterns. Kind of match Pattern Multiple characters a*a *ab* Special character a[*]a Multiple characters ab* Single character a?a Single digit a#a Range of [a-z] characters Outside a range [!a-z] Not a digit [!0-9] Combined a[!b-m]# ---End ACCESS Help ---- Match (returns True) aa, aBa, aBBBa abc, AABB, Xab a*a abcdefg, abc aaa, a3a, aBa a0a, a1a, a2a f, p, j No (returns False) aBC aZb, bac aaa cab, aab aBBBa aaa, a10a 2, & 9, &, % A, a, &, ~ An9, az0, a99 b, a 0, 1, 9 abc, aj0 10 match Data Manipulation Language: A Primer Copyright Konana 2000 WARNING! The above text search syntax is specific to ACCESS. If you want to use Oracle, the syntax will slightly change. For example, wild-character is % instead of *, single character match is _ instead of ?. Query 3 Find the faculty names along with the details of the classes taught in Spring 1999 (Semester code: 991). One of 1-6-Step Model: There is no sub-query involved. Six of 1-6-Step: Ask the questions to yourself and write down the answers: a. We want to display course ID, section number, faculty name and enrollment. b. This information comes from faculty table (names are in faculty table only) and Section table. c. The condition is that the semester code should be 991. Since there are two tables, we need a join condition to match faculty_SSN in the Section table and the faculty table. d. There is no need to group since we are not aggregating data based on any group e. There is no condition on the aggregation. f. There is no sorting specification (anyway, we will sort by course ID) SQL SELECT section.course_ID, section.section_num, section.semester_code, faculty.faculty_name, section.enrollment FROM section, faculty WHERE section.semester_code = “991” AND section.faculty_SSN = faculty.faculty_SSN ORDER BY section.course_ID; Explanation: SELECT clause lists all the relevant attributes that need to be displayed. These attributes come from two different tables: section and faculty (shown in FROM clause). Since we are interested in Spring 1999, the semester_code is set to “991”. Please note that semester_code is defined as text data type; therefore, 991 is written within quotes. Since we have two tables, we must have a join condition for these two tables to match the records from section table with those of faculty table. Otherwise, the result will be permutation and combination of records from these tables (actually try executing a query without the join condition). Don’t forget semi-colon at the end of SQL statement! Query Result: course_ID section_num semester_code faculty_name Enrollment 11 Data Manipulation Language: A Primer course_ID MIS373.1 MIS373.2 MIS382N.2 MIS382N.2 MIS382N.3 MIS382N.3 MIS382N.4 MIS382N.4 section_num 1 1 2 1 2 1 2 1 semester_code 991 991 991 991 991 991 991 991 faculty_name Konana Barua Barua Konana Ruefli Ruefli Unknown Unknown Copyright Konana 2000 Enrollment 45 40 45 58 35 32 28 22 QBE Please follow the instructions given below: Step 1: Click on Queries Objects à Double click on Create query in Design View. You should see the Show Table window. Step 2: Click on Faculty table and then press Add button (or double click on Section) Click on Section and then press Add button (or just Double Click on Section). Step 3: Close the window You should see Faculty and Section tables in the Select Query workspace. The relationship that was built earlier is immediately displayed. This basically stipulates the join conditions for the tables. 12 Data Manipulation Language: A Primer Copyright Konana 2000 Step 4: Double click on Course_ID, Section_Num, Semester_Code and Enrollment in Section table, and Faculty_name from faculty table. (If you happen to choose a wrong attribute, you can highlight the column and press Delete, or go to Edit in the menu bar and select Delete Column. Step 5: Under Semester_code column, type = “991” in the Criteria cell. Step 6: Set the sort order to ascending for Enrollment Step 5: Set criteria = “991” in the criteria row Step 7: Click on the “!” mark to run the query. Query 4 Find all the sections with an enrollment less than 30 students and display in ascending order of enrollment. One of 1-6-Step: There is no sub-query within the query. Six of 1-6-Step: Ask the questions to yourself and write down the answers: a. b. c. d. e. f. We want to display all attributes from section table. This information comes from Section table only. The condition is that the enrollment must be less than 30. There is no need to group since we are not aggregating data based on any group There is no condition on the aggregation. Data must be sorted in ascending order of enrollment. SQL SELECT section.* FROM section WHERE section.enrollment < 30 13 Data Manipulation Language: A Primer Copyright Konana 2000 ORDER BY section.enrollment; Explanation: Section.* implies that all attributes of the section table must be displayed. QBE Step 1: Click on Queries Objects à Double click on Create query in Design View. You should see the Show Table window. Step 2: Choose Section table by double-clicking (using Add) and Close the Show Table window 14 Data Manipulation Language: A Primer Copyright Konana 2000 Step 3: Double-click on * (i.e., selects all attributes) Step 4: Choose Enrollment from Section table again. Step 5: Remove the check so that Enrollment is displayed twice. (See what happens if the check mark is not removed) Step 6: Specify that Enrollment should be < 30 Step 7: Set Sort direction for Enrollment to Ascending (picture not shown) and Click on the ! mark to run the query. Query result: Course_ID MIS382N.4 MIS382N.4 MIS382N.2 MIS382N.4 MIS382N.4 MIS382N.4 MIS382N.4 Section_Num 2 2 1 1 1 1 2 Semester_code 971 981 992 971 994 991 991 Faculty_SSN 666-66-6666 666-66-6666 111-11-1111 666-66-6666 666-66-6666 666-66-6666 666-66-6666 Enrollment 7 17 17 18 18 22 28 Query 5 Find all faculty members (names) who taught a section with an enrollment greater than 35 in Spring 1999 (semestercode = 991). Display the Course ID, course name, section number and the enrollment as well. One of 1-6-Step: There are no sub-queries Six of 1-6-Step: Ask the questions to yourself and write down the answers: 15 Data Manipulation Language: A Primer Copyright Konana 2000 a. We want to display all the faculty name, course_ID, course_name, Section_Num and Enrollment. b. This information comes from Course (for course name), Faculty (for faculty name) and Section table for the section_num and enrollment. c. There are multiple conditions (1) the enrollment must be greater than 35; (2) the semestercode must be 991; and (3) join conditions for Course, Section and Faculty tables. Course and Section tables are matched by the Course_ID attribute, while the section and faculty tables are matched by the faculty_SSN attribute. d. There is no need to group since we are not aggregating data based on any group e. There is no condition on the aggregation. f. There is ordering condition. SQL SELECT faculty.faculty_name, section.course_ID, course.course_ID, section.section_num, section.enrollment FROM faculty, section, course WHERE section.enrollment > 35 AND section.semester_code = “991” AND Faculty.faculty_SSN = section.faculty_SSN AND Section.course_ID = course.course_ID; Explanation: Since we have three tables we must have two join conditions. Please run the query without the join conditions and see what happens. Query Result: faculty_name Konana Konana Barua Barua section.course_ID MIS382N.2 MIS373.1 MIS373.2 MIS382N.2 course.course_ID MIS382N.2 MIS373.1 MIS373.2 MIS382N.2 section_num 1 1 1 2 16 enrollment 58 45 40 45 Data Manipulation Language: A Primer Copyright Konana 2000 QBE Step 1: Click on Queries Objects à Double click on Create query in Design View. You should see the Show Table window. Step2: Select course, faculty and section tables Step3: Double-click the attributes in the order shown Step 5: Remove check mark so Semester_code is not displayed Step4 Step 7: Click on ! icon and run the query! 17 Step 6 Data Manipulation Language: A Primer Copyright Konana 2000 Query 6 Find the number of sections taught, total enrollment, average class size for each faculty member for Spring 1998 (display the name of faculty as well) One of 1-6-Step: There are no sub-queries. Six of 1-6-Step: Ask the questions to yourself and write down the answers: a. We want to display the faculty names, number of sections taught, total enrollment in their sections, and the average size of the classes. b. This information comes from Faculty (for faculty name) and Section tables. c. There are two conditions: (1) we want data only for Spring 1998 (semester_code = 981); and (2) join condition for section and faculty table. d. We need aggregation to count of number of sections, to compute the total enrollment in sections for each faculty, and to compute average class size for each faculty member. e. There are no conditions on the aggregated values. f. There is no ordering condition. SQL SELECT faculty.faculty_Name, COUNT(section.course_ID) AS "Number Sections", SUM(section.enrollment) AS "Total Enrollment", AVG(section.enrollment) AS "Average Enrollment" FROM faculty, section WHERE faculty.faculty_SSN = section.faculty_SSN AND section.semester_code = "981" GROUP BY Faculty.faculty_Name; Explanation: This query involves computing aggregated values for number of sections taught (COUNT), total enrollment (SUM) and average class size (AVG) for each faculty member. This implies that we need to GROUP data according to each faculty member. It is easy to compute aggregated values by using group operators such as COUNT, SUM, AVG, MAX (maximum) and MIN (minimum). ACCESS and most other database vendors support other group operators that are not ANSI standard. These include STDEV (standard deviation of the sample), STDEVP (standard deviation of the population), VAR (variance of sample), VARP (variance of the population), FIRST and LAST. You MUST use GROUP BY to aggregate unless you are computing for the whole table as one set of records to aggregate. We can provide a new name for the aggregated result (or any attribute for that matter) using AS. For instance COUNT(..) is renamed as Number Sections (shown in red color text). QBE Step 1: Click on Queries Objects à Double click on Create query in Design View. You should see the Show Table window. 18 Data Manipulation Language: A Primer Copyright Konana 2000 Step 2: Choose tables Section and Faculty by double clicking on them. Step 3: Close the window Step 4: Double click on Faculty_name, Course_ID, Semester_Code and Enrollment. Step 5: Remove Check from Show for Semester_Code and specify Criteria = “981” 19 Data Manipulation Language: A Primer Copyright Konana 2000 Step 6: Click on the Summation icon shown here. This will open up new row for Total below. Step 7: select Count from the drop-down menu Step 8: Select Sum from the drop-down menu 20 Step 9: Double-click on Enrollment in Section table again and select Avg in Total row. Data Manipulation Language: A Primer Copyright Konana 2000 Step 10: Click on ! mark to run the query. You will see the results as follows: Faculty_Name Barua Konana Ruefli Unknown CountOfCourse_ID 2 2 2 2 SumOfEnrollment 95 80 78 48 AvgOfEnrollment 47.5 40 39 24 The column heading automatically reflects the COUNT, SUM and AVG. If you want to change the name of the header then do the following: Step 11: Type Number of Sections: (with colon) before Course-ID Step 12: Similarly, type Total Enrollment: and Average Enrollment: for total and average respectively. Note the semicolon here. When you run the query now, the result will look like: Faculty_Name Barua Konana Ruefli Unknown Number of Sections 2 2 2 2 Total Enrollment 95 80 78 48 Average Enrollment 47.5 40 39 24 (If you cannot see the column headers completely, drag the column lines to the right by holding and then dragging). 21 Data Manipulation Language: A Primer Copyright Konana 2000 Query 7 List the total credit hours for each course for all the years (Display the course ID, semester, total enrollment and total credit hours (i.e., enrollment times the credit hours). One of 1-6-Step: There are no sub-queries. Six on 1-6-Step: Ask the questions to yourself and write down the answers: a. We want to display the course ID, semester, total enrollment in each course, and the total credit hours for each course. b. This information comes from Course (for credit hours) and Section tables. c. There is only one join condition for section and course tables. d. We need aggregation to compute the total enrollment for each course and semester and total credit units which is the sum of enrollment * the credit hours. Therefore, we need to GROUP BY course ID and then by semester code. The order of grouping is very important. e. There are no conditions on the aggregated values. f. There is no ordering condition. SQL SELECT course.course_ID, section.Semester_code, SUM(Enrollment) AS "Total Enrollment", SUM(Credit_hours*Enrollment) AS "Total Credit hours" FROM course, section WHERE course.course_ID = section.course_ID GROUP BY course.course_ID, section.semester_code; Explanation: The expression SUM(credit_hours * enrollment) gives the total credit hours for each course and semester combination. The GROUP BY represents the order in which records are grouped. Query Results: Course_ID MIS373.1 MIS373.1 MIS373.2 MIS373.2 MIS380N.1 MIS382N.2 MIS382N.2 MIS382N.2 MIS382N.3 MIS382N.3 MIS382N.4 MIS382N.4 MIS382N.4 MIS382N.4 Semester_code 981 991 981 991 984 971 991 992 981 991 971 981 991 994 "Total Enrollment" 80 45 95 40 270 65 103 17 78 67 25 48 50 18 "Total Credit hourss" 240 135 285 120 810 195 309 51 234 201 75 144 150 54 22 Data Manipulation Language: A Primer Copyright Konana 2000 QBE Step 1: Click on Queries Objects à Double click on Create query in Design View. You should see the Show Table window. Step 2: Select the tables Section and Course into the workspace and close the Show Table window. Step 4: Click on summation icon to see Total row Step 5: Click on a new column in the Field row. Next, click on your right mouse button and you will see this menu popping up Step 3: Double click on Course_ID, semester_code and Enrollment Step 6: Choose Build option 23 Data Manipulation Language: A Primer Copyright Konana 2000 After Step 6 you will see the Expression Builder. Expression builder can be used to build complex formula for aggregation and other computation. Step 7: Double Click on the + sign for Tables. This will list all the tables Step 8: Double click on Course table. This will display all the attributes. Step 10: Click on * which is nothing but multiplication. 24 Step 9: Double click on Credit_hours. This will appear in the here. Data Manipulation Language: A Primer Copyright Konana 2000 Step 12: Press OK button. Step 11: Double click on Section table and then on the attribute Enrollment When you press OK the formula will appear here. Step 12: Since we need to SUM the product of crdit_hours and enrollment, choose SUM in the Total row. Step 13: Press ! icon and run your query! The Expression Builder provides you to create complex computations. Please take a few minutes to check out various options in Expression Builder (you don’t have to save anything J). 25 Data Manipulation Language: A Primer Copyright Konana 2000 Query 8 List all the sections (for Spring 1999 – semester_code = 991) with the faculty name for which the enrollment is greater than the average number of students of all classes in Spring 1999. This is a bit more complex query. Please read this carefully. One of 1-6-Step: It may appear to you that there are two pieces to this query. You are absolutely correct! We need to first find out the average number of students per class for Spring 1999 and then use this information to pull all sections with enrollment greater than the average enrollment for Spring 1999. For each query you apply the 6-step cognitive model. The only question is how to connect these two queries. This is an art and will come with practice. So, do not expect miracle in the learning process J. We will apply the 6-step cognitive model to the inner query first: a. b. c. d. We want to display the average enrollment of all sections. This information comes from Section table. There is only one condition that the data are only for Spring 1999 (semester_code = 991). We need aggregation to compute the average. However, since we are grouping by any attribute as such we don’t need GROUP BY statement (in the SELECT statement there is no attribute to group by. e. There are no conditions on the aggregated values. f. There is no ordering condition. SQL There are a number of ways to execute this query. Two different ways are shown in the SQL statement below: We can use a nested query approach, where the result of the inner query (that is, finding the average enrollment) is fed into the outer query. Let’s get the inner query in SQL form. SELECT AVG(section.enrollment) FROM section WHERE section.semester_code = “991”; The above inner query is straightforward. Now, we need to the pass the result of that query into the outer query. Let’s apply the 6-step cognitive model again: a. We want to display the faculty names, Course_ID, Section_num and enrollment. b. This information comes from Faculty (for faculty name) and Section tables. c. There are THREE conditions: (1) we want data only for Spring 1999 (semester_code = 991); (2) join condition for section and faculty table; and (3) enrollment number must be greater than the average enrollment for Spring 1999 (i.e., the result of the inner query). d. There is no aggregation in the outer query. e. There are no conditions on the aggregated values. 26 Data Manipulation Language: A Primer Copyright Konana 2000 f. There is no ordering condition. SELECT Faculty.faculty_name, section.course_ID, section.section_num, section.enrollment FROM Faculty, Section WHERE Faculty.faculty_SSN = Section.faculty_SSN AND section.semester_code = "991" AND section.enrollment > (SELECT AVG(section.enrollment) FROM section WHERE section.semester_code = "991"); Explanation: We still applied the 6-step cognitive model even when we have nested or sub-query. We first solved for the inner query (nested query) and then included in the outer query. The Inner query in Red color is used in the WHERE clause as part of the condition. We can solve the above problem in a slightly different way. We can use the inner query as a sub-query within the FROM clause and use the result as another table with only ONE record and one attribute. The result can be renamed dynamically by giving a name. SELECT Faculty.faculty_name, section.course_ID, section.section_num, section.enrollment FROM Faculty, Section, (SELECT AVG(section.enrollment) AS avgenrollment FROM section WHERE section.semester_code = "991") A WHERE Faculty.faculty_SSN = Section.faculty_SSN AND section.semester_code = "991" AND section.enrollment > A.avgenrollment; Explanation: The inner query (red color) is used in the FROM clause. The result of this inner query is renamed as A (you can give any name you want). This renaming is required to refer to the results. The avg(section.enrollment) is also renamed as avgenrollment. “A” is a table just like any other table and any attribute in this table can be addressed as A.attributename. We do that in the above example (e.g., section.enrollment > A.avgenrollment) Query Result: faculty_name Konana Konana Barua Barua course_ID MIS382N.2 MIS373.1 MIS373.2 MIS382N.2 section_num 1 1 1 2 enrollment 58 45 40 45 If you want the average enrollment also to be displayed then include A.avgenrollment in the SELECT clause. This is left as an exercise. Moral: any complex query can be broken down in smaller queries and worked on. You need to think of a strategy to link all of them together! 27 Data Manipulation Language: A Primer Copyright Konana 2000 QBE It is not very easy to build a sub-query with QBE unless you choose to execute this query in the form of two or more queries. The best way to do this is use inner query SQL within QBE. As discussed early on, one cannot create complex queries with QBE easily. You may be better off learning SQL and use it along with QBE. In this example, we will include inner (nested query) within QBE. Step 1: Click on Queries Objects à Double click on Create query in Design View. You should see the Show Table window. Step 2: Select the tables Section and Course into the workspace and close the Show Table window. Step 3: Double click on Faculty_name, course_ID, section_Num, Semester_code and Enrollment. Step 4: Remove the check from Show for Semester_Code and specify Criteria = “991” Step 5: Click on the Criteria for Enrollment and press the right mouse button. It opens up the option menu. Here select the Zoom so that it opens up a window to type your inner SQL query. Zoom window, type in the criteria that enrollment > average enrollment: 28 Step 6: In the Data Manipulation Language: A Primer Copyright Konana 2000 Step 7: Press OK Step 8: Run the query by clicking on the ! icon. Done! Alternative Query Strategy for QBE Step 1: Create a dummy query that computes average enrollment of all classes in Spring 1999 and save the query (the example below calls it query8). You will need to include Total and find AVG of Enrollment (routine steps are ignored from now on). Specify criteria = “991” Remove check for Semester_code Step 2: Use this dummy query8 for your main query. Choose Faculty, Section and Query8. To pick query 8, in the Show Table click on Query. 29 Data Manipulation Language: A Primer Copyright Konana 2000 Click on Queries and you should see query8 that you created earlier. Choose the attributes shown in the picture below. Now, click on Criteria cell in the Enrollment column and press your right mouse button. Then choose the expression builder that we used in the previous query. 30 Data Manipulation Language: A Primer Copyright Konana 2000 Double click on Queries. This will list all the queries. Double click on Query8 and it will display the AvgOfEnrollment Click on > and then double click on AvgOfEnrollment and Press on OK. Now, run the query by clicking on !. You are DONE!! This alternative is equivalent to SQL’s second alternative of including inner query in the FROM clause. Query 9 List all the sections of each faculty member for which the enrollment is greater than the average enrollment of the classes taught by that particular faculty (don’t worry about the semester). One of 1-6-Step: This is the hardest problem so far (although it appears similar to the previous). The difference between the previous question and this question is that we want to pull out only those sections for faculty member X for which the enrollment is greater than the average enrollment of classes taught by X. Therefore, there are two queries – inner and outer query. The inner query computes the average enrollment for each faculty member and the outer query extracts all the sections with the enrollment greater than the average enrollment for that faculty. But, the difficulty is we need to match the faculty member between the inner query and the outer query. The inner query computes the average enrollment for each faculty member. Lets use the 6-step cognitive model: 31 Data Manipulation Language: A Primer a. b. c. d. e. f. Copyright Konana 2000 We want to display the faculty SSN and average enrollment. This information comes from Section table. There is no condition. We need aggregation to compute the average enrollment by each faculty SSN. There are no conditions on the aggregated values. There is no ordering condition. SQL SELECT section.faculty_SSN, AVG(section.enrollment) FROM section GROUP BY section.faculty_SSN; Now, we can take the result of this query to execute the outer query. Let’s apply the 6-step cognitive model again: a. We want to display the faculty name, course ID, section number, semester code and the enrollment. b. This information comes from Section table, Faculty table and the result of the first query. c. There are three conditions: (1) join condition for section and faculty table; (2) join condition for the query result of the first part and section table; and (3) enrollment must be greater than the average enrollment for that faculty member. d. There is no aggregation. e. There are no conditions on the aggregated values. f. There is no ordering condition. SELECT faculty.faculty_name, section.course_ID, section.section_num, section.semester_code, section.enrollment, A.avgenroll FROM section, faculty, (SELECT section.faculty_SSN, AVG(section.enrollment) AS Avgenroll FROM section GROUP BYsection.faculty_SSN) A WHERE section.faculty_SSN = faculty.faculty_SSN AND Renaming the A.faculty_SSN = section.faculty_SSN AND result of the section.enrollment > A.avgenroll; subquery to A Query Results: faculty_name Konana Konana Konana Konana Ruefli Leibrock Ruefli Unknown course_ID MIS382N.2 MIS382N.2 MIS380N.1 MIS380N.1 MIS380N.1 MIS380N.1 MIS382N.3 MIS382N.4 section_num 1 1 1 2 3 5 1 1 semester_code 991 971 984 984 984 984 981 991 enrollment 58 65 60 55 60 50 42 22 32 avgenroll 47.5 47.5 47.5 47.5 41 47.5 41 20.1428571428571 Data Manipulation Language: A Primer faculty_name Unknown Unknown Barua course_ID MIS382N.4 MIS382N.4 MIS373.2 section_num 2 1 1 semester_code 991 981 981 Copyright Konana 2000 enrollment 28 31 60 avgenroll 20.1428571428571 20.1428571428571 45 Alternative SQL: This query can be executed with a nested query as below: SELECT faculty.faculty_name, B.course_ID, B.section_num, B.semester_code, B.enrollment FROM section B, faculty WHERE B.faculty_SSN = faculty.faculty_SSN AND B.enrollment > (SELECT AVG(C.enrollment) FROM section C WHERE C.faculty_SSN = B.faculty_SSN); The inner query (red color) feeds the result of the average enrollment to the outer query (BLUE color). The inner query takes the faculty SSN from the outer query and matches that in the inner query to compute the average. In order to do that we remain section table in outer query to B and that in the inner query to C. Note that the WHERE clause in the inner query. We match the faculty SSN in outer query to the faculty SSN in the inner query. QBE One needs to really plan a strategy on how to do this in QBE. A simple way is to create two queries: first one to compute average class sizes for each faculty member; and second to use the result of the first query along with other data and conditions. Use Total to compute average Renamed as avgenrollment Save the above query as Query9 (or any other name that you wish to). 33 Data Manipulation Language: A Primer Use Tables to select Section and Faculty tables Copyright Konana 2000 Click on Queries to select query9 created earlier Select the attributes as shown by double clicking on them Click on Criteria cell in the Enrollment column and click on the right mouse button and Choose Build 34 Data Manipulation Language: A Primer Copyright Konana 2000 First, Click on > Second, Double click on Queries and double click on query9 Third, click on avgenrollment and press OK Then press ! icon to run the query. Query 10 Let’s change Query 6 slightly. Display the faculty names who have taught at least one section with a total enrollment less than 50 in Spring 1998. (display the number of sections taught and the total enrollment). The 1-6-Step details are left as an exercise. The SQL and QBE solutions are given below. In this example, there are conditions on the aggregated information: COUNT of sections and SUM of enrollment. SQL SELECT faculty.faculty_name, COUNT(section.course_ID) AS NumOfSections, SUM(section.enrollment) AS TotalEnrollment FROM faculty, section WHERE faculty.faculty_SSN = section.faculty_SSN AND section.semester_code = "981" GROUP BY faculty.faculty_Name HAVING COUNT(section.course_ID) > 1 AND SUM(section.enrollment) < 50; 35 Data Manipulation Language: A Primer Copyright Konana 2000 Explanation: Since there are conditions on the aggregated values (COUNT and SUM), these have to be specified in the HAVING clause. The common mistake is to specify such conditions in the WHERE clause. WHERE clause should not be used for specifying such conditions since the aggregation (or group operators) must be performed first. This aggregation works only after specifying GROUP BY clause. Therefore, any conditions on group operators must come after GROUP BY in HAVING clause. It is also imperative that HAVING must come after GROUP BY. QBE (explanation not included since the figure is self-explanatory) You can choose to leave it as Group By or use where option. 36 Data Manipulation Language: A Primer Copyright Konana 2000 INSERT INSERT statement is used to insert new records into a table (we will ignore other possibilities). (In ACCESS insert is fairly straightforward and, therefore, is ignored.). The basic structure of INSERT statement for other databases is: INSERT INTO table-name (attribute1,attribute2,….) VALUES (attribute1value,attribute2value,…); For example, if we want to insert a new course into COURSE table: INSERT INTO course (course_ID, course_name, credit_hours) VALUES (“MIS325”,”Introduction to Databases”,3); The attribute list and the values must match. If one is inserting values corresponding to all attributes of the table then the attribute list can be ignored. That is, in the above example, we can ignore the list of the attributes since the order and the number of attributes match the COURSE table structure and the values correspond to that order. INSERT INTO course VALUES (“MIS325”,”Introduction to Databases”,3); Sometimes, we may need to enter partial set of the attribute list. In which case, we can specify the attribute names and the values corresponding to that. For example, if we do not know the enrollment figure in SECTION table, we can ignore that by specifying as follows: INSERT INTO section (course_ID, section_num, semester_code, faculty_SSN) VALUES (“MIS325”,1,”994”,”666-66-6666”); We can also INSERT into a table a query result (one or more records). For example, assume there is a table called BACKUP_SECTION and you want to insert all the records for Spring 1999. INSERT INTO backup_section SELECT * FROM section WHERE semester_code = “991”; If the backup is required only for Course_ID, Section_Num, Semester_code, and Enrollment (i.e., faculty_SSN is ignored) then the statement will be: INSERT INTO backup_section (course_ID, section_num, semester_code, enrollment) SELECT course_ID, section_num, semester_code, enrollment FROM section WHERE semester_code = “991”; 37 Data Manipulation Language: A Primer Copyright Konana 2000 UPDATE UPDATE statement is used to make changes to the data in the database. This assumes that you have the privilege to update the tables. The basic construct of UPDATE is: UPDATE tablename SET attribute = expression (or a result of a query) WHERE condition; Example: Update the credit hours of all MIS graduate level courses to 6 (assume all MIS graduate courses have a number MIS380 and above. UPDATE course SET credit_hours = 6 WHERE course_ID >= “MIS380”; Below are the instructions for UPDATE using ACCESS SQL and QBE: SQL Step 1: Click on Queries Objects à Double click on Create query in Design View. You should see the Show Table window. Step 2: Press Close à Go to Query on the menu bar à choose Update Query Step 2 Step 3 38 Data Manipulation Language: A Primer Copyright Konana 2000 Step 4: Type in the SQL statement in the workspace and click on ! to run the query. You will be warned about the number of records being updated. QBE Step 1: Click on Queries Objects à Double click on Create query in Design View. You should see the Show Table window. Step 2: Add Course table into the workspace. And, follow the steps shown below. Step 6: Press ! to run Step 3: Choose Update Query from Query menu Step 5: specify course_ID > “MIS380: Step 4: select Credit_hours to update Step7: You will be warned that some records will be update. Click OK. Update Done! Multiple attributes of a table can be updated simultaneously with one UPDATE query. The structure is similar: UPDATE tablename SET Attribute1 = expression or query, Attribute2 = expression or query, …. …. WHERE condition; 39 Comma separates one attribute from another Data Manipulation Language: A Primer Copyright Konana 2000 DELETE As the name suggests, DELETE statement is used to delete records from tables with or without conditions. It is assumed that the user has the privileges to delete items from the database. The basic structure of DELETE is: DELETE FROM tablename (or a result of a query) WHERE condition; Example: Delete all the records taught by the faculty “Unknown” during Spring 1991. SQL DELETE FROM section WHERE semester_code = “991” AND faculty_SSN = (SELECT faculty_SSN FROM faculty WHERE faculty_name = “Unknown”); Step 1: Click on Queries Objects à Double click on Create query in Design View. You should see the Show Table window. Step 2: Press Close à Go to Query on the menu bar à choose Delete Query Select Delete Query Query menu Select SQL View 40 Data Manipulation Language: A Primer Copyright Konana 2000 Type in the SQL statement and run the query. You will see a warning that some records will be deleted. The QBE part is left as an exercise. 41