Spreadsheet-Based Decision Support Systems Chapter 21: Working with Large Data Re-Visited Prof. Name Position University Name name@email.com (123) 456-7890 Overview 21.1 Introduction 21.2 Creating Pivot Tables with VBA 21.3 Using External Data 21.4 Exporting Data 21.5 Applications 21.6 Summary 2 Introduction Creating pivot tables using VBA Importing data from text files or webpages using VBA Importing data from databases Creating basic queries using the SQL programming language Exporting data using VBA An application which allows a user to query a database from Excel 3 Creating Pivot Tables with VBA In Chapter 6, we learned how to create pivot tables and pivot charts using the Pivot Table Wizard in the Excel Data menu. We will now learn the properties and methods in VBA that will allow us to perform these tasks dynamically. 4 Creating Pivot Tables with VBA (cont) The main pivot table object is PivotTables. Pivot tables are used similar to the Chart and Shape objects in that we must use the ActiveSheet object before specifying a PivotTables object. ActiveSheet.PivotTables(“PivotTable1”) 5 Creating Pivot Tables with VBA (cont) From this PivotTables object, we can further specify PivotFields, and from PivotFields we can specify PivotItems. ActiveSheet.PivotTables(“PivotTable1”).PivotFields(“Days to Arrive”) ActiveSheet.PivotTables(“PivotTable1”).PivotFields(“Days to Arrive”).PivotItems(“4”) 6 Creating Pivot Tables with VBA (cont) The ActiveSheet object’s main method is the PivotTableWizard method. The PivotTableWizard method has several arguments: – SourceType argument specifies if our pivot table data is from the spreadsheet (xlDatabase), an external source (xlExternal), multiple ranges (xlConsolidation), or another pivot table (xlPivotTable). – SourceData argument is used to specify the specific data from this source type. – TableDestination argument specifies a range where the table should be placed. – TableName argument can be used to give a name to this table, such as “PivotTable1”. ActiveSheet.PivotTableWizard SourceType:=xlDatabase, SourceData:=Worksheets(“Data-Shipping”).Range(“B3:E27”), TableDestination:=Worksheets(“Pivot-Shipping”).Range(“A3”), TableName:=“PivotTable1” 7 Creating Pivot Tables with VBA (cont) The AddFields method is used to specify the pivot table layout. – This method is used to set row fields, column fields, and page fields of the pivot table. – Each field is an optional argument of the method: RowFields, ColumnFields, and PageFields. – If there is more than one field for a particular argument, then use the Array function to specify them. – The AddToTable argument takes True/False values to determine if these new fields should be appended to the current table or replace existing fields, respectively. ActiveSheet.PivotTables(“PivotTable1”).AddFields RowFields:= Array(“Max Weight, lbs”, “Days to Arrive”), ColumnFields:=”Shipping Company” 8 Creating Pivot Tables with VBA (cont) For the PivotTables object, there are several other properties and methods to discuss. The RowGrand and ColumnGrand properties specify whether or not grand totals should be calculated for row or column fields, respectively. – The possible values for these properties are True or False. ActiveSheet.PivotTables(“PivotTable1”).RowGrand = True 9 Creating Pivot Tables with VBA (cont) Another property that may be used often is the Format property. – This can be used to apply a pre-defined report or table format to your pivot table. – The values for this property can be an xlReport value or xlTable value. ActiveSheet.PivotTables(“PivotTable1”).Format xlReport10 ActiveSheet.PivotTables(“PivotTable1”).Format xlTable2 10 Creating Pivot Tables with VBA (cont) A useful method of the PivotTables object is the RefreshData method. – This method is equivalent to pressing the exclamation point icon on the Pivot Table Toolbar. – If any changes are made to the data from which the pivot table was created, refreshing the data will update the pivot table data. ActiveSheet.PivotTables(“PivotTable1”).RefreshData 11 Creating Pivot Tables with VBA (cont) One last useful method of the PivotTables object is GetPivotData. – This method has the same functionality as the GETPIVOTDATA function defined in Chapter 6. – For a specific item in a given row or column field, this method will find the corresponding value from the data field. ActiveSheet.PivotTables(“PivotTable1”).GetPivotData(“DataFieldName”, “RoworColumnFieldName”, “ItemName”) 12 Creating Pivot Tables with VBA (cont) There are some other useful properties of the PivotFields object. The Orientation property sets the data fields of a pivot table. – This property takes the values xlDataField, xlRowField, xlColumnField, and xlPageField or the respective fields. ActiveSheet.PivotTables(“PivotTable1”).PivotFields(“Cost”).Orientation = xlDataField There is another possible value for the Orientation property which is xlHidden. – This will hide all of the values of the specified field. – This property can be useful not only to set the data field, but also to change any previously set fields to be different field types or to be removed from the pivot table all together. 13 Creating Pivot Tables with VBA (cont) There are two properties which can be used to make calculations (sum, average, min, max, etc). – These are the Function property and SubTotals property. The Function property is used for data fields. – To use this property simply specify the type of calculation you want to be made on the named field. ActiveSheet.PivotTables(“PivotTable1”).PivotFields(“Cost”).Function = xlMin 14 Creating Pivot Tables with VBA (cont) The SubTotals property is used for non-data fields. – With this property you must specify an index number, or numbers, which represent the type of sub totals you want to show for the given field. – These index values are 2 = sum 3 = count 4 = average 5 = max 6 = min others ActiveSheet.PivotTables(“PivotTable1”).PivotFields(“Max Weight, lbs”).SubTotals(6) 15 Creating Pivot Tables with VBA (cont) There is one main property that is used often with the PivotItems object which is the Visible property. – Using this property is similar to clicking on the drop-down list of values for a field in a pivot table and checking or un-checking the values which you want to be displayed. – The values for this property are True and False, much like we have seen in uses of the Visible property with other objects. ActiveSheet.PivotTables(“PivotTable1”).PivotFields(“Days To Arrive”).PivotItems(“1”).Visible = True ActiveSheet.PivotTables(“PivotTable1”).PivotFields(“Days To Arrive”).PivotItems(“8”).Visible = True 16 Creating Pivot Tables with VBA (cont) One last useful property is the ShowPivotTableFieldList property which is used with a Workbook object. – This property has True or False values which can be set to show or hide the pivot table field list of the pivot tables in the workbook. ActiveWorkbook.ShowPivotTableFieldList = True 17 Figure 21.3 One complete pivot table code example 18 Using External Data Importing Data – Text Files and Webpages – Databases Performing Queries with SQL 19 Importing Data We will first describe how to import data from text files and web addresses in VBA. We will use an object called QueryTables. This object is referred to using a Worksheet object. ActiveSheet.QueryTables 20 Importing Data (cont) To import data, we will simply add a QueryTable object using the Add method. The Add method has two arguments: – Connection requires the type of data being imported and the actual location of the data. – Destination argument is the location on the spreadsheet where you would like to place the imported data. 21 Importing Data (cont) The Connection argument enables us to clarify if we are importing data from a text file or a webpage. If we are importing data from a text file, we would define the Connection argument as follows. Connection:= “TEXT; path” Here, the path is the actual location of the text file on your computer given by some string value. 22 Importing Data (cont) The path value can also be given dynamically by prompting the user for the path value and storing the path name in a string variable. This path value would have to be concatenated with the TEXT specification. Dim UserPath As String UserPath = InputBox(“Enter path of text file.”) Connection:= “TEXT; “ & UserPath & “ 23 Importing Data (cont) In creating dynamic imports, you may prefer to let the user browse for a file rather than enter the path. To display an explorer browse window, we use the GetOpenFilename method associated with the Application object. – This method presents the user with a browse window and allows them to select a file. – The name of the file is returned as a string value. Application.GetOpenFilename(FileFilter, FilterIndex, Title, ButtonText, MultiSelect) 24 Importing Data (cont) The FileFilter argument gives you the option of limiting the type of file the user can select. – “Text Files (*.txt), *.txt” The Title argument allows you to give a title to the browse window that will appear. The MultiSelect has the values True or False to determine if a user can select more than one or only one value, respectively. Dim UserPath As String UserPath = Application.GetOpenFilename("Text Files (*.txt), *.txt", , "Select a file to import.", , False) Connection:= “TEXT; “ & UserPath & “ 25 Importing Data (cont) If we are importing data from a webpage, we would define the Connection argument as follows: Connection:= “URL; actual URL” Here, the actual URL is the URL of the website. Again, this value could be taken from the user dynamically. 26 Importing Data (cont) The Destination argument value is simply a range. Columns and rows will be created for the data appropriately. The output range for the entire table of data will begin in the Destination range. Destination:=Range(“A1”) 27 Importing Text Code The necessary properties for importing a text file basically describe how the text is organized in the file so that the values are imported correctly. With ActiveSheet.QueryTables.Add (Connection:=“TEXT;C:\MyDocuments\textfile.txt", Destination:=Range("A1")) .Name = "ImportTextFile" .FieldNames = True .RowNumbers = False .TextFileStartRow = 1 .TextFileParseType = xlDelimited .TextFileTextQualifier = xlTextQualifierDoubleQuote .TextFileCommaDelimiter = True .Refresh BackgroundQuery:=False End With 28 Figures 21.4, 21.5, and 21.6 An example text file imported to Excel using VBA 29 Importing Webpage Code To import a webpage, there are a few new properties needed. With ActiveSheet.QueryTables.Add (Connection:= "URL;http://www.webpage.com", Destination:=Range("C1")) .Name = “WebpageQuery1" .FieldNames = True .RowNumbers = False .WebSelectionType = xlSpecifiedTables .WebFormatting = xlWebFormattingNone .WebTables = "1" .WebPreFormattedTextToColumns = True .WebConsecutiveDelimitersAsOne = True .Refresh BackgroundQuery:=False End With 30 Figures 21.7, 21.8, and 21.9 An example webpage imported to Excel using VBA 31 Importing Databases There are two main objects used to import data – Connection – Recordset The Connection object establishes the communication to a particular database. There are two main methods used with this object – Open method uses a ConnectionString argument to define the path to the database. – Close method does not have any arguments. A Connection should be opened and closed every time a query or import is made from the database. 32 Importing Databases (cont) To define a Connection object variable, we use a data type called ADODB.Connection. We declare the variable as an ADODB.Connection data type and then use the Set statement to define the connection value of our variable. We define our connections to be new connections using the New statement. Dim cntMyConnection As ADODB.Connection Set cntMyConnection = New ADODB.Connection 33 Importing Databases (cont) Now, we need to define the data provider, or database type, and data source, or filename, of this connection. – These values will be given to the ConnectionString argument of the Open method. – The data provider we will usually use can be defined as: “Microsoft.Jet.OLEDB.4.0”. – The data source should be the filename of the database plus the path of the file. Dim dbMyDatabase As String dbMyDatabase = ThisWorkbook.Path & “\MyDatabase.mdb” 34 Importing Databases (cont) Now we have the data provider and data source; we can either assign these values directly to the ConnectionString argument or we can use a String variable. The ConnectionString argument value has two sub arguments named Provider and Data Source for the data provider and data source, respectively. Dim CnctSource As String CnctSource = “Provider=Microsoft.Jet.OLEDB.4.0; Data Source=” & dbMyDatabase & “;” 35 Importing Databases (cont) The complete code to open a connection is: Dim cntMyConnection As ADODB.Connection, dbMyDatabase As String, CnctSource Set cntMyConnection = New ADODB.Connection dbMyDatabase = ThisWorkbook.Path & “\MyDatabase.mdb” CnctSource = “Provider=Microsoft.Jet.OLEDB.4.0; Data Source=” & dbMyDatabase & “;” cntMyConnection.Open ConnectionString:=CnctSource 36 Importing Databases (cont) After closing a Connection, we clear the Connection value by setting it to Nothing. The complete code to close a connection is: cntMyConnection.Close Set cntMyConnection = Nothing 37 Importing Databases (cont) The Recordset object is used to define a particular selection of data from the database that we are importing or manipulating. – We will again use a variable to represent this object throughout the code; to define Recordset object variables, we use the ADODB.Recordset data type. – We again use the Set statement to assign the value to this variable as a New Recordset. Dim rstFirstRecordset As ADODB.Recordset Set rstFirstRecordset = New ADODB.Recordset 38 Importing Databases (cont) The arguments for the Open method of the Recordset object are – Source – ActiveConnection The Source argument defines the data that should be imported. – The Source value is a string which contains some SQL commands. – Similar to the data source value and ConnectionString value discussed above, we can use a String variable to define these SQL commands to use as the value of the Source argument Dim Src As String Src = “SELECT * FROM tblTable1” 39 Importing Databases (cont) The ActiveConnection argument value is the name of the open Connection object you have previously defined. rstFirstRecordset.Open Source:=Src; ActiveConnection:=cntMyConnection To copy this data to the Excel spreadsheet, we use the Range object and a new method: CopyFromRecordset. – This method only needs to be followed by the name of the Recordset variable you have just opened. Range(“A1”).CopyFromRecordset rstFirstRecordset 40 Importing Databases (cont) In each procedure where we are importing or manipulating data from a database, we type the following. Dim rstFirstRecordset As ADODB.Recordset, Src As String Set rstFirstRecordset = New ADODB.Recordset Src = “SELECT * FROM tblTable1” rstFirstRecordset.Open Source:=Src; ActiveConnection:=cntMyConnection Range(“A1”).CopyFromRecordset rstFirstRecordset 41 Importing Databases (cont) When we are done using this Recordset, we should clear its values; we do this using the Set statement with the value Nothing. Set rstFirstRecordset = Nothing 42 Importing Databases (cont) In applications where you plan to make multiple queries to a database, we recommend creating a function procedure which can be called for each query. Function QueryData(Src, OutputRange) dbUnivInfo = ThisWorkbook.Path & "\UniversityInformationSystem.mdb" Set cntStudConnection = New ADODB.Connection CnctSource = "Provider=Microsoft.Jet.OLEDB.4.0; Data Source=" & dbUnivInfo & ";“ cntStudConnection.Open ConnectionString:=CnctSource Set rstNewQuery = New ADODB.Recordset rstNewQuery.Open Source:=Src, ActiveConnection:=cntStudConnection Range(OutputRange).CopyFromRecordset rstNewQuery Set rstNewQuery = Nothing cntStudConnection.Close Set cntStudConnection = Nothing End Function 43 Performing Queries with SQL Structured Query Language (SQL) is the code used to perform queries, or filter the data which is imported. SQL commands are used to define the Source argument of the Open method with the Recordset object. You can define the Source to be all values in a particular database table or pre-defined query or you can create a query as the value of the Source argument. 44 SQL (cont) The basic structure of SQL commands is 1. A statement which specifies an action to perform; 2. A statement which specifies the location of the data on which to perform the action; 3. A statement which specifies the criteria the data must meet in order for the action to be performed. Some basic action statements are – SELECT – CREATE – INSERT 45 Figure 21.10 Consider a table from a University System database. This table, called tblStudents, contains student names, IDs, and GPAs. 46 SQL (cont) The SELECT statement selects a specific group of data items from a table or query in the database. The phrase appearing immediately after the SELECT statement is the name or names of the fields which should be selected. SELECT StudentName FROM tblStudents 47 SQL (cont) To select everything in a table, that is all fields, use the asterisks mark (*) after the SELECT statement. We must also specify the location of this field, that is the table or query title from the database. We do this using the FROM statement. 48 SQL (cont) We can also include a criteria filtering in the query. The most common criteria statement is WHERE. The WHERE statement can use sub statements such as – <, >, = for value evaluations. – BETWEEN, LIKE, AND, OR, and NOT for other comparisons. SELECT StudentName FROM tblStudents WHERE GPA > 3.5 49 SQL (cont) Other criteria statements include – GROUP BY – ORDER BY ORDER BY can be used with the WHERE statement to sort the selected data; this data can be sorted in ascending or descending order using the statements ASC or DESC respectively. SELECT StudentName, GPA FROM tblStudents WHERE GPA > 3.0 ORDER BY GPA DESC 50 SQL (cont) In a SELECT statement, we can also perform simple aggregate functions. Simply type the name of the function after the SELECT statement and list the field names which apply to the function statement in parenthesis. One common function statement is COUNT. – Using SELECT COUNT will return the number of items (matching any given criteria) instead of the items themselves. SELECT COUNT (StudentName) FROM tblStudents WHERE GPA > 3.5 51 SQL (cont) Other functions include – MIN – MAX – AVG SELECT AVG (GPA) FROM tblStudents 52 SQL (cont) In VBA, SQL statements always appear as a string; that is, they are enclosed by quotation marks. If your criteria checks for a particular string value, you must use single quotation marks to state that value. =”SELECT GPA FROM tblStudents WHERE StudentName = ‘O. Peterson” 53 SQL (cont) Now suppose instead of specifying our own criteria, we want the user to determine which name to search for. – We can use an Input Box and a variable, in this example called StudName, to prompt the user for this value. – Then we can include this variable in place of the criteria value in the SQL statement. =”SELECT SSN FROM tblStudents WHERE StudentName = ‘” & StudName &” ’” Note that we have to include the single quotation marks around the criteria value; therefore, we have concatenated the variable name followed by the ending single quotation mark. 54 SQL (cont) Now let us incorporate these SQL statements into our database query code. – As mentioned in the previous section, we will use a string variable to assign the value of the SQL commands. – We will then use this variable in the Source argument of the Open method of the Recordset object. Dim StudName As String StudName = InputBox(“Please enter name of student whose GPA you want.” Src = “SELECT GPA FROM tblStudents WHERE StudentName = ‘” & StudName & “’” rstFirstRecordset.Open Source:=Src; ActiveConnection:=cntMyConnection Range(“A1”).CopyFromRecordset rstFirstRecordset 55 Exporting Data We can also use SQL to export data. We can place data into a previously created Access database using the CREATE and INSERT SQL commands. The CREATE statement can be used to create a new table in the database. – The corresponding location statement for the CREATE command is TABLE. – The name of the new table is given after the TABLE statement. – The name of the table is followed by the name of the fields for the new table; these are listed in parenthesis with a description of the data type the field should hold. – You must also include a CONSTRAINT command to specify the primary key of the table. – You would give a name to this key, specify that it is the PRIMARY KEY, and then list the selected field. =”CREATE TABLE tblCourses (CourseName TEXT, CourseNumber NUMBER, FacultyAssigned TEXT) CONSTRAINT CourseID PRIMARY KEY (CourseNumber)” 56 Exporting (cont) Once you have created a table, you can use the INSERT statement to enter values for each field. The INSERT statement is always followed by the INTO location statement. – The name of the table into which you are entering values is listed after the INTO statement. – The field names for which you are entering values should then be listed in parenthesis; that is, you may not want to enter values for all fields. – Then the values are listed after a VALUES statement in the same order in which the corresponding fields were listed. =”INSERT INTO tblCourses (CourseName, CourseNumber, FacultyAssigned) VALUES (‘DSS’, 234, ‘J. Smith’)” 57 Exporting (cont) You can also use the UPDATE statement to change values in a previously created table. The UPDATE statement uses the SET location statement and the same criteria statements used with the SELECT command. =”UPDATE tblStudents SET GPA = 3.9 WHERE StudentName = ‘Y. Zaals’” 58 Applications Transcript Query – We will develop an application which performs dynamic database queries using a pre-developed Access database. 59 Description This database contains information on students, faculty, courses, sections, and grades; there are six tables and one query. In this application, we will allow the user to query the database to retrieve transcript data for a particular student. This transcript data will include every course the student has taken with the details of the course and section as well as the grade they earned. We will then evaluate all grades to calculate the selected student’s overall GPA. 60 Figure 21.11 The tables and queries from MS Access 61 Figure 21.12 The spreadsheet 62 Figure 21.13 The query function procedure 63 Figure 21.14 The initial procedures 64 Figure 21.15 The “tblStudent” table from Access. 65 Figures 21.16 and 21.18 The student choice form 66 Figure 21.17 The form code 67 Figure 21.19 The “qryCourseID” query from Access 68 Figure 21.20 The “tblCourse” table from Access 69 Figure 21.21 The “tblSection” table from Access 70 Figure 21.22 The transcript query code 71 Application Conclusion The application is now complete. Transcript queries can be made for any student selected from the form. 72 Summary The main pivot table object is PivotTables. We must use the ActiveSheet object before specifying a PivotTables object. To create a pivot chart in VBA simply use the Chart object. There are two main systems used in VBA for communicating with external data sources: DAO and ADO. (We use ADO in this chapter.) There are two main ADO objects used to import data: Connection and Recordset. Structured Query Language (SQL) is the code used to perform queries or filter the data which is imported. Variables can be used to make queries dynamic with Input Boxes, User Forms, or by simply taking values the user has entered in a spreadsheet. 73 Additional Links (place links here) 74