Advanced Excel/ Excel based analysis and modelling Introduction Excel is commonly known for creation and manipulation of tables of data and presentation of data in diagrams. Excel can do so much more which include -Automatize the manipulation of tables through modification of existing routines and creation of new applications -Data analysis, modelling and simulation Advanced Excel skills are very applicable in Industry, Finance and Engineering. Some basics(refreshing) -enter data -modify entered data -format cell entries -format cell size -merge cells -add comments to cell(s) -formulae -autofill function -cell reference (relative, mixed and absolute) Excel assumes the reference is a relative reference, that means the cell reference changes when you copy the cell. A column or a row can be “fixed” by adding a “$”-symbol: There are four possibilities: = A3 changeable column and row (relative reference) = A$3 changeable column, fixed row (mixed reference) = $A3 fixed column, changeable row (mixed reference) = $A$3 fixed column and row (absolute reference) -naming cells: names can be used as variables in a formula instead of lengthy references LiSiz 2022 @Stats UZ -Excel functions-built in: syntax =function_name(argument(s))-arguments are separated by commas if more than argument is required for the function. A function processes input and returns a value. When you select a function from the list, you are presented with a dialog box/window which allows you to supply input/argument values for the function. You can supply the arguments directly by typing in or use the toggle tabs to access the worksheet for point and selecting cell ranges (left mouse click) and back to the dialog box. Common Excel error return values #DIV/0! ; results from dividing by 0 #NAME?: a formula contains an undefined variable or function name, or the syntax is not valid #N/A: formulae refer to cells which do not contain the appropriate data #NULL! a result has no value #NUM! value not an element of the real set, eg square root of a negative number #VALUE! invalid argument type #REF! invalid cell reference circular error a formula contains a reference to its own location Conditional functions IF (upto 7 nestings) Implicit If functions: countif, sumif Syntax = SUMIF(range,condition,sum_range) LOOK UP functions =VLOOKUP(lookup_value, table_array, column_index,match) =HLOOKUP(lookup_value, table_array, row_index,match) table_array The range reference or name of the lookup table. column(row)_index ; The column (row) of the table from which the value is to be returned. LiSiz 2022 @Stats UZ match Is a logical value, i.e. TRUE or FALSE, which specifies whether you want an exact or approximate value. It is optional with default value TRUE. Worksheet/Workbook Protection It is possible to protect a workbook, a worksheet or parts of a worksheet. (Under Review menu) You can lock or hide formulas of parts of a worksheet by highlighting the cell range then right mouse click, select Protection. Then after you still need to lock the protection. EXCEL Statistical utilities Presentation of Quantitative data Data classification, Types of charts and graphs, pivot tables Analysis of quantitative data The universal goal of Data analysis is to answer one very important question— what does the data reveal about the underlying system or process from which the data is collected? Excel provides you with numerous internal tools designed explicitly for data analysis, the user is also capable of creating their own unique creation to perform many types of analytical procedures by using Excel’s basic mathematical functions. This is often how an add-in is born: an individual creates a clever analytical application and makes it available to others. An add-in is a program designed to work within the framework of Excel. Add-ins use the basic capabilities of Excel (for example, either Visual Basic for Applications (VBA) or Visual Basic (VB) programming languages) to perform internal Excel tasks. These programming tools are used to automate and expand Excel’s reach into areas that are not readily available. There are many free and commercially available statistical, business, and engineering add-ins that provide capability in user-friendly formats. (1)Excel provides resident add-in utilities that are extremely useful in basic statistical analysis. The Data ribbon contains an Analysis group with almost 20 statistical Data Analysis tools. (2) Excel provides dozens of statistical functions through the function utility ( fx Insert Function) in the Formulas ribbon, Select the Statistical category of functions in the Function Library group, select the function you desire, and insert the function in a cell. The statistical category contains almost 100 functions that relate to important theoretical data distributions and statistical analysis tools. (3) There are numerous commercially available add-ins—functional programs that can be loaded into Excel that permit many forms of sophisticated analysis. LiSiz 2022 @Stats UZ Presentation of Qualitative Data—Data Visualization and analysis Sorting and filtering, Pivot tables Inferential statistics Chi-square tests, confidence interval, ANOVA, Experimental designs Visual Basic for Applications (VBA)1 Excel has a powerful programming language which you can use to write your own programmes/functions: thereby automating tasks. This offers a solution for handling repeating operations more efficiently. VBA code is entered in the Visual Basic Editor (VBE). VBA is installed by default, accessing the Visual Basic Editor (VBE), through which you'll enter your VBA code, doesn't automatically appear in the Ribbon. To activate Click on File, then Options, then Customize Ribbon, then Developer: The Developer Menu will be added and you can access the VBE tab. Alternatively you can use ALT+ F11 to access VBE Creating a used defined function First activate the Visual Basic Editor (VBE). The VBE is like most other applications. It is equipped with a menu and a toolbar at the top of the window and has several subwindows: - The Project Explorer displays the hierarchical structure of projects. - The Properties Window displays the properties of the projects. - The Module Window contains the VBA-code of your project. - The Immediate Window displays compiling messages. - The Module Window might not be visible when you open VBE. VBE menu bar: Insert Module (LC) (LC=Left Click on the mouse) - The Immediate Window is made visible by VBE menu bar: View Immediate Window (LC) Steps for writing a program Writing any kind of computing program consists of three basic principal steps: i) Design an algorithm which will perform the task you want. ii) Translate the algorithm into a computer language (code) with a certain syntax, e.g. VBA in our case. iii) Test (debug) your program thoroughly 1 Adapted from Andreas LiSiz 2022 @Stats UZ Fring’s handout Program structures types (i) sequential structures (line by line) 1 ................... 2 ................... 3 ................... (ii) control structures · branching or decision structures (iii) looping (repetition structures) (iv) controlled GOTO NOTE: It is useful to draw flow charts in order to keep track of the logic of the program structure. You do not need to write all comments in detail, but it suffices to write general statements in words. LiSiz 2022 @Stats UZ General structure of a sequential User Defined Function Function name [(arguments) [As type] ] [As type] [statements] [name = expression] [Exit Function] [statements] [name = expression] End Function - name the name of the function - arguments a list of input value (just like for built-in functions) - type the data type which will be returned by the function - statements valid VBA commands - expression an arithmetic expression assigned to the function name, which will be returned · Everything in bold has to be typed exactly as above. · Everything in squared brackets [...] is optional. · Each statement has to begin in a new line. · In case the statement is longer than the line you can split it by typing “ _” (i.e. space and underscore). · A program (function) is read from top to bottom, that is each line is executed after the next. There might be branches, loops etc which you can design. · When End Function or Exit Function is reached the calculation terminates and the value last assigned to the function´s name is returned. - An assignment is done by an equation, which has to be read from the right to the left, i.e. the value on the right hand side of the equation is assigned to the name on the left hand side · The arguments are the Input and the function name contains the Output. Examples (i)Function F(x) F = 3 * x + 10 End Function You can now use this function on an Excel worksheet in the same way as you use a built-in function, e.g. “=F(5)” would return 25 (ii) Function FF(x) b=2*x FF = b + 5 End Function The variable b only exists temporarily inside the function FF. LiSiz 2022 @Stats UZ (iii) Function G(x,y,z) G = y*x + z End Function As for built-in functions you can have more than one input variable (argument). (iv) Function Q(a,b,c,x) ' quadratic equation Q = a*x^2 + b*x +c End Function You can add comments to enhance the readability. VBA does not execute text following a single quote. “=Q(2,4,5,3)” returns 35 ( 2*3^2+4*3+5) (v) Function S(x, y, z) S = 2 * Application.WorksheetFunction. FunctionName (x, y, z) End Function You can use Excel built-in functions inside user defined functions e.g. FunctionName = SUM. “=S(1,2,3)” will return 12 Naming your User defined functions - The first character in the name has to be a letter. - The names are not case sensitive. - Names are not allowed to contain spaces, @, $, #,... or be identical to VBA commands. Errors on debugging - Inevitably you will make some mistakes either just typos or structural ones and you need some strategy to eliminate them. - Some mistakes block the entire WS, e.g. suppose you type: Function Err(x) Err = 2 * Sqr (Here the brackets are missing in Sqr) End Function - Call this function on the WS (Recalculation of the WS is F9) and an error message will be displayed : Left mouse click on OK and the mistake will be highlighted then Unlock with “Reset” LiSiz 2022 @Stats UZ Declaring variable types - Recall: Function name [(arguments) [As type] ] [As type] - The first type refers to the variable type of the arguments and the second type to the variable type of the function. - You can also declare variables used inside the program: · Syntax: Dim variable_name as type · Declaring the type avoids that different types of data get mixed up. You can trace systematically mistakes in long programs. - When you do not declare the type it will be “variant” by default. · The variant type takes more space than properly defined variables. Your program will run faster when you declare the types Variable types · integer integer numbers 0, ± 1, ± 2, ± 3, ... ±32767, -32768 e.g. Dim a as integer a = 32768 gives an error a = 11.3 implies a = 11 · boolean : 16 bit (2 byte) number which is “true“ or “false“ · string can contain up to 2 billion (2^31) characters · single 32 bit (4 byte) floating point number between -3.402823E38 to - 1.401298E-45 and 1.401298E-45 to 3.402823E38 · double 64 bit (8 byte) floating point number between -1.79769313486231E308 to -4.94065645841247E-324 4.94065645841247E-324 to 1.79769313486232E308 Expl.: Dim a as integer a = 32768 Ø gives an error a = 11.3 Ø a = 11 · variant 16 byte with numerical value (here you see the disadvantage, consumes more disk space) · date 64-bit (8-byte) number representing dates from 1-st January 1900 to 31-st December 9999 and times from 0:00:00 to 23:59:59. Working with dates and times VBA handles dates as numbers where 1-th of January 1900 = 1 2-nd of January 1900 = 2 ........ 25-th of October 2005 = 38650 - Some Date and time related VBA-functions: · Month(date) : a number between 1 and 12 representing the month · Weekday(date) : a number between 1 and 7 representing the day LiSiz 2022 @Stats UZ · Year(date) : a number between 1900 and 9999 for the year · Hour(date) : a number between 0 and 23 for the hour · Minute(date) : a number between 0 and 59 for the minute · Second(date) : a number between 0 and 59 for the second Examples: a) Write a user defined function which computes the weekday for a date Function DD(da As Date) DD = Weekday(da) End Function · Format the cell A1 as date and enter 25/10/2005 · “=DD(A1)” returns 3 b) Write a UDF which calculates the age in years given the birthdate. Function age(birthdate As Date) age = Int((Now() - birthdate) / 365) End Function · (Now() - birthdate) the age in days · Int( x ) extracts the integer part of x · age the age in integer numbers of years LiSiz 2022 @Stats UZ