Creating DataExpress Procedures TUTORIAL May 2, 2002 This presentation is designed to walk you through the basic steps of creating a DataExpress procedure. Creating a DataExpress Procedure TERMINOLOGY Database Data Set Data Element Data Byte A byte is a one-character piece AA data data element set consists a group of related ofof A database is aiscollection of information. It(fields) is the related data elements bytes that identify that a data organized for rapid smallest unitcontents of information unit define of information, the such of aand as in processing, searching, astudent database. record, name. forAexample, Other examples all the retrieval. database consists ofdata elements elements that are identify STUData bytes are ofdata related dataextracted sets. Anusinga STREET, student's address. AnSTU-ST, DataExpress by writing ais SM exampleSTU-CITY, of a database and example STU-ZIP, of awhich data set areisaSTUfew computational expression. (Student Management). ofDthe (Student data elements Detail) in in the the SM click to continue STU-D database. data set in the SM database. click to continue click to continue Creating a DataExpress Procedure BASIC STEPS 1. 2. 3. 4. 5. 6. 7. 8. Identify the data sources (databases and data sets) you want to access. Select the desired data fields from the data sets to create a working set. • A working set consists of all fields required for sorting, selecting, computations, and ouput. Create and define any computational fields. Define the output file. • Select the data fields in the order they should appear in the report. • Identify the data fields to use for sorting. • Identify the data fields to use for summary functions and control breaks. • Identify the data fields to use for selection criteria. Specify the linkages between data sets. Identify the output format (ex: terminal display, report listing) Run the procedure. Save the procedure. click to continue Creating a Basic DataExpress Procedure DATA SOURCES Before you go into DataExpress, determine which databases, data sets, and data fields you need to produce the report you want. In this case you have determined that the data you need will be extracted from: Database: SM Data sets: CLASS-D, STU-CLASS-D, STU-YRQXREF-D, STU-D Data fields: Creating a Basic DataExpress Procedure SORTS, SELECTIONS, CALCULATIONS, REPORT OUTPUT The following presentation walks through the screen by screen steps for creating a basic DataExpress procedure. This sample procedure produces a report that depicts the following data: •Classes in the year quarters A123 and A124. •Item numbers in the 1000 number range. •The student cumulative GPA’s are represented showing average, low and high GPA’s by item number and for the entire report output. •A count of students by class and by report is provided. click to continue From the DataExpress Main Menu, choose option 3 CREATE a new procedure. Once you create a procedure, you use it to produce (or run) a report. NOTE: Option #1 displays a list of existing DataExpress procedures in a specified catalog. Option #2 runs an existing DataExpress procedure. Option #4 allows modifications to existing DataExpress procedures. Option #8 exits DataExpress. click to continue From the CREATE a new procedure menu, choose option 1 (Use HP DICTIONARY definitions) to create most of your DataExpress procedures. All fields in FMS, PPMS, SMS, and FAS SAFERS databases are defined in the HP Dictionary. NOTE: Option #2 is used primarily for FAS NEEDnn databases. Option #3 is no longer used. Option #4 is used for accessing MPE, KSAM or self-describing files. Option #7 returns to the Main Menu. click to continue From the Use HP DICTIONARY definitions menu, choose option 2 (Use HP IMAGE data base definitions) to extract data from the SBCTC-IT application databases (FMS, PPMS, SMS and FAS SAFERS). NOTE: Option #3 is used to extract data contained in files created during job scheduling or batch processing. Not all files listed are permanent files. Some files may be working files for the job process and will contain no data. click to continue From the Use HP IMAGE data base definitions menu, choose option 2 (Select a specific IMAGE data base). Note: While it is possible to link multiple databases and files, you can select only one database at a time. Others may be added later in the create process. Option #1 displays a list of the available IMAGE databases. Option #3 allows the selection of a single data set. click to continue At the IMAGE data base name prompt, type the name of the application database to be accessed (SM – based on your preparation before you launched DataExpress). NOTE: Additional databases or files will be added at the ADD MORE FIELDS OR FILES (N/y)? prompt later in the process. click to continue At the Enter Data Set Selection prompt, type the menu numbers (separated by commas - no spaces) of the datasets to be used in creating the procedure. NOTE: If possible, enter the data set numbers in the order that they are to be linked. This is not REQUIRED, but is helpful when selecting data elements that may appear in multiple files. Linking is also made less confusing. click to continue Many databases will have multiple pages of listed data sets. To go to the next page without selecting anything on the current page, at the Enter Data Set Selection prompt, simply press Return. Note: When viewing DataExpress screens, read the information above the dotted line. Here you will find information about which menu is being viewed and which page of a multi-page selection menu is being displayed. click to continue Again, you don’t need any fields on this page so press Return to go to the next screen. click to continue To select the data fields you want from each data set, type the field menu numbers (separated by commas – no spaces) at the Enter Data Field Selection prompt. From the CLASS-D data set, you want fields 14 (CLASS-ITMNUM), 23 (COURSE-ID), and 28 (COURSE-TITLE). Note: It is not necessary to select the data fields in any particular order. The fields you select are building the Working Set (fields you want in your report output as well as fields used for sorting, selecting, and computations). click to continue At the Enter Data Set Selection prompt, continue selecting the fields to build the Working Set: 4 (INSTR-NAME) and 45 (YRQ). Note: Read the information above the dotted line. This tells you from which data set you are selecting fields and which page you are on. click to continue On page 3 of the CLASS-D data set, press Return (no more fields are needed from this data set). click to continue At the Enter Data Field Selection prompt, type the menu number of the field to be added to the Working Set – 7 (DROP-DATE). Note: Reading the information above the dotted line, notice that DataExpress is now displaying the fields in the second dataset (STU-CLASS-D) in the selection. click to continue At the Enter Data Field Selection prompt, type the menu number of the field to be added to the Working Set – 1 (SID). Note: DataExpress is displaying the data sets in the order they were selected from the SM database selection menu. This screen displays the fields in STU-YRQ-XREF-D. click to continue At the Enter Data Field Selection prompt, type the menu number of the field to be added to the Working Set – 17 (CUM-GPA). Note: DataExpress is now displaying page 1 of 2 of the pages in the STU-D data set. click to continue At the Enter Data Field Selection prompt, type the menu number of the field to be added to the Working Set – 14 (STU-NAME). Note: Page 2 of 2 of the fields in the STU-D data set are now displayed. click to continue Once you have finished selecting all the data fields (also called data elements) that you need for your procedure, at the Add More Fields Or Files prompt, press Return to continue building the procedure specifications. Note: The fields listed between the dotted lines, under the Define Output Specifications heading are referred to as the Working Set. A Working Set consists of all fields selected for output, sorting, selecting, and computations. If more fields or files were needed for the Working Set, the response to the Add More Fields Or Files prompt would be Y (yes). click to continue At the Enter Field Definitions prompt, type Y (yes) to define a date subtype for the DROP-DATE field. Note: All date fields should have date subtypes defined for them. For more information about what date subtypes are and how they function, click here : click to continue By the At typing Enter Data “Y” Type a at Field the (J+)previous Number prompt,prompt, prompt, type an the type ampersand the menu (&). prompts number The ampersand that below corresponds are is adisplayed, symbol with that allowing theprompts date field the DataExpress date to be that At the Date Subtype prompt, type the date subtype subtype modified to allow entry. a(6 date forsubtype DROP-DATE). entry. in the click to continue represents how data is stored field (for example, click to continue click to continue YYMMDD for 021231, December 31, 2002). Note: The majority of the date-related fields in the SBCTC-IT applications are 6 character YYMMDD formatted fields. When adding date subtypes, you are defining how the data is stored, not how you want it to display. click to continue Once you have defined all date subtypes for daterelated fields, at the Enter Field Number prompt, press Return to continue to the next option. click to continue At the Add Any Computational Fields prompt, press Return to accept the default of N (no). Note: A Y (yes) reply at this prompt, allows the creation of computed values. Most typically, computational fields are comprised of either arithmetic expressions (ex: one field plus another field), or conditional statements (ex: return a particular value if a certain condition or value is found in the field record). click to continue At the Include In Output prompt, type the field menu numbers (separated by commas - no spaces) in the order that you want the fields to appear on the report in columnar format. Note: It is not necessary to include all fields from the Working Set in the report output. Often fields are selected for the Working Set to be used as sort fields, selection fields (for filtering specific records), or as fields to be used in computations. Helpful Hint: It is recommended that sort fields are the leftmost fields in the report. This creates a control break where summaries can be obtained. click to continue At the Sort By prompt, type the menu numbers (separated by commas - no spaces) of the fields that you want the report to be sorted by. In this case, you want the data first sorted by YRQ, then by CLASS-ITMNUM, and so forth. Note: In this report,by selecting all fields related to the class as sort fields, a blank will appear in each row for those fields until a change in value occurs. This gives a group appearance that is easier to read than listing the repeating values. click to continue At the Summarize By prompt, type 0 to obtain report totals, followed by 1 to obtain totals for each CLASSITM-NUM. Note: Fields specified as SUMMARIZE BY fields must also be SORT fields that produce a control break. A control break is the point at which the value in a sort field changes. At each control break, you can specify a summary function. click to continue Under Enter Summary Field Selection For, at the report prompt, type 7C, 8A, 8H, 8L to obtain report summaries as follows: Count SIDs, calculate the Average, High, and Low CUM-GPAs. Note: The report prompt relates to the SUMMARIZE BY entry of 0. When prompted for summary field selections, you can enter fields contained in the Working Set followed by operators of C (count), A (average), H (high), L (low), E (end or last value), and * (subtotal a numeric field). click to continue Under Enter Summary Field Selection For at the CLASSITM-NUM prompt, type 7C, 8A, 8H, 8L to obtain summaries by item number for: Count SID’s, calculate the Average, High, and Low CUM-GPA. click to continue At the Suppress Detail Detail Records press Return to accept the default of N (no). Note: When responding with a Y to this prompt, automatic summarization of numeric fields will occur, giving a summary line report. This will only happen where nonunique values in non-numeric fields appear. To see an example of a report displaying suppressed detail, click here: click to continue At the Select By prompt, type the menu numbers (separated by commas - no spaces) for the fields that will be used to enter run-time values that will limit or filter the record output of the report. Note: Try to order the Select By fields in the order that they are found in the file linkage. In other words, select by those fields that come from the first data set first. If multiple fields from a single data set are used as selection fields, enter the key fields first. click to continue At the Are The Specifications Correct prompt, press Return to accept the default of Y (yes). Note: A N (no) reply will return to previous prompts for data re-entry. click to continue At the File Access And Linkage Order prompt, type the data set numbers (separated by commas - no spaces) in the order they are to be accessed. 1 (CLASS-D), 2 (STU-CLASS-D), 3 (STU-YRQXREF-D), 4 (STU-D). Note: When files (data sets) are separated by commas, it is assumed that the link field is in the file immediately to the left of the file number. Above, the linkage reads as follows: file 1 links to file 2, file 2 links to file 3, file 3 links to file 4. For other examples of linking methods, click here: click to continue At the Is There A Defined Path Between Data Sets prompt, press Return to accept the Y (yes) default. DataExpress will then search for the link between CLASS-D and STU-CLASS-D as noted at the top of the screen. Note: When unsure of whether or not there is a defined path, press return. If there is not a defined path, DataExpress will return a warning message and allow the manual selection of the linking fields. click to continue At the Is There A Defined Path Between Data Sets prompt, press Return to accept the Y (yes) default. DataExpress will then search for the link between STU-CLASS-D and STU-YRQ-XREF-D as noted at the top of the screen. click to continue At the Is There A Defined Path Between Data Sets prompt, press Return to accept the Y (yes) default. DataExpress will then search for the link between STU-YRQ-XREF-D and STU-D as noted at the top of the screen. click to continue At the Alternate Names For Selection Fields prompt, type Y (yes) and press Return to customize the run-time prompt for the YRQ selection field. Note: If fields are specified at the Select By prompt, you will receive this prompt. This allows the option of creating a customized prompt for the field which will be seen when the procedure is executed. This can be helpful if the field name is cryptic or the value to be entered is complex. click to continue Press At thereturn YRQ at prompt, the CLASS-ITM-NUM, type a customized COURSE-ID, prompt for the and YRQ DROP-DATE field. This message prompts.will be seen when the procedure is executed (instead of the YRQ prompt). Press return at the CLASS-ITM-NUM, COURSE-ID, and DROP-DATE prompts. click to continue Note: It is particularly helpful to customize the prompt when selecting by date related fields. For example, instead of seeing STU-STRT-DATE, the prompt can be changed to read TYPE STUDENT START DATE (YYMMDD) indicating the appropriate format of the selection entry. click to continue At the Edit Names/Locations Of Data Bases And Files prompt, press Return to accept the N (no) default. Note: This prompt will always be displayed. Most of the time, the response will be the default of N (no). Only respond with a Y (yes) if group or account location of the databases or data sets needs to be requalified. This is sometimes necessary when sharing procedures from college to college which contain non-standard files (not defined by Dictionary 3000). click to continue At the Include Database Password(s) In Procedure prompt, press Return to accept the N (no) default. Note: If the database passwords are entered here, they will be embedded in the procedure and the password will not be required at run-time. This will eliminate the security that the database passwords provide. It is not recommended that passwords be entered here. click to continue At the Enter An Option Number prompt, press Return to provide the run-time option of choosing the output format. Note: Unless the procedure is producing a file to be saved for use in a multi-pass procedure, it is recommended that you press Return to allow the run-time output option. click to continue At the Run The Procedure prompt, press Return to accept the Y (yes) default. Note: Typing N (no) here will provide the option to save the procedure so that it can be executed at a later date. The procedure may also be saved after it is run. click to continue At the Enter An Option Number prompt, press Return to accept the default of a Report Listing output. A Report Listing is either sent to a printer or viewed on the screen. Note: This is a repeat of the screen seen earlier, but the procedure is now in run mode. DataExpress is offering the run-time option for output. Helpful Hint: Other common run-time options (for non-multi-pass procedures) are 27 (for Excel), 29 (for Access). click to continue At the PASSWORD for SM prompt, type the database password for the SM database at your college. Be sure to type in all uppercase lettering. Note: As you type the password, the field will remain blank for security purposes. If you do not know the database password, see your system administrator, IT staff, systems security manager, or computing services staff. click to continue Once At At Atthe the thethe TYPE and and and database CLASS-ITM-NUM COURSE-ID DROP-DATE VALID password YEAR prompt, prompt, has QUARTER/S prompt, been press type entered, Return 000000 typeprompt, 1@ prompts to(six will type (using extract zeros) be A123,A124 the to displayed allextract @ course wildcard) to only for id’s extract those (ex: those to limit fields only ENGL101) records the those that extraction where were records within student’s entered to whose YRQ only at the YRQ those A123,A124 drop Select date records valueBy inmatch with the whose prompt. class athe CLASS-ITM-NUM class specified isBy null. item entering Note: number criteria. values A single falls range forwithin 0each inwill thethe field, click 1000 1000’s. workto item the also. continue output number records range. clickare to continue limited. click to continue click to continue Note: When entering values for the selection fields, valid operators include: LT (less than), LE (less than or equal to), GT (greater than), GE (greater than or equal to), NE (not equal to), EQ (assumed - equal to). The connectors of AND, OR, and TO may also be used to designate ranges. The wildcard symbol @ may be used to specify any characters. The wildard symbol ? designates any single character. click to continue Based on the selection criteria entered on the previous screen, the records will be read and extracted. At the List Report On Printer prompt, press Return to accept the N (no) default. The output will appear on the screen. Note: When specifying Y (yes) at the List Report On Printer prompt, you will be given the option to send the output to a system or terminal printer. click to continue At the Page Parts, Multi-Line or Truncate prompt, press Return to accept the default of P (page parts). This will display the report horizontally across the screen in multiple page parts when the report is wider than a single page. Note: The Multi-Line option prints the output to the screen on a single page, with multiple lines per record. This is difficult to read. The Truncate option crops the report to a single line per record on a single page horizontally. For a wide report, some fields will be lost from view. Page Parts is the most commonly used view. click to continue The output will display on the screen. To navigate up and down use the F1 (Forward) and F2 (Backward) function keys. To navigate horizontally in the report, press the F4 (Left) and F5 (Right) functions keys. Note: SID’s have been blacked out for privacy Note: The fields for which summary totals were requested will have a <*> under them. Note: The total will appear under the field being summarized. Here, a COUNT of 10 SID’s for the item number. click to continue By pressing the F5 function key, the screen displays page 1B of 181. Note: Average, High, Low CUMGPA’s for this item number. click to continue To view the report totals, at the Page Forward/Exit/Page-Back prompt, type the last report page number (which can be found in the upper right corner of the screen). click to continue Press the F4 function key to move to page 181A and view the report totals for the SID count. Note: Report totals will appear as the last records on the report - after the last item number summaries. click to continue Press the F8 EXIT key to exit the report listing view. Note: Report total SID count appears at the end of the report under the SID heading. click to continue At the Reuse Extract File prompt, press Return to accept the N (no) default. Note: A Y (yes) response allows you to reuse the data extract and either view it again, send it to a printer, or choose another output file format option. click to continue At the Repeat Procedure Run prompt, press Return to accept default of N (no). Note: A Y (yes) response allows you to rerun the procedure and enter different selection values if desired. click to continue At the Save Current Procedure prompt, press Return to accept the default of Y (yes). Note: Is not necessary to save every created procedure. If saved, the procedure is stored in a procedure catalog so that it can be accessed and rerun at a later date. click to continue At the Save Name prompt, type a 10 character procedure name. Note: The procedure name is user defined and can be anything you desire. Helpful Hint: Include something in the name that makes the procedure easy to find or wildcard search when perusing the catalog. click to continue At the Description prompt, type a 50 character description that includes details about the kind of procedure written. Note: This description appears in the procedure catalog. It is helpful to be as descriptive as possible. Use key words and try to include the select by fields. click to continue At the Entry Access By User prompt, type the @ symbol. Note: The @ symbol opens access to the procedure to all users who have MPE access to this catalog. Associated database passwords are still required. click to continue At the Entry Password prompt, press Return, leaving the field blank. Note: The Entry Password allows the option of securing a particular procedure with a user defined password. This is only recommended for more extreme security issues. In most cases the database password will suffice. click to continue At the Save Entry Into Local Catalog prompt, press Return to accept the default of Y (yes). Note: The local catalog location is tied to the log-on locale. The most common local catalog is the GEN.Pnnn (where nnn is the college number) catalog. Most procedure are saved here. click to continue At the Press <Enter> key to continue prompt, press the Return key to go back to the DataExpress Main Menu. click to continue WHOOO HOOO! By following this presentation, you created a DataExpress procedure that extracts class rosters for specified year quarters and item numbers. The report also counts SID’s and reflects average, high, low GPA’s for each item number and for the report. CONGRATULATIONS! DATE SUBTYPES •Define how date-related data is stored – not how you wish data to appear •Enable correct sorting •Enable valid selection ranges •Ensure valid computations Date Examples 020614 (June 14, 2002) 981024 (October 24, 1998) •Current date fields are defined as text or numeric strings. •A date subtype applies a special data type to identify date formats such as YYMMDD. Click for more information DATE SUBTYPES Century Delimiter •Delimits years to be treated as the twenty-first century versus the twentieth century. SET AT 10 •Looks at the two-digit YY part of the date and says: •If the YY is 10-99, then assume 1910-1999 •If the YY is 00-09, then assume 2000-2009 19981024 (October 24, 1998) 20020614 (June 14, 2002) •Dates now sort, select, calculate properly. Click to return to presentation DETAIL NOT SUPPRESSED Shows every line record for the student. STU-NAME SID CLS-TITLE LARRY 111-22-3333 BIOLOGY LARRY 111-22-3333 SOCIOLOGY LARRY 111-22-3333 PSYCHOLOGY TOTAL CR 5.0 3.0 4.0 12.0 DETAIL SUPPRESSED – AUTOMATIC SUMMARIZATION Remove unique detail information to obtain summary line total STU-NAME SID LARRY 111-22-3333 CR 12.0 Click to return to presentation OTHER LINKING EXAMPLES • Example: 1,2,1:3 – Use a field from file 1 to link to file 2. – Use a field from file 1 to link to file 3. – Read file 1 sequentially; for each record in file 1, read file 2 and extract corresponding records – For each record in file 1, read file 3 and extract corresponding records. • Example: 1,2,:3 – Use a field from file 1 to link to file 2. – Use a computed field to link from file 2 to file 3. The field is computed as soon as required data is read from the file. – Read file; for each record in file 1, read file 2 and extract corresponding records – For each record in file 2, read file 3 and extract corresponding records. Click for more information SPECIAL CASES OF LINKING • You can modify linking formulas by using the following symbols: – R (reverse) Example: 1R,2,3 Example: 1R,2,3 Read file 1 backwards – + (plus sign) Example: 1,+2 Selects all records in file 1 even if they do not have matching records in file 2.. – - (minus sign) Example, 1,-2 Selects only thos records in file 1 when there are no corresponding records in file 2. Click for more information SUMMARY OF LINKING • Link from any field (data element) in the from (or source) file. • Link only to a key field in the to (or target file). • When all fields in the from file re displayed, choose one as the linking field. • When all fields in the to file are displayed, choose the appropriate key field. MAXIMIZING PERFORMANCE • List files containing selection criteria data first in the linkage order. • If using MPE file, list them first in the linking sequence. Click to return to the presentation