DR Report Development Strategies using Transact-SQL By Acmeware, Inc. Glen R. D’Abate - President Introduction Acmeware Course Objectives Review Participants Experience Course Summary T-SQL Overview T-SQL Basics SELECT Statement TABLE JOIN Strategies WHERE Filters & Using T-SQL Functions GROUP BY & HAVING ORDER BY & UNION Advanced T-SQL Topics: Sub-Queries, Temp Tables, & Embedded Queries Parameters Cursors Indexes Lots of Examples Examples are derived from real problems requested by our clients Most statements have be drastically simplified to demonstrate the topic of discussion We will provide example code via email if you have an interest in receiving these Why use T-SQL? Server side-processing Modular Solutions Construction T-SQL Extends ANSI SQL Access to SQL Server Function Set Access to SQL Server System Stored Procedure & Extended Stored Procedures What is Created with T-SQL Query Analyzer Ad-hoc Queries SQL Server Stored Procedures SQL Server Views Application Specific Queries that passthrough (e.g., Access, SRS, etc.) Where to use T-SQL Code Query Analyzer Microsoft Access Visual Basic / C# and .ASP Crystal Reports / Crystal Enterprise Cognos / Actuate / Business Objects T-SQL Basics SELECT, DISTINCT, Aggregate Functions FROM, JOIN, ON WHERE GROUP BY HAVING ORDER BY UNION SELECT / DISTINCT The SELECT Statement identifies fields or columns of data that will be returned in the Result Set Column values may be An explicit value from a data source An aggregate of a data value A derived value from a T-SQL expression, A literal value Result Set A set of Records (or Rows) returned as a result of processing the T-SQL code. Each and every record has a value or a NULL for each column or field defined in the T-SQL statement. Example Result Set Result Columns Result Records NULL values: NULL implies “unknown” SELECT Statement About as simple as it gets. Asterisk means return all columns SELECT with Aggregate Counts all BillingIDs in table Total all ChargeTotals in table Totals charge Amounts for a given SourceID & VisitID SELECT with Expression Expression with string manipulation function IsNull() function is used often with DR data Note missing column name SELECT with Literal Literal expressions are common when the UNION operator is used as a method to indicate which result set of the UNION a record was derived from. Note literal expressions are omitted from the Group By statements SELECT Distinct Distinct means distinct record in the Recordset, not distinct column unless the Recordset has only one column. SELECT Statement - TOP N TOP N limits the result set to the first N records. This is used most often with the ORDER BY clause, or when trying to get a quick snapshot of data. ANSI SQL Syntax SELECT, DISTINCT, Aggregate Functions FROM, JOIN, ON WHERE GROUP BY HAVING ORDER BY UNION DR Table Joining Strategies Primary Keys are defined in all tables and table data is sorted by the MEDITECH defined Primary Key Clustered Index The SourceID is a DR-only construct defined to allow multi-database MT application implementations to exist in the same DR table. Also, multiple MT universe implementation can feed a single DR using SourceID to delineate data. SourceID is in virtually all DR tables. Foreign Keys are only identified by MEDITECH DR naming convention (i.e., end in ID). No declared Foreign Key relationships exist) The hierarchy that existed in the NPR data structures are typically reconstructed through PK to PK table links within a module DR Table Joining Issues PatientID defines Patient level data (i.e., Medical Record Unit Number externally) VisitID defines Visit level data (i.e., AccountNumber) VisitID joins provides cross modules linkage and should be used for inter-module table joins Most modules have internal IDs for linking within (e.g., ABS - AbstractID; B/AR - BillingID; LAB SpecimenID) which should be used for intra-module table linking More DR Joining Factors SourceID should typically be included in DR Table Joins Outer Joins are common where Nulls may exist in a Table (e.g.., looking for “Attending Provider” if he/she exists.) Unit Number or PatientID Joins can be used to compare a patient’s visits with other visits by the same patient (i.e., revisits within 72 hours) Five Common DR Join 1. 2. 3. 4. 5. Intra-Module Join Module - Dictionary Join Inter-Module Join Table Self-Join Outer Join (Left or Right) Intra Module Join Example: Create a JOIN that includes general Billing / Accounts Receivable (B/AR) information (i.e., from NPR main segment) and includes the Patient demographic as well as financial information associated with a visit. B/AR Intra-Module Join BarVisitFinancialData is at the same segment level of the NPR structural as BarVisits (I.e., both are BAR.PAT Always use SourceID within a module Note table naming nomenclature. Main segment tables are plural (e.g., BarVisits). Subordinate NPR tables have the same prefix with the addition of the Segment name (e.g., BarVisitFinancialData, BarVisitProviders) Module Join to a Dictionary Example: Create a JOIN that includes B/AR Charge detail information as well as detail about the specific charge Transactions. This requires the use of the MEDITECH B/AR Procedure dictionary, which is named ‘DBarProcedures’ in the DR. B/AR Module to B/AR Dictionary Join Often times, the fact table FK will be named more specific than the corresponding PK dictionary table (e.g., TransactionProcedureID, OrderingProviderID, etc.) Dictionaries in the DR all begin with ‘D’module (e.g., DMisProviders, DBarProcedures, DLabTests, etc.) B/AR Module to MIS Dictionary Join When linking to MIS dictionaries, the SourceID may, or may not be appropriate. This depends on whether you are using a multi-database module (e.g., B/AR) and multiple databases exist. Also, we have seen the LAB module set up with different SourceIDs Inter Module Join Example: Create a JOIN that includes Visit Demographic information with Laboratory Test results. Inter-Module with Lab data SourceID should be used if it can be used VisitID is key for cross module joins Inter-Module with MRI Demo-Recall data Patient ID is used for MR queries. Table Self Join Example: This type of query can be used to look at activity within a visit in the AdmVisitEvents table. Create a JOIN that finds patient transferred from a location, to a second location, then back to the original location within within 4 hours of being transferred out of that unit. Self Join using Visit Events This will identify the same transfer multiple times if multiple transfers occur within a 4 hour window Outer Join Example: Create a JOIN that identifies a patient’s top two ICD-9 procedures and top two diagnosis, regardless of whether or not they exist Outer Join with ICD-9s & Diagnoses Alias the ICD-9 and Diagnosis tables twice in order to identify primary and secondary occurrences, if they exist Outer Join (Example 2) Example: Create a JOIN that identifies all visits (by discharge or service date) that have occurred more than two months ago, and do not have a Bill generated Outer Join Finding Missing Bills Filter for missing BarBills entries ANSI SQL Syntax SELECT, DISTINCT, Aggregate Functions FROM, JOIN, ON WHERE GROUP BY HAVING ORDER BY UNION The WHERE Clause Specifies a filter condition to restrict the rows returned to the record set The WHERE expression simply evaluates a Boolean (True or False) value that determines if the record, as identified in the JOIN process) is to be returned May use operators, functions, and subqueries in the expression WHERE Clause: Find all current inpatients and list the patient’s admission date, current location – defined as room location, room, and bed Patients who have been admitted, but not yet discharged are considered current inpatients WHERE Clause: Identify the Reason employees have been terminated in less than one year for specific job codes This example uses the function DateDiff() as well as the IN function. WHERE Clause: Find patients with first name beginning with ‘J’, specific Zip Codes and Diagnoses IsNumeric() function can be used to verify a value i numeric before performing a mathematical function…this is VERY useful when working with DR data Like function allows for pattern matching in the WHERE filter WHERE Clause: String manipulation functions to clean up Name conversion problem Left(), Right(), and SubString() functions allow for string extraction from within a string Identifying a strings within a string can be performed using the Charindex() function ANSI SQL Syntax SELECT, DISTINCT, Aggregate Functions FROM, JOIN, ON WHERE GROUP BY HAVING ORDER BY UNION Aggregate Functions T-SQL Aggregate functions are used to summarized data values by: Count(), Count(Distinct) Avg(), Avg(Distinct) Sum(), Sum (Distinct) Max() & Min() HAVING Statement allows filtering of aggregate values Aggregate Example: Census by Location and Date The Count() aggregate counts all non-NULL entries. If Nulls are encountered, a warning is displayed. The AdmNursingCensus table captures the census at midnight run. Each inpatient is assigned to the current location in which they reside. Aggregate Example: Create a T-SQL statement that identifies by Final DRG the Average Length-of-Stay (as defined by Medicare), and number of Medications a patient received. Only consider patient visits in 2004 and only include DRG results if more than 2 visits occurred with the DRG. Medicare defines LOS as 1 for a single days stay, and the number of midnights for a multiple day stay The statement counts the unique DrugIDs that were given at anytime during a stay. The HAVING filter removes records from the result set AFTER the aggregation has been performed. We only consider DRS when more than 2 visits with the DRG have occurred. Aggregate Example: Identify the Average Charges associated with a patient visit by Attending Provider for a given Major Diagnosis Group and also show the number of cases that were used to compute this average? Diagnosis codes are sometimes entered with a leading character. In this case, remove the leading character. Because a single visit may have multiple Diagnoses, we must count DISTINCT VisitIDs Charge Total is available from BarVisitFinancialData. It could also be derived from BarChargeTransactions Each visit should only have one ‘Attending’ Provider. You may need to verify this. Only Final Bill (FB) accounts are considered and only those with a diagnosis that contains a “.” GROUP BY statement must match the non-Aggregate columns in the SELECT statement ANSI SQL Syntax SELECT, DISTINCT, Aggregate Functions FROM, JOIN, ON WHERE GROUP BY HAVING ORDER BY UNION ORDER BY / UNION Order By allows sorting of the result set The result set sorting can be specified by column name, column alias, an expression, or by position in result set Union allows two result sets to be concatenated and returned as a single result set. All columns must be of the same datatype for UNION to function By default, UNION is UNION Distinct, unless UNION ALL is specified ORDER BY Example: Create a T-SQL Statement that lists all Inpatient Admissions that occurred in the March 2005 and sort the output by Admitting Priority, then by Admitting date and time. Sorts first by Admitting Priority, then by AdmitDateTime Default Sort is ASC (ascending), Optionally can sort by DESC (descending order) Note, a item need not be in the select statement to be used in the ORDER BY. However, if a aggregate exists in the SELECT, then the item being sorted must be in the SELECT or GROUP BY statement ORDER BY Example: Create a T-SQL statement that counts the number of Laboratory specimens per hour-of-day that were completed in 2005. Only completed specimens that occurred this year This example sorts based on an expression (i.e., the computation of the hour of a day a specimen was received) This data can be used to schedule phlebotomist more accurately ORDER BY Example: Create a T-SQL Statement that provides a count of outpatient visits after 1/1/2004 by calendar month and sorts the output by month & year properly. Solution 1: Use Year() and Month() functions and then sort the output using a concatenated string value representing the month and year. For this strategy to function properly, the sort value must be in the SELECT or GROUP BY statement. This solution works well when parameters are passed (e.g., from Access Form or Crystal Parameter) and returns integer values for FY and FM. Solution 2: This returns the first day of the month to represent all admissions in a given month. This returns a datetime datatype in the Month column Transaction ServiceDateTime is converted to the first day of the month using a combination of Convert first to a string value, then back to a datetime value. UNION Example: Create a T-SQL Query that in a single result set, returns all Charge, Receipt, Adjustment, and Refund Transactions in a single result sent. Charges Receipts, Adjustments, Refunds, etc. BarChargeTransactions and BarCollectionTransactions are similar tables with the former having charge data and the latter having receipt, adjustment, refund, and other data. When brining this data together into a result set, a UNION operator is often used. Note that it is possible to achieve the same output using a FULL OUTER JOIN and recognizing that the TransactionID is unique across the two tables, but we find this approach to be significant less efficient Advanced Concepts: SubQueries, #Temp Tables, Embedded Queries, Cursors, Parameters & Indexes Non-Correlated Sub-Queries Non-Correlated Sub-Queries in the WHERE clause typically can also be written using a JOIN Non-Correlated Sub-Queries can also be in the SELECT statement Non-Correlated Sub-Query Example: Create a T-SQL query that identifies all patients that have ever been in a room where any patient has ever been identified with the Microbiology Organism of MRSA? Finds the home phone of the patient who has been in the location of interest This essentially creates a list of all locations where an MRSA infection has been isolated. Sub-Queries are enclosed in parenthesis Correlated Sub-Queries Correlated Sub-Queries use values in the “outer query” to derive the vales in the “inner sub-query.” Example: Create a T-SQL Query that shows Charge Transaction procedures and the appropriate Revenue Value Unit (RVU) that is associated with the charge transaction. Wherever an EffectiveDateTime is in use, the code below is generally required to tie the date of the activity to the correct EffectiveDateTime. When an aggregate is used in a sub-query, it is not possible to emulate the same logic with a JOIN clause. Note effective date is always prior to ServiceDateTime Using Temporary Tables Temporary tables are extremely useful when a problem cannot be solved using a single query Temporary tables are efficient Temporary tables can be session specific (#T), or can persist while in use by multiple sessions (##T) Temporary tables can be used to break down complex queries into simpler, modular components Note, this is the same T-SQL statement we saw earlier in the UNION section. It brings charges together with receipts, adjustments, refunds, etc., but now loads this into a #temp table. The data is then be aggregated by visit The data is then be aggregated by AccountNumber (visit) using the temp table, and displayed in separate result set columns using the CASE statement Embedded Query Because a T-SQL statement returns a result set, and a result set is essentially structured the same as a table, the TSQL statement itself can be used in the FROM Clause of another query. We do not often use this method as it often more difficult to support than using #Temp tables to accomplish the same result. Embedded sub-query returns receipts and expected reimbursement by FC and insurance and insurance order. This T-SQL statement, which you may have already developed, can be easily used as an embedded sub-query to solve the problem of identifying total outstanding balances by Financial Class. T-SQL Cursors A SELECT statement returns a complete result set containing all the rows that meet the qualifications in the SELECT statement. Solutions that require a need to process a result set one row (or block of rows) at a time can utilize Cursors. A cursor typically loops through a result set and performs an operation (i.e., Update, Insert, etc.) on a table based on values in the records being processed by the cursor Cursors tend to be overused (especially by programmers and NPR RWers as an easy solution Cursors are very inefficient and we avoid using them unless absolutely necessary Example: We need to create a T-SQL statement that identifies the TOP 5 DRGs utilized by Admitting Provider (using the AbstractProvider table). That is, for any given provider, we want to list that provider top 5 DRSs. This requires that the SELECT TOP 5 clause be repeated for each Admitting Provider. The only way to solve this is through the use of a cursor. Create a temporary table to receive the results Cursor Query identifies all Admitting Providers FETCH_STATUS determines when the cursor has completed a pass through the result set Results are returned by SELECTing on the temp table Cursor value (@Provider) is used in filter to get TOP 5 DRGs by Provider FETCH NEXT retrieves the nest record in the result set Stored Procedure Parameters The vast majority of T-SQL statements we construct are within entered within Stored Procedures and contain parameters. Parameters allow the user running an Access. Crystal, or SRS report to configure the output based on input constraints. Highlighting code allows you to execute specific code in QA Stored Procedure Parameters Syntax for modifying a SP. CREATE is used to create a SP Parameters must have datatype defined Parameters can be used in any part of the T-SQL statement where the datatype of the parameter allows Using Table Indexes Table Indexes can decrease T-SQL statement processing times Indexes trade off space and table update efficiency with query performance Build indexes on Joined and/or Filtered columns Covered Indexes take more space, but can insure fast response time A rule of thumb is that the SQL Optimizer will filtered results using an index if the result set is reduced by a factor of 6 to 20 Query Analyzer can be used to check if an index is put to use Manage Index options (from Table context menu in Enterprise Manager) allows you to add, modify, or remove indexes IX_ColumnName(s) is a common nomenclature for naming indexes In general, when creating DR based indexes, you will not check any of these options. Note, ServiceDateTime index is NOT used, even though a filter on this columns is specified? Index SCANs require each row to be examined. Index SEEKs allow the optimizer to go directly to the row-of-interest Estimated Execution Plan in Query Analyzer allows you to see how the SQL Optimizer will arrive at a result set solution. MEDITECH supplies a clustered index on every table which determines the default sort order of data in the table Now the ServiceDateTime index IS used with virtually the same Query? Index SEEK is very efficient IX_ServiceDateTime is now used by optimizer, however a Bookmark Lookup is also required to find the Act # and Name System SPs & Extended SPs This SP uses XP_SendMail to notify individuals that an employee was terminated and asks them to perform certain tasks based on this information Email address list is derived from another T-SQL statement