Introduction to Analytical Modeling Gregory V. Caliri BMC Software, Inc. Waltham MA USA ABSTRACT Analytical models are constructed and used by capacity planners to predict computing resource requirements related to workload behavior, content, and volume changes, and to measure effects of hardware and software changes. Developing the analytical model provides the capacity planner with an opportunity to study and understand the various behavior patterns of work and hardware that currently exist. Certain factors must be taken into consideration to avoid common errors in model construction, analysis, and predictions. Definition of Analytical Modeling What is an analytical model? By pure definition and in terms of being applied to computer systems, it is a set of equations describing the performance of a computer system1. In practical terms, it describes a collection of measured and calculated behaviors of different elements over a finite period of time within the computer system – workloads, hardware, software, and the CPU itself, and can even include the actions and behaviors of its users and support personnel. In most instances, the capacity planner constructs the model using activity measurement information generated and collected during one or more time intervals. It is critical that an interval or series of intervals be used that contain significant volumes of business-critical activity. Units of work are then characterized by type and grouped into workloads. The capacity analyst can then translate future business requirements into measurable units of computing resource consumption, and calculate capacity and performance projections for workloads. changes. Some will even carry the use of an analytical model beyond entering changes to the current system or set of systems and use it as input to a second model so as to measure the effects of the combination of two existing systems. For most sites, the projection of capacity requirements and future performance are the objectives behind the capacity planning effort. In these "what-if" analysis situations, the capacity planner follows a several step process consisting of the following steps: - Receives projections for future business computing requirements Translates those business requirements into data processing resource requirements based on the information contained in the model, and other sources, if the model does not contain sufficient workloads with characteristics meeting those requirements Calculates the status of the system after the new workload requirements have been input - Purposes for building Analytical models Some users will construct analytical models to merely gain an understanding of the current activity on the system and to measure performance and analyze behavior of the workloads and hardware within it. Others will use them as a basis for prediction of behavior of certain elements of work within a system by inputting changes to different components of the system; one might include changes to faster or slower hardware, configuration changes, increased or decreased or altered workload arrival patterns. Reports results to management, listing any available options. Starting off If you've never engaged in capacity planning, implementing the process in your enterprise. Very simply: - Define and identify the purpose(s) for your modeling study - Ensure that sufficient collection mechanisms and analytical tools are available to support model construction and analysis - Characterize workloads according to a set of rules and definitions - Identify the intervals of time and critical workloads for study - Accept input and business requirements from your user community - Establish a standard method to report results back to management Let's review each of these steps. Define the purpose for the modeling study It is important to define exactly what the purpose is for building models and what their specific uses will be. Most will use the model to execute a series of "whatif" changes to the environment by making alterations to the analytical model -- workload volume increases or decreases, hardware changes, or addition of new users and transactions. Performance results are then measured. As a parallel function, an analytical model can be used to model changes to the existing environment that will allow the analyst to tune the system for improved performance. Refining the objective for the use of the model can also serve to streamline the process. For instance, are we only concerned about CPU capacity? Must we control response time of certain mission-critical workloads? Detailed modeling of database changes? Will you be analyzing and tuning for typically heavy use periods, or only doing so for peak periods? Each scenario listed would entail different levels of data collection and varying complexities in workload characterization. Of course, if an analytical model has reduced detail and very coarse granularity in its components, it will not be as flexible and will not be able to be used to return specific esoteric results. Establishing the purpose(s) for the modeling study will affect the total approach that is taken to model construction, characterization of workloads, and series of analytical iterations to be performed with the model. Definition of the modeling goal will also lead to increased confidence in the results of the study. Data collection and retention; model construction A data collection routine must be designed and implemented, and the data collected must be robust enough so that appropriate records are available to identify all components and all pertinent workload activity. On an OS/3902 system, this would include all SMF3 job and task related records (type 30s), all pertinent RMF4 records for configuration, hardware activity, workload activity (types 70 - 75, and type 78 records if collected), and all appropriate database and online activity monitor records (IMS DC Monitor 5, DB26 related SMF, IMF7 etc.). Some sites will find it impossible to generate, collect, and archive data with extreme granularity for extended periods of time. In these instances, it is recommended that prime intervals for modeling be identified early in the process and that the data is kept from these periods of time, even if certain monitoring instrumentation mechanisms have to be deployed. Similar, but less detailed data collection mechanisms exist on UNIX systems. Often the analyst must execute series of UNIX commands, collect the output from those commands and later generate reports and input for modeling from that output. There are several commercially available measurement and capacity planning tools available. These packages provide their own collectors to generate measurement data that will permit creation of an analytical model. Organize and characterize workloads according to a set of rules and definitions This is probably the most difficult task because it is highly subjective. As with other steps, errors made here can be carried forward through the process and cause improper results. To begin workload characterization, you must study all units of work in the enterprise at an extremely granular and low level. This will give the capacity planner an understanding of system activity and behavior patterns. If data collection was set up properly, this should be possible. Classify work according to its type -- batch, online, TSO, query transactions, long / short transactions, utilities, long and short processes. As part of the previous step, you should have already made computing activity and resource consumption trackable and identifiable. The mission-critical workload definitions should already be roughly established. more intervals for modeling that have different mixes of work and build separate models. From this point, begin to classify work and build workloads by the type of work that it is, and do so from a system activity standpoint. In an OS/390 system, batch should be classified as short, long, and "hot" and further grouped as to its service. For instance, production batch serving the business might be placed in one set of workloads, and internal work of some type would be placed in others; online database transactions should be identified and grouped not only as to its production or test role but also by its function. The most important rule to follow in area is to ensure that your model contains a robust workload mix and the most critical workloads executing at a significant and typical activity level. A baseline analytical model should come reasonably close to representing a realistic situation. Some will attempt to perform the capacity planning process by classifying work by user communities or account codes. This approach is only valid if the work within each user group or accounting code group is also classified as to the type of work and placed into its own workloads. Erroneous projections are often produced when user counts are employed. This approach assumes that additional users will exhibit the exact behavior and execute work with the same distributions and resource consumption compositions as the existing user community. Obviously, there are many methods of internal communication and sometimes these may be dictated by corporate culture. One suggested method for receiving input from users is to hold a monthly meeting with a representative from each of your user communities. Identify the intervals of time and critical workloads for study When selecting an appropriate interval of time to measure and input into the construction of analytical models, observe the following: 1) Attempt to select a period of high, but not completely saturated system utilization. 2) Keep in mind your objectives for modeling, and ensure that the model contains all of the critical workloads to be measured and observed. 3) Do not use intervals of time that contain anomalies of activity, such as looping processes, crashed regions, application outages, and other factors that are likely to cause unrealistic measurements. 4) The mix of workloads and their activity will change from one time of day to another. In OS/390 mainframe systems, this is rather common; often there will be a high volume of online, real-time transactional processing during the standard business day and a concentration of batch work during the evening hours. Situations containing the same variances can exist in other platforms as well. In such instances, select two or Accept input and business requirements from your user community This meeting can be used by the capacity planner to receive input from, and deliver feedback to user groups and explain the current state of the enterprise in plain language. There is also a side benefit to this meeting; different user groups can communicate with each other on upcoming projects. Often duplication of effort is eliminated because two or more groups determine that they are doing the same work, and with a cooperative effort, save system development time and use fewer computing resources. It is also an excellent opportunity to release, distribute and explain the monthly or quarterly performance and capacity plan to users. Establish a standard method to report results back to management Capacity planners often issue a monthly or quarterly report to management. The report should be straightforward, and offer brief explanations of performance results of critical workloads. There should also be a report on the state of the enterprise’s capacity, with capacity and performance expectations based on growth projections. Revisions to the capacity plan and the reasons for them should also be included. One mistake often made is the inclusion of too much irrelevant information in reports or presentations. In most cases, upper management personnel do not have the time nor the interest to wade through technical jargon and attempt its translation. Use of visuals can cut through the technological language barrier. Often the capacity planner gets into a quandary – he or she has to provide a high level report for management and executives, but may also be challenged by technical personnel to explain the report in technical terms. In such instances, you must have the technical detail available and make it available to those who wish to see it. You will be asked for it at some point in time, and it might be advisable to distribute the high level report and extend an invitation to your audience to read the extended technical report. If actions must be taken, executives often wish to have a variety of viable options and the benefits and consequences of each put before them. Avoid listing only one possible solution to management to solve a problem and refrain from presenting options that are not practically possible to implement. Queuing theory and its role in Analytical Modeling The mathematical basis for many analytical modeling studies is the application of queuing theory. In plain English, it is a mechanism to reflect the length of time that a task waits to receive service and queue length times are calculated based on the speed that a unit providing service (or server, not to be confused with a "file server", etc.) can provide and the number of requests to be processed. If one thinks of a single device – for instance, a disk, or a channel, or a CPU as a "server" - the following formula can be applied to determine the average response time for a transaction to be handled at that one service point, or server. This formula is known as "Little's Law". Rt = Response time, or the time that the transaction enters the queue until the request is satisfied S = Service time , or the amount of time that the server itself spends handling the request Tx = The number of transactions receiving or awaiting service at any one time The formula: Rt = S / (1-(Tx*S)) If Tx*S is equal to or greater than one, then the server is considered to be saturated, as transactions are arriving at the server at a greater rate than the server can handle. To demonstrate this formula, let's assume that a serving CPU can service a request in 50 milliseconds, or .05 second. We can then input transactions per hour, and divide by 3600 to obtain transactions per second. Using the formula, we can input transaction counts and determine where the response time will degrade noticeably, and where the server will saturate. With lower arrival rates, the response time hovers very close to the service time. There is very little queuing taking place for the first 20000 transactions per hour. However, when the total is doubled to 40000 per hour, the queuing time accelerates to 64 milliseconds, and the transactions are spending more time queued for service than they are actually receiving service. The queuing time rises with a more rapid rate as more transactions are input to the server unit. In the rightmost column, you will note that if the service time were reduced, the queues for transactions would be shorter and the response time would not be noticeable at 72000 transactions per hour as they are with 50ms service time. There is also a column listing response time calculations if service time for the transaction were improved and reduced to 30 ms. Analysis of a single server's response time by arrival rate; service time is constant at 50 ms. Trans/ Hr Trans/ Sec Pct. Server busy Response time (service time .05s) Queue Time Response if service time is .03s 10000 2.778 13.89 0.058 0.008 0.033 11000 3.056 15.28 0.059 0.009 0.033 12000 3.333 16.67 0.060 0.010 0.033 13000 3.611 18.06 0.061 0.011 0.034 14000 3.889 19.44 0.062 0.012 0.034 15000 4.167 20.83 0.063 0.013 0.034 16000 4.444 22.22 0.064 0.014 0.035 17000 4.722 23.61 0.065 0.015 0.035 18000 5 25.00 0.067 0.017 0.035 19000 5.278 26.39 0.068 0.018 0.036 20000 5.556 27.78 0.069 0.019 0.036 40000 11.11 55.56 0.113 0.063 0.045 60000 16.67 83.33 0.300 0.250 0.060 70000 19.44 97.22 1.800 1.750 0.072 71000 19.72 98.61 3.600 3.550 0.073 71500 19.86 99.31 7.200 7.150 0.074 72000 20 100.00 Saturated Saturated 0.075 What should be evident is that there is a definite point where the response time begins to markedly curve upward! Now, one must consider that a process traveling through various points of service in a computing system will have to undergo some type of queuing process at each point. The length of time spent in all of these queues, plus the service time spent at each point of service, comprises the response time for a single process or transaction. Mathematical formulae exist for explanation and calculation of a process in a multi-point system, but they are beyond the scope of an introductory paper. The extended explanation of the above formula and its practical application with multiple points of service can be found at its source; this was extracted from a paper by Dr. Jeffrey Buzen, “A Simple Model of Transaction Processing”, contained in the 1984 CMG Proceedings. It is possible to simulate operations of several hundred terminals using scripted keystroke files. However, it may not be practical to perform such simulations with thousands of terminals. One of the research documents that the author encountered discussed the prospect of experimentation and, with a touch of humor, conveyed that some experiments are unfeasible due to safety reasons. He cited two prime examples. One was the scenario of a jetliner, carrying a full load of passengers, and then attempting a landing with one engine shut off. The other was the possibility of driving a nuclear reactor to the point of critical mass so those researchers could definitively prove where the point actually occurred! 7 It is conceded that stress testing or overloading of a computer system to determine its points of performance degradation would not carry the possibility of disaster that these previous experiments would carry, but one would wish to avoid them nonetheless. Modeling methodologies other than analytical Two other modeling methods are often used to determine current status of computing systems and to model any changes to them. The first is the use of experimental models. Measuring existing situations or even creating new situations and measuring the performance results performs experimentation and the percentage of used capacity. Benchmark workloads are run on new hardware and/or software environments and true performance measurements are collected. When commercial computing environments were much smaller than they are today, it was relatively easy to simulate an actual business environment. Indeed, the "stress test" was a commonplace occurrence within the MIS world. A number of individuals were handed a scripted series of instructions to follow at a certain time, and performance results were measured at the conclusion of the test. It is still the most accurate method of computer performance prediction. However, several problems arise in the running experiments. In today's world of MIS, the volumes of transactions processed are so high that it is impossible in many cases to obtain a true experimental reading of what might happen after changes are executed. How many individuals would be needed to enter 50,000 online terminal transactions in an hour, or generate a number of hits from varied locations to a web server to duplicate the effort? Furthermore, with today's 24/7/365 expectations, an enterprise may not have the machine, time, and personnel resources with which practical experimentation can be performed. Another modeling technique in use today, and gaining popularity in some areas of computing is simulation. The following quotation provides a down-to-earth explanation as to how simulation models work: "The simulation model describes the operation of the system in terms of individual events of the individual elements in the system. The interrelationships among the elements are also built into the model. Then the model allows the computing device to capture the effect of the elements' action on each other as a dynamic process."(Kobayashi, 1981)8 In short, this is stating that a simulation model describes workloads, and the different components of each, as well as the results of the continuous interaction of the different components of the computer system as time proceeds. Several factors make it difficult to use simulation models for larger systems that contain multiple workloads and devices. The most notable reasons are that there are too many variances in behavior of different devices and workloads which compose the workloads over a period of time, and that the arrival of transactions used as input to the simulation will probably have an uneven distribution. This leads us to begin consideration of the less-complex, but highly effective analytical modeling technique. It must also be noted that for several years, simulation modeling has been a proven, effective methodology used in less complex analyses such as network traffic modeling and prediction. Because network traffic generally has fewer variances in its composition, and packets generally do not interact with each other, simulation techniques can be applied in a practical fashion. There have also been some other applications and hybrid techniques developed through the years; one is called “Simalytic Modeling”, and it combines both analytical and simulation techniques in the same model. An excellent paper (Norton) on this hybrid methodology is noted in the recommended reading section below. Future of Analytical Modeling Analytical models, or products using analytical analysis and queueing theory, and the tools to create and analyze them and to report and predict performance will continue to enjoy widespread use. In large scale computer systems running applications that contain a high degree of variation of activity, it will remain a highly practical method of analysis because of its relative simplicity and practicality. New technologies have emerged that will force changes in methods of model construction and "whatif" exercises. Internal and operational architecture changes in the mainframe arena will lead to a complete revision of the modeling paradigm, and analytical modeling should continue to service the mainframe realm. approaches for effective capacity planning. The availability of statistical data, the platforms used for processing, and the objectives and complexities of studies can dictate the methodology to be used for capacity planning. References (1) Buzen, Dr. Jeffrey P., “A Simple Model of Transaction Processing”, CMG Proceedings, 1984. (2),(3),(4),(5) OS/390, RMF, SMF, and IMS DC Monitor are trademarks of IBM Corporation, White Plains, NY. (6) IMF is a trademark of BMC Software, Inc., Houston, TX. (7) Extracted from an Internet WWW home page (http://staff.um.edu.mt/jskl/simul.html), which is an extract from the text of “Simulation”, by J. Skelnar, University of Malta, 1995. (8) Kobayashi, Hisashi. , “Modeling and Analysis: An Introduction to System Performance Evaluation Methodology.” The Systems Programming Series. Reading, MA: Addison-Wesley Publishing Company, 1984. (quote attributed by Norton, below) Actual stress test experiments will likely not be as prevalent as they were in the past for larger interactive applications, simply because of the large scale efforts required to plan for them and the human and system resources required to execute them. However, it will certainly be used to benchmark hardware, vendor software, and even batch cycle testing. Recommended reading – in addition to the above Simulation has been used for many years to provide a more detailed analysis of systems with workload components that contain limited variability. Simulation has and will continue to come into play in the world of the Internet. With the rise of Ecommerce, simulation modeling appears to be a viable method for modeling web-based, multi-platform, and network applications. “Using Analytical Modeling to Ensure Client/Server Application Performance”, Leganza, Gene, Cayenne Systems, CMG Proceedings, 1996. (and other works by Leganza in CMG Proceedings that deal with stress testing). Conclusion This paper has touched upon several high-level areas of capacity planning and specifically, use of an analytical model as a primary tool. While analytical modeling is but one method in use today, different platforms and applications may require other “Simalytic Enterprise Modeling - The Best of Both Worlds”, Norton, Tim R., Doctoral Candidate, Colorado Technical University CMG Proceedings, 1996 (and many other works by Norton found in CMG Proceedings through the years)