CARNEGIE MELLON Department of Electrical and Computer Engineering~ Database Performance Evaluation Methodology for Design Automation Systems: A Case Study of MICON Jason c. Y. Lee 1992 Database Performance Evaluation Methodology for Design Automation Systems: A Case Study of MICON Jason C.Y. Lee Advisor" Daniel R Siewiorek Additional Committee Member: Zary Segall Masters Thesis August, 1992 Abstract : Design AutomationSystemssuch as MICON rely heavily on database systems for design information retrieval. In our current configuration, over 90%of the system synthesis time is spent on accessing the database. Here we address the significance of the database system by considering three database issues. First, a wayto comparethe performanceof different database systems while embeddedin a computer aided design (CAD)system. Second, a way to predict database performancewhensynthesizing different designs. Finally, a wayto identify key areas for improvementresulting in improved performance of the CADsystem. The methodologywas applied to two different relational databases coupled to the MICON system. The results was over a factor of three speedupin total databaseprocessingtime resulting in over a factor of two reduction in MICON run time. 1.0 Introduction Databasesystems are found in a variety of computerapplications. Not surprisingly, Design Automation (DA)tools are amongthose applications whichrequire the ability to store and rapidly access large amounts of data. One such design automation system, MICON [Birmingham,Gupta, Siewiorek 1992], synthesizes designs at the system level. The set of MICON tools allow for the capture of hardwaredesign knowledgeand the synthesis of new systems based on its accumulated knowledge.The synthesis tool, M1, is a knowledge-basedsystem whichdesigns by the methodof synthesis-by-composition. Figure 1 shows a sample M1input and output. Synthesis-by-composition involvestraversing a hierarchy of objects andputting together objects that fit the required criteria. Theselection of objects relies heavily on retrieval of informationfromthe database. Section 1 of this report describes the relationship of the database to the MICON synthesis system while Section 2 presents measurementson time spent in various phases of the design of a personal computer and a process control computer. Section 3 presents a performancemodelfor databases inspired by instruction frequency performance modelscommonlyused to describe computer workstations. Section 4 demonstratesthe use of the performancemodelin predicting database performance. Section 5 illustrates howsimple modifications can be madeto removeperformance bottlenecks yielding over a factor of two improvementin system performance. Finally, section 6 draws someconclusions about the research. 1.1 Database Organization The database whichMICON uses is built upon the table-based relational model[Korth, Silberschatz 1991]. Eachrow in a table represents a relationship amongthe values in the row. Expressions whichidentify sets of informationare formulatedon relational algebra and relational calculus. Expressionsfor sets of informationare converted into database queries (requests for information). SQL(Structured Query Language)is an ANSIstandard that is based on both relational algebra andrelational calculus. Database Performance Evaluation Methodologyfor Design Automation Systems: A Case Study of MICON 1 Query Tot.~ amountof boa~area ~ ~tuarc inche~> Total ammlnt of boardco$~< nearestdollar > To~l~lnountof powerdi~sipatlonaIIowcd< milliwatts~ Is a CO.PROCESSOR needed Is HOneeded Is SIOneeded Is TIMER needed Is a KeyboardCommoner needed Namoof PwcessorFam~y Pmcessor narn~ i0000 1000 50000 ? 68008 1000000 2000 ? 40000 N N o MALE 9 NONE ? N N o 5 DTE Figure 1 SampleM1Input (top) and Schematic for Ml’s Output Netlist (bottom) Database Performance Evaluation Methodologyfor Design Automation Systems: A Case Study of MICON 2 MICON currently uses the Informix Relational Database Systemand interacts with the database through SQLqueries. Thus MICON’s information is organized into tables. For example, MICON has a part table since its basic object is a part. In MICON, parts can be abstract or physical. For example, a two input NAND gate would be an abstract part while a 74F00would be a physical part. Thepart table contains information about the nameof the chip, an associated id, package type, etc. Othertables contain characteristics of the parts, the function of the parts, the pin information, and so on. Onewayto represent the logical structure of the information is through an entity-relationship diagram(E-R diagram)illustrating entities and the relationships betweenthem [Korth, Silberschatz 1991]. Anentity is an object that exists and is distinguishable from other objects [Korth, Silberschatz 1991]. In an E-Rdiagram,entities are in boxes, relationships are in diamonds,and attributes are in ovals. Adirected line fromrelationship A to entity B meansthat there is a one-to-one or a many-to-onerelationship from A to B. An undirected line meansthat there is a many-to-many or many-toone relationship. Figure 2 is an E-Rdiagramfor part of the MICON database. Thediagramshowsus that the entities are parts, characteristics, functions, specifications, logical pins, and physical pins. Eachentity has associated attributes. Someof the key attributes havebeen included here. For example,the part entity has a unique id, chip name,id for shared characteristics, and id to mapa physical pin to a logical pin. Therelationships between the entities are shownby the diamonds.For example,the function entity has a relationship with the part entity but not with the characteristic entity. The waythat the MICON tools interact with the database is through a client-server modelbuilt on IPCsockets (Figure 3). Noneof the tools makedirect calls to the database. A set of client routines are providedto allow for simplified interaction with the database. Theclient routines packagethe requests into buffers whichare sent through IPCpackets. This interface provides a wayto specify the type of information(part, characteristics, pins, etc.), the specific attributes desired (chip name, id, type, etc.), and the qualifications that are desired (part function is a NAND, etc.). Database Performance Evaluation Methodologyfor Design Automation Systems: A Case Study of MICON 3 Figure 2 E-R diagram for part of the MICON database Figure 3 Client - Server Model Theactual databaseaccess is througha server process that interprets the client’s request andtransforms these requests into database queries. The results are then repackagedinto a buffer and sent Database Perfomaance Evaluation Methodologyfor Design Automation Systems: A Case Study of MICON 4 back to the client. Thusbesides spendingtime on its ownprocessing, a client programmust wait for the network,wait for the queries to be built and results buffered, and wait for the databaseto be accessed. Section 2 provides data on the relative time spent during design in each of these phases. 2.0 Motivation Thesynthesis tool M1,makesextensive use of the database in order to select parts during design. It wasspeculated that the database access time wasa significant part of M1’s executiontime but this had not been completelyverified. Bymeasuringthe time for different parts of the system, the impact of the database on the overall performancecould clearly be recorded. The measurements were taken from M1designing a CGAcard1 (Figure 4). This design contained 45 physical parts, .." E Informix 100 200 300 400 500 600 700 ¯ [] Database(91.5%) Other (6.4%) [] Network(1.3%) Buffer & Build (0.8%) 800 Tirn~ Figure 4 M1Execution Time (CGACard Design) 23 abstract parts, and 80 nets. 91.5%of Ml’s overall execution time wasspent waiting for the database. Of the total 665 seconds, 9 seconds were spent over the network and 5 seconds were spent building the queries and buffering the results. Only 43 secondswere actually used by M1. 1. All measurements weretaken froma Sun3running SunOS. Database Performance Evaluation Methodologyfor Design Automation Systems: A Case Study of MICON 5 Thus the overall synthesis time is a direct function of database performance. By way of comparison, a different significantly commercial database system, Sybase Relational Database System, proved to be faster than Informix (Figure 5). With Sybase, the overall time was reduced 63% Sybase ¯ [] [] [] Database Other Network Buffer& Build Informix 0 100 200 300 400 500 600 700 800 Time (sec) Figure 5 M1 Execution Time Comparison 246 seconds. The database time was 190 seconds which meant that the time for the other components stayed constant. Yet, the database time was still the largest componentof the execution time (Figure 6). Section 3 provides a model for estimating the time spent in the database as a function of query types and frequencies. Database Performance Evaluation Methodologyfor Design Automation Systems: A Case Study of MICON 6 ¯ [] [] [] Sybase 100 0 200 Database(77.3%) Other (1.69%) Network(3.9%) Buffer & Build (1.9%) 3OO Time(sec) Figure 6 M1 Execution Time w/Sybase (CGA Card Design) DatabasePerformanceEvaluation Methodology for DesignAutomationSystems:A Case Studyof MICON 7 3.0 Database Evaluation Model In the comparisonof Informix and Sybase, Sybaseclearly required less execution time for the CGAcard design. Whyis one database better than another? Without a better way to compare database systems, whetherone database is always better than another is unclear. 3.1 Numberof Queries Per Second(QPS) The simplest modelinvolves the average time per query for each database system. A queries per second(QPS)rating could be taken by dividing the total numberof queries by the time it took do the queries. This wouldbe similar to a MIPS(million of instructions per second) rating which is used in computerprocessor evaluation [Hennessy, Patterson 1990]. But of course like MIPS, this type of numberdoes not give a true indication of performance.QPSdependson the query mix, type of queries, and size of tables being accessed. For the CGAcard design with Informix, 605 seconds were spent accessing the database and 195 seconds with Sybase. Duringthis time 517 queries were made, resulting in a QPSrating of 0.855 for Informix and 2.651 QPSfor Sybase. 3.2 An Alternative ComparisonMethod A more comprehensivewayof evaluating database performanceis to generate a histogram of the query types, calculating the average time for each query. The average time per query, TPQ,is analogous to cycles per instruction used in computerperformanceevaluation. Next measurements are taken of the frequencyof each type of query. This query mixis analogousto an instruction mix for a processor. Bymultiplying each query count by its corresponding TPQ,one can comparehow different database systems wouldperform for each query type. The query types which are executed the most should have smaller TPQs. Database Performance Evaluation Methodologyfor Design Automation Systems: A Case Study of MICON 8 3.3 M1Results Weapplied this comparisonmethodby comparingour systemwith Informixand with Sybase. First wefoundthe breakdown of querytypes. M1uses fourteentypes of queries (Figure 7). Type 1 select max(cid) Type 2 select dtype, from cid from unified_~har Type 3 select max(cid) Type 4 select cid, T~e 5 select unified_char from myml char_name, rvaluetype from ~ id,manu_name from part where chip_name = "BOOT_0" Type 6 select id,chip_name from part where manu_name = "BOOT_0" Type 7 select cid, id =455)) sval from characteristic Type 8 select id, sig_id, (char_of where map_id,pack_id,char_id from Type 9 select pin~name,pin_index,isa from pins_logic Type i0 select isa, fun_res,link_type from function Type ii select char_id Type 12 select na~e,x_or from from part where specific man~_name where Type 13 select id , chip_name, manu_name from isa = "BOOT_0" and lira_type = 0) = 455 OR char_of = (select char_id from part where part where where where run_of f_of = -455 in (select char_id from part where id = 455) = "BOOT_0" spec_of=-455 part where char_id in (select distinct f_of from function Type 14 select id,chip_name,manu_name from part where char id in (select distinct f_of from function TOR_0 u) AND id in (select char~of from characteristic where cid= 23 and sval = "i0000") where where isa Figure 7 Exampleof each M1Query Type Thesetypes of queriesreflect the needfor specific types of informationnecessaryfor part selection. Theaveragetime for each querytype wasmeasuredfor both databasesystems(Figure 8). Themeasurements weretaken by starting a timer before a querywassent to the databasesystem andstoppingit after the results werestored in buffers. Theaveragetimewasthen takenfromthese measurements. Notice that performancevaried fromquery type to query type andfromdatabase to database.For example,the queries are rankedby decreasingtime for Informix,yet query9 requiredfive times longer on Sybasecompared to Informix. Database Performance Evaluation Methodologyfor Design Automation Systems: A Case Study of MICON 9 = "RESIS- 14 7 13 10 4 1 2 3 5 11 6 9 ¯ Informix [] Sybase 12 8 Query Type Figure 8 Average Query Time per Type for the CGACard Design Database Performance Evaluation Methodology for Design Automation Systems: A Case Study of MICON 10 Nowconsider the mix of queries for the CGAcard design (Figure 9). Query type 7 is executed the most. Alsonotice that there are distinct ranges in whichthe query counts tended to cluster. This is becausewhenM1is looking for a part, there are certain attributes that is alwaysneeded. Thus for any particular M1request for information, a certain numberof queries need to be done every time. Finally, by taking the query count of each type of query and multiplying by the TPQfor each query, the system performanceunder query load can be observed. (Figure 10). Then by summing over all query types in Figure 10 for each database system, the total database times can be compared. Section 4 illustrates howto predict the time spent in a databasesystem. Database Performance Evaluation Methodologyfor Design Automation Systems: A Case Study of MICON 11 150 140 130 120 110 IO0 9O ¯", 0 80 70 60 5o 4O 3O 2O 10 7 6 8 9 10 11 12 5 13 14 1 2 3 4 Query Type Figure 9 Query Mix for the CGACard Design Database Perfomaaace Evaluation Methodologyfor Design Automation Systems: A Case Study of MICON 12 45O 4OO 35O 3OO 25O ¯ Informix [] 2OO Sybase 150 1 O0 5O 7 10 13 14 6 9 5 11 8 12 4 1 2 3 Query Type Figure 10 Total Time Per Query Type for the CGACard Design DatabasePerformance EvaluationMethodology for DesignAutomation Systems:ACaseStudyof MICON 13 4.0 Execution Time Prediction It is often useful to predict the executiontime of a programgiven a set of inputs. For example,if the execution time for Ml’s design of the CGAcard using Informix and Sybase is knownhowcan time spent in the database whendesigning a 68008-basedsingle board computerbe predicted? 4.1 Prediction Method By using the TPQand query histogram, the database execution time for other sets of input can be predicted. The total execution time for each query type will simply be the TPQmultiplied by the query count. The total database execution time wouldbe: ExecutionTime = Z TPQi × Qi i=1 WhereQi is the numberof times query type i is executed. This type of analysis can be especially useful if it wererequired to predict performanceon another database system prior to porting the wholeapplication over to the other system. Thequery count can be collected from the current system. Asimple driver programcan be written to collect the TPQfor each query type on the other system. Then the execution time formula can be used to predict the total database executiontime on the other system. 4.2 M1Results Using this method,the database execution time for M1to synthesize a 68008-processor-basedsingle board computersystem was predicted. This design contained 33 physical parts, 104 abstract parts, and 92 nets. Figure 11 is the query mix for the design obtained by running M1with the Informix database system. Figures 12 and 13 comparethe predicted and the actual database execution times for each database system. The predicted times were calculated by multiplying the TPQsobtained from the CGAcard design, by the 68008-based computer design query mix. Using Database Performance Evaluation Methodologyfor Design Automation Systems: A Case Study of MICON 14 26O 24O 22O 2OO 180 160 140 120 1 O0 8O 6O 4O 2O 0 7 6 8 9 10 11 12 5 13 14 1 2 3 4 Query Type Figure 11 Query Mix for the 68008-based Computer Design Database Performance Evaluation Methodology for Design Automation Systems: A Case Study of MICON 15 8OO 75O 7OO 65O 6O0 55O 5OO 45O 400 ¯ Actual 35O [] Predicted 3OO 25O 2OO 150 100 50 7 10 13 14 6 11 5 Query Figure 12 Actual and Predicted 9 12 8 4 1 3 2 Type performance for 68008 Design (Informix) Database Performance Evaluation Methodologyfor Design Automation Systems: A Case Study of MICON 16 2OO 180 160 140 120 100 ¯ Actual [] Predicted 80 60 40 20 7 9 10 13 8 12 6 11 5 14 4 2 1 3 Query Type Figure 13 Actual and Predicted Database Performance Evaluation performance for 68008 Design (Sybase) Methodology for Design Automation Systems: A Case Study of MICON 17 the execution formulagiven in Section 4.1, an overall execution time of 1068secondsfor Informix and 351 seconds for Sybase was predicted. The actual time for Informix was 1059 seconds which meansthat the prediction was less than 1%higher than the actual time. The actual time for Sybase was 347 seconds which meant that the prediction was 1.2%higher. Figure 14 comparesthe query time of Informix and Sybase for the 68008-basedcomputerdesign. Nowthat the relative time spent in each query type is known,optimizations can be sought to improveoverall performance. Database Performance Evaluation Methodologyfor Design Automation Systems: A Case Study of MICON 18 8OO 75O 7OO 65O 6OO 55O 5OO 45O 400 ¯ Informix 350 [] Sybase 300 25O 200 150 100 5O 7 10 13 14 6 11 5 9 12 8 4 1 3 2 Query Type Figure 14 Total Time Per Query Type for the 68008-based Computer Design Database Performance Evaluation Methodology for Design Automation Systems: A Case Study of MICON 19 5.0 Performance Improvement Databasesystems evaluation and database systems performanceprediction are both very useful. In addition, database performancecan be improvedif the performancebottleneck can be identified. 5.1 Identifying Areas for Improvement Lookingat database performancein terms of TPQdoes not help identify the areas for improvement. It does not identify the commoncase. Amdahl’sLawstates that the performance improvementto be gained from using somefaster modeof execution is limited by the fraction of time the faster modecan be used [Hennessy, Patterson 1990]. Thusthe queries wheremost of the time is spent should be the places with the most potential benefit. Bymultiplying TPQby the query mix, the total time per query type can be examinedand the queries identified whichconsumethe most time. By focusing attention on the time consumingqueries, the greatest performanceimprovement can be found. The speedup due to the performance improvementis: Speedup = F F g c tiO t~ enhanced (1 - FractiOnenhanced) + Sp e e dl~Penhanced Fraction enhancedis the fraction of executiontime in the current system that can be improved. Speedupenhancedis the current execution time of the portion to be improveddivided by the improvedexecution time. 5.2 M1Results FromFigure10and 14, it is obviousthat query type 7 was the best candidate for improvement. Query7 is a query with the whereclause looking for two possible values. Oneof the two possible values is another query. Onepossibility is to break up the compound statement into several differ- DatabasePerformanceEvaluationMethodology for DesignAutomationSystems:A Case Study of MICON 20 ent queries. The result of makingthis change was more than a 26%decrease in TPQfor query type 7 in Sybase. The TPQwent from 0.75 seconds to 0.55 seconds. An even more significant result was a 92%decrease in TPQfor query type 7 in lnformix. The TPQwent from 3.15 seconds to 0.15 seconds (Figure 15). For the CGAcard design, the overall design time with Informix went from 10.92 minutes downto3.95 minutes (Figure 16). This was a 63.7%decrease in time. For Sybase, the overall design time went from 4.15 minutes downto 3.69 minutes (Figure 17). This was a 11.1%decrease in time. For the 68008-basedcomputerdesign, the overall design time with Informix went from 19.5 minutes downto 7.4 minutes whichis a 62.1%decrease (Figure 18). For Sybase, the overall design time went from 7.7 minutes downto 6.9 minutes whichis a 10.4% decrease (Figure 19). 3.2 l [] Current Query Improved Query Sybase Informix Database System Figure 15 Improved Query 7 Average Time DatabasePerformanceEvaluation Methodology for DesignAutomationSystems:A Case Study of MICON 21 Improved ¯ [] [] [] Database Other Network Buffer& Build Current 100 200 300 400 500 600 700 800 Time (sec) Figure 16 CGACardDesign ImprovementComparison(Informix) Improved ¯ [] Database Other [] [] Network Buffer& Build Current 50 100 150 200 250 300 Time (sec) Figure 17 CGACardDesign ImprovementComparison(Sybase) DatabasePerformanceEvaluationMethodology for DesignAutomationSystems:A Case Study of MICON 22 Improved ¯ Database [] Other [] Network [] Buffer& Build Current 0 200 400 600 800 1000 1200 Time (sec) Figure18 68008-basedDesignImprovement Comparison (Informix) Improved ¯ Database [] Other [] Network [] Buffer& Build Current 0 100 200 300 400 500 Time (sec) Figure19 68008-based DesignImprovement Comparison (Sybase) DatabasePerformanceEvaluation Methodology for DesignAutomationSystems: A Case Studyof MICON 23 Usingthe speedupequationin Section 5.1, the speedupof havingthe improvement in the CGA card design with Informixwas3.35 times. SpeeduPcGA(lnformix) -- 0.737 (1-0.737) + 20.----~ = 3.35 The0.737fraction enhanced wascalculatedby dividingthe total timefor querytype 7 in the current versionby the total querytime for all queries. The20.8 speedupenhancedwascalculatedby dividingthe current averagetime for querytype 7 by the averagetime for the improvedquerytype Thespeedupwith Sybasewas1.17 times. Speedt~PCGA (Sybase) = 0.552 (1 - 0.552) + 1.3~ = 1.17 Thespeedupof havingthe improvement in the 68008-basedcomputerdesign with Informixwas 3.19 times. Sp eeduP68oo8( Informix) = 1 0.721 (1-0.721) + 20.----g- = 3.19 Thespeedupwith Sybasewas1.16 times. SpeeduP68oo8 (Sybase) = 0.522 ( 1 - 0.522)+ 1.3----ff = 1.16 DatabasePerformanceEvaluationMethodology for DesignAutomationSystems:A Case Study of MICON 24 6.0 Conclusions Analyzingthe frequencyof individual query types provides insight into use of the database as well as enabling performanceprediction and use of the system on a newtype of database. The TPQand query count for each query type identified places to focus improvement efforts. Evenfor a stable system such as M1,the resulting improvement in execution time proved to be a major event in its evolution. The performanceenhancementfor a mature tool like M1indicates potential gains in other databaseintensive tools. Substantial performanceimprovementresulted from splitting a compoundquery. Future research should analyze the cause of the performancegain and what other types of gains mightbe possible. Workin this area maygive a better understanding of what kinds of queries should be used in designing CADsystems. 7.0 Bibliography [Birmingham,Gupta, Siewiorek 1992] W.P. Birmingham,A.P. Gupta, and D.P. Siewiorek, "Automatingthe Design of ComputerSystems : The MICON project", Jones and Bartlett Publishers, Inc., Boston, MA,(1992) [Hennessy,Patterson 1990] J.L. Hennessyand D.A. Patterson, "ComputerArchitecture : A Quantitative Approach", MorganKaufmannPublishers, Inc., San Mateo, CA, (1990) [Korth, Silberschatz 1991] H.F. Korth and A. Silberschatz, "Database SystemConcepts", McGraw-Hill,Inc., NewYork City, NewYork, (1991) Database Performance Evaluation Methodologyfor Design Automation Systems: A Case Study of MICON 25