LDBC SOCIAL NETWORK BENCHMARK (SNB) INTERACTIVE WORKLOAD # BENCHMARK OVERVIEW ## Introduction <introduction> # Definitions ACID : the transactional properties of Atomicity, Consistency, Isolation and Durability. Application : The term Application or Application Program refers to code that is not part of the commercially available components of the SUT, but used specifically to implement the Workload (see Section 3.3) of this specification. For example, stored procedures, triggers, and referential integrity constraints are considered part of the Application Program when used to implement any portion of the Workload, but are not considered part of the Application Program when solely used to enforce integrity rules (see Section XX) or transparency requirements (see Section XX) independently of any Transaction. Application Recovery: the process of recovering the business application after a Single Point of Failure and reaching a point where the business meets certain operational criteria. Application Recovery Time: The elapsed time between the start of Application Recovery and the end of Application Recovery (see Section XX). Arbitrary Transaction: An Arbitrary Transaction is a Database Transaction that executes arbitrary operations against the database at a minimum isolation level of L0 (see Section XX). Commit: a control operation that: Is initiated by a unit of work (a Transaction) Is implemented by the DBMS Signifies that the unit of work has completed successfully and all tentatively modified data are to persist (until modified by some other operation or unit of work) Upon successful completion of this control operation both the Transaction and the data are said to be Committed. Database Management System (DBMS): A DBMS is a collection of programs that enable you to store, modify, and extract information from a database. Database Metadata: information managed by the DBMS and stored in the database to define, manage and use the database objects, e.g. tables, views, synonyms, value ranges, indexes, users, etc. Database Recovery: the process of recovering the database from a Single Point of Failure system failure. Database Recovery Time: the duration from the start of Database Recovery to the point when database files complete recovery. Database Session: To work with a database instance, to make queries or to manage the database instance, you have to open a Database Session. This can happen as follows: The user logs on to the database with a user name and password, thus opening a Database Session. Later, the Database Session is terminated explicitly by the user or closed implicitly when the timeout value is exceeded. A database tool implicitly opens a Database Session and then closes it again. Database Transaction: A Database Transaction is an ACID unit of work. Data Growth: the space needed in the DBMS data files to accommodate the increase in the Growing Tables resulting from executing the Transaction Mix at the Reported Throughput during the period of required Sustainable performance. DATE: represents the data type of date with a granularity of a day and must be able to support the range of January 1, 1800 to December 31, 2199, inclusive. DATE must be implemented using a Native Data Type. DATETIME: represents the data type for a date value that includes a time component. The date component must meet all requirements of the DATE data type. The time component must be capable of representing the range of time values from 00:00:00 to 23:59:59. Fractional seconds may be implemented, but are not required. DATETIME must be implemented using a Native Data Type. Driver: To measure the performance of the OLTP system, a simple Driver generates Transactions and their inputs, submits them to the System Under Test, and measures the rate of completed Transactions being returned. To simplify the benchmark and focus on the core transactional performance, all application functions related to user interface and display functions have been excluded from the benchmark. The System Under Test is focused on portraying the components found on the server side of a transaction monitor or application server. Durability: See Durable. Durable / Durability: In general, state that persists across failures is said to be Durable and an implementation that ensures state persists across failures is said to provide Durability. In the context of the benchmark, Durability is more tightly defined as the SUT‘s ability to ensure all Committed data persist across any Single Point of Failure. Durable Medium: a data storage medium that is inherently non-volatile such as a magnetic disk or tape. Durable Media is the plural of Durable Medium. Executive Summary Statement: The term Executive Summary Statement refers to the Adobe Acrobat PDF file in the ExecutiveSummaryStatement folder in the FDR. The contents of the Executive Summary Statement are defined in Clause 9. FDR: The FDR is a zip file of a directory structure containing the following: A Report in Adobe Acrobat PDF format, An Executive Summary Statement in Adobe Acrobat PDF format, An XML document (―ES.xml‖) with approximately the same information as in the Executive Summary Statement, The Supporting Files consisting of various source files, scripts, and listing files. Requirements for the FDR file directory structure are described below. Comment: The purpose of the FDR is to document how a benchmark Result was implemented and executed in sufficient detail so that the Result can be reproduced given the appropriate hardware and software products. Foreign Key: A Foreign Key (FK) is a column or combination of columns used to establish and enforce a link between the data in two tables. A link is created between two tables by adding the column or columns that hold one table's Primary Key values to the other table. This column becomes a Foreign Key in the second table. Full disclosure report: See FDR. Initial Database Size: Initial Database Size is measured after the database is initially loaded with the data generated by SNDG. Initial Database Size is any space allocated to the test database which is used to store a database entity (e.g. a row, an index, Database Metadata), or used as formatting overhead by the data manager. May: The word "may" in the specification means that an item is truly optional. Measured Throughput: The Measured Throughput is computed as the total number of Valid Trade-Result Transactions within the Measurement Interval divided by the duration of the Measurement Interval in seconds. Measurement Interval: the period of time during Steady State chosen by the Test Sponsor to compute the Reported Throughput. Must: The word "must" or the terms "required", "requires", "requirement" or "shall" in the specification, means that compliance is mandatory. Must not: The phrase "must not" or the term "shall not" in the specification, means that this is an absolute prohibition of the specification. Native Data Type: A Native Data Type is a built-in data type of the DBMS whose documented purpose is to store data of a particular type described in the specification. For example, DATETIME must be implemented with a built-in data type of the DBMS designed to store date-time information. Performance Metric: The LDBC-SNB Reported Throughput as expressed in tps. This is known as the Performance Metric. Price/Performance Metric: The LDBC-SNB total 3-year pricing divided by the Reported Throughput is price/tpsE. This is also known as the Price/Performance Metric. Primary Key: A Primary Key is a single column or combination of columns that uniquely identifies a row. None of the columns that are part of the Primary Key may be nullable. A table must have no more than one Primary Key. Referential Integrity: Referential Integrity preserves the relationship of data between tables, by restricting actions performed on Primary Keys and Foreign Keys in a table. Report: The term Report refers to the Adobe Acrobat PDF file in the Report folder in the FDR. The contents of the Report are defined in Section XX. Response Time: Results: TPC-E Results are the Performance Metric, Price/Performance Metric. RT: See Response Time. Scale Factor: Should: The word "should" or the adjective "recommended", mean that there might exist valid reasons in particular circumstances to ignore a particular item, but the full implication must be understood and weighed before choosing a different course. Should not: The phrase "should not", or the phrase "not recommended", means that there might exist valid reasons in particular circumstances when the particular behavior is acceptable or even useful, but the full implications should be understood and the case carefully weighed before implementing any behavior described with this label. Social Network Dataset Generator: See SNDG. SNDG: is an application responsible of providing the data sets used by the SN benchmark. This dataset generator is designed to produce directed labeled graphs that mimic the characteristics of those graphs of real data. SUT: See System Under Test. System Under Test: The System Under Test (SUT) is defined to be the database system where the benchmark is executed. Test Sponsor: The Test Sponsor is the company officially submitting the Result with the FDR and will be charged the filing fee. Although multiple companies may sponsor a Result together, for the purposes of the LDBC processes the Test Sponsor must be a single company. A Test Sponsor need not be a LDBC member. The Test Sponsor is responsible for maintaining the FDR with any necessary updates or corrections. The Test Sponsor is also the name used to identify the Result. Test Run: the entire period of time during which Drivers submit and the SUT completes Transactions other than Trade-Cleanup. Transaction Mix: Valid Transaction: The term Valid Transaction refers to any Transaction for which input data has been sent in full by the Driver, whose processing has been successfully completed on the SUT and whose correct output data has been received in full by the Driver. Workload: # BENCHMARK SCHEMA AND DATA MODEL ## Introduction <description of the data schema> ## Data schema implementation SNB may be implemented with different data models, e.g. relational, RDF and different graph data models. The reference schema is provided as RDFS and SQL. The data generator produces TTL syntax for RDF and comma separated values for other data models. A single attribute has a single data type, as follows: - Identifier – This is an integer value foreign key or a URI in RDF. If this is an integer column, the implementation data type should support at least 2^50 distinct values A datetime should support a date range from 0000 to 9999 in the year field, with a resolution of no less than one second. The string column for names may have a variable length and may have a declared maximum length, e.g. 40 characters. Long string – For example a post content may be a long string that is often short in the data but may not declare a maximum length and must support data sizes f up to 1MB. A single attribute in the reference schema may not be divided into multiple attributes in the target schema. A schema on the DBMS is optional. An RDF implementation for example may work without one. An RDF implementation is allowed to load the RDF reference schema and to take advantage of the data type and cardinality statements therein. A relational or graph schema may specify system specific options affecting storage layout. These may for example specify vertical partitioning. Vertical partitioning means anything from a column store layout with per-column allocated storage space to use of explicit column groups. Any mix of row or column-wise storage structures is allowed as long as this is declaratively specified data structure by data structure. Data structure here means for example table or index. Covering indices and clustered indices are allowed. If these are defined, then all replications of data implied by these must be maintained statement by statement, i.e. each auxiliary data structure must be consistent with any other data structures of the table after each data manipulation operation. A covering index is an index which materializes a specific order of a specific subset or possibly all columns of a table. A clustered index is an index which materializes all columns of a table in a specific order, which order may or may not be that of the primary key of the table. A clustered or covering index may be the primary or only representation of a table. Any subset of the columns on a covering or clustered index may be used for ordering the data. A hash based index or a combination of a hash based and tree based index are all allowed, in row or column-wise or hybrid forms. # BENCHMARK WORKLOAD ## Introduction <description of the data workloads> ## Workload implementation and data access transparency The base unit of work is the insertion of one post. For each post inserted, the following number of other operations must be completed: <add operations> For a run to qualify the number of successfully executed operation must not deviate from the above frequencies by more than 1%. The queries and updates may be implemented in a declarative query language or as procedural code using an API. If a declarative query language is used, e.g. SPARQL or SQL, then explicit query plans are prohibited in all the read-only queries. The update transactions may still consist of multiple statements,, effectively amounting to explicit plans. Explicit query plans include but are not limited to: - Directives or hints specifying a join order or join type Directives or hints specifying an access path, e.g. which index to use Directive or hints specifying an expected cardinality, selectivity, fanout or any other information that pertains to the expected number or results or cost of all or part of the query. ## Auxiliary data structures and pre-computation Q2 retrieves for a person the latest n posts or comments posted by a contact of the person. An implementation may choose to pre-compute this ‘ top of the wall’ query. If doing so, inserting any new post or comment will add the item in question to the materialized top k post views of each of the contacts of the person. If after insertion, this list were to be longer than 20 items, the transaction will delete the oldest item. If this pre-computation is applied, the update of the ‘top of the wall’ materialization of the users concerned must be implemented as a single transaction. This pre-computation may be implemented as client side logic in the test driver, as stored procedures or as triggers. In all cases the operations, whether one or many, must constitute a single transaction. A SPARQL protocol operation consisting of multiple statements may be a valid implementation of the SUT executes the statements as a single transaction. Other pre-computation of query results is explicitly prohibited. ## ACID Compliance The interactive workload requires full ACID support from the SUT. Atomicity. All the updates in a transaction must either take place or be all cancelled. Consistency. If a database object, e.g. table, has auxiliary data structures, e.g. indices, the content of these must be consistent after the commit or rollback of a transaction. If multiple client application threads share one transaction context, these may transiently see inconsistent states, e.g. there may be a time when an insert of a row is reflected in one index of a table but not in another. Isolation. If a transaction reads the database with intent to update, the DBMS must guarantee that repeating the same read within the same transaction will return the same data. This also means that no more and no less data rows must be returned. In other words, this corresponds to snapshot or to serializable isolation. This level of isolation is applied for the operations where the transaction mix so specifies. If the database is accessed without transaction context or without intent too update, then the DBMS should provide read committed semantics, e.g. repeating the same read may produce different results but these results may never include effects of pending uncommitted transactions. Durability. The effects of a transaction must be made durable against instantaneous failure before the SUT confirms the successful commit of a transaction to the application. For systems using a transaction log, this implies syncing the durable media of the transaction log before confirming success to the application.. This will typically entail group commit where transactions that fall in the same short window are logged together and the logging device will typically be an SSD or battery backed RAM on a storage controller. For systems using replication for durability, this will entail receipt of a confirmation message from the replicating party before confirming successful commit to the application. # TEST DRIVER ## Introduction A qualifying run must use the SNB test driver provided with the data generator. The test driver may be modified by the test sponsor for purposes of interfacing to the SUT. The parameter generating and result recording and workload scheduling parts of the test driver should not be changed. The auditor needs to have access to the test driver source code used for producing the driver used in the reported run. The test driver produces the following artifacts as a by product of the run: <add products> Start and end time of each execution in real time, recorded with microsecond precision, including the identifier of the operation and any substitution parameters. If the operation is an update taken from the simulation timeline then the timestamp in simulation time is also recorded. The test driver is scale-out capable. Many instances of the test driver may be used in a test run. The number and configuration of the test drivers must be disclosed, along with hardware details of the platform running the driver(s), together with details of the network interface connecting the drivers too the SUT. The SUT hardware may also be used for hosting the driver(s), at the discretion of the test sponsor A separate test summary tool provided with the test driver analyzes the test driver log(s) after a measurement window is completed. The tool produces for each of the distinct queries and transactions the following summary: - Count of executions Minimum/average/90th percentile/maximum execution time. Start and end date of the window in real time and in simulation time. Metric in operations per second at scale. (ops) (throughout rating) Number of test drivers Number of database sessions (threads) per test driver Parallelism Settings The test driver(s) produce(s) the following information concerning the test driver configuration: Number of test drivers. For each test driver: - - - Test driver name and version IP address of SUT – If not applicable, e.g. in process API, specifies in-process. If localhost connection specifies localhost:port. Maximum Number of concurrently pending operations on SUT from this test driver. E.g. connection pool size, worker thread pool size. Scheduling mode: For SNB test driver, the scheduling mode, i.e. synchronous, window of n seconds, etc. For SNB test driver, maximum length of stall – i.e. amount of real time during which no progress was made in update. This is a configuration setting, the actual stall times show up in the metrics. For SNB driver, measured rate of simulation time to real time, e.g. 30m real for 300 m simulated. This is calculated based on the earliest and latest operations in both simulation time and real ttime started and completed during the measurement window. Update throughput setting – For SPB driver, update operation target throughput in editorial operations per second for this test driver process. May be fractional. # EXECUTION RULES AND METRICS ## Introduction A benchmark execution is divided into the following steps: - - - - - Data Preparation – This includes running the data generator, placing the generated files in a staging area, configuring storage, setting up the SUT configuration and preparing any data partitions in the SUT. This may include pre-allocating database space but may not include loading any data or defining any schema having to do with the benchmark. Bulk Load – This includes defining the database schema, if any, loading the initial database population, making this durably stored, gathering any optimizer statistics,. The bulk load time is reported and is equal to the amount of elapsed wall clock time between starting the schema definition and receiving the confirmation message of the end of statistics gathering. Benchmark Run – The run begins after the bulk load or after another benchmark run. If the run does not directly follow the bulk load, it must start at a point in the update stream that has not previously been played into the database. In other words, a run may only include update events whose timestamp is later than the latest post creation date in the database prior to start of run. The run starts when the first of the test drivers sends its first message to the SUT. If the SUT is in-process with the driver the window starts when the driver starts. Measurement Window – The measurement window is the timed portion of the benchmark run. It may begin at any time during the run. The activity during the measurement window must meet the criteria set forth in Query Mix and must include enough updates to satisfy the criteria in Minimum Measurement Window. The measurement window is terminated at the discretion of the test sponsor at any time when the Minimum Measurement Window criteria are met. All the processes constituting the SUT are to be killed at the end of the window or alternatively all the hardware components of the SUT are to be powered off. Recovery Test – The SUT is to be restarted after the measurement window and the auditor will verify that the SUT contains the entirety of the last update recorded by the test driver(s) as successfully committed. ## Scaling and database population rules The scale factor of a SNB dataset is the number of simulated users. All dataset scales contain data for three years of social network activity. The validation scale is 10,000 users. Official SNB results may be published at this scale of the 30,000 user scale or a power of 10 multiple of either. For example, 100,000 , 300,000, 1,000,000, 3,000,000 and so forth. The dataset is divided in a bulk loadable initial database population and an update stream. These are generated by the SNB data generator. The data generator has options for splitting the dataset into any number of files. The update stream contains the latest 10% of the events in the simulated social network. These events form a single serializable sequence in time. Some events will depend on preceding events, for example a post must exist before a reply to the post is posted. The data generator guarantees that these are separated by at least 2 minutes of simulation time. The update stream may be broken into arbitrarily many sub-streams. Valid partitioning criteria are for example the user performing the action, if any, or a round-robin division where each successive event goes into a different sub-stream. When two events occur in the same sub-stream, their order must be the same as in the original update stream. Rationale: The authors are aware that the prevalent practice for online benchmarks is to tie the reported throughput to the scale, e.g. max 12.5 tpmC per warehouse in TPC-C. The authors depart from this practice here because with throughput tied to scale, test systems with interesting throughputs rapidly become very expensive,, raising the entry barrier for publishing a result. Itt is thought that scaling in buckets lowers the barrier of entry and reduces incentive to use hardware configurations that would be unusual in a production environment. ## Performance metrics and response time ### Response time The Response Time (RT) is defined by RTn = eTn - sTn where: sTn and eTn are measured at the Driver; sTn = time measured before the first byte of input data of the Transaction is sent by the Driver to the SUT; and eTn = time measured after the last byte of output data from the Transaction is received by the Driver from the SUT. The resolution of the time stamps used for measuring Response Time must be at least 0.01 seconds. ### Throughput Operations per second at scale, e.g. opsSI@300G The metric is calculated for a run that satisfies the minimum length and per query minimum execution count and query mix criteria. Each completed operation counts as one operation. The metric is the count of successful operations divided by elapsed time in seconds. ### SNB Interactive Metric. The metric consists of a power score and a throughput score and of a composite metric which is the geometric mean of the two. The power score is the geometric mean of the execution times of the update set and queries scaled to operations per hour. The throughput test length is the length of elapsed time between the start of the first query stream of the throughput test and the end of the last stream or update set of the throughput test, whichever is later. The throughput score is the number of queries, excluding update sets, completed during the throughput test, scaled to queries per hour. The BI metrics are multiplied by the scale factor. The rationale is that a run with 10x more date is likely to take approximately 10x or longer on the same platform. The multiplication is used so as not to have sharply decreasing metrics when scale increases. Note that this is not done with the interactive metric as we expect a sub-linear increase in query time as scale increases. ## Workload execution The SNB BI workload is measured similarly to TPC-H, consisting of a power run and a throughput run. The workload is divided into query streams and update sets. A query stream has one execution of each query, as specified in the workload definition. The power run consists of one update set followed by one query stream. The throughput run consists of a test sponsor dependent number of query streams. For each query stream one update set will be processed. The scheduling of query streams and update sets is as specified in the workload definition, similar to TPC-DS. For both power and throughput runs: - Name of run, e.g. power/throughput Sequence number of run, e.g. 0 if first run of its type Start time End time The start time is before the submission of the first request to the SUT and the end time is after complete receipt of the last result pertaining to the run from the SUT. The last result is the last result of the last query or update set belonging to the run, whichever is later. For each query stream and update set, the following is recorded: - Type of operation, e.g. update set or query stream. Number of query stream or update set Start time End time For each operation in each query stream/update set the following is recorded: - Number of the stream/update set Sequence number in the query stream. Note that different streams run different permutations of the query set, hence the sequence number does not follow from the query. Name of query, e.g. bi/query2.txt if queries read from template files, else the name of the file containing the query implementation as code, one file per query. Value of each substitution parameter Executable text of the query (if instantiated from a template) Start time End time First 100 rows of result set. This is for documentation and may be as text, CSV or other convenient human readable format. - ## Minimum Measurement Window The measurement window starts at the point in simulation time given by the post with the highest creation date in the timestamp. The update stream must contain at least 35 days worth of events in simulation time that are in the future from the date-time of the latest pre-run post in the database. The minimal measurement window corresponds to 30 days worth of events in simulation time, as generated by the data generator. Thus the number of events in the minimal window is linear to the scale factor. And grows the database by approximately 3%. When the test sponsor decides to start the measurement window, the time of the latest new post event successfully completed by any of the test drivers is taken to be the start of the measurement window in simulation time. The window may be terminated by the test sponsor at any time when the timestamp of the latest new post event in the log of any of the test drivers is more than 30 days of simulation time in the future of the starting timestamp. The test summary tool (see below) may be used for reading the logs being written by a test driver. ## Checkpoints A checkpoint is defined as the operation which causes data persisted in a transaction log to become durable outside of the transaction log. In specific this means that a SUT restart after instantaneous failure following the completion of the checkpoint may not have recourse to transaction log entries written before the end of the checkpoint. A checkpoint typically involves a synchronization barrier at which all data committed prior too the moment is required to be in durable storage that does not depend on the transaction log. Not all DBMS’s use a checkpointing mechanism for durability. For example a system may rely on redundant storage of data for durability guarantees against instantaneous failure of a single server. The measurement window may contain a checkpoint. If the measurement window does not contain one, then the restart test will involve redoing all the updates in the window as part of the recovery test. # PRICING # FULL DISCLOSURE ## Introduction <introduction> All of the information included in a Full Disclosure Resport is gathered into a single file per one test execution. The test execution is a SNB interactive, a SNB BI or a SPB execution. Other types will be added as appropriate. The different types of executions share certain common items. Note that since a BI run is specified to start after a bulk load, a BI run must precede the interactive run if both are made from the same dataset and share a bulk load. The implication for data generation is that update events may be produced in a BI update set format first and then in a streaming format for the interactive mix. ## Full Disclosure Report The full disclosure report consists of the following: - Name and contact information of the test sponsor - Description of the DBMS Description of the SUT hardware Description of additional resources used for driving the test, if separate from the SUT hardware 3 years total cost of ownership for the SUT, including hardware and software Metric and numerical quantities summary produced by the test driver summary function Attestation letter by the auditor Supporting Files The Supporting Files is a file archive (e.g. tar or zip) with the following: - Complete configuration files for the DBMS Configuration files for the test driver(s). Any operating system tuning parameters with non-default values in effect at the time of the run. For example sysctl.conf non Linux. Copy of the test driver log(s) for the run. Complete modifications of the test driver source code, if any ## Dataset description The following is recorded for dataset description: - Scale factor. Data format, e.g. CSV, TTL, etc Data generator name and version Number of interactive events generated for interactive update run Number of update sets generated for BI run Datetime of start of data generation This information is produced by the data generator and put in a log file. This is included into the overall run report by the script assembling the parts. ## Bulk load description - Start of bulk load End of bulk load Count of persons, knows, posts, comments, tags of posts and comments after completion of bulk load. ## DBMS description A test implementation extracts the following information of the DBMS: - Vendor name DBMS name - Version number These may be extracted with the appropriate information calls of ODBC/JDBC, SPARQL end point description or other DBMS dependent means. ## Platform description The following lists the automatically extracted features of a platform when running an LDBC benchmark. The measures are in this example taken from a Linux system but the corresponding metrics may be extracted from most other systems. - Operating system, e.g. result of uname -a CPU type – e.g. Xeon E52550@2.00GHz – Heading of the CPU description in proc/cpuinfo Number of threads: The count of CPU description entries in /proc/cpuinfo Number of cores: The siblings count ihn a CPU description in proc/cpuinfo Memory: The total amount of RAM is given by free (in MB) Number of disks (number of distinct /dev/sd?*, count distinct letters in ? Total disk capacity: sum of total space from df, excluding nfs mounted file systems and RAM based file systems, e.g. tmpfs. System configuration, /etc/sysconf file contents for Linux. In the case of a cluster, this description is repeated for each of the cluster nodes. These will typically be identical but may not always be so. ## Interactive workload description - Start of test run as datetime. Start of measurement window as datetime. Time of instantaneous failure stopping the measurement window as datetime. Time of DBMS restart following SUT cold boot. Time of test driver restart. Time of first successful connection from test driver to SUT. End Time of first successfully executed workload operation. Time of reaching steady state. Steady state is considered reached at the end of the first continuous two-minute window for which the throughput is over 90% of the throughput reported for the first measurement period. ## Metrics reported description The following lists the metrics reported per operation for the operations started and successfully completed during the measurement window. For each distinct type of query/update operation: - Name of operation, e.g. int/query1.txt - Template text – If the query is instantiated from a template, this is the template text without parameter substitution Count of successful executions Shortest/average/90 percentile/longest execution time in microseconds If the query is read from a template in the file system, this references the template, otherwise this references the file name of the file containing the query implementation ### Throughput Over Time For each two minute window during the whole execution of each test driver instance, the following is recorded: - Start time of window Flag indicating whether the window is in the measurement interval Throughput measured inside the window as operations per second, counting operations started and successfully ciompleted in the window. Operations that start in one window and end in another are not counted here but are counted in the complete run. The test driver emits this information at the end of each window. This allows the test sponsor to determine whether the system is in steady state. The test sponsor may signal the start of the measurement window to the test driver. This may be done by copying a file to the directory of the test driver, for example. A graphic documenting the sustained throughput of the SUT during the interactive workload can be prepared from this data. This shows the warm-up behavior before the measurement window as well as during the restart following cold boot of the SUT. This graphic is included in the FDR, similarly to TPC-C. ## Machine readable FDR report A benchmark run is described by a machine readable data structure, e.g. RDF in TTL file. This document specifies what information this data structure should contain. Different sections of a benchmark run contribute content to this description. Most of this information is produced by the benchmark tools themselves, some is produced by test-sponsor supplied code run as part of the benchmark run. # AUDIT