LDBC SOCIAL NETWORK BENCHMARK (SNB) INTERACTIVE

advertisement
LDBC SOCIAL NETWORK BENCHMARK (SNB)
INTERACTIVE WORKLOAD
# BENCHMARK OVERVIEW
## Introduction
<introduction>
# Definitions
ACID : the transactional properties of Atomicity, Consistency, Isolation and Durability.
Application : The term Application or Application Program refers to code that is not part of
the commercially available components of the SUT, but used specifically to implement the
Workload (see Section 3.3) of this specification. For example, stored procedures, triggers,
and referential integrity constraints are considered part of the Application Program when
used to implement any portion of the Workload, but are not considered part of the
Application Program when solely used to enforce integrity rules (see Section XX) or
transparency requirements (see Section XX) independently of any Transaction.
Application Recovery: the process of recovering the business application after a Single
Point of Failure and reaching a point where the business meets certain operational criteria.
Application Recovery Time: The elapsed time between the start of Application Recovery
and the end of Application Recovery (see Section XX).
Arbitrary Transaction: An Arbitrary Transaction is a Database Transaction that executes
arbitrary operations against the database at a minimum isolation level of L0 (see Section
XX).
Commit: a control operation that:
Is initiated by a unit of work (a Transaction)
Is implemented by the DBMS
Signifies that the unit of work has completed successfully and all
tentatively modified data are to persist (until modified by some other operation or
unit of work) Upon successful completion of this control operation both the Transaction and the
data are said to be Committed.
Database Management System (DBMS): A DBMS is a collection of programs that enable you
to store, modify, and extract information from a database.
Database Metadata: information managed by the DBMS and stored in the database to
define, manage and use the database objects, e.g. tables, views, synonyms, value ranges,
indexes, users, etc.
Database Recovery: the process of recovering the database from a Single Point of Failure
system failure.
Database Recovery Time: the duration from the start of Database Recovery to the point
when database files complete recovery.
Database Session: To work with a database instance, to make queries or to manage the
database instance, you have to open a Database Session. This can happen as follows: The
user logs on to the database with a user name and password, thus opening a Database
Session. Later, the Database Session is terminated explicitly by the user or closed implicitly
when the timeout value is exceeded. A database tool implicitly opens a Database Session
and then closes it again.
Database Transaction: A Database Transaction is an ACID unit of work.
Data Growth: the space needed in the DBMS data files to accommodate the increase in the
Growing Tables resulting from executing the Transaction Mix at the Reported Throughput
during the period of required Sustainable performance.
DATE: represents the data type of date with a granularity of a day and must be able to
support the range of January 1, 1800 to December 31, 2199, inclusive. DATE must be
implemented using a Native Data Type.
DATETIME: represents the data type for a date value that includes a time component. The
date component must meet all requirements of the DATE data type. The time component
must be capable of representing the range of time values from 00:00:00 to 23:59:59.
Fractional seconds may be implemented, but are not required. DATETIME must be
implemented using a Native Data Type.
Driver: To measure the performance of the OLTP system, a simple Driver generates
Transactions and their inputs, submits them to the System Under Test, and measures the
rate of completed Transactions being returned. To simplify the benchmark and focus on the
core transactional performance, all application functions related to user interface and
display functions have been excluded from the benchmark. The System Under Test is
focused on portraying the components found on the server side of a transaction monitor or
application server.
Durability: See Durable.
Durable / Durability: In general, state that persists across failures is said to be Durable and
an implementation that ensures state persists across failures is said to provide Durability.
In the context of the benchmark, Durability is more tightly defined as the SUT‘s ability to
ensure all Committed data persist across any Single Point of Failure.
Durable Medium: a data storage medium that is inherently non-volatile such as a magnetic
disk or tape. Durable Media is the plural of Durable Medium.
Executive Summary Statement: The term Executive Summary Statement refers to the
Adobe Acrobat PDF file in the ExecutiveSummaryStatement folder in the FDR. The contents
of the Executive Summary Statement are defined in Clause 9.
FDR: The FDR is a zip file of a directory structure containing the following:




A Report in Adobe Acrobat PDF format,
An Executive Summary Statement in Adobe Acrobat PDF format,
An XML document (―ES.xml‖) with approximately the same information as in the
Executive Summary Statement,
The Supporting Files consisting of various source files, scripts, and listing files.
Requirements for the FDR file directory structure are described below.
Comment: The purpose of the FDR is to document how a benchmark Result was
implemented and executed in sufficient detail so that the Result can be reproduced
given the appropriate hardware and software products.
Foreign Key: A Foreign Key (FK) is a column or combination of columns used to establish
and enforce a link between the data in two tables. A link is created between two tables by
adding the column or columns that hold one table's Primary Key values to the other table.
This column becomes a Foreign Key in the second table.
Full disclosure report: See FDR.
Initial Database Size: Initial Database Size is measured after the database is initially loaded
with the data generated by SNDG. Initial Database Size is any space allocated to the test
database which is used to store a database entity (e.g. a row, an index, Database Metadata),
or used as formatting overhead by the data manager.
May: The word "may" in the specification means that an item is truly optional.
Measured Throughput: The Measured Throughput is computed as the total number of Valid
Trade-Result Transactions within the Measurement Interval divided by the duration of the
Measurement Interval in seconds.
Measurement Interval: the period of time during Steady State chosen by the Test Sponsor
to compute the Reported Throughput.
Must: The word "must" or the terms "required", "requires", "requirement" or "shall" in the
specification, means that compliance is mandatory.
Must not: The phrase "must not" or the term "shall not" in the specification, means that this
is an absolute prohibition of the specification.
Native Data Type: A Native Data Type is a built-in data type of the DBMS whose
documented purpose is to store data of a particular type described in the specification. For
example, DATETIME must be implemented with a built-in data type of the DBMS designed
to store date-time information.
Performance Metric: The LDBC-SNB Reported Throughput as expressed in tps. This is
known as the Performance Metric.
Price/Performance Metric: The LDBC-SNB total 3-year pricing divided by the Reported
Throughput is price/tpsE. This is also known as the Price/Performance Metric.
Primary Key: A Primary Key is a single column or combination of columns that uniquely
identifies a row. None of the columns that are part of the Primary Key may be nullable. A
table must have no more than one Primary Key.
Referential Integrity: Referential Integrity preserves the relationship of data between
tables, by restricting actions performed on Primary Keys and Foreign Keys in a table.
Report: The term Report refers to the Adobe Acrobat PDF file in the Report folder in the
FDR. The contents of the Report are defined in Section XX.
Response Time:
Results: TPC-E Results are the Performance Metric, Price/Performance Metric.
RT: See Response Time.
Scale Factor:
Should: The word "should" or the adjective "recommended", mean that there might exist
valid reasons in particular circumstances to ignore a particular item, but the full
implication must be understood and weighed before choosing a different course.
Should not: The phrase "should not", or the phrase "not recommended", means that there
might exist valid reasons in particular circumstances when the particular behavior is
acceptable or even useful, but the full implications should be understood and the case
carefully weighed before implementing any behavior described with this label.
Social Network Dataset Generator: See SNDG.
SNDG: is an application responsible of providing the data sets used by the SN benchmark.
This dataset generator is designed to produce directed labeled graphs that mimic the
characteristics of those graphs of real data.
SUT: See System Under Test.
System Under Test: The System Under Test (SUT) is defined to be the database system
where the benchmark is executed.
Test Sponsor: The Test Sponsor is the company officially submitting the Result with the
FDR and will be charged the filing fee. Although multiple companies may sponsor a Result
together, for the purposes of the LDBC processes the Test Sponsor must be a single
company. A Test Sponsor need not be a LDBC member. The Test Sponsor is responsible for
maintaining the FDR with any necessary updates or corrections. The Test Sponsor is also
the name used to identify the Result.
Test Run: the entire period of time during which Drivers submit and the SUT completes
Transactions other than Trade-Cleanup.
Transaction Mix:
Valid Transaction: The term Valid Transaction refers to any Transaction for which input
data has been sent in full by the Driver, whose processing has been successfully completed
on the SUT and whose correct output data has been received in full by the Driver.
Workload:
# BENCHMARK SCHEMA AND DATA MODEL
## Introduction
<description of the data schema>
## Data schema implementation
SNB may be implemented with different data models, e.g. relational, RDF and different graph data
models.
The reference schema is provided as RDFS and SQL. The data generator produces TTL syntax for
RDF and comma separated values for other data models.
A single attribute has a single data type, as follows:
-
Identifier – This is an integer value foreign key or a URI in RDF. If this is an integer column,
the implementation data type should support at least 2^50 distinct values
A datetime should support a date range from 0000 to 9999 in the year field, with a
resolution of no less than one second.
The string column for names may have a variable length and may have a declared maximum
length, e.g. 40 characters.
Long string – For example a post content may be a long string that is often short in the data
but may not declare a maximum length and must support data sizes f up to 1MB.
A single attribute in the reference schema may not be divided into multiple attributes in the target
schema.
A schema on the DBMS is optional. An RDF implementation for example may work without one. An
RDF implementation is allowed to load the RDF reference schema and to take advantage of the data
type and cardinality statements therein.
A relational or graph schema may specify system specific options affecting storage layout. These
may for example specify vertical partitioning. Vertical partitioning means anything from a column
store layout with per-column allocated storage space to use of explicit column groups. Any mix of
row or column-wise storage structures is allowed as long as this is declaratively specified data
structure by data structure. Data structure here means for example table or index.
Covering indices and clustered indices are allowed. If these are defined, then all replications of data
implied by these must be maintained statement by statement, i.e. each auxiliary data structure must
be consistent with any other data structures of the table after each data manipulation operation.
A covering index is an index which materializes a specific order of a specific subset or possibly all
columns of a table. A clustered index is an index which materializes all columns of a table in a
specific order, which order may or may not be that of the primary key of the table. A clustered or
covering index may be the primary or only representation of a table.
Any subset of the columns on a covering or clustered index may be used for ordering the data. A
hash based index or a combination of a hash based and tree based index are all allowed, in row or
column-wise or hybrid forms.
# BENCHMARK WORKLOAD
## Introduction
<description of the data workloads>
## Workload implementation and data access transparency
The base unit of work is the insertion of one post.
For each post inserted, the following number of other operations must be completed:
<add operations>
For a run to qualify the number of successfully executed operation must not deviate from the above
frequencies by more than 1%.
The queries and updates may be implemented in a declarative query language or as procedural
code using an API.
If a declarative query language is used, e.g. SPARQL or SQL, then explicit query plans are prohibited
in all the read-only queries. The update transactions may still consist of multiple statements,,
effectively amounting to explicit plans.
Explicit query plans include but are not limited to:
-
Directives or hints specifying a join order or join type
Directives or hints specifying an access path, e.g. which index to use
Directive or hints specifying an expected cardinality, selectivity, fanout or any other
information that pertains to the expected number or results or cost of all or part of the
query.
## Auxiliary data structures and pre-computation
Q2 retrieves for a person the latest n posts or comments posted by a contact of the person.
An implementation may choose to pre-compute this ‘ top of the wall’ query.
If doing so, inserting any new post or comment will add the item in question to the materialized top
k post views of each of the contacts of the person. If after insertion, this list were to be longer than
20 items, the transaction will delete the oldest item.
If this pre-computation is applied, the update of the ‘top of the wall’ materialization of the users
concerned must be implemented as a single transaction.
This pre-computation may be implemented as client side logic in the test driver, as stored
procedures or as triggers. In all cases the operations, whether one or many, must constitute a
single transaction. A SPARQL protocol operation consisting of multiple statements may be a valid
implementation of the SUT executes the statements as a single transaction.
Other pre-computation of query results is explicitly prohibited.
## ACID Compliance
The interactive workload requires full ACID support from the SUT.




Atomicity. All the updates in a transaction must either take place or be all cancelled.
Consistency. If a database object, e.g. table, has auxiliary data structures, e.g. indices, the
content of these must be consistent after the commit or rollback of a transaction. If
multiple client application threads share one transaction context, these may transiently see
inconsistent states, e.g. there may be a time when an insert of a row is reflected in one index
of a table but not in another.
Isolation. If a transaction reads the database with intent to update, the DBMS must
guarantee that repeating the same read within the same transaction will return the same
data. This also means that no more and no less data rows must be returned. In other
words, this corresponds to snapshot or to serializable isolation. This level of isolation is
applied for the operations where the transaction mix so specifies. If the database is
accessed without transaction context or without intent too update, then the DBMS should
provide read committed semantics, e.g. repeating the same read may produce different
results but these results may never include effects of pending uncommitted transactions.
Durability. The effects of a transaction must be made durable against instantaneous failure
before the SUT confirms the successful commit of a transaction to the application. For
systems using a transaction log, this implies syncing the durable media of the transaction
log before confirming success to the application.. This will typically entail group commit
where transactions that fall in the same short window are logged together and the logging
device will typically be an SSD or battery backed RAM on a storage controller. For systems
using replication for durability, this will entail receipt of a confirmation message from the
replicating party before confirming successful commit to the application.
# TEST DRIVER
## Introduction
A qualifying run must use the SNB test driver provided with the data generator. The test driver
may be modified by the test sponsor for purposes of interfacing to the SUT. The parameter
generating and result recording and workload scheduling parts of the test driver should not be
changed. The auditor needs to have access to the test driver source code used for producing the
driver used in the reported run.
The test driver produces the following artifacts as a by product of the run:

<add products>
Start and end time of each execution in real time, recorded with microsecond precision, including
the identifier of the operation and any substitution parameters. If the operation is an update taken
from the simulation timeline then the timestamp in simulation time is also recorded.
The test driver is scale-out capable. Many instances of the test driver may be used in a test run.
The number and configuration of the test drivers must be disclosed, along with hardware details of
the platform running the driver(s), together with details of the network interface connecting the
drivers too the SUT. The SUT hardware may also be used for hosting the driver(s), at the discretion
of the test sponsor
A separate test summary tool provided with the test driver analyzes the test driver log(s) after a
measurement window is completed.
The tool produces for each of the distinct queries and transactions the following summary:
-
Count of executions
Minimum/average/90th percentile/maximum execution time.
Start and end date of the window in real time and in simulation time.
Metric in operations per second at scale. (ops) (throughout rating)
Number of test drivers
Number of database sessions (threads) per test driver
Parallelism Settings
The test driver(s) produce(s) the following information concerning the test driver configuration:
Number of test drivers.
For each test driver:
-
-
-
Test driver name and version
IP address of SUT – If not applicable, e.g. in process API, specifies in-process. If localhost
connection specifies localhost:port.
Maximum Number of concurrently pending operations on SUT from this test driver. E.g.
connection pool size, worker thread pool size.
Scheduling mode: For SNB test driver, the scheduling mode, i.e. synchronous, window of n
seconds, etc.
For SNB test driver, maximum length of stall – i.e. amount of real time during which no
progress was made in update. This is a configuration setting, the actual stall times show up
in the metrics.
For SNB driver, measured rate of simulation time to real time, e.g. 30m real for 300 m
simulated. This is calculated based on the earliest and latest operations in both simulation
time and real ttime started and completed during the measurement window.
Update throughput setting – For SPB driver, update operation target throughput in editorial
operations per second for this test driver process. May be fractional.
# EXECUTION RULES AND METRICS
## Introduction
A benchmark execution is divided into the following steps:
-
-
-
-
-
Data Preparation – This includes running the data generator, placing the generated files in a
staging area, configuring storage, setting up the SUT configuration and preparing any data
partitions in the SUT. This may include pre-allocating database space but may not include
loading any data or defining any schema having to do with the benchmark.
Bulk Load – This includes defining the database schema, if any, loading the initial database
population, making this durably stored, gathering any optimizer statistics,. The bulk load
time is reported and is equal to the amount of elapsed wall clock time between starting the
schema definition and receiving the confirmation message of the end of statistics gathering.
Benchmark Run – The run begins after the bulk load or after another benchmark run. If the
run does not directly follow the bulk load, it must start at a point in the update stream that
has not previously been played into the database. In other words, a run may only include
update events whose timestamp is later than the latest post creation date in the database
prior to start of run. The run starts when the first of the test drivers sends its first message
to the SUT. If the SUT is in-process with the driver the window starts when the driver
starts.
Measurement Window – The measurement window is the timed portion of the benchmark
run. It may begin at any time during the run. The activity during the measurement window
must meet the criteria set forth in Query Mix and must include enough updates to satisfy the
criteria in Minimum Measurement Window. The measurement window is terminated at the
discretion of the test sponsor at any time when the Minimum Measurement Window
criteria are met. All the processes constituting the SUT are to be killed at the end of the
window or alternatively all the hardware components of the SUT are to be powered off.
Recovery Test – The SUT is to be restarted after the measurement window and the auditor
will verify that the SUT contains the entirety of the last update recorded by the test
driver(s) as successfully committed.
## Scaling and database population rules
The scale factor of a SNB dataset is the number of simulated users. All dataset scales contain data
for three years of social network activity.
The validation scale is 10,000 users. Official SNB results may be published at this scale of the
30,000 user scale or a power of 10 multiple of either. For example, 100,000 , 300,000, 1,000,000,
3,000,000 and so forth.
The dataset is divided in a bulk loadable initial database population and an update stream. These
are generated by the SNB data generator. The data generator has options for splitting the dataset
into any number of files.
The update stream contains the latest 10% of the events in the simulated social network. These
events form a single serializable sequence in time. Some events will depend on preceding events,
for example a post must exist before a reply to the post is posted. The data generator guarantees
that these are separated by at least 2 minutes of simulation time.
The update stream may be broken into arbitrarily many sub-streams. Valid partitioning criteria are
for example the user performing the action, if any, or a round-robin division where each successive
event goes into a different sub-stream. When two events occur in the same sub-stream, their order
must be the same as in the original update stream.
Rationale: The authors are aware that the prevalent practice for online benchmarks is to tie the
reported throughput to the scale, e.g. max 12.5 tpmC per warehouse in TPC-C. The authors depart
from this practice here because with throughput tied to scale, test systems with interesting
throughputs rapidly become very expensive,, raising the entry barrier for publishing a result. Itt is
thought that scaling in buckets lowers the barrier of entry and reduces incentive to use hardware
configurations that would be unusual in a production environment.
## Performance metrics and response time
### Response time
The Response Time (RT) is defined by RTn = eTn - sTn where:



sTn and eTn are measured at the Driver;
sTn = time measured before the first byte of input data of the Transaction is sent by
the Driver to the SUT; and
eTn = time measured after the last byte of output data from the Transaction is
received by the Driver from the SUT.
The resolution of the time stamps used for measuring Response Time must be at least 0.01
seconds.
### Throughput
Operations per second at scale, e.g. opsSI@300G
The metric is calculated for a run that satisfies the minimum length and per query minimum
execution count and query mix criteria. Each completed operation counts as one operation. The
metric is the count of successful operations divided by elapsed time in seconds.
### SNB Interactive Metric.
The metric consists of a power score and a throughput score and of a composite metric which is the
geometric mean of the two.
The power score is the geometric mean of the execution times of the update set and queries scaled
to operations per hour.
The throughput test length is the length of elapsed time between the start of the first query stream
of the throughput test and the end of the last stream or update set of the throughput test,
whichever is later. The throughput score is the number of queries, excluding update sets,
completed during the throughput test, scaled to queries per hour.
The BI metrics are multiplied by the scale factor. The rationale is that a run with 10x more date is
likely to take approximately 10x or longer on the same platform. The multiplication is used so as
not to have sharply decreasing metrics when scale increases.
Note that this is not done with the interactive metric as we expect a sub-linear increase in query
time as scale increases.
## Workload execution
The SNB BI workload is measured similarly to TPC-H, consisting of a power run and a throughput
run.
The workload is divided into query streams and update sets. A query stream has one execution of
each query, as specified in the workload definition.
The power run consists of one update set followed by one query stream.
The throughput run consists of a test sponsor dependent number of query streams. For each query
stream one update set will be processed. The scheduling of query streams and update sets is as
specified in the workload definition, similar to TPC-DS.
For both power and throughput runs:
-
Name of run, e.g. power/throughput
Sequence number of run, e.g. 0 if first run of its type
Start time
End time
The start time is before the submission of the first request to the SUT and the end time is after
complete receipt of the last result pertaining to the run from the SUT. The last result is the last
result of the last query or update set belonging to the run, whichever is later.
For each query stream and update set, the following is recorded:
-
Type of operation, e.g. update set or query stream.
Number of query stream or update set
Start time
End time
For each operation in each query stream/update set the following is recorded:
-
Number of the stream/update set
Sequence number in the query stream. Note that different streams run different
permutations of the query set, hence the sequence number does not follow from the query.
Name of query, e.g. bi/query2.txt if queries read from template files, else the name of the file
containing the query implementation as code, one file per query.
Value of each substitution parameter
Executable text of the query (if instantiated from a template)
Start time
End time
First 100 rows of result set. This is for documentation and may be as text, CSV or other
convenient human readable format.
-
## Minimum Measurement Window
The measurement window starts at the point in simulation time given by the post with the highest
creation date in the timestamp. The update stream must contain at least 35 days worth of events in
simulation time that are in the future from the date-time of the latest pre-run post in the database.
The minimal measurement window corresponds to 30 days worth of events in simulation time, as
generated by the data generator. Thus the number of events in the minimal window is linear to the
scale factor. And grows the database by approximately 3%.
When the test sponsor decides to start the measurement window, the time of the latest new post
event successfully completed by any of the test drivers is taken to be the start of the measurement
window in simulation time. The window may be terminated by the test sponsor at any time when
the timestamp of the latest new post event in the log of any of the test drivers is more than 30 days
of simulation time in the future of the starting timestamp.
The test summary tool (see below) may be used for reading the logs being written by a test driver.
## Checkpoints
A checkpoint is defined as the operation which causes data persisted in a transaction log to become
durable outside of the transaction log. In specific this means that a SUT restart after instantaneous
failure following the completion of the checkpoint may not have recourse to transaction log entries
written before the end of the checkpoint.
A checkpoint typically involves a synchronization barrier at which all data committed prior too the
moment is required to be in durable storage that does not depend on the transaction log.
Not all DBMS’s use a checkpointing mechanism for durability. For example a system may rely on
redundant storage of data for durability guarantees against instantaneous failure of a single server.
The measurement window may contain a checkpoint. If the measurement window does not contain
one, then the restart test will involve redoing all the updates in the window as part of the recovery
test.
# PRICING
# FULL DISCLOSURE
## Introduction
<introduction>
All of the information included in a Full Disclosure Resport is gathered into a single file per one test
execution. The test execution is a SNB interactive, a SNB BI or a SPB execution. Other types will be
added as appropriate. The different types of executions share certain common items.
Note that since a BI run is specified to start after a bulk load, a BI run must precede the interactive
run if both are made from the same dataset and share a bulk load. The implication for data
generation is that update events may be produced in a BI update set format first and then in a
streaming format for the interactive mix.
## Full Disclosure Report
The full disclosure report consists of the following:
-
Name and contact information of the test sponsor
-
Description of the DBMS
Description of the SUT hardware
Description of additional resources used for driving the test, if separate from the SUT
hardware
3 years total cost of ownership for the SUT, including hardware and software
Metric and numerical quantities summary produced by the test driver summary function
Attestation letter by the auditor
Supporting Files
The Supporting Files is a file archive (e.g. tar or zip) with the following:
-
Complete configuration files for the DBMS
Configuration files for the test driver(s).
Any operating system tuning parameters with non-default values in effect at the time of the
run. For example sysctl.conf non Linux.
Copy of the test driver log(s) for the run.
Complete modifications of the test driver source code, if any
## Dataset description
The following is recorded for dataset description:
-
Scale factor.
Data format, e.g. CSV, TTL, etc
Data generator name and version
Number of interactive events generated for interactive update run
Number of update sets generated for BI run
Datetime of start of data generation
This information is produced by the data generator and put in a log file. This is included into the
overall run report by the script assembling the parts.
## Bulk load description
-
Start of bulk load
End of bulk load
Count of persons, knows, posts, comments, tags of posts and comments after completion of
bulk load.
## DBMS description
A test implementation extracts the following information of the DBMS:
-
Vendor name
DBMS name
-
Version number
These may be extracted with the appropriate information calls of ODBC/JDBC, SPARQL end point
description or other DBMS dependent means.
## Platform description
The following lists the automatically extracted features of a platform when running an LDBC
benchmark. The measures are in this example taken from a Linux system but the corresponding
metrics may be extracted from most other systems.
-
Operating system, e.g. result of uname -a
CPU type – e.g. Xeon E52550@2.00GHz – Heading of the CPU description in proc/cpuinfo
Number of threads: The count of CPU description entries in /proc/cpuinfo
Number of cores: The siblings count ihn a CPU description in proc/cpuinfo
Memory: The total amount of RAM is given by free (in MB)
Number of disks (number of distinct /dev/sd?*, count distinct letters in ?
Total disk capacity: sum of total space from df, excluding nfs mounted file systems and RAM
based file systems, e.g. tmpfs.
System configuration, /etc/sysconf file contents for Linux.
In the case of a cluster, this description is repeated for each of the cluster nodes. These will
typically be identical but may not always be so.
## Interactive workload description
-
Start of test run as datetime.
Start of measurement window as datetime.
Time of instantaneous failure stopping the measurement window as datetime.
Time of DBMS restart following SUT cold boot.
Time of test driver restart.
Time of first successful connection from test driver to SUT.
End Time of first successfully executed workload operation.
Time of reaching steady state.
Steady state is considered reached at the end of the first continuous two-minute window for which
the throughput is over 90% of the throughput reported for the first measurement period.
## Metrics reported description
The following lists the metrics reported per operation for the operations started and successfully
completed during the measurement window.
For each distinct type of query/update operation:
-
Name of operation, e.g. int/query1.txt
-
Template text – If the query is instantiated from a template, this is the template text without
parameter substitution
Count of successful executions
Shortest/average/90 percentile/longest execution time in microseconds
If the query is read from a template in the file system, this references the template, otherwise this
references the file name of the file containing the query implementation
### Throughput Over Time
For each two minute window during the whole execution of each test driver instance, the following
is recorded:
-
Start time of window
Flag indicating whether the window is in the measurement interval
Throughput measured inside the window as operations per second, counting operations
started and successfully ciompleted in the window. Operations that start in one window
and end in another are not counted here but are counted in the complete run.
The test driver emits this information at the end of each window. This allows the test sponsor to
determine whether the system is in steady state. The test sponsor may signal the start of the
measurement window to the test driver. This may be done by copying a file to the directory of the
test driver, for example.
A graphic documenting the sustained throughput of the SUT during the interactive workload can be
prepared from this data. This shows the warm-up behavior before the measurement window as
well as during the restart following cold boot of the SUT. This graphic is included in the FDR,
similarly to TPC-C.
## Machine readable FDR report
A benchmark run is described by a machine readable data structure, e.g. RDF in TTL file. This
document specifies what information this data structure should contain.
Different sections of a benchmark run contribute content to this description. Most of this
information is produced by the benchmark tools themselves, some is produced by test-sponsor
supplied code run as part of the benchmark run.
# AUDIT
Download