PostgreSQL Performance Presentation 9.5devel Edition Mark Wong Consultant, 2ndQuadrant & Contributor, PostgreSQL 10 March 2015 c 2ndQuadrant, Ltd. SF PGDAY 2015 1 / 50 Agenda • • • • Background Patches to test The test suite Compare results between difference versions of PostgreSQL c 2ndQuadrant, Ltd. SF PGDAY 2015 2 / 50 Background This presentation made possible by the AXLE1 project: • Advanced Analytics for Extremely Large European Databases • “The objectives of the AXLE project are to greatly improve the speed and quality of decision making on real-world data sets. AXLE aims to make these improvements generally available through high quality open source implementations via the PostgreSQL and Orange products.” 1 http://axleproject.eu/ c 2ndQuadrant, Ltd. SF PGDAY 2015 3 / 50 A few AXLE details • Aspects of the project include: − − − − • Scalability engineering—autopartitioning, compression Security, privacy, and audit techniques Visual analytics Advanced architectures for hardware and software Partners − − − − − c 2ndQuadrant, Ltd. 2ndQuadrant Barcelona Supercomputing Center PortaVita University of Manchester University of Ljubljana SF PGDAY 2015 4 / 50 The Patches c 2ndQuadrant, Ltd. SF PGDAY 2015 5 / 50 Changes Evaluated Patches tested in this presentation are intended to help the performance of large database systems: • New feature already committed: − • BRIN indexes Scalability related improvements still under review: − − Concurrent hash table for buffer lookups Wait free LWLocks • • c 2ndQuadrant, Ltd. Light Weight Locks Currently provides mutually exclusive access to shared memory by using spin locks or semaphores SF PGDAY 2015 6 / 50 BRIN • • • • • Block Range indexes are a new-to-PostgreSQL type of index Authored by Álvaro Herrera @ 2ndQuadrant Originally proposed as MinMax indexes2 Intended for Big Data systems Similar to Oracle Exadata ”storage indexes” 2 http://goo.gl/yUwav9 c 2ndQuadrant, Ltd. SF PGDAY 2015 7 / 50 Why use BRIN? • • Much smaller and less costly to maintain compared to B-tree indexes Allowing use of an index on a large table that would previously been impractical using B-tree indexes without horizontal partitioning c 2ndQuadrant, Ltd. SF PGDAY 2015 8 / 50 When to use BRIN? Best case scenario: • On a column that has a linear sort order • Example: − − c 2ndQuadrant, Ltd. A table recording sales orders Use BRIN on the DATE column representing when an order was placed SF PGDAY 2015 9 / 50 Index Structure Comparison B-tree BRIN c 2ndQuadrant, Ltd. SF PGDAY 2015 10 / 50 Concurrent Hash Table • • Authored by Robert Haas @ EnterpriseDB 3 Improve concurrency accessing the buffer table between simultaneously executing queries − − 3 Lookups performed using memory barriers instead of locks Inserts and deletes (to the buffer table) use atomic operations instead of locks http://goo.gl/9naq5c c 2ndQuadrant, Ltd. SF PGDAY 2015 11 / 50 Series of changes around LWLocks • • • • • Related to wait free LW SHARED acquisition Authored by Andres Freund @ 2ndQuadrant 4 Use atomic compare-and-exchange to acquire LWLock Lockless clock sweep buffer eviction Reduce false sharing by increase padding between LWLocks 4 http://goo.gl/8bf3CN c 2ndQuadrant, Ltd. SF PGDAY 2015 12 / 50 False sharing visual from Intel c 2ndQuadrant, Ltd. SF PGDAY 2015 13 / 50 The Hardware c 2ndQuadrant, Ltd. SF PGDAY 2015 14 / 50 Environment: Barcelona Barcelona Supercomputing Center • OS: SUSE Linux Enterprise Server 11 SP3 • Hardware − − − c 2ndQuadrant, Ltd. 2 x Intel Xeon E5-2670, Sandy Bridge DP, 2.6Ghz 8 Core/16 Threads, 3.3Ghz Turbo, 8GT/s QPI, 20MB cache, 115W 256 GB RAM 120TB storage connected via 10GigE SF PGDAY 2015 15 / 50 Environment: Manchester University of Manchester • OS: Ubuntu 12.04.4 LTS • Hardware − − − c 2ndQuadrant, Ltd. 4 x Intel Xeon E54620 Eight Core 2.2GHz Processors, 8 Core/32 Threads 256 GB RAM 10TB storage, 17 x 600GB SSD RAID-0 SF PGDAY 2015 16 / 50 The Test c 2ndQuadrant, Ltd. SF PGDAY 2015 17 / 50 TPC BenchmarkTMH • • • http://www.tpc.org/tpch/ A suite of business oriented ad-hoc queries against a parts supplier data warehouse Composed of three tests: − − − • Load Test Power Test Throughput Test Use smaller parts of the TPC-H to perform microbenchmarks against BRIN indexes c 2ndQuadrant, Ltd. SF PGDAY 2015 18 / 50 TPC-H Load Test Measure the time to load data into 8 tables and build all indexes c 2ndQuadrant, Ltd. SF PGDAY 2015 19 / 50 TPC-H Load Test notes • • • Load of each table starts simultaneously All indexes created simultaneously as soon as table is loaded Expectation is that the combination of BRIN indexes and the scalability patches will reduce the index creation portion of the test c 2ndQuadrant, Ltd. SF PGDAY 2015 20 / 50 TPC-H Power Test Raw query execution for a single user: • 22 different queries executed consecutively • Order of queries randomized per test • Scored by taking the geometric mean of the execution time of each query • Expectation is that the use of BRIN indexes will be comparable to the use of B-tree indexes c 2ndQuadrant, Ltd. SF PGDAY 2015 21 / 50 A sample of Power Test queries c 2ndQuadrant, Ltd. SF PGDAY 2015 22 / 50 TPC-H Query 14 This query monitors the market response to a promotion such as TV advertisements or a special campaign. select 100.00 ∗ sum( case when p t y p e l i k e ’PROMO% ’ then l e x t e n d e d p r i c e ∗ ( 1 − l d i s c o u n t ) else 0 end ) / sum( l e x t e n d e d p r i c e ∗ ( 1 − l d i s c o u n t ) ) as promo revenue from lineitem , part where l partkey = p partkey and l s h i p d a t e >= date ’1994−10−01 ’ and l s h i p d a t e < cast ( date ’1994−10−01 ’ + i n t e r v a l ’ 1 month ’ as date ) ; c 2ndQuadrant, Ltd. SF PGDAY 2015 23 / 50 TPC-H Query 12 This query determines whether selecting less expensive modes of shipping is negatively affecting the critical-priority orders by causing more parts to be received by customers after the committed date. s e l e c t l shipmode , sum( case when o o r d e r p r i o r i t y = ’1−URGENT ’ or o o r d e r p r i o r i t y = ’2−HIGH ’ then 1 else 0 end ) as h i g h l i n e c o u n t , sum( case when o o r d e r p r i o r i t y <> ’1−URGENT ’ and o o r d e r p r i o r i t y <> ’2−HIGH ’ then 1 else 0 end ) as l o w l i n e c o u n t from orders , l i n e i t e m where o o r d e r k e y = l o r d e r k e y and l shipmode i n ( ’FOB ’ , ’TRUCK ’ ) and l c o m m i t d a t e < l r e c e i p t d a t e and l s h i p d a t e < l c o m m i t d a t e and l r e c e i p t d a t e >= date ’1994−01−01 ’ and l r e c e i p t d a t e < date ’1994−01−01 ’ + i n t e r v a l ’ 1 year ’ group by l shipmode order by l shipmode ; c 2ndQuadrant, Ltd. SF PGDAY 2015 24 / 50 TPC-H Throughput Test Measure the system’s ability to execute queries from multiple users simultaneously: • Execute multiple Power Tests at the same time • Each instance of a Power Test referred to as a stream • Scored as the ratio of the total number of queries executed over the length of the test • Expectation is that we can run more streams and execute queries faster c 2ndQuadrant, Ltd. SF PGDAY 2015 25 / 50 Initial BRIN Tests But first design microbenchmarks around BRIN in case their use is masked in the TPC-H workload: • Loading a table • Creating an index • Maintenance on growing a table • Measure against the lineitem table, 6 million rows per scale factor c 2ndQuadrant, Ltd. SF PGDAY 2015 26 / 50 Load tables with BRIN 1. Load data into the table when there are no indexes 2. Load data into the table when there is only a single B-tree index on a DATE column 3. Repeat after replaces the B-tree index with a BRIN index c 2ndQuadrant, Ltd. SF PGDAY 2015 27 / 50 Create a BRIN index 1. Load data into the table when there are no indexes 2. Create a B-tree index on the same DATE column 3. Start over and repeat with a BRIN index c 2ndQuadrant, Ltd. SF PGDAY 2015 28 / 50 BRIN maintenance overhead 1. Create table without any indexes 2. Time the load of 8GB of data repeatedly until table size reaches 1TB 3. Start over and repeat with a B-tree index on the same DATE column 4. Start over and repeat with a BRIN index on the same DATE column c 2ndQuadrant, Ltd. SF PGDAY 2015 29 / 50 The Results c 2ndQuadrant, Ltd. SF PGDAY 2015 30 / 50 BRIN microbenchmarks c 2ndQuadrant, Ltd. SF PGDAY 2015 31 / 50 BRIN index increases load time by 10% vs 84% B-tree index c 2ndQuadrant, Ltd. SF PGDAY 2015 32 / 50 Creating a BRIN index is 81% faster than creating a B-tree index c 2ndQuadrant, Ltd. SF PGDAY 2015 33 / 50 BRIN overhead a constant 10% vs B-tree overhead growing slightly as the table grows c 2ndQuadrant, Ltd. SF PGDAY 2015 34 / 50 Impact on TPC-H Tests c 2ndQuadrant, Ltd. SF PGDAY 2015 35 / 50 Load Test at 30GB • B-tree − − • 28 B-tree indexes across 8 tables Total build time 46 minutes BRIN − 29 total indexes across 8 tables • • − • 21 B-tree 8 BRIN Total build time 34 minutes Use of BRIN indexes reduces load time by 26% c 2ndQuadrant, Ltd. SF PGDAY 2015 36 / 50 Power Test notes at 30GB • • 15 of 22 queries to use BRIN index Q2, Q3, Q4, Q5, Q6, Q7, Q8, Q10, Q11, Q14, Q15, Q16, Q19, Q20, Q21 c 2ndQuadrant, Ltd. SF PGDAY 2015 37 / 50 Power Test results at 30GB c 2ndQuadrant, Ltd. SF PGDAY 2015 38 / 50 Power Test results at 100GB c 2ndQuadrant, Ltd. SF PGDAY 2015 39 / 50 Summary of Power Test • At 30GB − − • Overall score 2145 (B-tree) vs 1806 (BRIN) Use of BRIN indexes within 15% of the B-tree score At 100Gb − − c 2ndQuadrant, Ltd. Overall score 1418 (B-tree) vs 1815 (BRIN) Use of BRIN indexes improves score by 28% SF PGDAY 2015 40 / 50 Index Comparison at 100GB5 5 statement timeout set to 3000 seconds for no indexes c 2ndQuadrant, Ltd. SF PGDAY 2015 41 / 50 Summary of Power Test with BRIN vs No indexes • At 100Gb − − 6 Overall score 10616 (no indexes) vs 1815 BRIN Use of BRIN indexes improves score by over 71% Should be lower without statement timeout set c 2ndQuadrant, Ltd. SF PGDAY 2015 42 / 50 BRIN objectives met • • • BRIN index size is 612 times smaller: 1.7MB vs 1GB B-tree index on 8 GB of data BRIN overhead, maintenance, size all improved over B-tree Performance of BRIN a little slower than B-tree performance c 2ndQuadrant, Ltd. SF PGDAY 2015 43 / 50 Scalability results c 2ndQuadrant, Ltd. SF PGDAY 2015 44 / 50 Scalability effect on Load Test Additional patches 20% improved over 9.2, 14% over development branch: c 2ndQuadrant, Ltd. SF PGDAY 2015 45 / 50 Scaling on a 2-socket system c 2ndQuadrant, Ltd. SF PGDAY 2015 46 / 50 Scaling on a 4-socket system c 2ndQuadrant, Ltd. SF PGDAY 2015 47 / 50 Power Test History c 2ndQuadrant, Ltd. SF PGDAY 2015 48 / 50 Thank you! c 2ndQuadrant, Ltd. SF PGDAY 2015 49 / 50 The research leading to these results has received funding from the European Union’s Seventh Framework Programme (FP7/2007-2013) under grant agreement n◦ 318633 c 2ndQuadrant, Ltd. SF PGDAY 2015 50 / 50