A Peer-to-Peer Database Server based on BitTorrent John Colquhoun Paul Watson www.neresc.ac.uk Introduction • If a database server receives queries faster than it can process them, performance becomes unacceptable • Similar problems have been addressed in the domain of file-sharing by the use of Peer-to-Peer (P2P) technologies • Can we utilise the combined processing power, disk space and memory of individual clients to reduce the load on the server? • We examine how P2P techniques could be applied within a database environment and introduce the Wigan P2P database, derived from the BitTorrent file-sharing protocol • Potential applications in e-Science and e-Commerce www.neresc.ac.uk 2 System Architecture Advertise: SELECT * FROM t Query: SELECT id, value FROM t WHERE t.Id < 100 Advertise: SELECT id, value FROM t WHERE t.Id < 100 Tracker Query: SELECT id, value FROM t WHERE t.id < 100 Advertise: SELECT id, value FROM t WHERE t.id < 100 Advertise: SELECT id, value Query: SELECT id, value FROMt t FROM WHEREt.Id t.Id<<10 10 WHERE www.neresc.ac.uk 3 Implementation • A simulator of Wigan • The TPC-H benchmark database was used to evaluate the design • Identified cases where Wigan offered a performance advantage over a Client-Server database and those areas where it did not • Experiment results – a busy system where peers submitted one of a choice of five queries, however some submitted an entirely random query over a table of 10,000 tuples www.neresc.ac.uk 4 Simulator Results 120 Average response time (s) 100 80 All peers 60 Random queries Repeating queries 40 20 0 P2P Client-Server Database type www.neresc.ac.uk 5 Current work • Live Wigan system • Currently under development, using algorithms developed for the simulator • Written in Java and uses OGSA-DAI • Also uses the TPC-H benchmark database • Experiments are ongoing; in the future the live version will be used to investigate extensions to the Wigan system www.neresc.ac.uk 6 Live System Results (1) • Initial experiments with the live system involved comparing Wigan against accessing data directly from SQLServer via JDBC • One of the TPC-H tables is large (approx 6 million tuples) and hence SQLServer takes some time to evaluate queries on this table, regardless of the result set size • However, in Wigan, connecting to a peer that already has the query results reduces the response time • In this experiment, all peers submitted the same query www.neresc.ac.uk 7 Live System Results (2) 45 40 Average response time (s) 35 30 25 20 15 10 5 0 Wigan www.neresc.ac.uk SQL Server 8 Summary • We designed, implemented & evaluated the Wigan Peer-to-Peer Database System • Derived from the popular BitTorrent file-sharing protocol • The first database server that uses P2P to scale over multiple peers • Simulator shows Wigan can outperform client-server db when: • There are enough peers available to reduce the load on the seed • There is sufficient overlap between the queries • The system is sufficiently busy so a traditional database server would become overloaded www.neresc.ac.uk 9