Intelligent Environments Computer Science and Engineering University of Texas at Arlington Intelligent Environments 1 Databases for Intelligent Environments Requirements Technologies Evaluation Architecture Intelligent Environments 2 Intelligent Environments Database Requirements Intelligent Environments 3 Database Requirements Intelligent Environments 4 Data Storage Requirements Sensor data Temperature (15 @ 8 Kbps) Humidity (15 @ 8 Kbps) Gas (15 @ 8 Kbps) Light (15 @ 8 Kbps) Motion (15 @ 8 Kbps) Pressure (100 @ 8 Kbps) Microphone (15 @ 500 Kbps) Camera (15 @ 10 Mbps) Intelligent Environments 5 Data Storage Requirements User data Multimedia Phone messages/conversations (500 Kbps – 10 Mbps) Music (500 Kbps) TV/Radio broadcasts (500 Kbps – 10 Mbps) Home movies (10 Mbps) Images Computer Programs Data files Operating systems Intelligent Environments 6 Data Storage Requirements Issues Query frequency and type Sampling/recording rates 205 sensors (158,900 Kbps) Multimedia recordings Simultaneous playback Analysis, prediction, decision-making queries Transaction granularity Historical data, decay Security and privacy Centralized vs. distributed Intelligent Environments 7 Intelligent Environments Database Technologies Intelligent Environments 8 Database Technologies Commercial DB2 Empress Informix Oracle MS Access MS SQL Sybase Free Intelligent Environments Berkeley DB PostgreSQL MySQL 9 DB2 Vendor: IBM Availability: Commercial ($300) www.ibm.com/software/data/db2 Features Comprehensive Intelligent Environments 10 Empress Vendor: Empress Availability: Commercial ($ call) www.empress.com Features Designed for embedded, real-time applications Intelligent Environments 11 Informix Vendor: IBM (acquired from Informix) Availability: Commercial ($ call) www.ibm.com/software/data/informix Features Parallel databases Object relational Intelligent Environments 12 Oracle Vendor: Oracle Availability: Commercial ($300) www.oracle.com Features Comprehensive Intelligent Environments 13 MS Access Vendor: Microsoft Availability: Commerical ($329 with Office Professional) www.microsoft.com/office/access General purpose Designed for individual users Intelligent Environments 14 MS SQL Vendor: Microsoft Availability: Commercial ($5,000) www.microsoft.com/sql Features General purpose Designed for enterprise users Intelligent Environments 15 Sybase Vendor: Sybase Availability: Commercial ($1,000) www.sybase.com Features General purpose Intelligent Environments 16 Berkeley DB Vendor: UC Berkeley Availability: Free www.sleepycat.com Features Designed for embedded systems applications Intelligent Environments 17 MySQL Vendor: MySQL Availability: Free www.mysql.com Features General purpose Intelligent Environments 18 PostgreSQL Vendor: Open source effort Availability: Free www.postgresql.org Features General purpose Intelligent Environments 19 Intelligent Environments Database Evaluation Intelligent Environments 20 Database Benchmarking Transaction Processing Performance Council (TPC) www.tpc.org Rigorously-defined benchmarks Independent regulatory body TPC benchmarks TPC-C, TPC-H, TPC-R, TPC-W Intelligent Environments 21 TPC-C Benchmark Simulates complete computing environment Multiple users executing transactions against a database Order-entry scenario Entering and delivering orders Recording payments Checking order status Inventory monitoring Metrics Transactions per minute (tpmC) Price per transaction ($/tpmC) Intelligent Environments 22 TPC-H Benchmark Decision support benchmark Examine large volumes of data Answers to critical business questions Complex queries Data modifications Metrics Composite Query-per-Hour Performance Metric (QphH@Size, $/QphH@Size) Size of database Single-stream query processing power Concurrent query throughput Intelligent Environments 23 TPC-R Benchmark Decision support benchmark Similar to TPC-H Advanced knowledge of queries Allows optimization Metrics Composite Query-per-Hour Performance Metric (QphR@Size, $/QphR@Size) Intelligent Environments 24 TPC-W Benchmark Web transactions benchmark E-commerce scenario Multiple browser sessions Dynamic page generation with database access and update Simultaneous transaction execution Heterogeneous database tables (sizes, attributes, relationships) Metrics Web interactions processed per second (WIPS, $/WIPS) Intelligent Environments 25 TPC Results Best TPC-C 709,220 tpmC (MS SQL) TPC-H 100GB: 5578 QphH (Oracle) 300GB: 5976 QphH (Oracle) 1000GB: 25,805 QphH (Oracle) 3000GB: 79,528 QphH (Teradata) 10,000GB: 81,501 QphH (Teradata) Intelligent Environments 26 TPC Results Best TPC-R TPC-W 100GB: 4442 QphR (Oracle) 10,000 items: 21,139 WIPS (MS SQL) 100,000 items: 10,439 WIPS (MS SQL) More results at www.tpc.org Intelligent Environments 27 Other Benchmarks Wisconsin AS3AP Relational queries ANSI SQL Scalable and Portable benchmark Mix of transactions, relational queries, and utility functions Open Source Database Benchmark (OSDB) Based on AS3AP Intelligent Environments 28 Analysis High-end database transaction processing power 600,000 tpm = 10,000 tps Sensor recording transactions 15 temp/hum/gas/light/motion, 100 pres 15 cameras (30 fps) / 15 microphones (64 Kbps) 175 tps 465 tps, or 120,450 tps (one-byte mic transactions) Multimedia recording transactions Prediction and decision-making queries System information Intelligent Environments 29 Intelligent Environments Database Architecture Intelligent Environments 30 Database Architecture Issues (again) Query frequency and type Sensors Multimedia recording and playback Analysis, prediction, decision-making queries User data System information Transaction granularity Historical data, decay Security and privacy Centralized vs. distributed Intelligent Environments 31 Sensor Database Systems COUGAR project www.cs.cornell.edu/database/cougar Query processing over ad-hoc sensor networks Small database component (QueryProxy) at each sensor Sensor clusters provide local aggregations (e.g., min, max, mean) Assumes centralized index of all data sources Intelligent Environments 32 Siemens Netabase “The network is the database.” Sensor networks Navas and Wynblatt, ACM SIGMOD 2001 Large number of data sources (105) Volatile data and data organization “Thin” data servers on scaled-down hardware Netabase approach Query decomposition Characteristic routing (ala IP routing) Local joins Query evaluation Intelligent Environments 33 Siemens Netabase www.netabasesoftware.com Intelligent Environments 34 SmartHome Database Architecture Intelligent Environments 35 SmartHome Database Architecture Centralized vs. distributed? Answer: Both Central storage of high demand, persistent data Distributed storage of low demand, dynamic data Distributed queries Push processing toward sensors Adaptive, hierarchical organization End-effector autonomy (“smart sensor”) Intelligent Environments 36 UTA MavHome Smart Home Intelligent Environments 37