Transactional Information Systems Winter Term 2015/2016 Prof. Dr.-Ing. Wolfgang Lehner Who for Whom Prof. Dr.-Ing. Wolfgang Lehner Lecture wolfgang.lehner@tu-dresden.de Tomas Karnagel Exercise tomas.karnagel@tu-dresden.de Dr.-Ing. Martin Hahmann Lecture & Exercise martin.hahmann@tu-dresden.de Kai Herrmann Exercise kai.herrmann@tu-dresden.de 2 Who for Whom (2) CLASSES AND EXERCISES § For students of Distributed Systems Engineering, Computational Logic & Engineering § Part of the Module „Ubiquitous Information Systems“ § Class: Every Monday, 13:00 to 14:30 (room INF/E023) § Exercise - Starting this week - Every Tuesday, 13:00 to 14:30 - APB E001 LECTURE NOTES § https://wwwdb.inf.tu-dresden.de/lectures/ws-20152016/transactional-information-systems/ § News via Facebook: Database Technology Group Dresden (become a Fan!) 3 Databases - What are they good for? Why Databases? Relational databases are the foundation of the western civilization. Bruce Lindsay, IBM Fellow @ IBM Almaden Research Center 5 Challenges – Petabyte Age INFORMATION AGE !!! § § § § § 100 million FedEx transactions per day 150 million VISA credit card transactions per day 300 million long distance calls in the AT&T network per day 50 billion emails per day 100 million monthly unique U.S. visitors in June 2011 for each of the ”Big five” web companies Google, Facebook, Youtube, Yahoo and Twitter EVOLUTION § 2002: 22ExaByte (1018) of generated and stored digital data § 2006: 161ExaByte of generated and stored digital data § Estimate for 2010 - Yearly produced digital information will be ~988ExaByte (~1 ZByte) 70% of the generated data comes from individuals (~2GByte per person per year) 95% of the information is (almost) unstructured 25% of the digital information are images 6 Challenges – Petabyte Age (2) THE WEB IS A HUGE SOURCE OF INFORMATION: SEARCH ENGINES (GOOGLE, YAHOO!) COLLECT AND STORE BILLIONS OF DOCUMENTS § § § § § 20 PB processed every day at Google (2008) Google Search covers about 50 billion web pages 40 billion photos are hosted by Facebook eBay has 6.5 PB of user data + 50 TB/day (2009) Structured data, text, images, video - 15 hr of video uploaded to YouTube every min § World of Warcraft utilizes 1.3 PB of storage space § Valve Steam delivers 20 PB of content monthly 7 Challenges – Petabyte Age (3) NEW REALITIES § § § § Cheap digital sensors for everything Cheap digital storage (TB disks < $100) Everything is data Rise of data-driven culture - CERN’s LHC generates 15 PB a year - Sloan Digital Sky Survey (200 GB/night) - ... The quest for knowledge used to begin with grand theories. Now it begins with massive amounts of data. Welcome to the Petabyte Age. 8 Attempt of a Definition WHAT ARE DATABASE SYSTEMS? § Elmasri/Navathe: “A collection of Related Data” with following properties: - A database represents a specific part of the real world - A database is logical consistent and has a specific meaning - A database is designed, build and filled with data to fulfill a specific service APPLICATION SCENARIOS OF DATABASES § Universal approach: Efficient storage and processing of data - Managing bank accounts - Managing customer data in an insurance company - Multimedia data: e.g. texts, articles, magazines and journals of a publisher - Estate management - ... 9 Goals of Database Systems STORAGE § Efficient management of large and persistent (long-term stored) data § Convenient and efficient data organization (data independence, layered architecture, data dictionary) ACCESS § Easy data access (query language) § Concurrent user access (transactions) § Data security (protection against unauthorized access) ADDITIONAL § Data integrity (ensuring data consistency) - Semantic constraints which must be fulfilled by data objects (e.g. employees must be older than § Fault-tolerance and availability (protection against data loss) 10 Overview What is in the Lecture? 1. DATABASE USAGE § Query § Programming § Design 2. DATABASE ARCHITECTURE § Indexes § Transactions § Query Processing 3. DATABASE SCALING § Scale-out § Scale-in 12 Literature and Copyright BASIC LITERATURE § § § § Elasmri, R.; Navathe, S.: “Fundamentals of Database Systems” (5th Edition) Ramakrishnan, R.; Gehrke, J.: “Database Management Systems”. McGraw-Hill, 2000 Weikum, G.; Vossen, G.: „Transactional Information Systems“. Morgan Kaufmann, 2001 J. Hoffer:, M. Prescott, H. Topi: „Modern Database Management” (9th Edition) COPYRIGHT NOTE § All material may contain under consideration of other copyright constraints examples and figures from additional sources § The (commercial) use of the lecture notes in business not related to TU Dresden is explicitly not allowed 13