Course Project Overview Upcoming Deliverables Overview of Project 1 Counts for 40% of your overall grade (35%) Phase 1 design and implementation report (65%) Phase 2 demo, source code, and final project report Overview of Project 2 Example project Upcoming Deliverables Example project Lessons learned Project status Due: By April 22 at the latest; we will schedule demo dates and times a little later in the semester Project Option 1: Phase 1 Project Option 1 Due: February 26 in class Phase 1: Design & implementation of Storage Manager Phase 2: Query Processor Implement Storage Manager to handle Project Option 2 Project Option 1: Phase 1 Required items Phase 1: Overview of TinySQL and Design of new features Phase 2: Implementation of extensions Set of relations and indexes created Set of functions provided by the Storage Manager List of simplifications creation of relations and indexes inserting records into a file inserting key values and addresses into the index fetching a page Maintain a system catalog with: Name of a relation Number of tuples in the relation Indexed columns, etc. Project Option 1: Phase 2 Implement Query Engine components: scanner/parser/validator query optimizer query code generator runtime database processor 1 Example Project Option 1 only three relations per select, etc. only batch inserts Required items Aggregates (MIN, MAX, AVG, COUNT) Necessary to modify the underlying data structures; requires modifying the grammar and the parser DISTINCT Requires modifying the grammar and the parser Simplify to only allow either all or no aggregates in a select OR functionality Insert a new method after a normal select has run to filter out duplicates Web GUI Write a wrapper class to handle web-based queries and send results to nicely-formatted HTML pages Creation of relations and indexes if any Inserting records into a file Inserting key values and addresses into the index Fetching a page Basic query processing steps Query optimization techniques used if any Project Option 2: Phase 2 Implement your new features, like: Identify the open source product you have chosen Identify the basic db functions that it supports Identify the new db features you plan to add For each feature, identify the list of basic components that you need to write and incorporate into the chosen open source package Identify potential benefits and difficulties for adding the new features Example Project Option 2 Parsing, logical query plan, optimized query plans, etc. Maintain intermediate results for debugging Project Option 2: Phase 1 Study an open-source DB software package such as tinySQL Understand implementation details, such as how to handle Only for select statements Not for joins or projects Only one index used per select Output helpful info after each stage Simplified use of indexes Support a limited set of SQL Project Option 2: Phase 1 An improved query processor Join, aggregate, and nested queries Support for larger data files New query optimization methods Support for online aggregation/approximate queries Security features … Lessons learned from past years The earlier you start, the better Get basic code working as early as possible, then add improvements Don’t neglect the report: Be sure to write clearly, with: A mixture of high-level design description with low-level implementation details Technical details explaining your algorithms, assumptions, and simplifications 2 As you move on to Phase 2, remember: Use good coding style! Documentation, comments, naming conventions Design for debugging and testing Generate intermediate results (e.g., Option 1, print generated logical query plan) Send help and debugging info to standard output/GUI Prepare well-designed test cases for the demo walkthroughs 3