COURSE OUTLINE ourse Name C Databases & Big Data Technologies Course Spr 2024 S tart Course Code TBD Credit Hours 3 Course Prereq. Nil Name Course Prereq. Nil Course Spr 2024 Code End Course Type Core BS (IT&S) Program Course Short Description Unlock the power of data with our core course. Covering fundamental and advanced concepts, this course provides a comprehensive exploration of databases and SQL — a cornerstone for effective data analysis. From relational database design to NoSQL databases, students will master SQL fundamentals, data cleaning, and transformation techniques. Dive into advanced topics such as query optimization, data warehousing, and geospatial analytics using platforms like Snowflake. With a focus on real-world applications, this course equips students with the essential skills to navigate and derive insights from diverse datasets, offering a strategic advantage in the dynamic field of business analytics. CLOs - Course Learning Outcomes Upon having completed this course, the students will be able to: . D 1 emonstrate proficiency in SQL fundamentals and advanced techniques. 2. Gain proficiency in diverse data tools, illustrating the ability to choose and implement the most suitable data storage solution for different analytical requirements. 3. Apply relational database design principles to create well-structured databases, demonstrating the capability to design and implement data models that support efficient data storage and retrieval. 4. Investigating practical uses of big data architectures, with an emphasis on understanding Pandas DataFrames and scalable data processing through Dask. eaching & Learning Methodology T Any or all of the following (or a combination) may be used: 1. Class presentation and demonstrations. 2. Hands-on sessions. 3. Guest lectures from industry experts. 4. Individual / collaborative group projects. 5. Online learning resources. Textbook(s) 1 1. The Practical SQL Handbook (Judith Bowman) – 2001 eference Book(s) & Reading Material R 1. Joe Celko's SQL Programming Style (Joe Celko) – 2005 2. Joe Celko's SQL for Smarties -Advanced SQL Programming, 5e (Joe Celko) – 2014 3. Joe Celko's Analytics and OLAP in SQL (Joe Celko) - 2006 Grading Policy ssessment Instruments A Take home (Assignments, projects, In class (Activities, Quiz, participation) Mid Term Exam Final Exam ercentage P 20% 30% 20% 30% Weekly Course Schedule (Tentative) Topics Week Introduction to Databases 1 ● ● ● ● verview of databases and their role in business analytics O Types of databases: relational, NoSQL, NewSQL Database management systems (DBMS) and their functionalities Introduction to data pipelines Relational Database Design 2 E ● ntity-Relationship (ER) modeling ● Normalization and data integrity ● SQL fundamentals: create database, querying, CRUD operations SQL Fundamentals 3-4 Physical database design considerations ● ● Implementation of database schemas ● Indexing strategies and performance optimization Advanced SQL Queries 5-6 7 J ● oins, subqueries, and set operations ● Aggregation functions and group by clause ● Views, indexes, and stored procedures Introduction to dataframe ● Introduction to Python Pandas DataFrame: Essential for data analysis. 2 P ● ractical Data Manipulation: Hands-on exercises to reinforce learning. ● Advanced Analysis and Optimization: Enhancing skills for efficient data handling. Study Week Midterm Assessments 8 eviewing the Big Data R ● A discussion from the perspective of storage, use-case, and business ● Characteristics of Big Data ● Volume, Variety, Velocity, Veracity, Valence ● Getting Value out of Big Data NoSQL Databases 9-10 Introduction to NoSQL databases: document-based, key-value, columnar, graph-based ● ● Use cases and advantages of NoSQL databases in analytics Data warehouses, lake houses, and lakes 11 ● ● ● ● ● ● ● Introduction to data warehouses ETL and ELT OLTP and OLAP Data marts and cubes Schemas Lake houses Data lakes Introduction to Snowflake (Data Warehouse) 12 ● ● ● ● otivation of using Snowflake M Component within the Snowflake ecosystem Loading data, and querying it Working with semi structured data, views, and joins within Snowflake Generating insights with Snowflake and Power BI 13 C ● reate stages, databases, tables, views, and virtual warehouses ● Load structured and semi-structured data ● Perform analytical queries on data in Snowflake, including joins between tables Introduction to scalable analytics 14 Introduction: Overview of parallel computing; Introduction to Dask in Python. ● ● Dask DataFrame Basics: Fundamentals of Dask DataFrame; Comparison with Pandas. ● Scalable Analysis: Scaling tasks with Dask; Deployment on clusters. tudy Week S Final Exam 3