Uploaded by Bilal Hayat Butt

Course Outline - Databases & Big Data Technologies.docx - Google Docs

advertisement
‭COURSE OUTLINE‬
‭ ourse Name‬
C
‭Databases & Big Data Technologies‬
‭Course‬
‭Spr 2024‬
S
‭
tart‬
‭Course Code‬
‭TBD‬
‭Credit Hours‬
‭3‬
‭Course Prereq.‬
‭Nil‬
‭Name‬
‭Course Prereq.‬
‭Nil‬
‭Course‬
‭Spr 2024‬
‭Code‬
‭End‬
‭Course Type‬
‭Core‬
‭BS (IT&S)‬
‭Program‬
‭Course Short Description‬
‭Unlock the power of data with our core course. Covering fundamental and advanced‬
‭concepts, this course provides a comprehensive exploration of databases and SQL — a‬
‭cornerstone for effective data analysis. From relational database design to NoSQL‬
‭databases, students will master SQL fundamentals, data cleaning, and transformation‬
‭techniques. Dive into advanced topics such as query optimization, data warehousing,‬
‭and geospatial analytics using platforms like Snowflake. With a focus on real-world‬
‭applications, this course equips students with the essential skills to navigate and derive‬
‭insights from diverse datasets, offering a strategic advantage in the dynamic field of‬
‭business analytics.‬
‭CLOs - Course Learning Outcomes‬
‭Upon having completed this course, the students will be able to:‬
‭ .‬ D
1
‭ emonstrate proficiency in SQL fundamentals and advanced techniques.‬
‭2.‬ ‭Gain proficiency in diverse data tools, illustrating the ability to choose and implement‬
‭the most suitable data storage solution for different analytical requirements.‬
‭3.‬ ‭Apply relational database design principles to create well-structured databases,‬
‭demonstrating the capability to design and implement data models that support‬
‭efficient data storage and retrieval.‬
‭4.‬ ‭Investigating practical uses of big data architectures, with an emphasis on‬
‭understanding Pandas DataFrames and scalable data processing through Dask.‬
‭ eaching & Learning Methodology‬
T
‭Any or all of the following (or a combination) may be used:‬
‭1.‬ ‭Class presentation and demonstrations.‬
‭2.‬ ‭Hands-on sessions.‬
‭3.‬ ‭Guest lectures from industry experts.‬
‭4.‬ ‭Individual / collaborative group projects.‬
‭5.‬ ‭Online learning resources.‬
‭Textbook(s)‬
‭1‬
‭1.‬ ‭The Practical SQL Handbook (Judith Bowman) – 2001‬
‭ eference Book(s) & Reading Material‬
R
‭1.‬ ‭Joe Celko's SQL Programming Style (Joe Celko) – 2005‬
‭2.‬ ‭Joe Celko's SQL for Smarties -Advanced SQL Programming, 5e (Joe Celko) – 2014‬
‭3.‬ ‭Joe Celko's Analytics and OLAP in SQL (Joe Celko) - 2006‬
‭Grading Policy‬
‭ ssessment Instruments‬
A
‭Take home (Assignments, projects,‬
‭In class (Activities, Quiz, participation)‬
‭Mid Term Exam‬
‭Final Exam‬
‭ ercentage‬
P
‭20%‬
‭30%‬
‭20%‬
‭30%‬
‭Weekly Course Schedule (Tentative)‬
‭Topics‬
‭Week‬
‭Introduction to Databases‬
‭1‬
‭‬
●
‭●‬
‭●‬
‭●‬
‭ verview of databases and their role in business analytics‬
O
‭Types of databases: relational, NoSQL, NewSQL‬
‭Database management systems (DBMS) and their functionalities‬
‭Introduction to data pipelines‬
‭Relational Database Design‬
‭2‬
‭‬ E
●
‭ ntity-Relationship (ER) modeling‬
‭●‬ ‭Normalization and data integrity‬
‭●‬ ‭SQL fundamentals: create database, querying, CRUD operations‬
‭SQL Fundamentals‬
‭3-4‬
‭ ‬ ‭Physical database design considerations‬
●
‭●‬ ‭Implementation of database schemas‬
‭●‬ ‭Indexing strategies and performance optimization‬
‭Advanced SQL Queries‬
‭5-6‬
‭7‬
‭‬ J
●
‭ oins, subqueries, and set operations‬
‭●‬ ‭Aggregation functions and group by clause‬
‭●‬ ‭Views, indexes, and stored procedures‬
‭Introduction to dataframe‬
‭●‬ ‭Introduction to Python Pandas DataFrame: Essential for data analysis.‬
‭2‬
‭‬ P
●
‭ ractical Data Manipulation: Hands-on exercises to reinforce learning.‬
‭●‬ ‭Advanced Analysis and Optimization: Enhancing skills for efficient data handling.‬
‭Study Week‬
‭Midterm Assessments‬
‭8‬
‭ eviewing the Big Data‬
R
‭●‬ ‭A discussion from the perspective of storage, use-case, and business‬
‭●‬ ‭Characteristics of Big Data‬
‭●‬ ‭Volume, Variety, Velocity, Veracity, Valence‬
‭●‬ ‭Getting Value out of Big Data‬
‭NoSQL Databases‬
‭9-10‬
‭ ‬ I‭ntroduction to NoSQL databases: document-based, key-value, columnar, graph-based‬
●
‭●‬ ‭Use cases and advantages of NoSQL databases in analytics‬
‭Data warehouses, lake houses, and lakes‬
‭11‬
‭‬
●
‭●‬
‭●‬
‭●‬
‭●‬
‭●‬
‭●‬
I‭ntroduction to data warehouses‬
‭ETL and ELT‬
‭OLTP and OLAP‬
‭Data marts and cubes‬
‭Schemas‬
‭Lake houses‬
‭Data lakes‬
‭Introduction to Snowflake (Data Warehouse)‬
‭12‬
‭‬
●
‭●‬
‭●‬
‭●‬
‭ otivation of using Snowflake‬
M
‭Component within the Snowflake ecosystem‬
‭Loading data, and querying it‬
‭Working with semi structured data, views, and joins within Snowflake‬
‭Generating insights with Snowflake and Power BI‬
‭13‬
‭‬ C
●
‭ reate stages, databases, tables, views, and virtual warehouses‬
‭●‬ ‭Load structured and semi-structured data‬
‭●‬ ‭Perform analytical queries on data in Snowflake, including joins between tables‬
‭Introduction to scalable analytics‬
‭14‬
‭ ‬ I‭ntroduction: Overview of parallel computing; Introduction to Dask in Python.‬
●
‭●‬ ‭Dask DataFrame Basics: Fundamentals of Dask DataFrame; Comparison with Pandas.‬
‭●‬ ‭Scalable Analysis: Scaling tasks with Dask; Deployment on clusters.‬
‭ tudy Week‬
S
‭Final Exam‬
‭3‬
Download