Data Structures via C++ What is data structure? Professor: Shaker El-Sappagh Galala University Sh.elsappagh@gmail.com 1 Shaker El-Sappagh Data structures 10/20/2021 Outline • Course expectations • Data structures definitions. 2 Shaker El-Sappagh Data structures 10/20/2021 Course expectations 3 Shaker El-Sappagh Data structures 10/20/2021 1. Covered topics • Definition of data structures • Quick review of C++ and object-oriented programming • Algorithm complexity analysis • Sorting algorithms and searching algorithms • Recursion • Lists (array and linked list) • Stack using array and linked list • Queue using array and linked list • Hashing using array and linked list • Tree using linked lists • Graph using linked lists 4 Shaker El-Sappagh Data structures 10/20/2021 2. The Book The official book that I will use for this course is: "Data Structures Via C++: Objects by Evolution", A. Michael Berman, , Oxford Univ. Press (2007) However, I will use several materials and exercises and labs problems will be collected from several books. 5 Shaker El-Sappagh Data structures 10/20/2021 3. C++ IDE C++ Development Environment: Since this is a programming course, you will also require a development environment to compile and run your programs. Students are assumed to be able to install one of the following IDEs. Note that the list is ordered from the highest priority to the lowest. Name Eclipse C++Builder Visual Studio Code Netbeans Kite Codelite Atom CLion Codeblocks 6 Link https://www.eclipse.org/ide/ https://www.embarcadero.com/products/ https://code.visualstudio.com/ https://netbeans.org/downloads/8.0.1/ https://www.kite.com/get-kite/ https://codelite.org/ https://atom.io/ https://www.jetbrains.com/clion/ http://www.codeblocks.org/ Shaker El-Sappagh Data structures 10/20/2021 4. Grading Category Midterm exam Final Exam Collection 7 % 25% 60% 15% Description Midterm exam Final exam Assignments, practical exam, attendance, class participation, quizzes, research, and seminar. Shaker El-Sappagh Data structures 10/20/2021 5. Other points 1. 2. 3. 4. 5. 6. Homework Office hours Communication Collaboration and academic honesty Teaching assistant Exams 8 Shaker El-Sappagh Data structures 10/20/2021 Data structures definitions 9 Shaker El-Sappagh Data structures 10/20/2021 1. Data structure definitions Definition 1: A data structure is the organization of information, usually in memory, for better algorithm efficiency (storage and speed) by facilitating operations while keeping the size of the data. Definition 2: A data structure consists of a base storage method (e.g., array) and one or more algorithms that are used to access or modify that data. Data structure specifies: (1) Organization of data, (2) Accessing methods, (3) Degree of associativity, (4) Processing alternatives for information. A data structure is a way of storing data so that it can be used efficiently. An appropriate data structure allows a variety of important operations to be performed using resources (memory space and execution time) efficiently. Types of data structures: build in and user defined. 10 Shaker El-Sappagh Data structures 10/20/2021 1. Data structures definition (Cont’d) Types of Data Structure: • In linear data structures: values are arranged in linear fashion. • In Non-linear data structures: valuesare not arranged in order. Operations Performed in Data Structure: Traversing, Insertion, Deletion, Merging, Sorting, Searching 11 Shaker El-Sappagh Data structures 10/20/2021 1. Data structures definition (Cont’d) Algorithm + Data Structure = Program • Algorithm: The step-by-step procedure to solve a problem. It is a wellorganized, pre-arranged, and defined computational module that receives some values or set of values as input and provides a single or set of values as output. These well-defined computational steps are arranged in sequence, which processes the given input into output. • The efficiency of an algorithm depends on the time complexity and space complexity. • Complexity is a function of input size. • Suppose space is fixed for one algorithm then only run time will be considered for obtaining the complexity of algorithm, these are 1. Best case 2. Worst case 3. Average case 12 Shaker El-Sappagh Data structures 10/20/2021 1. Data structures definition (Cont’d) • Efficiency of an algorithm: can be determined by measuring the time, space, and amount of resources it uses for executing the program. • The amount of time taken by an algorithm can be calculated by finding the number of steps the algorithm executes. • The space refers to the number of units it requires for memory storage. 13 Shaker El-Sappagh Data structures 10/20/2021 2. Memory: Where everything lives? We have two memories for a C++ application: 1. Static memory that is not growing while the program is executing: Compilation time allocation (OSdependent). a. Code instructions, b. Static and global variables, c. Stack for local variables and function calls. 2. Dynamic memory that can be grow while the program is executing: Execution time allocation. a. Heap memory: where the objects are living. It has garbage collection mechanism. 14 Shaker El-Sappagh Data structures Heap Stack Static and global variables Code 10/20/2021 2. Memory (Cont’d) Stack (1MB from OS) Example: #include <iostream> int main() D() int total; { int Square(int x) int a = 4, b = 8; C() { total = SquareOfSum (a, b); B() cout<<“output = “<< total; A() return x*x; } } int SquareOfSum(int x, int y) { Global int z = Square(x+y); return z; } Code 15 Shaker El-Sappagh Data structures 10/20/2021 2. Memory (Cont’d) Example: #include <iostream> int main() { int a; int *p1, *p2; p1 = new int; *p1 = 5; p2 = new int[10]; *p2= 100 // as p[0] = 100 *(p2+1) = 200; // as p[1] = 200 p2 = new int [5]; delete p1; delete[] p2; } 16 Stack (1MB from OS) Heap Heap Global Code Shaker El-Sappagh Data structures 10/20/2021 2. Memory (Cont’d) Limitations of the stack memory: • Stack can go in stack overflow as is memory can not be changed in run time and its allocation depends on the OS and resources. • For array allocation, you must know at the compilation time the size of the array. Static array. Either allocating very large space or be in risk for stack overflow. 17 Shaker El-Sappagh Data structures 10/20/2021 2. Memory (Cont’d) Limitations of the heap memory (dynamic memory allocation): • Heap is a free pool of memory but… • Heap memory can extend (dynamic memory allocation) while program execution. • We can go out of memory of heap if the system memory is low. • It is the responsibility of the programmer to clean the not needed memory. 18 Shaker El-Sappagh Data structures 10/20/2021 3. Software development lifecycle: Waterfall methodology Initiation In a software project: Develop a high-quality SW, in time, at cost. Analysis Who performs: manager. Deliverables: request for development. Steps: feasibility study, examination of alternatives, cost benefit analysis, make or buy study 19 Design Who performs: system analyst. Deliverables: functional specification as a part of the legal agreement Implementation Who performs: system designer. Deliverables: Design, technical, or coding specification. Shaker El-Sappagh Who performs: programmer. Deliverables: documented code and documentation. Data structures Testing Who performs: system tester. Deliverables: test pan and report. Maintenance Who performs: maintenance progr. For: bugs, adaptations, enhancements 10/20/2021 3. Software development lifecycle (Cont’d) Characteristics of a good software system • Usable: easy to use. • Reliable: no crash and correct results. • Maintainable: software problems can be solved easily, and software updated easily. • Reusable: software built with reusable components to reduce cost and improve reliability and should generate new reusable components that will be used in future. Good design helps to achieve these goals. 20 Shaker El-Sappagh Data structures 10/20/2021 4. Design methodologies • Top-down design (TDD): it decomposes a large problem into smaller ones, each of which can be solved separately. Coupling and cohesion. • Object-oriented design (OOD): System is broken down into objects. Class, object, encapsulation, polymorphism, overloading, messaging, inheritance, interface, hiding, etc. • What methodology we will use?! 21 Shaker El-Sappagh Data structures 10/20/2021 Specification: Video Rental System • Build a software system to support the operation of a video rental store. The system should automate the process of renting tapes and receiving returned tapes, including calculating and printing patron bills, which may or may not be done at the same time the tape is returned. The system must also give the clerk access to information about tapes, such as the number of copies on the shelf of any given video owned by the store. The system must be able to add new customers and tapes to and remove them from the database. Each patron and each copy of each tape are to be associated with a unique bar-coded label. 22 10/20/2021 Using TDD method: Structure chart for initial decomposition of video rental system Video Rental System Process Transactions 23 Process Queries Process Modifications 10/20/2021 Using TDD method: Structure chart for “process transactions” Process Transactions Rent Tape Query Tape Validate Patron Return Tape Update Tape Update Patron Query Tape 24 Pay Bill Update Patron Validate Patron Compute Bill Update Tape Update Patron 10/20/2021 Using TDD method: : Final decomposition of video rental system Video Rental System Process Transactions Process Queries Rent Tape Query Tape Query Tape 25 Return Tape Validate Patron Validate Patron Update Tape Update Patron Compute Bill Update Tape Pay Bill Update Patron Update Patron Tape Queries Process Modifications Patron Queries Add Patron Delete Patron Add Tape Delete Tape 10/20/2021 Using OOD method: Responsibility-driven design process • Find the classes • Determine the responsibilities of each class. • Determine who collaborates with each class. 26 10/20/2021 Using OOD method: Classes for video rental system • Tape • Patron • Console • Scanner • Printer 27 10/20/2021 Using OOD method: Tape class Class Responsibilities Tape 10/20/2021 Keep track of tape identification data Check self out Check self in Answer queries about location of self 28 Using OOD method: Patron class Class Responsibilities Patron Keep track of patron identification data Update patron identification data Update list of rented tapes Update billing information 10/20/2021 29 Using OOD method: Console class Class Responsibilities Console Check tapes in and out Find the location of tapes Find out about patrons Add and remove tapes Add, update, and remove patrons Update patron billing information 10/20/2021 30 Using OOD method: Scanner class 31 Class Responsibilities Scanner Read a bar code and return either a tape or a patron 10/20/2021 Using OOD method: Printer class 32 Class Responsibilities Printer Print bills Print receipts 10/20/2021 Using OOD method: Video rental system: Classes, responsibilities, and collaborators Class Responsibilities Collaborators Tape Keep track of tape identification data Check self in Check self out Answer queries about location of self Patron Keep track of patron identification data Update patron identification data Update list of rented tapes Update billing information Tape Console Check tapes in and out Find the location of tapes Find out about patrons Add and remove tapes Add, update, and remove patrons Update patron billing information Patron Tape Scanner Printer 10/20/2021 33 Using OOD method: Video rental system: Classes, responsibilities, and collaborator Class Responsibilities Collaborators Scanner Read a bar code and return either a tape or a Tape patron. (This suggests that a database that links Patron bar codes to tapes and patrons is maintained by the Scanner class.) Printer Print bills Print receipts 34 Tape Patron 10/20/2021 4. Design methodologies (Cont’d) • Abstract data type (ADT): is a well-specified collection of data and a group of operations that can be performed upon the data. • ADT’s specification describes what data can be stored (the characteristics of the ADT) and how it can be used (the Operations), but not how it is implemented. • ADT can be defined formally using a programming language or can be described informally using English. 35 Shaker El-Sappagh Data structures 10/20/2021 5. Software reliability • Importance of data and risks (costs) of faulty software. • Importance of SW testing. Testing is the search for SW errors. • Taxonomy of errors: syntax errors: violating the formal rules of programming language. validity (logical) errors: when a program runs but gives wrong answers. Results from misunderstanding b/w phases. verification errors: errors in the requirements analysis by overlooking important factors (the moon example). run-time errors: failure to implement design. Cause program termination or crash as misuse of memory, divide by zero, etc. Maintenance errors: any of the above errors introduced during software maintenance. 36 Shaker El-Sappagh Data structures 10/20/2021 Exercise • The following function contains a validity error of a sort that’s quite common. int sum(int a[], int n) { // precondition: a is an array subscripted // from 0 to n-1 int i, total(0); for (i = 0; i <= n; i++) total += a[i]; return total; } • 37 Find the bug and fix it. 10/20/2021 6. Testing approaches • Black-Box testing: test the correspondence between system outputs and expected ones. Input set according to the system’s requirements Corresponding output set according to the system’s requirements • Glass-Box testing: feasible. test as many possible paths through the code as Input set according to the system’s requirements 38 Corresponding output set according to the system’s requirements Shaker El-Sappagh Data structures 10/20/2021 Guidelines for Creating Test Plans 1. Test typical cases 2. Test extreme cases 3. Test invalid inputs 39 10/20/2021 6. Testing approaches (cont’d) • Unit testing: unit may be function, group of functions (module), or class. It depends on the used design methodology. • System testing: testing the components interface, system functional requirements, beta testing. • How to select the testing samples? Test typical cases, test extreme cases, test invalid inputs. 40 Shaker El-Sappagh Data structures 10/20/2021 6. Testing approaches (cont’d) Unit testing techniques: Unit testing is required because a single system has many developers. • Drivers: when you write a unit that is independent of other units. Driver is a separate program to only test a unit (ex. Max function). • Stubs: When a unit contains calls to one or more functions that are not yet available. A stub is a placeholder for the function that has not yet been written. It can print “I am here” message or return a value. 41 Shaker El-Sappagh Data structures 10/20/2021 Function max int max (int a[], int n) { // (precondition) assertion 1: a is an array with subscripts ranging from 0 to n-1 int max_val(a[0]), i; for (i = 1; i < n; i++) // (Loop invariant) assertion 2: // (max_val >= a[k] for 0 <= k < i) and (max_val = a[j] for some j, 0 <= j < i) if (max_val < a[i]) max_val = a[i]; // (Postcondition) assertion 3: // // (max_val >= a[k] for 0 <= k < n) and (max_val == a[j] for some j, 0 <= j < n) i.e., max_val is equal the value of largest int in array a return max_val; } 42 10/20/2021 Code Example 3-1: Test driver for max function (Part 1 of 2) #include <iostream> int max(int a[], int n); int main() { int a[100], i; cout << "Max driver\n"; cout << "Enter each input to max terminated by -9999\n"; cout << "Length of input must be <= 100\n"; 43 10/20/2021 Test driver for max function (Part 2 of 2) for (i = 0; i < 100; i++) { int val; cin >> val; if (val == -9999) // termination sentinel break; else a[i] = val; } cout << "\nMax is " << max(a, i) << '\n'; cout << "\n\n"; return 0; } 44 10/20/2021 Assignment - Write a function int min(int a[], int n) that finds the smallest item in array a. - Write a function int maxpos(int a[], int n) that returns the position of the largest item in the array. 45 Shaker El-Sappagh Data structures 10/20/2021 Thank you 46 Shaker El-Sappagh Data structures 10/20/2021