1.1 Describe the role of an Operating System (OS). A. An Operating System (OS) is a software program that controls the hardware and software resources of a computer system and provides a platform for other software to run on. The role of an OS can be broken down into several key functions: B. Resource management: An OS is responsible for managing and allocating resources such as memory, CPU time, and storage to various applications and processes running on the system. This ensures that each application gets the resources it needs to function correctly, while also preventing any one application from monopolizing the system. C. Memory management: An OS is responsible for managing the memory of the computer, including allocating memory to processes, managing virtual memory, and handling memory management issues such as fragmentation. D. Process management: An OS is responsible for creating, scheduling, and terminating processes, as well as managing inter-process communication and synchronizing access to shared resources. E. Security: An OS is responsible for ensuring the security of the system, protecting against unauthorized access and malicious software. F. Hardware management: An OS is responsible for managing and communicating with the computer's hardware, such as the CPU, memory, storage, and input/output devices. G. File management: An OS is responsible for managing the file system and providing a consistent and organized way for applications to access files and directories on the system. H. Networking: Many modern OS includes networking capabilities, allowing the computer to connect to other devices and networks, and to manage the flow of data between them. I. In summary, an Operating System is a foundational software that controls the resources of a computer and provides an environment for other software to run on. It plays a crucial role in managing and allocating resources, memory, processes, security, hardware and file management and networking. 1.2 Firm real-time Operating Systems. 1.2.1 What is a firm real-time operating system? Cite two examples of firm real-time systems and justify your answer. A. A firm real-time operating system (RTOS) is a type of RTOS that guarantees that specific tasks will be completed within a specific period of time, also known as a deadline. This type of operating system is designed for applications that require a high degree of predictability, such as control systems, embedded systems, and other time-sensitive applications. B. Two examples of firm real-time systems are VxWorks and QNX. a. VxWorks: VxWorks is a popular RTOS that is widely used in a variety of applications such as aerospace, defence, industrial automation, and medical devices. It has a reputation for its high reliability and real-time performance, and it provides a variety of features such as preemptive multitasking, memory protection, and inter-process communication. b. QNX: QNX is another popular RTOS that is widely used in a variety of applications such as automotive systems, industrial control systems, and medical devices. It is known for its high reliability, real-time performance, and ability to handle multiple concurrent processes. QNX also provides a variety of features such as a microkernel architecture, real-time process scheduling, and inter-process communication. C. Both VxWorks and QNX have the capability to guarantee that specific tasks will be completed within a specific period of time, which makes them fit the definition of a firm real-time operating system. They are widely used in industries that require real-time performance and high reliability, such as aerospace, defence, industrial automation, and medical devices. 1.2.2 What sort of decisions would you take if you were in charge of designing Scheduling and Virtual Memory for a firm real-time OS? A. If I were in charge of designing scheduling and virtual memory for a firm real-time OS, I would take the following decisions: a. Scheduling: I would design a scheduling algorithm that is tailored for real-time systems, such as Earliest Deadline First (EDF) or Rate Monotonic (RM) scheduling. These algorithms are designed to meet hard real-time deadlines and guarantee that the most critical tasks will be completed on time. b. Priority Inversion: I would also take into account priority inversion, which is a situation when a lower-priority task holds a resource that a higher-priority task needs. To avoid this I would implement priority inheritance or priority ceiling protocols. c. Virtual Memory: I would design a virtual memory management system that can handle real-time requirements such as predictable response times and low overhead. One such approach would be to use a fixed page size and preallocate pages to specific tasks to minimize page faults. d. Memory allocation: I would also design a memory allocation strategy that is suitable for real-time systems. One such strategy would be to use a fixed block size and preallocate memory blocks to specific tasks to minimize fragmentation and improve predictability. e. Memory Protection: I would ensure that the virtual memory management system includes memory protection mechanisms to ensure that tasks cannot interfere with each other's memory. f. Memory Management: I would also consider using a technique such as memory compression or memory prioritization to reduce memory usage, as memory is a critical resource for real-time systems. B. In summary, designing scheduling and virtual memory for a firm real-time OS requires a thorough understanding of real-time requirements and constraints. I would take decisions that prioritize predictable response times, low overhead, and efficient use of resources. The scheduling algorithm should be tailored for real-time systems, ensuring that the most critical tasks will be completed on time, and the virtual memory management system should include features such as fixed page size, preallocated pages, memory protection, and memory compression or prioritization. 1.3 Draw a diagram of a processor showing its basic components, including (named) registers, with also a representation of the attached memory. A. Draw a diagram lmao 1.4 Give the reason(s) why processors have several levels of caches of different sizes instead of including a single large cache. A. Processors have several levels of caches of different sizes, instead of including a single large cache, for several reasons: a. Cost: A larger cache requires more memory and power, which increases the cost of the processor. Having multiple levels of cache allows for a balance between cost and performance. b. Latency: A single large cache would have a larger access latency than smaller caches. By having multiple levels of cache, the processor can quickly access data in the smaller, faster levels of cache, reducing the number of times it has to access the larger, slower main memory. c. Bandwidth: A single large cache would also consume a large amount of the processor's bandwidth, potentially slowing down other operations. By having multiple levels of cache, the processor can distribute the workload across different levels of cache and balance the usage of resources. d. Power consumption: A large cache would consume more power than smaller caches, which could be a concern for portable or low-power devices. By having multiple levels of cache, the processor can reduce power consumption. e. Flexibility: Having multiple levels of cache allows for better flexibility in terms of cache management. The processor can use different algorithms for different levels of cache, such as a direct-mapped cache for the L1 cache and an associative cache for the L2 cache. B. In summary, processors have several levels of caches of different sizes instead of including a single large cache because it allows for a balance between cost and performance, reduces latency, distributes the workload, reduces power consumption and provides flexibility in terms of cache management. 1.5 CPUs can raise an exception when malicious programs try to write data or execute code in memory areas that are respectively setup as read-only or non-executable. Briefly describe how CPUs can quickly detect that these areas are not accessed in an appropriate way. A. CPUs can quickly detect that memory areas designated as read-only or non-executable are being accessed in an inappropriate way by using hardware-based protection mechanisms. B. Memory Protection Unit (MPU): A Memory Protection Unit (MPU) is a hardware component that enforces memory protection by monitoring access to memory and raising an exception if a program tries to access a protected memory area. The MPU can be configured to mark certain regions of memory as read-only or non-executable, and it will raise an exception if a program tries to write to or execute code in those areas. C. Data Execution Prevention (DEP): Data Execution Prevention (DEP) is a hardware-based security feature that marks certain regions of memory as non-executable. This prevents malicious programs from executing code in those areas, even if they were able to write data to those regions. If a program tries to execute code in a non-executable memory area, the CPU will raise an exception. D. Address Space Layout Randomization (ASLR): Address Space Layout Randomization (ASLR) is a technique that randomizes the location of system libraries, stack, heap and other key data structures in memory, making it difficult for an attacker to predict where memory-based vulnerabilities will be located. This makes it more difficult for an attacker to exploit buffer overflow vulnerabilities and other memory-based attacks. E. In summary, CPUs can quickly detect that memory areas designated as read-only or non-executable are being accessed in an inappropriate way by using hardware-based protection mechanisms such as Memory Protection Unit (MPU), Data Execution Prevention (DEP) and Address Space Layout Randomization (ASLR) that monitor access to memory and raise an exception if a program tries to access a protected memory area. 1.6 Imagine a disk (or a piece of memory) that is partitioned into blocks of fixed size (e.g. 4KBs). The allocated blocks are written into a table (i.e., an array) t containing n entries, which indicates the start address of each block in use. For some reasons, these blocks need to be compacted as they are randomly allocated and therefore, memory is fragmented. Detail an algorithm that will compact these blocks more efficiently. The algorithm itself needs to be efficient and can be described using pseudo-code, but you will need to make sure the calculation of the addresses are correct. A. Here's an algorithm that can be used to compact these blocks more efficiently: B. Initialize two pointers, "current_pointer" and "free_pointer", both pointing to the first block in the table "t". C. Starting from the "current_pointer", check if the block is in use. If it is, do nothing. If it is not, increment the "free_pointer" until an in-use block is found. D. Move the in-use block found at the "free_pointer" to the "current_pointer" position. E. Repeat step 2 and 3 until the "current_pointer" reaches the end of the table "t". F. Set the remaining entries in the table "t" to "unallocated" G. Return the "t" H. Pseudocode for the algorithm: function compact_blocks(table t, int block_size): int current_pointer = 0 int free_pointer = 0 while current_pointer < n: if t[current_pointer] != "unallocated": current_pointer++ continue free_pointer = current_pointer + 1 while free_pointer < n and t[free_pointer] == "unallocated": free_pointer++ if free_pointer >= n: break t[current_pointer] = t[free_pointer] t[free_pointer] = "unallocated" for i in range(current_pointer, n): t[i] = "unallocated" return t I. This algorithm uses two pointers, "current_pointer" and "free_pointer", to iterate through the table "t" and find the unallocated blocks. The "current_pointer" is used to keep track of the current position in the table and the "free_pointer" is used to find the next in-use block. The algorithm then moves the in-use block found at the "free_pointer" to the "current_pointer" position. This way, it eliminates the empty spaces in between the blocks. The algorithm then sets the remaining entries in the table "t" to "unallocated" and returns the table. This algorithm ensures that all the blocks are compacted and that the memory is no longer fragmented. J. This algorithm has a time complexity of O(n) as it iterates over the entire table only once. This makes it an efficient algorithm for compacting the blocks. 1.7 A system consists of four resources of the same type, being shared by three processes, each of which needs at most two resources. Show that the system is deadlock-free. A. A deadlock can occur when multiple processes are waiting for resources that are held by other processes. In this case, the system consists of four resources of the same type, being shared by three processes, each of which needs at most two resources. B. The total number of resources that all the processes need is 3 * 2 = 6, which is less than the total number of resources available in the system (4). This means that even if all the processes request their maximum number of resources, there will still be resources available for other processes. Therefore, there will be no scenario where a process is waiting for a resource that is held by another process and the system is deadlock-free. C. Additionally, since all the processes need at most two resources, it means that there will always be at least one resource left for other processes to use, which ensures that there is no blocking of the resources and no possibility of a deadlock. D. In summary, since the total number of resources needed by all processes is less than the total number of resources available in the system, and all processes need at most two resources, the system is deadlock-free. 2. SIMD computing can accelerate calculations tremendously. X86 processors usually support various sets of SIMD instructions including MMX, SSE, AVX2 and AVX3. Answer the following questions. 2.1.1 Let’s imagine having 8 single-precision floating values to be processed with the same operation many times. Which of the three previously mentioned SIMD sets should we use? Explain. A. If we have 8 single-precision floating values to be processed with the same operation many times, then the best choice would be to use AVX (Advanced Vector Extensions) SIMD set. B. AVX provides 256-bit wide registers, which allows operations on 8 single-precision floating values at once. The other two SIMD sets, MMX and SSE, are not wide enough to handle 8 single-precision floating values simultaneously. C. Using AVX allows to perform the same operation on 8 single-precision floating values at once, which can result in significant performance improvements when compared to performing the same operation on each value individually. Additionally, AVX also provides many more instructions than the other two sets, which allows for a wider range of operations on the values. D. It is worth noting that AVX2 and AVX3, the extensions of AVX, also provides wider registers and more number of registers, which could further improve the performance of the operation if the system supports it. 2.1.2 How many 16-bit integer operations a single AVX4 instruction would be able to hypothetically process? Detail your answer. A. AVX4 is not a standard instruction set extension for x86 processors, it might hypothetically be able to process 16-bit integer operations by using its 256-bit wide registers. Each 256-bit register can hold 16 16-bit integers, so a single AVX4 instruction would be able to process 16 16-bit integer operations at once. B. It's important to note that AVX4 is not a standard extension and it's not available on current processors. AVX instruction set is the latest standard and it's available on processors that support AVX2 or AVX-512. AVX-512 is the latest and most powerful instruction set, it provides 512-bit wide registers which can hold 32 16-bit integers. 2.1.3 Let’s imagine we have a 2048 bit-wide SIMD architecture able to carry out Fused-Multiply-And-Add instructions on single precision floating point values. The processor supporting these instructions has 8 cores running at 2GHz and has an average IPC of two. What would be the theoretical processing power of such an architecture (in TFlops)? A. The theoretical processing power of such an architecture can be calculated as follows: a. Number of operations per second: clock rate (2GHz) x cores (8) x IPC (2) = 32 x 10^9 operations/s b. Number of floating-point operations per second: 32 x 10^9 operations/s x Fused-Multiply-And-Add instructions per operation = 32 x 10^9 operations/s c. Bit-width of the SIMD architecture: 2048 bits d. Number of floating-point values that can be processed simultaneously: 2048 bits / 32 bits (size of a single-precision floating-point value) = 64 e. Theoretical processing power (TFlops): Number of floating-point operations per second / 10^12 (1 TFlop) x Number of floating-point values that can be processed simultaneously = 32 x 64 = 2048 TFlops B. So, the theoretical processing power of this architecture is 2048 TFlops. It's important to note that this is only a theoretical value and in practice, the actual performance will depend on various factors such as memory bandwidth, memory access pattern, and instruction-level parallelism. 2.1.4 As SIMD instructions work on a lane basis, it is often difficult to implement algorithms that do not work blockwise. Sorting algorithms are such an example, and we would like to design an architecture that accelerates sorting integers using the quicksort algorithm. For this purpose, you are in charge of designing a few novel instructions to accelerate the pivoting of elements. How many new instructions will you need? Detail the input parameters and the output of these new functions. Make sure the proposed instructions are realistic, for instance, have a limited number of operands and will not create an excessive transistor footprint on the processor. Figures are also welcome if supported with text. A. To accelerate the pivoting of elements in the quicksort algorithm using SIMD instructions, I would propose the following new instructions: a. "SIMD swap": This instruction would swap two elements in a SIMD register. It would take two input parameters: the indices of the two elements to be swapped. The output would be the SIMD register with the elements swapped. b. "SIMD compare and swap": This instruction would compare two elements in a SIMD register and swap them if they are in the wrong order. It would take two input parameters: the indices of the two elements to be compared and swapped. The output would be the SIMD register with the elements in the correct order. c. "SIMD partition": This instruction would partition the elements in a SIMD register based on a pivot element. It would take two input parameters: the index of the pivot element and the number of elements in the partition. The output would be a SIMD register with the elements partitioned into two groups: elements less than the pivot element and elements greater than or equal to the pivot element. B. With these new instructions, the quicksort algorithm can be implemented in a more efficient way. The SIMD swap and SIMD compare and swap instructions can be used to sort the elements in the SIMD register while the SIMD partition instruction can be used to partition the elements around the pivot element. C. The number of new instructions required is 3. The number of input parameters and output of these instructions are limited, so it will not create an excessive transistor footprint on the processor. D. It's important to note that the proposed instructions are simplified and realistic, but it might not be enough to fully optimize the quicksort algorithm, other techniques like using multiple threads, parallel sort, or using other sorting algorithms that are better suited for SIMD architecture like radix sort, would be needed to fully optimize quicksort. 2.2.1 Developing a SIMD lane shuffle operator that is easier at shuffling data and using. By some extraordinary phenomenon, you are sent to the past in 1963 to help design an operating system for sending Neil, Buzz and Michael to the moon. While you cannot change the primitive hardware being used at that time, you have the opportunity to incorporate your 21st-century knowledge of operating systems into the design. We assume that the system used has all the characteristics of a modern computer and that information can be displayed on a screen. What kind of (operating) systems are we looking for when working on aeronautical systems? A. When working on aeronautical systems, such as the Apollo program that sent Neil, Buzz, and Michael to the moon, it is important to have an operating system that is highly reliable, robust, and able to handle a wide range of tasks. The following features would be key in such an operating system: a. Real-time capabilities: The system needs to be able to respond to real-time events and critical situations in a timely manner, such as detecting and avoiding potential hazards during flight. b. Error detection and recovery: The system must be able to detect and recover from errors, such as hardware or software failures, in order to ensure the safety of the mission. c. Resource management: The system must be able to manage and allocate resources, such as memory and processing power, effectively in order to ensure smooth operation of all tasks. d. Security: The system must be able to protect against unauthorized access and maintain the confidentiality and integrity of data. e. Communication: The system must be able to communicate effectively with other systems and devices, such as ground stations and other spacecraft, in order to exchange data and commands. f. Data Management: The system must be able to store and retrieve data, such as telemetry and sensor readings, in a reliable and efficient manner. g. Flexibility: The system should be able to adapt to changing requirements and handle a wide range of tasks, such as navigation, guidance, and control. h. Scalability: The system should be able to handle a large number of tasks, and can be scaled up or down as needed. i. SIMD Lane Shuffle operator: The system should have a SIMD lane shuffle operator that can effectively move and shuffle data between the different lanes of the SIMD register, this would make it easier to work with and perform operations on large data sets. B. It's worth noting that the technology and hardware available in 1963 was not capable of handling the complexity and demands of an aeronautical system like the Apollo program, but by incorporating knowledge of modern operating systems and advanced technologies, it could have improved the system performance and safety. 2.2.2 How would you design such a system? Which elements from a modern operating system will you keep and which elements would not you use? Any answer is welcome but, unlike the question, make sure your answers are realistic. A. There are a number of elements from modern operating systems that would not be practical to use in the context of designing an operating system for the Apollo program in 1963: a. Virtual memory: Virtual memory is a technique used to increase the amount of available memory by using the hard drive as an extension of the physical memory. However, this technology was not available in 1963 and the main memory size would have been limited. b. Multitasking: Multitasking is the ability of an operating system to run multiple tasks at the same time. However, in 1963, the hardware and technology was not advanced enough to support true multitasking, so each task would have to be executed sequentially. c. Graphical User Interface (GUI): A GUI is an interface that allows users to interact with the operating system using graphical elements such as icons and windows. However, the technology to support a GUI was not available in 1963, so the operating system would have to be used through a command-line interface. d. Networking: Networking is the ability of an operating system to connect to other devices and systems over a network. However, in 1963, the technology for networking was not advanced enough to support the kind of networking that is available today. e. Cloud computing: Cloud computing is a model of delivering computing services over the internet, including storage, processing, and software. However, the technology for cloud computing did not exist in 1963, so the system would have to rely on local storage and processing. f. Advanced scheduling algorithm: Advanced scheduling algorithms like CFS, EDF etc which are used in modern operating systems to schedule processes would not be used in the Apollo operating system as the hardware and technology for such advanced algorithms were not available in 1963. g. Advanced security features such as firewalls, intrusion detection, and prevention systems would not be available in 1963, as the technology and understanding of computer security were not as advanced as they are today. 2.3 You are a software developer that works for a supermarket chain called Tasdrose. Tasdrose wants to increase customer satisfaction with a new system guiding people to the tills so that their overall waiting time is reduced. We suppose that the customers have to choose a waiting lane as soon as they arrive at the till area in the case of multiple lanes and tills. However, a single cost-effective lane system is also being investigated. 2.3.1 Which topic(s) studied would be relevant to treat such a problem? A. There are several topics studied in computer science and operations research that would be relevant to treat the problem of reducing customer waiting time at the tills in a supermarket such as Tasdrose. Some of these topics include: a. Queueing theory: This is a branch of mathematics that deals with the analysis of waiting lines or queues. It would be useful in understanding the behavior of customers in the waiting lanes and in determining the optimal number of lanes and tills to minimize overall waiting time. b. Simulation: Simulation is a technique used to model and analyze complex systems. It would be useful in predicting the performance of the new system and in evaluating different lane configurations and strategies. c. Optimization: Optimization is a method used to find the best solution to a problem. It would be useful in determining the optimal number of lanes and tills to minimize overall waiting time and in determining the best lane assignment strategy for customers. d. Scheduling: Scheduling is a method used to assign tasks or resources to different agents. It would be useful in determining the optimal scheduling of customers to the different tills to minimize overall waiting time. e. Game theory: Game theory is the study of mathematical models of strategic interaction between rational decision-makers. It would be useful in analyzing the decision-making of customers when choosing a waiting lane and in determining the optimal lane assignment strategy for customers. f. Machine Learning: Machine learning is a type of artificial intelligence that allows systems to learn and improve from experience. It would be useful in analyzing the customer behavior, understanding the customer preferences and in predicting the customer flow in the store and allocating the resources accordingly. g. Control Systems: Control systems is a branch of engineering that deals with the design of control systems for dynamic systems. It would be useful in controlling the customer flow in the store, and in allocating the resources according to the customer flow. B. By using these topics, it would be possible to develop a system that effectively guides customers to the tills, reduces overall waiting time and increases customer satisfaction. 2.3.2 How could the overall problem been solved, knowing that the customers are using intelligent trolleys that know exactly what has been bought. Give an overview of such a system. In particular, detail the data structure you will use and how the concurrency aspects are tackled. Make sure your anwer takes into account the two separate cases of having either a single or multiple lanes. A. One approach to solving the problem of reducing customer waiting time at the tills in a supermarket such as Tasdrose would be to use a system that utilizes the data gathered from the customers' intelligent trolleys. The system would consist of the following components: a. Data Collection: Each intelligent trolley would be equipped with a barcode scanner and RFID reader that would allow it to automatically record the items purchased by the customer. This data would be transmitted to a central server in real-time via a wireless connection. b. Data Processing: The central server would process the data received from the trolleys to determine the number of items and their total value. This information would be used to predict the time it would take for a customer to check out and pay for their purchases. c. Lane Allocation: Based on the predicted check-out time, the system would assign the customer to the appropriate lane. For example, if the customer had a large number of items and a high total value, they would be directed to a lane with more tills and more experienced cashiers. d. Concurrency control: In case of multiple lanes, the system would use a combination of locks, semaphores and monitors for concurrency control, to ensure that the shared resources such as tills and cashiers are accessed by only one customer at a time. e. Lane assignment: For customers with a single lane, the system would use a priority queue data structure to assign the customers to the till. The priority queue would be based on the number of items and the total value of their purchases. This would ensure that customers with larger purchases are served first. f. Monitoring and Feedback: The system would continuously monitor the performance of the tills and the waiting times of the customers, providing real-time feedback to the management. The management would then use this information to adjust the number of tills and cashiers, as well as the lane assignment strategy as required. g. Machine learning: The system would use Machine learning techniques such as decision trees, Random forest, and gradient boosting to make predictions on customer behaviour, the number of customers at different times of the day, the number of items they are likely to purchase etc. This information would be used to optimize the use of resources and to make the shopping experience more efficient for the customers. B. Overall, by using this system, Tasdrose would be able to reduce customer waiting times at the tills, increase customer satisfaction and improve the efficiency of its operations.