II. Cloud ComputingStudy Guide 1. Definition of Cloud Computing: Cloud computing utilizes remote internet servers for storing, managing, and processing data, as opposed to local servers or personal computers. It entails on-demand access to resources like storage, applications, and services without requiring direct physical control or ownership of the underlying infrastructure. 2. Virtualization Technologies; Creates a transparent view of computing resources. Maps real resources to virtual ones. Enables the creation of multiple VMs on a single physical machine or a single VM using resources from multiple machines. Facilitates the provisioning of processing, storage, networks, and other essential computing resources for abstraction. Purpose: Abstraction (simplifying resource usage), replication (easier management or allocation of multiple instances), and isolation (separating client resource uses). There are several types of virtualization: A. Full Virtualization: (e.g., VMware, Microsoft): Simulates complete hardware, allowing unmodified guest OS to run, efficient, hardware exposed to OS. B. Paravirtualization:(e.g.Xen), the guest operating system is aware of the virtualization and is modified to run in this environment. Simpler architectural interface, loses portability, has memory management and interrupt handling. C. OS Level Virtualization: (OpenVZ and Linux VServer) allows for the creation of multiple isolated user-space instances on a single kernel. Containers on top of OS: processes, file systems,network resources, environment variables, system call interface: good for sandboxing. D. Application Layer Virtualization: This type, exemplified by Mono and Java Virtual Machine, focuses on running specific applications. Virtualization Classification: 1st virtualization classification A. Instruction Set Architecture: division between hardware and software subpart, system/machine interface. B. Application Binary Interface: process/machine interface provides program with access to hardware resources and services (includes set of user instructions and system calls), where virtualizing software is at. C. Application Programming Interface: abstracts from the details of service implementations, uses high level language and support for portability. 2nd virtualization Classification: A. Instruction Set Architecture: Emulate the ISA in software, inefficient, abstraction done in software, caching, code reorganization, for teaching, debugging, and having multiple OS. Cons, self-modding code, efficient exception handling. B. Hardware Abstraction Layer: maps real machine and emulator, handles non-virtualizable architecture, usage for: fast and usable, virtual hardware, migration, and consolidation: standalone uses just VMM over hardware, hosted has a host OS, multilevel privilege domains, caches silent fails. C. OS Level: virtualized syscall interface, easy to manipulate, may not abstract all devices. D. Library (user-level API) Level: different subsystem API to application, complex implementation, user-level device drivers. E. Application (programming Language Level): Mono and Java Virtual Machine, focuses on running specific applications: Virtual arch, platform independent, less control. Bare-metal or Hosted: Baremetal: hardware control, not fighting OS. Hosted: run native processes alongside VMs, easy management, avoid code duplication, familiar environment. Combination: KVM (Kernel-based Virtual Machine) is a virtualization solution. Pros: Strong performance and hardware support. Cons: More manual configuration compared to some hypervisors. I/O Virtualization: A. Emulation: Implements NIC, pro: unmodified guest, cons:slow due to access to every NIC register, hypervisor emulates complex hardware. B. Paravirtualized: fast-no need for emulation, cons: requires guest driver, modifies whol Os, requires additional single driver to a guest. C. Direct Access/ direct Assignment: fast as possible, NIC needed per guest and host, can’t monitor and encapsulate guest packets and hypervisor level. D. SR-IOV: Contains a physical function controlled by the host to create virtual functions that are assigned to guests, access all NIC registers directly, multiplexing. Pro: fast, only one NIC needed, Con: emerging, expensive, unsupported. Virtual Machines: A. Virtualizing software(virtual machine monitor), between hardware and conventional software, ISA translation, can be build on top of OS VMMs: A. System: ABI interface, efficient execution, OS-independent services, VMWARE ESXI server, Usermode-Linux, same ISA B. Process: API interface, easier installation, leverage OS services, execution overhead C. Hosted: same ISA, system VMMS, VMM executes on hardware, improve performance and I/O devices support on OS host, lacks isolation (VMWare workstation D. Whole System VMMs - use different ISA, full emulation of guestOS and applications, virtual PC 3. Parallel Computing: Parallel computing is the use of multiple computing resources simultaneously to solve a computational problem. It involves dividing a problem into smaller tasks that can be executed concurrently on different processors or nodes. This allows for faster problem-solving, the ability to handle larger problems, cost savings, and concurrency. Aspects of Parallel Programs: partitioning of dataset and overall work, communication waits for communications to complete to move on, asynchronous data transfer independently of each other, synchronization(types: barrier, lock/semaphore, synchronous communication operations). Pro: saves time and costs, provides concurrency, solves large problems. Types of Parallelism: Data Parallelism and Task Parallelism. Flyn’s Classical Taxonomy: distinguish mult-processor architecture by instruction and data. E.g. SISD(I instruction: D data: S single: M multiple) processing units execute the same instruction at any cycle, but operate on different data elements, and can execute different instructions on different data elements(most common). Parallel computer memory: 1 - processors access all memory as a single global address space, data sharing fast, lack of scalability. 2 - processor has its own memory, scalable processor communication on the programmer. Models: Shared Memory: Uses #1, locks used to control shared memory access Message Passing Model Data Parallel Model Problems: Data dependency and load balancing Forms of Parallel Computing: A. Distributed Systems and Distributed Computing A distributed system refers to a collection of individual computers that appear as a single coherent system to users. It includes various types of networks such as the Internet, LAN, WAN, ATM networks, and mobile cellular systems. The main characteristics of distributed systems are scalability, users and system components unaware of heterogeneity, continuous availability, openness, scalability and transparency. Connects users to remote resources and shares them in a controlled way. Has multiple cPUs. Transparency: Hides information such as: access/data representation, resource location, resource migration, resource relocation and allocation, replication, concurrency, failure. Openness: Supports heterogeneity, interoperability, portability. Distributed computing is the use of multiple computing resources simultaneously to solve a computational problem. It involves dividing tasks among different computers or processors and coordinating their actions to achieve a common goal. Distributed computing can be used to save time, solve larger problems, save costs, and provide concurrency. Hardware Aspect: Multi-processor Systems: 2 types in terms of memory (shared/nonshared[slower, difficult to build and for interprocess communication, incoherent, message passing, can connect more processors)), 2 types in terms of interconnection of memory and processors (bus based/switch based [two types: cross bar switch/omega switch]. Multi-Computer Systems: 2 types (homogeneous [same type of computers with their own memory] / heterogeneous[different types, each has own memory, fast communication)), both have buse based or switch based system types for interconnection Example of homogeneous multicomputer system: System area network: homogeneous multicomputer system based on fast bus based interconnecting network, connected through multi-access network, limited scalability, cluster of workstations: meshes, hypercube structure Heterogeneous multicomputer system: Busbased: limited scalability, connected through shared multiaccess network(fast ethernet). Switch Based: Meshes and hypercube topology. B. Cluster Computing: Cluster computing refers to the use of interconnected whole computers, known as nodes, that work together as a single computing resource. These nodes are typically connected through fast local area networks. Cluster computing is often used for applications that require high processing power, such as parallel computation, high-powered gaming, and 3D visualization, database apps, ecommerce, etc. Pros: Reduce Processing power costs/Increase performance, availability by eliminating single points of failure, scalability by adding cluster nodes Cluster Methods: Cluster categories: High Availability: avoids singlepoint of failure, 2+ nodes, redundancy. Load Balancing: large number of noads to share load, servers with large client base High Performance: 4. Datacenter(DC) A data center is a facility that houses computer systems and associated components, such as servers, storage devices, network equipment, and power and cooling systems. It is designed to store, manage, process, and distribute large amounts of data. Data centers are crucial for organizations that rely on technology for their operations, as they provide a secure and controlled environment for the efficient operation of computer systems and the storage of data. DC system: [software includes: distributed system and file system, parallel computation through map reduce, hadoop] [Virtualized infra: computing: vm/hypervisor. Storage: virtualized/distributed, network: network virtualization). 3 layers: access layer with Top of Rack switches, aggregation layer, core layer. Common DC issues: single point of failure, oversubscript of links higherup in topology. Solution Properties: backward compatible with existing infra, no changes in application, support of layer 2(ethernet) for certain monitoring apps that require server with same role to be on same VLAN, use same IPP on dual homed servers, allow server farm growth, cost effective, fast communication, layer 2: on spanning tree for entire network to prevent looping (FAT-Tree), layer 3 shortest path routing between source and destination, best-effort delivery. 5. Openstack OpenStack is an open-source cloud computing platform that manages computing, storage, and networking resources in a datacenter. It was launched in July 2010 by Rackspace Hosting and NASA, and it is managed by the OpenStack Foundation. The platform is written in Python and has a modular architecture with various components such as Nova (Compute), Swift (Object Storage), Cinder (Block Storage), Glance (Image Service), Keystone (Identity), Neutron (Network), and Horizon (Dashboard). OpenStack allows users to create and manage virtual machines, store and retrieve files, manage authentication and authorization, and provide network connectivity as a service. It is compatible with multiple hypervisors and supports various programming languages. OpenStack is widely used in the industry and has a large community of contributors and supporters. 6. Introduction to SDN; SDN (Software-Defined Networking), is a programmable network architecture that involves a new network operating system and generalized network virtualization. It aims to simplify network systems that have become complex due to new control requirements like VLANs, traffic engineering, and deep packet inspection. SDN works by separating the control plane from the data plane in network devices. The control plane is responsible for making traffic control decisions, while the data plane is responsible for forwarding packets based on those decisions. In SDN, the control plane is centralized and programmable, allowing for more flexibility and agility in managing the network. One of the key implementations of SDN is OpenFlow, which is a Layer 2 communication protocol. OpenFlow allows for remote administration of network switch packet forwarding tables. It operates over TCP and utilizes TLS for secure communication between the controller and the switches. In summary, SDN is a network architecture separating the control plane from the data plane, allowing for centralized control and programmability. OpenFlow is one of the implementations of SDN that enables remote administration of network switches. 7. Open vSwitch and Cloud: Open vSwitch is a multilayer software switch that operates within a hypervisor and provides connectivity between virtual machines and physical interfaces. It consists of three interfaces: configuration, forwarding path, and connectivity management. Open vSwitch is relevant to cloud computing because it enables the creation of a single logical switch image across multiple physical servers, allowing for centralized management and control of virtual networks. It supports features such as virtual private networks (VPNs) for connecting VMs over a private network, mobility between IP subnets, and integration with virtualization platforms like XenServer. Open vSwitch enhances the flexibility, scalability, and efficiency of network infrastructure in cloud computing environments. 8. VPN Technology: a. PPTP/PPP(Point-to-Point Tunneling Proto); i. Simple tunneling; encrypts packets > encapsulates it b. L2F (Layer 2 Forwarding Protocol): i. Same as L2TP but does not have encryption or confidentiality by itself, relies on the protocol being tunneled to provide privacy. Designed to Tunnel PPP Protocol c. L2TP:Layer 2 Tunneling Protocol i. Allows PPP frames to be sent over non IP networks and PPTP. ii. Allows multiple Qos tunnels between the same endpoints. 1. Better compression & supports flow control. iii. Step1: establishing Control Connections for a tunnel. iv. Step 2: establishing a session triggered by inbound/outbound call requests > Tunnel & corresponding Control Connection MUST be established before an inbound/outbound call. v. L2TP must be established before tunneling PPP frames; Multiple sessions can exist in a single Tunnel & multiple tunnels can exist between the same LAC & LNS. d. PPP (POINT TO POINT): Communication involves direct interaction between two nodes. Pros: Simplicity and efficiency. Cons: Scalability and central management control. e. VLAN (Virtual Local Area Network): Network segmentation technique. Pros: Improved network efficiency and security. Cons: Increased network complexity and potential issues if not properly configured. f. VXLAN (Virtual Extensible LAN): Network virtualization technology. Enables creation of overlay networks. Pros: Scalability and flexibility. Cons: Potential complexity and overhead. g. MPL (MultiProtocol Label Switching): Protocol. Efficient packet forwarding in network communication. Pros: Improved performance and traffic engineering. Cons: Complexity and the need for specialized hardware. h. GRE (Generic Routing Encapsulation): Tunneling protocol used for encapsulating a wide variety of network layer protocols. Pros: Providing flexibility and simplicity. Con: Inherent security features. i. SSL (Secure Socket Layer): Cryptographic protocol for secure communication over the internet. Pros: Data encryption and authentication. Cons: Vulnerable to certain security vulnerabilities and largely succeeded by its successor, TLS (Transport Layer Security). j. IPsec (IP Security): Suite of protocols for securing Internet Protocol (IP) communications. Pros: Encryption and authentication benefits. Cons: Complexity in configuration and management. 9. Introduction of MapReduce The differences of GFS and HDFS GFS (Google File System) ● master-slave architecture with a single server master for metadata management and multiple chunk servers for storing file data ● Designed to handle a small number of large files(GB size) ● uses replication for data reliability, where each chunk is stored on multiple chunk servers ● uses a relaxed consistency model, where file regions are consistent but might not be up to date to recent changes. ● Includes mechanisms for fast recovery and data replication to handle component failures. ● Optimized for high aggregate throughput for large number of clients with focus on high bandwidth rather than low latency HDFS (Hadoop Distributed File System) ● HDFS also follows a master-slave architecture, but it has a more complex architecture with multiple NameNodes for metadata management and multiple DataNodes for storing file data. ● HDFS is optimized for handling a large number of small to medium-sized files. ● HDFS also uses replication for data reliability, but it provides more flexibility in choosing the replication factor for each file. ● HDFS provides a strict consistency model, ensuring that all clients see the same view of the file system at any given time. ● HDFS has built-in fault tolerance mechanisms, such as automatic failover of NameNodes and data replication across DataNodes. ● HDFS is designed to provide both high throughput and low latency, making it suitable for a wide range of applications. Key Features of Ceph ● Scalability: can scale horizontally for more storage nodes to accommodate growing data needs ● Distributed Architecture: data is distributed across multiple storage nodes for high availability and fault tolerance ● Object Storage: allows users to store and retrieve data as objects, good for storing large amounts of unstructured data ● BlockStorage: enables the creation of virtual block devices that can be used by applications and Operating systems ● File System: provides distributed file system CephFS, allowing users to mount Ceph storage as a file system and access it using standard file system interfaces ● Data Replication and Erasure Coding: data replication and erasure coding for data protection. Data can be replicated or distributed across multiple nodes for redundancy ● Self-healing: if the storage node fails or becomes unavailable, Ceph built-in mechanisms has data be automatically replicated or reconstructed on other available nodes. ● Open source What is MapReduce? MapReduce is a programming model and software framework used for processing and generating large datasets in a distributed computing environment. It was developed by Google to handle their massive data processing needs. The MapReduce model involves two main operations: Map and Reduce. ● In the Map phase, the input data is divided into smaller chunks and processed in parallel by multiple nodes in a cluster. Each node applies a map function to the input data and generates intermediate key-value pairs. ● In the Reduce phase, the intermediate key-value pairs are shuffled and sorted based on the keys. Then, the reduce function is applied to the sorted pairs, which combines the values associated with each key and produces the final output. ● MapReduce is designed to handle large-scale data processing tasks by distributing the workload across multiple nodes, allowing for parallel processing and efficient utilization of resources. It provides fault tolerance, scalability, and high throughput, making it suitable for processing big data. What are the key components of MapReduce? ● Map Function: takes input data and transforms it into a set of key-value pairs ● Reduce Function: takes map function output, performs a summary operation on data to produce the final result ● Input Split: divide data into smaller chunks called input splits, processed by individual map tasks ● Map Task: each map task processes an input split and applies map function to generate the intermediate key-value pairs ● Shuffle and Sort: the intermediate pairs are sorted and grouped by key to prepare for reduce ● Partitioning: the pairs are partitioned based on key and distributed to reduce tasks ● Reduce Task: each reduce task receives subset of key value pairs and applies reduce to create final output ● Output: final output of reduce tasks is collected and stored in desired format What are the main parallelization challenges? How does MapReduce deal with them? Map Reduce deals with the main parallelization challenges by providing a programming model and framework that simplifies the process of parallel computing ● Data partitioning: MapReduce automatically partitions the input data into smaller chunks and distributes them across multiple nodes in a cluster. This allows for parallel processing of the data. ● Task distribution: MapReduce assigns tasks to different nodes in the cluster, ensuring that each node performs a specific computation on its assigned data partition. This enables efficient utilization of resources and load balancing. ● Fault Tolerance: MapReduce handles node failures by automatically reassigning failed tasks to other available nodes. It also maintains checkpoints to track the progress of computations, allowing for recovery in case of failures. ● Data Locality: MapReduce aims to minimize data movement across the network by scheduling tasks on nodes where the data is already present. This reduces network congestion and improves overall performance. ● Task Coordination: MapReduce provides synchronization mechanisms, such as barriers and locks, to coordinate the execution of tasks. This ensures that the output of one task is available as input to subsequent tasks. What are the Limitations of MapReduce? ● Lack of real-time processing: MapReduce is designed for batch processing and is not suitable for real-time or interactive applications that require low latency. ● Inefficient for small datasets: MapReduce is optimized for processing large amounts of data. For small datasets, the overhead of setting up and managing the MapReduce framework can be significant, resulting in inefficient processing. ● Limited support for iterative algorithms: MapReduce is not well-suited for iterative algorithms, as it requires reading and writing data to disk between each iteration, which can be time-consuming. ● Difficulty in expressing complex computations: MapReduce is based on a simple programming model of map and reduce functions, which can make it challenging to express complex computations that require multiple stages or dependencies between tasks. ● Data movement and shuffling overhead: MapReduce involves a significant amount of data movement and shuffling between the map and reduce phases, which can introduce additional latency and network overhead. ● Limited fault tolerance: While MapReduce provides fault tolerance mechanisms, such as task re-execution and data replication, it does not handle failures at the node level. If a node fails, the entire job may need to be restarted. ● Limited support for real-time updates: MapReduce is not designed for handling real-time updates to data. It is more suitable for batch processing of static datasets. III. Cloud Security 10. What are the causes of problems associated with Cloud Computing? Illustrate each reason in details. a. Loss of Control in Cloud i. Data: cloud users relinquish control over their data, relying on providers for security and privacy ii. Applications: Users lose control over application maintenance and security depending on cloud providers iii. Identity Management Cloud consumers depend on providers for identity management, raising privacy and security concerns b. Lack of trust in the Cloud: i. Third-party management: cloud users entrust third-party providers with data and application security raising trust and risk concerns ii. Security Measures: Users may doubt the effectiveness of cloud provider security measures in safeguarding data and ensuring privacy iii. Compliance and regulations: concerns arise about cloud providers meeting industry regulations and data protection laws c. Multi-Tenancy Issues i. Resource sharing: cloud involves multiple tenants sharing physical resources, prompting concerns about data isolation and mutual impact on performance or security ii. Separation of Data: adequate separation between tenants is crucial to prevent breaches iii. Trust among tenants: lack of it can hinder cloud adoption 11. What can malicious insiders do? a. Malicious insiders in cloud computing can compromise system security through actions like unauthorized access, data theft, manipulation, service disruption, unauthorized sharing, sabotage, and unauthorized privilege escalation. Organizations must implement robust security measures, access controls, and monitoring systems to detect and prevent these insider threats in cloud environments. 12. What can outside attackers do? a. Outside attackers can compromise cloud system security through actions like exploiting vulnerabilities, launching DDoS attacks, conducting man-in-the-middle attacks, attempting data breaches, using social engineering, injecting malware, and posing insider threats. Cloud service providers and users should implement robust security measures, including strong authentication, encryption, regular updates, and employee training, to mitigate these risks and protect the system from outside attacks. 13. What are the big issues in security and privacy in Cloud Computing? a. Loss of Control: Cloud consumers may relinquish control over data, applications, and identity management, relying on cloud providers for security and privacy. b. Lack of Trust:Trusting third-party management in cloud computing poses challenges, requiring confidence in cloud providers with sensitive data and operations. c. Multi-tenancy Issues: Resource sharing among cloud tenants can lead to conflicts and challenges in ensuring separation and preventing negative impacts. d. Confidentiality:Concerns arise regarding the confidentiality of stored and transmitted data, with potential exposure through unauthorized access or breaches. e. Integrity: Ensuring data integrity is crucial, as unauthorized modification or tampering can compromise the reliability and trustworthiness of the data. f. Availability: Cloud services must maintain constant availability; downtime or disruptions can result in productivity and revenue loss. g. Data Security and Storage: Essential to protect data in transit, at rest, and during processing to prevent unauthorized access, data loss, or leakage. h. Identity and Access Management:Critical for managing user identities and controlling access to cloud resources, preventing unauthorized access, and ensuring proper authentication and authorization. i. Compliance and Legal Issues:Cloud computing raises concerns about compliance with regulations and laws regarding data protection, privacy, and cross-border data transfers. j. Auditability and Forensics: Crucial ability to audit and investigate security incidents or breaches in the cloud for accountability and legal purposes. 14. What are infrastructure security issues? Give examples for each type. a. Network Security: ensuring the security of the network infra, including firewalls, routers, and switches, to protect against unauthorized access and data breaches b. Host Security: Secure the physical and virtual servers of the cloud environment, implement strong access controls, regular patches and updates, and monitor for suspicious activity c. Application Security: Protect cloud applications from vulnerabilities and attacks such as XSS and SQL injections through secure coding practices, regular security testing, and web application firewalls d. Data security: Ensure CIA triad of data stored in cloud through encryption of sensitive data, access controls, and regular backups e. Availability: ensure the availability of services and resources by implementing redundancy, load balancing, and disaster recovery measures to mitigate impact of hardware and software failures f. Compliance and regulatory issues: address compliance requirements and industry regulations to protect sensitive data and ensure elegal and ethical use of services g. Physical security: Protecting the physical infrastructure of data centers where cloud services are hosted, including access controls, surveillance systems, and environmental controls to prevent unauthorized access and physical damage. 15. What is the data life cycle? a. Collection: Gather data from sensors, databases, or user input. b. Storage: Store collected data in databases, data warehouses, or cloud storage. c. Processing: Transform and clean data for usability, involving tasks like filtering, aggregating, or analysis. d. Analysis: Extract insights and patterns using statistical analysis, data mining, or machine learning techniques. e. Visualization: Present analyzed data visually (charts, graphs, dashboards) for better understanding and decision-making. f. Dissemination: Share analyzed data with stakeholders through reports, presentations, or interactive platforms. g. Archiving: Archive data for long-term storage or compliance, ensuring secure retention for future access if needed. 16. What are the key privacy concerns about cloud stored data? a. Data storage: When data is stored in the cloud, there is a risk of unauthorized access or data breaches, which can compromise the privacy of sensitive information. b. Data retention: Cloud service providers may retain data for extended periods of time, raising concerns about the long-term storage and control of personal or sensitive data. c. Data destruction: Ensuring secure and complete data destruction when it is no longer needed is crucial to protect privacy. However, in the cloud environment, it can be challenging to verify if data has been permanently deleted. d. Auditing and monitoring: Lack of visibility and control over data in the cloud can make it difficult to monitor and audit access to sensitive information, potentially leading to privacy breaches. e. Handling of privacy breaches: In the event of a privacy breach, it is important to have clear protocols and procedures in place for notifying affected parties, mitigating the impact, and addressing any legal or regulatory obligations. 17. What are new vulnerabilities & attacks in cloud computing? a. From PT “Research in Cloud Security and Privacy_II” slide 6 i. Threats arise from other consumers ii. Due to the subtleties of how physical resources can be transparently shared between VMs iii. Such attacks are based on placement and extraction iv. A customer VM and its adversary can be assigned to the same physical server v. Adversary can penetrate the VM and violate customer confidentiality b. Idea of Taxonomy i. C. fear of loss of control over data (think dlp) ii. I. cloud provider (3rd party) trusting the cloud provider is not tampering with your data and its integrity 1. Privacy issues iii. A. fear of systems doing down for a client, will the provider have computation and resources to keep their systems running 1. Ex: DoS iv. Massive data mining; ex: accounts stolen for attackers to spend clients' money to run instances v. Cloud provider employees can be phished vi. Auditability and forensics; difficult to audit data that is outside of your organization (data in cloud) vii. Legal quagmire and transitive trust issues: Who’s responsible for complying with regulations? 1. Ex: SOX (Sarbanes-Oxley Act), HIPAA (Health Insurance Portability and Accountability Act), GLBA(Gramm-Leach-Bliley Act) 2. Solution: Follow compliance standards. 18. What are possible solutions to minimize lack of trust? Illustrate them. a. Policy Language: i. Consumers have specific security needs but do not have a hand in how things are handled (ex: SLA:Standard Language of Representing) ii. Solution: Creating policy language: 1. Machine understandable 2. Easy to combine/merge and compare 3. Implement validation tool to check policies created and that it reflects the policy creator’s intentions b. Certification: Some form of reputable independent, comparable assessment and description of security features and assurance i. Ex: Sarbanes-Oxley, DIACAP, DISTCAP, ii. Risk Assessments - performed by certified 3rd parties that ALSO provide assurance 19. What are possible solutions to minimize loss of control? Illustrate them. a. Monitoring: application specific runtime monitoring and management tool i. Mechanisms that assist provider and client to monitor, collect, analyze logs, system health ii. RAdAC (Risk-adaptable Access Control): a mechanism that enables the consumer to act upon attacks by providing the ability to move the user’s app. To another cloud iii. Ex: siem, packet sniffer, ips/ids, wag, vulnerability management/scanners, etc. b. Utilizing different clouds i. Reduce the spread of risk / increase redundancy / increase chance of mission completion ii. similar idea as isolation/segmentation c. Accessing Control Management i. Access to cloud, servers, services, db, account access ii. Federated Identity Management: access control management burden still lies with the provider 1. Requires the client/user to largely trust the provider in terms of security, management, and maintenance of access control policies. a. Can be difficult when there can be a vast amount of users from different organization with different control policies d. Identity Management i. Goals proposed for User-Centric IDM 1. Authenticate without disclosing identifying information 2. Ability to securely use a service while on an untrusted host (VM on the cloud) 3. Minimal disclosure and minimized risk of disclosure during communication between user and service provider (Man in the Middle, Side Channel and Correlation Attacks) 4. Independence of Trusted Third Party 5. Approach 1: a. IDM Wallet: Uses AB Scheme to protect PII from untrusted hosts b. Anonymous Identification: Authenticating a user without revealing their identity i. Ex: ID Data: SSN, DoB CIS benchmarks? ii. Disclosure Policy: set of rules choosing ID data from a set of identities in IDM wallet iii. Disclosure History: logging and auditing iv. Negotiation policy, virtual machine v. Example case: credit cards 1. IdP provides encrypted ID Info to user and SP > SP & SP interact > Both run IdP’s public function on certain bits of encrypted data (blockchain) > finally both agree on exchange of results and agree if it matches 6. Approach 2: a. Active Bundle Scheme to protect PII from untrusted hosts; encapsulating mechanism protecting data carried within 1. Includes data, metadata (used for confidentiality) 2. Operations: Self-Integrity Check (ex. Hash func.) / Evaporation/filtering” self destroy part of AB sensitive data portion when it’s threatened via disclosure / Apoptosis self destructs completely b. Predicates over encrypted data to authenticate without disclosing any unencrypted ID data c. Multi-party computing to be independent from a 3rd party 20. Questions about the topics and approaches in the research papers about cloud computing security and data privacy. IV. Research Papers and Open Questions (around 15 %) 1. Intel_Cloud_Builder_Guide_to_Cloud_Design_and_Deplo yment.pdf a. 2. Gfs-sosp2003.pdf 3. Mapreduce-osdi04.pdf 4. Cloud Computing Security - A Survey.pdf 5. 13_Hatman-Intra-cloud Trust Management for Hadoop_paper.pdf Sample Questions 1. What is MapReduce? What does MapReduce ‘runtime’ handle? (4 points) a. Parallel computing framework for cloud computing b. ‘Runtime’ handles c. Programming model for expressing distributed computations at a massive scale and an execution framework for organizing and performing such computations. MapReduce ‘runtime’ handles scheduling, data distribution, synchronization, errors, and faults. 2. If CSUF is going to make their own cloud, what do they need? Establishing a robust cloud infrastructure necessitates foundational elements like physical servers, networking gear, and storage. Virtualization technology optimizes hardware usage through virtual machines (VMs), while orchestration tools automate resource deployment and scaling. Robust security measures, including access controls and encryption, ensure data protection and regulatory compliance. Monitoring tools track resource usage and performance, and scalability features adapt to varying workloads. Reliable data storage solutions with backup mechanisms maintain data integrity. A well-designed networking infrastructure ensures seamless communication, and a disaster recovery plan mitigates unforeseen events. User interfaces and APIs enhance user interaction and resource management within the cloud environment. These elements collectively contribute to a resilient, secure, and user-friendly cloud infrastructure. 3. Suppose you want to establish a private cloud in CSUF and provide IaaS for students and teachers. You have limited budgets and can purchase a few servers and switches (e.g., 3 high performance servers and each server has 64G memory and 1T storage, and two high-speed switches). Your design goal is to provide remote desktop access for users (support 100 Linux VMs and or 30 Windows VMs). The existing Openstack design is too clumsy in terms of resource consumption and too complicate to set up your services, and thus you do NOT want to use openstack and want to design the system from scratch. 1) Describe your design requirements to enable the above desired system. (hint: The design requirements will lead you to the system physical and software components design. Thus provide detailed design requirements including networking, storage, computation, access control, monitoring, etc.). 1) Design Requirements: Networking: High-speed, reliable networking infrastructure. Segmentation to isolate student and teacher environments. Adequate bandwidth to support remote desktop access for 100 Linux VMs and 30 Windows VMs. Storage:Sufficient storage capacity for VM images and user data. Fault-tolerant storage architecture to ensure data integrity. Computation: Three high-performance servers with 64GB memory each to accommodate the desired number of VMs. Efficient virtualization technology to support both Linux and Windows VMs. Access Control: Robust access control mechanisms to differentiate between students and teachers. Secure authentication for remote desktop access. Monitoring: Comprehensive monitoring system for resource utilization, performance, and security. Alerts for potential issues and anomalies. 2) Describe your physical system setup (draw the system topology and illustrate why you want to set up in that way, i.e., pros and cons. Hint: provide your answer according to your provided system design requirements) Three High-Performance Servers (Server 1, 2, and 3): ● Purpose: ○ These servers serve as the computational backbone for the private cloud, hosting the virtual machines (VMs) for both students and teachers. ● Reasoning: ○ Load Distribution: Distributing the VMs across three servers ensures load distribution, preventing a single point of failure and improving overall system performance. ○ Redundancy: In case one server fails, the others can still handle the computational load, ensuring high availability. Two High-Speed Switches (Switch A and B): ● Purpose: ○ These switches provide the networking infrastructure for the private cloud, facilitating communication between servers and external connections. ● Reasoning: ○ Redundancy: Having two switches enhances network redundancy. If one switch fails, the other can continue to handle network traffic, minimizing downtime. ○ High Throughput: High-speed switches ensure efficient communication and reduce network latency, critical for remote desktop access. Network Segmentation: ● Purpose: ○ Segregating the network into student and teacher environments enhances security and control. ● Reasoning: ○ Security: Isolating student and teacher environments reduces the risk of unauthorized access and potential security breaches. ○ Control: Segmentation allows for tailored access control policies for students and teachers. 3) Describe what software functions you want to establish on each physical machine and their interrelations. (Describe carefully about the system components and how they provide you the desired features. Hint: provide your answer according to you provided system design requirements). Server 1, 2, and 3: ● Virtualization Software: Implement efficient virtualization software (e.g., KVM) to support Linux and Windows VMs. ● Operating System: Install a lightweight and secure operating system. ● Monitoring Agents: Deploy monitoring agents for resource tracking and security. ● Access Control Software: Implement access control mechanisms to differentiate between students and teachers. ● Remote Desktop Services: Install remote desktop services for user access. Interrelations: ● Virtualization Software <-> Operating System: Ensure compatibility and optimal performance. ● Operating System <-> Monitoring Agents: Facilitate monitoring of system resources. ● Access Control Software <-> Remote Desktop Services: Ensure secure and differentiated access for students and teachers. ● Remote Desktop Services <-> Virtualization Software: Enable seamless remote desktop access to VMs.