MUGASA HATWIB The Glasgow Raspberry Pi Cloud: A Scale Model for Cloud Computing Infrastructures Cloud Data Center Development/Testbed Environments A typical Cloud data center has tens and thousands of servers and making it very expensive for educational and research purposes. Most research in Data Centers using Cloud computing is by uses testbed/development environments with a handful of machines. The typical development environment that consists of a couple of servers is still very expensive for most research and educational institutions considering the data center space, power, cooling, etc. Cloud Data Center Simulators Data center Cloud simulators have been developed to provide a low cost alternative to the development environments. In the past, simulators have been used successfully to model state-of-a-target systems however there are essential Cloud Computing properties that these simulators have failed to capture. Traffic patterns in operational Cloud data center networks constantly change over time and are generally unpredictable in the long term, something that cannot be fully modelled in a simulated environment. Simulation also does not model cross layer correlation between application, network and virtualization. An example is the iCloud simulator of the Cloud infrastructure aimed at simulating instance types provided by Amazon without considering the underlying network behavior. Furthermore, with Cloud simulation tools like iCanCloud, CloudSim and GreenCloud still in there infant development and physical Cloud Dev environments running on x86 processors being very expensive for low-budget research project, a novel alternative is to use a cutting-edge “scale-model” of a Clould system to create a highly reliability and affordable Dev environment. PiCloud as an alternative scaled Cloud DC model The low-power, low-cost Raspberry Pi provides an affordable option to construct a miniature Cloud Data Center. It also allows for the reproduction of actual traffic patterns with realistic cloud applications. The applications and technics developed suggested in this research are readily adapted to production Cloud environments This research provides an insight into the Glasgow PiCloud project which uses Raspberry Pi devices in a cluster to build a scaled model of a Cloud data center. It covers the areas of system design, research direction, discussions on the Raspberry Pi device and related works. The System Design The prototype scale model of the PiCloud cluster was built using 56 Model B version Raspberry Pi devices housed in a rack constructed from Lego bricks. In order to reflect a typical data center network architecture, the Raspberry Pi devices are interconnected through a canonical multi-root tree topology. Machines in the same rack are connected to the same Top Of rack switch, which in turn connects to the rest of the topology through an OpenFlow enabled aggregation switch thus allowing it to be fully programmable and compatible with the leading-edge Software Define Networking (SDN) research. The accessibility and design of the prototype means the PiCloud clusters can easily be re-cabled to form a fat-tree topology. In a fat-tree topology, the links become "fatter" as traffic moves up the tree towards the root. By choosing the fatness of the links, the network can be tailored to efficiently use any bandwidth made available by packaging and communications technology. Figure 1 below show the physical Raspberry Pi devices that were used in this research and Figure 2 shows the System Architectural design. Figure 1. Figure 2 System Architecture The software stack for an individual Pi is shown in Figure below The Raspberry Pi devices used in this research where run using Linux on a 16GB Sandisk SD card storage. The Linux Operating System that was used is the Raspbian, this is a Debian flavor of Linux that has been optimized to run on the Raspberry Pi hardware through contributions from the Raspberry Pi community. Raspbian OS comes with over 35,000 pre-compiled software packages built with support for the Raspberry Pi hardware floating point capacities. It is also installed with the Linux Container Suite of programs to provide operation system level virtualization, playing a role similar to the hypervisor in x86 virtualization technologies. Currently, they were able to comfortably support three containers concurrently on a single Raspberry Pi. There is also an API daemon on each Pi to provide a RESTful management interface for facilitating virtual host management and interacting with a head node or the pimaster. System Virtualization The most popular virtualizations tools on the market like Xen and VMWare are very memory intensive to be run on a 256MB RAM device, thus rendering them useless for the PiCloud project. In addition, most of the architectural support for virtualization to efficiently isolate guest operating systems from the host is provided for x86 CPU processors. There is an ongoing effort by Xen to provide ARM platform support however this was still under development when the PiCloud project was being run. As an alternative, Linux Containers were used for virtualization, however, they did not provide the level of isolation that full virtualization achieves. They only provide a virtual environment that has its own process and network space. Management API As part of the LXC suite of packages, there is a set of tools that can be used as a management layer to administer the PiCloud, however, they are not fully functional on the Pi platform. Therefore a webserver on the pimaster was used to provide a webbased control panel for users and administrators to manage the PiCloud as shown in Figure 4 below. Through the interaction with daemons running on individual Pi devices and the pimaster, the website is used to control workloads using RESTful interfaces. Figure 4 Research directions The PiCloud can be used for experimenting with new algorithms for Virtual Machine (VM) management and directly observing resulting behaviors on all layers of the Cloud architecture, something that would not be achieved with over-simplified assumptions of simulators. By operating an actual physical infrastructure, the improvement of file management and migration techniques can be adequately evaluated. Observations on the interaction and effects of specific optimizations on the cloud configurations can be made at all different layers. The PiCloud project can also be used to investigate ways of reducing network congestion through improved resource allocation, as well as looking at novel network architectures and technologies that require significant changes to the infrastructure. Through using physical devices to build the Cloud DC, crucial administration management requirements like API design, UI design etc., will not be overlooked or underestimated compared to if the Cloud DC was built using simulators. A future development in the Cloud computing is the adjustment or removal of virtualization techniques. Like with the PiCloud project, this could be achieved by switching the virtualization tools to lighter mechanisms like the Linux Containers. Alternatively, this can be done by removing virtualization completely and renting out physical nodes. Conclusion The PiCloud project showed that a Cloud DC can be hosted without the extensive physical footprint in terms of expensive power and cooling management, which cover 33% of total power consumption, by scaling down the hardware capacity. However this is not the same for software capacity and limiting the services that can be applied on the PiCloud to a few lightweight applications like http server, Hadoop, etc. The table below showed the cost breakdown of a development environment compared PiCloud Server Cost Power Cooling Development Environment $112,000 10,080W/h Yes PiCloud $1,960 196W/h No With the reduction in hardware prices and increase in capacity, and if the trend continues, there is anticipation that Cloud computing will be more affordable for research and education deployments. References: [1] Fung Po Tso, David R. White, Simon Jouet, Jeremy Singer, Dimitrios P. Pezaros, "The Glasgow Raspberry Pi Cloud: A Scale Model for Cloud Computing Infrastructures," icdcsw, pp.108-112, 2013 IEEE 33rd International Conference on Distributed Computing Systems Workshops (ICDCSW), 2013 [2] A. Greenberg, J. Hamilton, D. A. Maltz, and P. Patel, “The cost of a cloud: Research problems in data center networks,” ACM SIGCOMM Computer Communication Review, vol. 39, no. 1, January 2009. [3] “Raspbian.” [Online]. Available: http://www.raspbian.org/ [4] “Raspberry pi.” [Online]. Available: http://www.raspberrypi.org