Chapter I INTRODUCTION For years, researches have been carried out focusing on ways to maximize computing performance of processors as the needs and expectations for faster and higher quality applications are of great demand. At present, high performance computing systems exists. These systems have played an important role in the developments and innovations in communications, hardware, network protocols, and operating systems as well as in solving important scientific, engineering and business problems. Conventional supercomputers are very expensive that only large corporations, government and large educational institutions can afford them. To average researchers and developers, these systems are beyond their budgets. But a lower cost option is available through distributed computing. Distributed computing is the use of multiple network- connected computers for solving a problem or for information processing. The use of this system has substantially lowered price and complexity for implementing high performance computing. Given this new affordability, new experiments in installing such systems to develop intensive processing applications are being done. One application that requires intensive CPU processing in computer graphics is the 3D computer rendering. 3D rendering is the process of generating from the abstract description of a 3D scene. Despite the developments of new techniques and algorithms, this process is computationally intensive and requires a lot of time to be done; especially when the source scene is complex or when photo-realistic images are required. It takes a lot of time before a rendering process is completed. This is because rendering requires intensive CPU processing due to the large 1 number of calculations done, especially in movie or video animation applications. This study focuses on improving and speeding up the computation process in 3D rendering, through the use of distributed computing. The capabilities of an open source 3D graphics application will be improved so that it will allow its processes to be distributed to different processors and achieve high performance computing at minimal cost. As faster processors become available, so do applications and algorithms that take full advantage of their capabilities. This is an irony in computer graphics. As a result, the speed of animation tasks such as rendering performance is slow. With the rise of different facts, studies, and applications about 3D rendering, intensive CPU processing, and distributed computing, several problems arises. Is there a way to obtain a high speed rendering performance using available PCs? Is it possible to say that money is not an obstacle in obtaining this? Is it possible to obtain this through distributed computing? The researchers will be trying to solve these problems in this senior project. This Senior Project is important to different kinds of people. To the Ateneo de Naga University, this could provide faster rendering ability with a minimal cost. With this, the university could save a lot of money than purchasing costly high specs computers. To the animators: This project will help them speed up the rendering process of their animations using the available resources in their offices or laboratories. To the students and researchers: this project can be a reference for future researches. This would also broaden their understanding about distributed computing and help them be aware of the benefits that a distributed computing as a high performance computing provides. Hence, this will encourage them to engage in the development of applications that require CPU intensive 2 processing, especially in solving important scientific and engineering problems that can help in the development of our country. This study attempts to develop a 3D graphics application that will be using distributed systems. Specifically, this senior project aims to achieve the following: (1) obtain high performance computing system at a low cost; (2) utilize the available computers for the implementation of the program; (3) achieve a high speed rendering of Computer-Aided Designs (CAD) though the use of distributed computing, and (4) prove that distributed computing can achieve the same speed as supercomputers or as the high specs PCs. Although this project aims to solve three broad problems, this is limited to the following: (1)Twelve computers are to be connected in the network to implement the project; (2) Only rendering Computer-Aided Designs(CAD) using a rendering and queuing software programs shall be the task of the project; (3)Computers to be used in the implementation must be the available computers in the Ateneo laboratories (CISCO); (4)The project will run only in Linux environment, and (5)The focus of this study will only be limited to obtaining high speed and low cost rendering through distributed computing. 3 Chapter 2 REVIEW OF RELATED LITERATURE The study aims to develop a low cost distributed system that will render 3D Computer Aided Design. The following are the researches and literature relevant to this study: High Performance Computing High performance computing (HPC) is the use of parallel processing for running advanced application programs efficiently, reliably and quickly. HPC is a high speed computing that uses supercomputers or computer clusters to solve advanced computation problems. HPC is usually used in research and development in many areas of science, engineering and business. (Dowd, K. 1998) In a computer, how quickly calculations can be set up and input to the processor and how quickly new jobs and their data can be moved in, completed, and the results moved out of the computer determines how much of the processor's speed can actually be. Computers are often compared on the basis of processing speed. The convention used for measuring the computer performance is "FLOPS" (floating point operations per second). A FLOP makes heavy use of floating point calculations. Floating point operations include operations that involve fractional numbers. Such operations, which take much longer time to compute than integer operations, occur often in some applications. Thus, FLOPS measure the speed of the arithmetic processor. Currently, the largest high performance computers have processing speeds ranging up to petaflops. (Eadline D., 2009) 4 High performance computers have played an important role in contributing to wealth creation and improving the quality of life through enabling the development of new products and processes with greater efficacy, efficiency or reduced harmful side effects, and in contributing to our ability to understand and describe the world around us. Over the past decade, High-performance 'supercomputers' are becoming tools of international competition and they play an important role in scientific research. Many of the national and international problems we face involve complex computations that only high-performance computers can solve. Some examples of this includes automotive, weather forecasting, popular entertainment, aerospace, electronic, and pharmaceutical industries that are becoming more reliant on the use of high-performance computer in the analysis, engineering, design, and manufacture of high-technology products. (Graham, et al., 2004) HPC was originally pertaining only to supercomputers for scientific research, but high-performance computing migrated to the business world. In the business world, the capabilities in developing, manufacturing, and applying high-performance computing are also crucial in the rapid changing global economy. To lead the increasing international competition in businesses, some of the largest enterprises require power of a high performance computer. (Sterling, et al., 1995) The term HPC is sometimes used as a synonym for supercomputing, although technically a supercomputer is a computer that performs at or near the currently highest operational rate for computers. A supercomputer is one of the fastest kinds of computer. It is a machine used to perform parallel processing through the use of multiple processing elements. It uses hundreds or thousands of processing elements and are generally more difficult to program. It is very expensive and is commonly used for special applications that 5 require intense calculation tasks. These are usually used in research and development in many areas of science and engineering, as well as national security and defence. Because supercomputers are very costly, only big companies, large academic institutions and government military agencies are able to afford and use these systems. (Graham, et al., 2004) Supercomputer-like performance also could be achieved through high performance computing clusters. "Cluster" is an ambiguous term in computer industry. Given the ambiguity in the usage of terminology and blurred boundaries between various technologies, we define a cluster as set of computers which are connected to each other, and are physically located close to each other, in order to solve problems more efficiently. It is a type of parallel or distributed system that consists of a collection of interconnected computers and is used as a single, unified computing resource. (Kant, 2005) A cluster of networked computers are usually deployed to improve performance and/or availability over that of a single computer, while typically being much more cost-effective than a single computer of comparable speed or availability. This cluster of computers works together so that in many respects they form a single computer. (Bookman, 2003) Clusters are used to run parallel programs for timeintensive computations. They commonly run simulations and other CPU-intensive programs that would take an excessive amount of time to run on regular hardware. A popular related term common in the discussions of computer clusters is the Beowulf term. Some cluster computer is being referred to as Beowulf clusters. Although not technically accurate for all types of HPCs, a Beowulf cluster refers to computer cluster built using primarily commodity components and running an Open Source operating system. (Narayan, A., 2005) The High Performance Computing Group of the Ateneo de Manila 6 University has developed AGILA HPCS. It stands for the Ateneo Gigaflops-Range Performance, Linux OS, and Athlon Processors High Performance Computing System. It is an interdisciplinary project aimed at supporting the computational science and engineering research at ADMU. AGILA used eight (8) compute nodes connected by a 100Mbps Fast Ethernet and supports parallel programming using message passing software such as LAMMPI and PVM to achieve a high performance computing system. (Saldaña, et al., 2001) With the increasing availability of cheaper and faster computers, there is a growing interest in the technological benefits of this system. With the introduction of computer clusters, new applications are being developed to assist researches in specific areas where supercomputer is not practical to use because it is not affordable. There are many applications which can benefit from using clusters of computers. Clusters are being used in a specific area such as rendering. In a cluster for rendering purposes, each node can be capable of running a rendering algorithm and animation can be rendered in parallel, thus cutting down on production time. As an example, the rendering of certain scenes in the movie Titanic were performed on a cluster of Linux-based machine. As such, it is clear that some in the entertainment industry see the potential benefits of utilizing clusters to create movie scenes artificially. (Hope, L. and Lam, E., 2008) Chao-Tung Yang and Yao-Chung Chang, in their study entitled “Apply cluster and grid computing on parallel 3D rendering” used cluster and grid computing on 3D rendering application. In their paper, they used a PC cluster consisting of one master node and nine diskless slave nodes built for the purpose parallel rendering. They used two heterogeneous PC clusters and the clusters were set to different subnets then used a grid middleware to connect the two clusters to form a grid computing environment on multiple Linux PC 7 clusters. They also installed software to manage and monitor incoming or outgoing computing job and schedule the job to achieve high performance computing and high CPU utilization. (Yang and Chang, 2004) In the study conducted by Chao-Tung Yang and Yao-Chung Chang, they used grid computing on their 3D rendering computer cluster. Grid computing is a type of distributed computing using loosely coupled systems. Grid technologies enable large scale aggregation and sharing of computational data and other resources across the internet. A successful, wellknown project in grid computing is SETI@home, the Search for Extraterrestrial Intelligence program, which used the idle CPU cycles of a million home PCs via screen savers to analyze radio telescope data. Over 630,000 years of computational time has been accumulated by the project. It is a great reflection on the power of distributed computing and the internet. These computing technologies promise to change the way we tackle complex problems. (Narayan, 2005, Bookman, 2003) Distributed Systems One way to achieve a high performance computing is through Distributed Computing Systems. A distributed computing system is a group of multiple autonomous processors which are interconnected by a communication subnet to interact in a cooperative way to achieve an overall goal (Ananda, 1991). Processors participating in it are often rather different in size and power, architecture, organizational membership, and so on (Peleg, 2000). These processors do not share a common memory. Information is exchanged by passing messages between these processors. Software supporting distributed computing must be run on each computer to be able to execute the process. 8 The Distributed Computing Server take distributed computing requests and divide the processes into smaller tasks that can run on the individual PCs. It sends applications and some client management software to the client machines that request them. It monitors the status of jobs being executed. And after the machines run the programs, the server assembles the results sent back by the clients and combines them to solve a big rendering task. If the server does not hear from a processing client for a certain period of time, it may send the same application to another system. The server also manages any security policy or other management functions as necessary, including handling the dial-up users whose connection and IP addresses are inconsistent. The complexity of the distributed computing system depends on the size and type of environment. A larger system requires complex resource identification, policy management, authentication, encryption, etc. Resource identification is necessary to define the level of processing power, memory, and storage each system can contribute. Policy management, on the other hand, is used to define the jobs and users are allowed to access a system, as well as the priorities based on the importance of each project. Authentication and encryption are necessary to prevent unauthorized access and data within the distributed system (Erlanger, 2002). With distributed systems, we can derive a bunch of benefits. Increased performance, increased reliability and availability, flexibility and ease of extendibility, modularity, local control, and reduced cost are some of this benefit. (Ananda, A.L., Srinivasan, B. (1991), Distributed Computing Systems: Concepts & Structures, IEEE Computer Society Press, Los Alamitos, CA) 9 These benefits can be easily seen by comparing it to supercomputers. Distributed systems do not require pricey electrical power, environmental controls, and extra infrastructure that a supercomputer requires. Also, unlike supercomputers which programs are written in specialized languages, applications in distributed computing can be written in basic languages like C and C++. Distributed computing also improves the speed of processes. In a case study that Intel did of a commercial and retail banking organization running Data Synapse's LiveCluster platform, computation time for a series of complex interest rate swap modeling tasks was reduced from 15 hours on a dedicated cluster of four workstations to 30 minutes on a grid computing of around 100 desktop computers. Processing 200 trades on a dedicated system took 44 minutes, but only 33 seconds on a grid of 100 PCs (Erlanger,2002) One process that can be speed up by distributed computing is the rendering process. A study conducted in Nanyang Technological University in Singapore, entitled “Grid Based Computer rendering” used render farms, or a cluster of interconnected computers, to boost the rendering speed. Rendering a single frame of a 3D model in professional animation normally takes about several hours. But in this study, by employing two different clusters with a total of 60 remote computing nodes for rendering, the rendering time has been greatly reduced. A 3D picture having 100 frames have been rendered for only 2mins. 50sec. (Chong,A., Sourin, A., Levinski, K., (2003).Grid Based Computer Animation Rendering. Graphite 2006.) Another study that implemented distributed computing is “On utilization of the Grid Computing Technology for Video Conversion and 3d Rendering” a study conducted at different sites in Taiwan. The study implemented rendering, interconnecting four different 10 sites with different PC specifications, using grid computing. They used a special program, MPI-Povray to distribute processes across networks. The study shows that there is a speedup in task processing if there are 8 processors to render an image, which created eight task on the grid platform. (Yang,C., Lai,C., Li, K., Hsu, C., Chu, W. (2005). On utilization of the Grid Computing Technology for Video Conversion and 3d Rendering. Parallel and Distributed Processing and Application, Third International Symposium, ISPA 2005, 442453) Rendering Rendering is the process by means of which a 2D (two-dimensional) image can be obtained from the abstract definition of a 3D (three-dimensional) scene (Morcillo et al.). It is a process depicting the three-dimensional scene as a picture, taken from a specified location and perspective which could add the simulation of realistic lighting, shadows, atmosphere, color, texture, and optical effects. Figure 1 illustrates the problem. The result of rendering could also be unrealistic to the extent of just to appearing as a painting or an abstract image. It is simply a computer graphics transformation process converting 3D models into 2D images (Jeremy Birn,2002). Figure 2.1. Rendering Process 11 The process of producing this 2D (two-dimensional) image in rendering requires several phases such as modelling, setting materials and textures, placing the virtual light sources and the actual rendering itself (Morcillo, et al). Rendering algorithms take a definition of geometry, materials, textures, light sources and virtual camera as input and produce an image or sequence images (used in animations) as output. Considering the development of new techniques and algorithms, rendering is still seen as an intensive process that requires a lot of time in transforming source scene, whether complex or not, into photo-realistic images. High-quality photorealistic rendering of complex scenes is one of the key goals of computer graphics and has been adopted by different institutions and professionals among different fields of expertise. Through the course of time, rendering has been serving as a useful application despite its time constraints and naturally-intensive processing. It has become a useful tool to different companies or professionals in different ways but generally, in carrying out their objectives in a pleasant and detailed manner. One specific example is the architects who use this process to present designs to their clients (3DStormstudio, Inc.,2009). Most clients cannot understand elevations or floor plans and are simply not trained to visualize 3D forms based from simple or a complicated 2D drawing, therefore architects use rendering to communicate their designs easily to clients, leaving no room for confusion. The media industry is another avenue for rendering to be used as a tool since it demands high fidelity images for their 3D synthesis projects which can be manifested in the advertisements, campaigns, and programs which they produce. For animators, this is used to produce quality animation which is truly visually pleasing and aesthetic. 12 Depending on the rendering method and the scene characteristics, the generation of a single high quality image may take several hours or even days (Morcillo, et.al) In some instances, rendering may take from seconds to days for a single image/frame. Because of the huge amount of time required to be done, the rendering phase is often considered to be a delaying factor or step in photorealistic projects in which one image may need some hours of rendering in a modern workstation. Rendering sometimes takes a long time, even on very fast and highly-advanced computers. This is because the software is essentially "photographing" each pixel of the image, and the calculation of the color of just one pixel can involve a great deal of calculation, tracing rays of light as they would bounce around the 3D scene. To render all the frames of an entire animated movie can involve hundreds of computers working continuously for months or years.(Birn, J., 2009) Considering the nature of rendering and knowing the computational requirements of rendering are huge, obtaining the results in a reasonable time on a single computer is practically impossible. For that, several approaches based on different technologies have been developed to assure the affordability, efficiency, and quality in the rendering process. One way to lessen the rendering time is through a supercomputer. The supercomputer called Eka, or the sanskrit term for “the one,” is the fastest supercomputer in Asia today and is a main component of 3D animation in Pune, India. Eka was last used in the rendering of the film Roadside Rodeo, the first international quality of India. It accomplished the rendering job in five months. The supercomputer has been very pivotal in reducing the rendering times for animation frames, computer generated imagery (CGI), visual effects 13 (vfx) and compositing in the domains of high end 3D modelling, 2D & 3D animation and game asset development. A study in Purdue University entitled "Using 3D Computer Animation Tools to Render Complex Simulations" created several simulations and visualizations of the terrorist attack on the Pentagon building that happened on Sept. 11, 2001. The process took advantage of both animation and finite element analysis (FEA) simulation techniques for visualization. During the main production of the project, supercomputers were being used to get useful results. However, this project was halted due to lack of computing resources. The researchers are thinking of the possibility of distributing the solutions to many computers(distributed computing) which could hold the key to scaling the simulations. In spite of the great speed that supercomputer offers, only big companies can afford its high price. Hence, other systems such as distributed computing systems are being considered as alternatives. Today, since technologies are being replaced almost monthly by new ones and stacks of old processors are increasing, old computers can be recycled to create a distributed computing system. Through this, we can have the performance of a supercomputer at an affordable price. Different studies presented above offered a high performance computing. The making of the film Roadside Rodeo, and the animation of the terrorist attack in Pentagon, both used supercomputers for rendering. On one hand, the different studies on Grid computing provided an alternative solution by implementing grid computing on rendering. These studies 14 may provide a great speed but not low cost because they used high cost processors, and even supercomputers. The study of Chao-Tung Yang and Yao-Chung Chang, the “Grid Based Computer rendering” study in Singapore, and the other studies mentioned earlier, rendered their task using grid computing. They interconnect the computers in one area and in different sites using internet. However, this kind of setup is difficult to maintain because of the different locations of the client. Our study aims to use the unused PCs of the engineering department which will give us the same performance of the previous studies at an affordable price. We also aim to achieve high performance 3d rendering through tightly coupled or physically connected systems. With the processors based in one location, maintenance and troubleshooting will be easier. 15 Chapter 3 METHODOLOGY Process Construction Figure 3.1 Process Construction 16 The first step in this study is to find and gather as many available PCs suitable for the research purposes. The number of CPUs that will be used impacts the rendering speed. The more PCs attached in the distributed system and the better the specs found, the much faster it is to render. For this study, we will be using the unused PCs from the Engineering Laboratories of Ateneo de Naga University as well as the PCs from the CISCO Laboratory. We will be using 12 old PCs and 12 new PCs from the CISCO Laboratory. Once the PCs have been selected, the operating system to be used will then be installed. The next step is to configure a server. After configuring, the server and the PCs will be joined together in a communications network so that it is possible to exchange information between different devices. Then, a domain will be created, and the PCs will be joined in this domain. Through this, user management is consolidated on one central server. User management is an essential distributed system requirement for it controls the actions of every member of the system. It assures security by controlling the access of members to a rendering queue by reprioritizing jobs, stopping them, restarting specific frames, changing frames to be rendered and so on. Rendering queue is a sequence of render programs waiting processing. It gives permissions as well as restrictions to the users. Logging in and changing credentials to the computers will require permissions which can be managed in the server rather than monitoring individual PCs in the system. Having a server in the system also handle issues regarding an important network consideration: network storage. Each PC in the system will need access to the same location from which it will read its 3D instructions and, after executing them, store its results. Since the server is already configured, centralizing server resources can easily be done by sharing access to a common location. 17 The next step is to install the programs that would be needed by the system to perform rendering. The major programs to be used include a rendering software and queuing software. This project will be using open source distributed queuing software as the middleware system for distributed computing. This queuing software will have the function of dividing a rendering job into multiple parts and deciding which PC from the system executes which part and when. We will also be using open source 3D rendering software. There are freeware Queue managers and 3d rendering software available. However, the 3d rendering softwares are not designed for distributed systems. Also, the queue manager does not know anything about rendering job. Our task is to make these two software work as a whole. Since we will be using an open source queue managers and 3d rendering software, the source code can and will be edited for the programs to work together. The final step in the process is the testing and analysis. The system will be tested to see whether the objectives of this study are met or not. If the objectives are met, then the result of the study is successful. If the objectives are not met, additional PCs will be added to the system to improve the speed of rendering. Then the additional devices will be joined to the domain, be configured to share a common storage location with the system and again be tested. 18 Physical Setup In this project, we will be joining old PCs and a server in a network. This will be done to allow the devices to communicate and exchange information with one another. The physical set up of the devices is shown in the figure. Figure 3.2 Physical Setup As shown in the figure, the devices to be used are old PCs, a server, a gigabit switch and network cables. The rendering will be done in one of the PCs. The old PCs will be installed with an operating system and the necessary programs to support in the distribution of rendering processes. The server will be the one responsible for user management and resource sharing in the system. It is also responsible for the division of rendering processes and the distribution of these processes to the PCs. The gigabit switch is the intermediary device used to join the end devices to the network by connecting the end devices to the switch through the network cables. 19 Test and Evaluation Testing will be done by comparing the rendering time of 3D files using distributed systems of low specs PCs(old computers) and distributed systems of higher specs PCs(CISCO computers). 3D rendering files will be acquired. These files will be rendered in the two sets of distributed systems. The results should show that rendering with higher specs of PCs is faster than the lower ones. Also, after the trials with 12 PCs in the network, we will decrease the number of PCs in each system. We will also add PCs with higher specs in each system. Results should show that decreasing the number of PCs will render slow, and adding additional high specs PCs will increase the rendering performance of each system. Results will be recorded in a table and will be graphed after gathering all the data needed. Using the data gathered, analysis will be done. A comparison of the rendering speed will be made. This should prove the theory that the higher the number of PCs and the higher the specs, the faster the rendering speed will be. A positive result will support the idea that the distributed computing system with a certain number of PCs can achieve the same speed as supercomputers or as high speed PCs. However, if rendering results are not successful, troubleshooting will be done by adding and configuring more units of computer to the system then updating the queue manager to achieve a faster rendering speed. 20 21 Cost Estimate Item Cost 16-port Gigabit Switch 9000 php 20RJ45 120 Php 10 meters UTP Cable 150 Php Table 2 Cost Estimate 22 References: Sterling, T.L., et al. (1995). Enabling Technologies for Petaflops Computing, The MIT Press, US. Graham, S.L., et al. (2004). Getting Up To Speed the Future Of Supercomputing, The National Academies Press, Washington, DC. Baker, M., Monash, R.K. (1999). Cluster Computing: A High-Performance Contender. Technical Activities Forum, pp. 79-83 Hope, L., Lam, E. A Review of Applications of Cluster Computing. Yang, L.T, Guo, M. (2006). High-Performance Computing Paradigm And Infrastructure, John Wiley & Sons, Inc., Hoboken, NJ. Ananda, A.L., Srinivasan, B. (1991), Distributed Computing Systems: Concepts & Structures, IEEE Computer Society Press, Los Alamitos, CA Yang,C., Lai,C., Li, K., Hsu, C., Chu, W. (2005). On utilization of the Grid Computing Technology for Video Conversion and 3d Rendering. Parallel and Distributed Processing and Application, Third International Symposium, ISPA 2005, 442-453 Chong,A., Sourin, A., Levinski, K., (2003).Grid Based Computer Animation Rendering. Graphite 2006, 4th international conference on Computer graphics and interactive techniques in Australasia and Southeast Asia, ACM, New York, NY, USA Ziff Davis Publishing House Inc.(2001), ExtremeTech (April 2002), from http://www.extremetech.com/article2/0,2845,1154100,00.asp IBM (2005). High-performance Linux clustering, Part 1: Clustering fundamentals. Retrieved September 2009. http://www.ibm.com/developerworks/linux/library/l-cluster1/ Bookman, C.(2003), Linux Clustering: building and maintaining Linux Clusters, New Riders Publishing, USA Morcillo, C.G,et al, (2008). 3D Distributed Rendering and Optimization using Free Software. European Journal of the Informatics Professional, Volume III, 45-53. Meador W.S. and Chourasia.A, (2003). Using 3D Computer Animation Tools to Render Complex Simulations. American Society for Engineering Education Annual Conference &Exposition. Jeremy Birn (2002). 3D Rendering. August 2009, from http://www.3drender.com/glossary/3drendering.htm 23 3DStormstudio, Inc (2009).3D ARCHITECTURAL RENDERINGS & WALKTHROUGH ANIMATIONS, August 2009, from http://www.3dstormstudio.com/ Mediafreaks 3d Animation Studio (2009). Eka, Asia’s Fastest Supercomputer, Keys India 3D Animation. September 2009,from http://blog.media-freaks.com/eka-asias-fastestsupercomputer-keys-india-3d-animation/ Bell, G., Gray, J.,What's next in high-performance computing?, Communications of the ACM, Volume 45 , Issue 2 (February 2002), 91-95 Silberschatz, A., et al. (2005). Operating System Concepts, 7th Edition, John Wiley and Sons, Inc., US. Saldaña,R., et al, (2001). Development of a Beowulf-Class High Performance Computing System for Computational Science Applications, Science Diliman(A Journal of Pure and Applied Sciences), Vol 13, No 2, 97-99. 24