Department of Electrical and Computer Engineering
California State University, Los Angeles
5151 State University Drive
Los Angeles, CA 90032 USA
Abstract ─ This paper focuses on the development of a high-performance information server for web-based education. An innovative model of software architecture is provided to effectively utilize the computational power of a parallel server platform for efficient, on-demand astronomical image browsing through the Internet.
Our previous research revealed the demand for astronomical image browsing raised by various communities engaged in educational and research activities. Additionally, we have characterized network performance under different levels of activity and identified techniques for efficient image transmission over the Internet. Based on our findings, we have developed a parallel server which will handle an arbitrary (typically large) number of simultaneous requests for astronomical image files from distributed clients. The server is capable of disseminating data at different transmission rates to accommodate the various network bandwidth restrictions, real-time display requirements, and/or image resolution requirements of different communities.
In our design, a tuple space programming paradigm is used to enable parallel processing of the image browsing requests. This architectural model supports automatic load balancing to fully utilize the computational power provided by the parallel server. Additionally, a hashing algorithm is used for fast look-up of astronomical image files in the database. Since different image resolutions and transmission rates may be required, multiple worker processes (known as threads) are employed to perform progressive, on-demand image decompression and transmission using a wavelet-based transformation algorithm. This approach facilitates efficient use of system processing and communication resources while providing the flexibility to serve a diverse clientele. The various service parameters can be explicitly defined by the client or implicitly analyzed by a controller thread on the server side, thus providing the “best effort delivery” given realistic constraints.
Keywords ─ Aerospace, high performance server, tuple space, hashing, wavelet compression, parallel processing, web-based education, image transmission,
I.
I NTRODUCTION
The National Aeronautics and Space Administration
(NASA) in 1994 provided funding to establish the
Structures, Pointing, and Control Engineering Laboratory
(SPACE Lab) at California State University, Los Angeles
(CSULA). The goal of this laboratory is to design and fabricate platforms that resemble the complex dynamic behavior of a segmented space telescope, the James Webb
Space Telescope (JWST) [1], and its components. Since its inception, the laboratory has made efforts to use the most current computer technologies to develop a prototype information server for the purpose of disseminating multimedia files related to the project. The target audience includes communities ranging from professional and amateur space scientists to students, educators, and the general public. Such efforts meet NASA’s mission to encourage space exploration and research through education. To promote the awareness of NASA’s missions, current digital technologies can be used to facilitate the establishment of networks not only for scientists and engineers of today, but for generations to come.
This paper focuses on the development and implementation of an Aerospace Information Server to support efficient, on-demand information dissemination.
The SPACE Lab has endeavored in the development of a parallel tuple space server (Aerospace Information Server,
AIS) for web-based education. Specifically, the design is focused on, but not restricted to, astronomical image browsing. Connection to the server will be Internet-based.
As such, the server must be able to cope with multiple simultaneous image browsing requests. Additionally, the
AIS must be able to support various data transfer rates due to network bandwidth restriction, real-time display requirements, and different image resolutions as dictated by the diverse communities accessing the server. In order to implement these requirements and maintain the server’s speed and efficiency, various technologies have been implemented into the server as listed below:
1.
Tuple space programming paradigm for parallel processing and automatic load balancing [7].
2.
Search algorithms utilizing a hash table for expedited access to database files [5].
3.
Wavelet-based image transformation algorithms for progressive image compression/decompression, and image transmission [14], [15].
The paper is organized as follows: Section 2 introduces the hardware of the server. Section 3 describes the software architecture of the server system. Section 4 details the technologies employed. Section 5 describes the implementation issues. Section 6 concludes the paper.
II. S YSTEM D ESCRIPTION OF THE A EROSPACE
I NFORMATION S ERVER
In order to implement the key technologies stated above and maintain real-time performance, a state-of-the-art computer system must be utilized. The Dell PowerEdge
1855 Blade Server, as shown in Fig. 1, was selected as the foundation for the AIS. The modular nature of this system facilitates scalability while minimizing power consumption and physical space required. Multiple server blades are housed in a chassis that contains power supplies, communication modules, and cooling fans shared by the entire system. Each server blade contains two dual-core 64bit Xeon Processors with up to 16 GB of DDR2 shared memory. The two Xeon processors are interconnected by a dual front-side bus running at 667 MHz (see Fig. 2). Each of the cores is outfitted with its own L1 Cache, while the two cores on each chip share a L2 Cache (2 Mb). All four cores have shared access to main memory. The Xeon processor supports Hyper-Threading technology, and hence, two software threads can be established simultaneously in each core of the processor [10].
III. S OFTWARE A RCHITECTURE
The Dell PowerEdge 1855 Blade Server offers many unique architectural features that optimize system performance, making it the ideal platform for the AIS.
Dual-Core and Hyper-Threading technologies are used to implement parallelism and share memory spaces within the server. The software architecture of the AIS was developed to exploit these features and optimize performance. Figure
3 displays a flowchart of the AIS. Each dual-core Xeon processor contains four virtual software threads. The threads are assigned and perform individual tasks depending on their role. Three of these threads are designated as Worker threads and one as a Controller thread. The Controller thread is responsible for initially connecting to and receiving requests from a client. The Controller thread also manages the tuple space, which stores a pool of requests made by clients. The three Worker threads operate in parallel and handle client requests (i.e. searching the server’s database for the corresponding image to a specific image query). Upon completing their assigned tasks,
Worker threads reconnect to the client in order to communicate the results. A more detailed view of information flow within the AIS will be discussed in the following sections.
Fig 3. Flowchart of Aerospace Information Server
Fig 1. Dell Poweredge 1855 Blade Server chassis (left) and single blade server (right)
Fig. 2. Two Intel dual-core Xeon processors on a single blade server
IV. T ECHNOLOGIES E MPLOYED
Tuple space utilizes a content-associative memory archetype for parallel and distributed systems. The technology specifies a depository space for tuples [8].
Through this approach, multiple processing units are able to retrieve tasks from a common pool of requests. Whenever a
Worker thread becomes available, it accesses the tuple space to retrieve a request and process it accordingly. In this way, automatic load balancing is realized among the threads.
A hashing algorithm is implemented within the server for high-performance database access [5]. The technology applies an algorithm to a key (in this case, the file name) to produce a numeric value. This value is then mapped to the hard-drive address of a stored image. The collection of all the key-address pairs comprises a hash table. The functional
programming allows for expedited access to database files by directly associating the name of the requested image with its location in physical memory. Thus, a thread needs only to calculate the hash number of the key and obtain the associated address to access the desired image file, rather than searching the entire database on a file-by-file basis.
Additionally, wavelet-based transformation algorithms are integrated into the system in order to progressively compress or decompress images on-demand [15]. Figure 4 illustrates the concept of the wavelet-based data transformation. In this figure, an image is capable of being divided into regions utilizing separable filters. The region sizes are inversely proportional to the desired image resolution of the user. When the desired resolution has been reached, a sample of the region is taken and transformed into a packet. The packets may then be transmitted and an image rebuilt by a user, based on the contents of the packets
[14]. By offering the client the option of selecting a compression ratio based on individual use, only one highresolution version of an image needs to be stored in the AIS database. This unique approach greatly reduces the amount of physical memory required to house the image library.
Fig. 4 A two-scale discrete wavelet decomposition
V. I MPLEMENTATION OF THE A EROSPACE
I NFORMATION S ERVER
We have used the Hyper-Threading architecture to allow for the definition different roles among processor threads.
These threads include a Controller and multiple Workers.
This delegation of roles allows for efficient communication and task scheduling, as well as automatic processor load balancing.
A. Controller Thread
Initially, a client will explicitly input a request, including the desired name and resolution of the file, to AIS via a specialized graphical user interface (GUI). Note that such features could otherwise be specified by the Controller thread based on the client’s communication constraints and image resolution requirements [2]. The GUI will also take note of the client’s IP address and assign a port number for the client computer to listen to. A tuple is then generated using this information. Tuples used in the AIS adhere to the generalized structure shown below:
<File name, Resolution, IP address, Port number> (1)
Upon transmitting its request, the client will then disconnect from the server. However, the client will remain listening on one of its ports for a reconnection request made by the AIS. This process of connecting and disconnecting allows the AIS to gather more requests within a given timeframe. The server does not need to wait until the client’s request has been fulfilled before it can take another request. Instead, the server is able to disconnect from the client and connect to a different client to obtain a new request.
The request tuple is then deposited to tuple space, which is continually monitored by the Controller thread. The tuple will remain there until a Worker thread is able to accommodate the request.
The tuple space region is located in the server’s shared memory. Therefore, all Worker threads are able to access a common pool of requests at the same time. It is possible that a tuple can be accessed by more than one thread, causing misread data and incorrect results. Thus, in order to maintain the integrity of stored data, semaphores are used.
Semaphores lock the memory location to a specific thread until it has finished its task [13]. Once completed, the thread relinquishes control of the shared memory region.
In our implementation, a semaphore is activated while the
Controller thread is writing a request into tuple space. In this way, it is ensured that a Worker thread does not attempt to read the request prematurely. Semaphores are also placed on the Worker side for protecting the critical section of the shared memory space. When available, a Worker thread will access a request in tuple space and set a semaphore.
This makes certain that the Controller thread is unable to overwrite that memory location. Additionally, the memory location is made unavailable to other Worker threads attempting to access the request.
B. Worker Threads
The role of the three Worker threads in the AIS is to search the database for the desired image. To minimize the time required to perform image searches, a hashing function is utilized. Hashing functions apply a reproducible algorithm to convert a data element, called a key, into a numerical representation of the data, called a hash number.
The hash number is then mapped in a table to the memory location of the particular file. In the AIS, the file name is used as the key. When an image request arrives in tuple space, a Worker thread will apply the hashing function to the file name. The Worker then looks up the hash table entry for the resulting hash number and returns the corresponding memory address. This methodology allows AIS to advance directly to the location of the image rather than searching the entire database for the file. As such, the location of the image in physical memory does not matter anymore. By
utilizing hashing functions, the search time will remain the same regardless of the image’s location in memory. Thus, overall system efficiency is raised.
A key metric for determining the efficiency of hashing functions is collision rate. Collisions occur when different hash keys produce the same hash number. Thus, multiple records must be stored at a single location in the hash table.
To manage collisions, the hash table is organized into buckets. Each bucket number corresponds to a unique hash number. In the case that a collision occurs, a pointer to the memory address of the next image with that hash number is stored in the table. Using this approach, fast and efficient searches are preserved.
The Cyclic Redundancy Check, 32-bit (CRC-32) algorithm was chosen as the hash function used in the AIS.
This function has many characteristics that suit it to this application. CRC-32 produces a 32-bit numerical output for each input, which is easily handled by the 32-bit registers of the AIS. A low collision rate of 0.03% [5] allows for efficient search implementation with large data sets. The function itself requires few calculations, further reducing system overhead and improving search time.
After the Worker thread has found the desired image in the server, it will then reestablish a connection to the client utilizing the IP address of the client and the port number that the client is listening to. Wavelet transformation is then used on a copy of the requested image. The image is broken down into digital components. Each component, or packet, takes a sample of the image portion it corresponds to.
Depending on the client’s initial resolution request, AIS will send a specific percentage of the total number of packets.
Additionally, packet size may be altered to further fine-tune image decompression rates. Afterwards, the client reconstructs the image utilizing the received packets. The obtained image quality is directly proportional to the number of packets the client receives. For a high image quality, a large number of packets must be received to accurately reassemble the image. For lower quality images, the opposite holds true. Below are images showing various image qualities when differing numbers of packets are sent
(Figs. 5 and 6).
Fig. 5 Original image of Linear Comet, Dec. 2001 (64 packets) [12]
Fig. 6 Image reconstruction after A) 4 packets, B) 6 packets, and
C) 32 packets received
VI. C ONCLUDING R EMARKS
This paper focuses on the development of the software architecture to support web-based astronomical image browsing. The architecture utilizes the unique features of the underlying server platform to optimize performance.
Although the design of the Aerospace Information Server is primarily focused on research and education, the parallel server model can be generalized for other applications with similar features. Examples of this include pay-per-view movie multicasting, multimedia information streaming and distribution, and online gaming. Using the progressive compression and transmission paradigm of the AIS, only one high-quality media file would be needed to serve multiple requests with different requirements. The result is a fast, highly efficient parallel server capable of handling large loads effectively. The continued development of the
AIS will include performance evaluation of the utilization of memory and hierarchical resources (i.e. cache memory) at the core and thread levels. Additionally, the Graphical User
Interface on the client side will be expanded and will include image viewing features such as zooming, conversion, and selection.
A CKNOWLEDGEMENT
This work was supported by NASA under Grant URC
NCC 4158. Special thanks go to the faculty and students associated with the SPACE Laboratory.
[1]
R EFERENCES
I. Dasheysky, V. Balzano, “JWST: Maximizing Efficiency and
Minimizing Ground Systems,”
Proceedings of the 7 th
International Symposium on Reducing the Costs of Space Craft
Ground Systems and Operations (RCSGSO) , Jun 2007.
[2] J. Dong, P. Thienphrapa, H. Boussalis, C. Liu, et al,
“Implementation of a Robust Transmission System for
Astronomical Images over Error-prone Links,” Proceedings of
SPIE, Multimedia Systems and Applications IX , 2006.
[3]
A. Dunkels, O. Schmidt, T. Voigt, “Using Protothreads for Sensor
Node Programming,”
Proceedings of the RealWSN 2005 Workshop on Real-World Wireless Sensor Networks , June 2005.
[4] J. Foster, M. Price. Sockets, Shellcode, Porting & Coding:
Reverse Engineering Exploits and Tool Coding for Security
Professionals. Rockland, MA: Syngress Publishing, Inc., 2005.
[5]
Z. Genova and K. Christensen, “Efficient Summarization of URLs using CRC32 for Implementing URL Switching,” Proceedings of
IEEE Conference on Local Computer Networks (LCN) , 2002.
[6] S. Harris, J. Ross. Beginning Algorithms. Indianapolis, IN: Wiley
Publishing, Inc., 2006.
[7]
K. Hawick, H. James, L. Pritchard, “Tuple-Space Based
Middleware for Distributed Computing,” Technical Report
DHPC-128 , 2002.
[8]
H. Lin, K. Quach, W. Zhu, Y. Aung, C. Liu, “Implementation of a
High-Available Parallel Transaction System for Mobile
Multimedia Communication,”
Proceedings of 7 th
World
Multiconference Systemics, Cybernetics, and Informatics (SCI) ,
July 2003.
[9]
C. Liu, J. Layland, “Scheduling Algorithms for Multiprogramming in a Hard-Real-Time Environment,” Journal of ACM (JACM) , Vol.
20-1, pp. 46-61, January 1973.
[10] D. Marr, F. Binns, D. Hill, G. Hinton, D Koufaty, J. Miller, M.
Upton, “Hyper-Threading
Technology Architecture and
Microarchitecture,” Intel Technology Journal , Vol. 6-1, pp. 4-15,
February 2002.
[11] W. Martins, J. Del Cuvillo, F. Useche, K. Theobald, G. Gao, “A
Multithreaded Parallel Implementation of a Dynamic Programming
Algorithm for Sequence Comparison,”
Proceedings of
International Pacific Symposium on Biocomputing , January 2002.
[12] L. Mikkelsen, "AGS High School Astronomy." Snapshots in
November-December 2001. created 19 Nov 2001. EUC Syd and
Amtsgymnasiet . 26 Feb 2007 <http://www.amtsgymsdbg.dk/as/Nov2001/>.
[13]
A. Santosa, “Fast Mutual Exclusion Algorithms: The MPI
Implementation,” unpublished .
[14] J. Shapiro, “Embedded Image coding Using Zerotrees of Wavelet
Coefficients,”
IEEE Transactions on Signal Processing , Vol. 41-
12. pp. 3445-3462, December 1993.
[15]
Y. Zhao, S. Ahalt, and J. Dong, “Content-based Retransmission for
Video Streaming System with Error Concealment,” Proceedings of
SPIE , 2004.