"Virtual Classroom" ENEE408G Fall 2002 Final Project Final Report Group 1 Madhvi Jain Jonathan Shalvi Yasin Ozer Frank J. Patrum Table Of Contents 1 Scope _____________________________________________________________ 3 1.1 Identification_________________________________________________________ 3 1.2 Overview ____________________________________________________________ 3 1.3 General Description ___________________________________________________ 3 2 Requirements_______________________________________________________ 3 2.1 Hardware ___________________________________________________________ 3 2.2 Software ____________________________________________________________ 4 2.2.1 Server __________________________________________________________________ 5 2.2.2 Client ___________________________________________________________________ 7 3 Scheduling ________________________________________________________ 12 4 Conclusion _____________________________________________________ 14 Table of Tables and Figures Figure 2.1-1: Hardware Block Diagram .................................................................... 4 Figure 2.2-1: Software Block Diagram ...................................................................... 5 Figure 2.2.1-1: Server GUI Screen Shot .................................................................... 7 Figure 2.2.2.3-1: Client GUI Initial Design.............................................................. 10 Figure 2.2.2.3-2: Client GUI Screen Shot Final ...................................................... 11 Figure 2.2.2.3-3: Student and Overhead Child GUI Screenshots for Client ........ 12 Table 3.1 Schedule of Tasks ...................................................................................... 13 1 Scope 1.1 Identification This project report considers the basic plan and goals of ENEE408G group 1 to develop a Virtual Classroom type application for desktop personal computer (PC) based systems. This document will define the basic approach and further define the initial tasks accomplished towards the completion of said project. 1.2 Overview The Virtual Classroom is an application with both a server side and a client side. The product was intended to use pre-existing technologies such as UDP to stream video from a server PC to one or more client PCs on a local area network (LAN). Specifically, this application was meant to stream video of a classroom with a teacher using a white board. While similar applications have been developed in the past, this application has slightly different functionality. These differences will be discussed in detail in section two of this document. 1.3 General Description The basic premise of the Virtual Classroom is to provide an "off-sight" group of students with an interactive classroom experience. This is accomplished through the use of Creative Video Blaster USB PC cameras, and Motion Picture Experts Group (MPEG) encoded video being streamed from a server PC to the client PCs. While the server is of utmost importance to this application, the client is where most of the functionality design is implemented. Section two of this document will break down the various components of the Virtual Classroom and explicitly identify subsystems that will comprise each part. 2 Requirements While no specific requirements have been levied on this project by the instructors, we as a team, developed several tasks that we feel set the Virtual Classroom apart from similar applications. This section of this document will highlight these features and try to explain them as concisely as possible. This section will be split into two major sections, hardware and software. The software section will be further split into two subsections, one for each of the system components. 2.1 Hardware The first and least time intensive component of development for the Virtual Classroom application is the hardware. For purposes of this project, we used the Creative Video Blaster USB PC Cameras, henceforth, PC cameras, that were designated for class use. Once the system had been tested using pre-recorded video, we intended to use multiple PC cameras attached to the USB ports on the Server PC to capture the live video to be streamed to the client PCs. This situation never actually came about due to difficulties accomplishing other “core” tasks in the allotted time. For the primary video stream, two cameras were supposed to be used. In fact, two cameras were used but never as a live video feed. The cameras were used to capture video to files that were then manipulated as per the “first phase” of the project. We also wanted to use another camera designated as a "student camera" and finally a fourth camera designated "overhead camera", but once again, due to time constraints we did not get to implement this functionality. Besides the PC cameras, PCs will be used. For this project, the minimum PC requirements will be held as equivalent to the PCs maintained in the ECE labs in the AV Williams building at the University of Maryland College Park campus. Generically, that will be an IBM PC compatible machine with Microsoft Windows XP operating system, and a network interface card (NIC) for connection to the LAN. Below is a Block Diagram depicting the basic premise of the physical system. Camera 1 Whiteboard Field of View TCP/IP Connection USB Hub Server PC Off-sight Classroom LAN Field of View Client PC Client PC Camera 1 Overhead Overhead Camera Figure 2.1-1: Hardware Block Diagram 2.2 Software The software has two major components. These are the server side application and the client side application. Each of these will be explained in the following sections and broken down further into “modules” that will be used to make explanation of the tasking more efficient. Below is a figure that depicts the data flow through the Client PC software and gives a brief glimpse of the module breakdown. Camera 1 Server PC USB Hub Camera 2 Splice Convert To MPEG Transmit via TCP/IP Client PC Manipulate Video TCP/IP Stream Overhead Camera Data In Via TCP/IP MPEG Stream over TCP/IP Data In Via USB Data Conversion to MPEG GUI Data Conversion To raw video Video Splicing of raw video Conversion from raw video to MPEG video GUI Advanced DSP functions Conversion from MPEG to raw video Basic DSP functions Second GUI (Overhead cameras) Figure 2.2-1: Software Block Diagram 2.2.1 Server The sever component was designed to be comprised of several subsystems. These subsystems are video capture, video splicing/encoding, and video streaming, as well as the Graphical User Interface (GUI). Each subsystem is discussed in detail below. 2.2.1.1 Video Capture Before the video can be transmitted to the client PCs, it must of course be recorded. The video capture element of the server designed to interface with the USB cameras and store the data transferred from them. This task required knowledge of the USB interface standard as well as understanding memory allocation. This subsystem did not ever actually get written as it was intended to be one of the last primary functions to be implemented as will be shown in section four. The reason for the lack of implementation was the difficulty of interfacing with the USB standard as well as actually integrating the various parts of the system with the GUI. For initial system tests, we used pre-recorded video as was stated previously. 2.2.1.2 Video Splicing/Encoding Once we have a raw video signal, the server takes the two streams (from two PC cameras or from two prerecorded files) and splices them together. This is the first actual DSP task that the Virtual Classroom implements based on new algorithms. After the two streams have been acceptably spliced into one "seamless" image, the video was to then be encoded for transmission. The actual video encoding never got implemented. The splice code was written and demonstrated successfully in the project demonstration. For the sake of speed and size, we intended to use a preexisting MPEG-4 codec to accomplish the streaming, but in the end found that we would likely send a simple array of binary data that would then be manipulated using the Dali Libraries, vice a full video stream. 2.2.1.3 Video Streaming The original design for video streaming was to send an MPEG encoded video stream from the server PC to the clients. We envisioned accomplishing this task via Transmission Control Protocol /Internet Protocol (TCP/IP). We eventually chose to attempt this capability using UDP, however, which has a less stringent error checking method and allows more control for data loss that we felt would be necessary to ensure a timely process for the video transmission. In the code base, the Microsoft Socket Libraries were used with the specific class of CAsyncSocket being the primary mode of implementation. Again, due to time constraints, we did not actually get this functional. 2.2.1.4 Graphical User Interface (GUI) The final part of the server component is the GUI. While we believed this would most likely be the easiest subsystem to code, (using the Visual studio SDK provided for the class) we found that to be a false impression. Given the crucial nature of this component for a modern project, the difficulties we faced in getting a working GUI caused delays in every other aspect of our project. The following paragraph contains the basic premise we used in the design of our GUI. In order for our application to have success, it must have an intuitive, easy to use, friendly interface for both the server and the client. Items that will be necessary in this GUI are listed below. Window(s) for viewing the source video Window for viewing the "spliced" video Button to start and stop transmission Menu for video codec to be used (for future expansion of system) Menu for video source selection (different camera options or prerecorded video) Other features that may be implemented to make the server more user-friendly are listed below. Window showing IP address of server (to be used by "dialing in" clients) Window showing number of clients connected Button to accept or reject clients, individually or in groups (this will allow for a more selective participation and prevent "unwanted" clients from "eavesdropping" on the class) Record button that will allow the stream to be stored in the memory for use later as a "pre-recorded" source Below is a screen capture of the Server GUI as was implemented at the time of the final project demonstration and presentation. Figure 2.2.1-1: Server GUI Screen Shot 2.2.2 Client The heart of the Virtual Classroom is the client component. This is where the majority of the functionality is implemented that will set the Virtual Classroom apart from similar technologies. It is the client's capability to manipulate the final video stream and see what he/she wants that makes the Virtual Classroom unique. There are several subsystems that complete the client application. These are, the video input and display, video manipulation, and, of course the GUI. 2.2.2.1 Video Input and Display We intended for the video from the server to be received as a UDP stream on the client PC. This stream was to be converted, using pre-existing code, to its proper format by stripping the UDP wrappers off of it. Once accomplished, the video was to then be displayed on the GUI for the client to view. This required the production of “viewer” software capable of understanding and decoding the applicable video compression algorithm. We never got the video streaming implemented as stated in the section 2.2.1 of this document. However, the Client application does play prerecorded video files of both MPEG and AVI format. The audio portion of the system has not yet been implemented but the video plays using the MCIwnd functions from the Microsoft Libraries. 2.2.2.2 Video Manipulation Once the video is displayed the user was to be capable of manipulating it in several ways. These manipulations are broken down into two major categories. These are “basic” Digital Signals Processing (DSP) functions, and “advanced” DSP functions. Each is further broken down in the subsections following. 2.2.2.2.1 Basic DSP The basic DSP functions to be implemented in the client application are accomplished in several different manners. These include some or all of the following: manipulation of a video stream, manipulation of still images pulled from a video stream, or some combination of both. The following tasks were to be implemented under the “basic” DSP module, Zoom/Pan and time “reversal” or “jumpback”. The zoom function allows the user to zoom in on a selected area of the video by a factor of two. We intended for this to be a pull down menu that allow a selection between 2X and 5X, but the functionality was hard coded using the Dali Libraries as a factor two zoom. The function written by Yasin to accomplish the same zoom function was also hard coded at factor two but was not implemented. Both sets of code have been included in the group project folder on the S: drive as well as on a CDROM provided with this report. The Pan function was to allow the user to focus primarily on one section of the board, once the zoom had been activated. The dynamic nature of this function was never accomplished, however. The pan function is currently hard coded using the Dali Libraries along with the zoom functions. The function does allow further development to provide for dynamic pan capability, however, we simply didn’t have enough time to implement it. The time reversal function was meant to allow the user to jump back in time at preset increments such as 10 seconds, 30 seconds and 1 minute. This functionality was never actually implemented or coded. We believe that it will not be difficult to implement in the future, using either the I frames or specific array locations (assuming we continue passing simple arrays instead of full video streams) to locate the desired time and then “jump” back to that point in the video to replay it. 2.2.2.2.2 Advanced DSP The advanced DSP functions were considered “additional” features that would not be implemented until the primary implementation was tested. While this did not completely preclude development, we placed our priority on the basic functionality before the advanced functionality. That said, the two major functions that we wanted to implement under the “advanced DSP” umbrella were the ability to remove obstacles digitally (i.e. the professor) and the ability to include multiple video streams. While the multiple video streams were never even attempted, Madhvi produced a basic algorithm that she intended to use as the basis for the professor removal function. Unfortunately, due to its incompleteness, it is not included anywhere in our project. 2.2.2.3 GUI As with the server side of the application, the GUI was expected to be the simplest coding while still maintaining its crucial nature. Especially on the client, the GUI is of the highest priority with regards to design. If the user doesn’t like the look and feel of our GUI, we lose customer. Therefore, we took a lot of time to carefully, plan and design the client GUI. Following are the requirements we felt were necessary to make this a successful application: Window for viewing video stream Menu item for selection of source (place to input server IP address) Pan control based on mouse click in the video window Zoom button with pull down menu for step size Jump button for “replay” with pull down menu for step size Automatic video codec recognition Toggle buttons to activate GUI for “student camera” and “overhead camera” While the above are minimum requirements for the GUI, the following are possible options for a more user friendly interface. Option to save video for later playback Option to “screen capture” the video as a still and print it Audio adjustment control (volume slide) The following two pages are two screen shots of the Client GUI. The first is the initial design, and the second is the implementation at the time of the final demonstration. The difference is primarily due to the fact that we could not tie the video playback functions to the video “frame” on the GUI so the frame was removed. Figure 2.2.2.3-1: Client GUI Initial Design Figure 2.2.2.3-2: Client GUI Screen Shot Final Below are two more screenshots that show the two child windows that were implemented. These windows were designed with the idea that the video from multiple streams would play in them while the primary stream was played in the main client GUI. Figure 2.2.2.3-3: Student and Overhead Child GUI Screenshots for Client 3 Scheduling While this project involved a lot of tasks, the break down into subcomponents and “modules” made it easier for us to develop the various parts as well as being a more efficient technique for managing the work. With that in mind, we as a group decided to break the tasks down into categories of interest for each group member. Each member chose a task or category that he or she would be the Subject Matter Expert (SME) for and had primary control over. It was the SME’s task to ensure that the modules for their individual domain were accomplished according to the schedule set forth below. While the SME had overall authority of a task, he or she did not act as the sole source of manning for said task. The SME also chose a “subordinate” that worked in conjunction with the SME to accomplish all the tasks according to the schedule. This allowed each team member to actively pursue those tasks that most interested him or her, while still getting a good grasp of the other tasks involved in completing the project as a whole. Below is a break down of the major task categories and their respective SMEs as we initially planned them. Each member was the SME for one major task and a subordinate on at least one more task. Major Task/Category SME/Subordinate (alternate) GUI/Architecture Jon/Madhvi Basic DSP Yasin/Madhvi Video/Image Capture Jon/Frank Advanced DSP Madhvi/Yasin Streaming/Communications Frank/Jon Logistics/File Preparation Frank/Yasin Some changes were made to the tasking as the project unfolded. These changes primarily affected Jon and Frank, as Frank took over the role of GUI SME and Jon did more of the Architecture. As shown in the list above, the Communications and Streaming issues were tasked by both Jon and Frank fairly equally throughout the course of the project. While the list above shows major categories, the tasks were further broken down as detailed in section two of this document. Following is the basic schedule that was set forth at the onset of the project. To achieve these milestones, the tasks were broken down for individuals to work on and report back to the SME, and finally to the group as a whole for acceptance or rejection. Table 3.1 Schedule of Tasks Task Assignee SME Due Date Jon Jon Mon 03 Nov Frank Frank Mon 03 Nov Yasin Yasin Wed 13 Nov Madhvi Yasin Wed 13 Nov Frank/Yasin Frank Fri 22 Nov Final Report Write Up Frank Frank Mon 16 Dec Final Report/Presentation Group Frank Fri 13 Dec Madhvi Madhvi Fri 22 Nov Jon/Frank Jon Fri 22 Nov Frank Frank TBD Madhvi Madhvi TBD unassigned Madhvi TBD GUI (code via SDK) Research of USB, TCP/IP (Selection of initial code base (from pre-existing code) Basic DSP on Client Jump back and scaling Pan Code Documentation Users Guide Server Code Base Video Splicing Video Capture/Encoding Multiple video streams Advanced DSP on Client Professor Removal GUI addition Note: TBD due dates are to be determined and currently are considered beyond the scope of the initial project requirements. Should we have time we will attempt to implement these functions once we have a working prototype application. This schedule was drastically affected by the difficulties Frank had implementing the GUI. In the end the documentation and presentation were accomplished as expected, however the other tasks that were completed did not get finalized until the very last day of the project. 4 Conclusion Conclusively, this project was a very difficult task that took much more work than we as a group anticipated. We expected some difficulties with the DSP portions of the code, as well as with the concept of multiple cameras and streaming the video. We found, however that the DSP was actually easier for us to design and get functional because a stronger background based on the material presented in class and in previous classes. For us, the GUI programming and integration of the DSP code with the GUI was the most difficult portion of the project, which was a highly unexpected turn of events. We believe, as a group, that had we had the entire semester with the design already in mind and the ability to work on it, our project would have been in a much more finalized version for the demonstration. As it was, we felt that we accomplished most of the key features we wanted, in design if not in integration, and that we learned a lot from this project as a whole. It gave us a clear idea of project management as well as how to deal with unexpected scheduling issues and difficulties in the design.