Interconference Room Video Communications System Project Plan Team May 03-13 Date of Submission Tuesday, September 24th, 2002 Client Senior Design Faculty Advisor S.S. Venkata ECpE Department Chair Team Members Noah Korba Brian Marshall Nick McInerney Jalal Saidi Melissa Weverka Table of Contents 1 INTRODUCTORY MATERIALS......................... 5 1.1 ABSTRACT ....................................................................................................... 5 1.2 DEFINITION OF TERMS............................................................................... 5 2 PROJECT PLAN ........................................................................ 9 2.1 INTRODUCTION ........................................................................................... 9 2.1.1 2.1.2 2.1.3 2.1.4 2.1.5 2.2 DESIGN REQUIREMENTS ....................................................................... 10 2.2.1 2.2.2 2.2.3 2.2.4 2.3 2.4 Design Objectives ..................................................................................... 10 Functional Requirements .......................................................................... 11 Design Constraints.................................................................................... 11 Measurable Milestones ............................................................................. 12 END-PRODUCT DESCRIPTION ............................................................ 12 APPROACH AND DESIGN....................................................................... 12 2.4.1 2.4.2 2.4.3 2.4.4 2.5 2.6 2.7 3 General Background ............................................................................. 9 Technical Problem ................................................................................. 9 Operating Environment ........................................................................ 9 Intended Users and Uses ...................................................................... 9 Assumptions and Limitations ............................................................ 10 Technical Approaches ......................................................................... 12 Technical Design .................................................................................. 13 Testing Description.............................................................................. 14 Risks/Risk Management ...................................................................... 14 FINANCIAL BUDGET ............................................................................... 15 PERSONAL EFFORT BUDGET .............................................................. 15 PROJECT SCHEDULE ............................................................................... 15 CLOSURE MATERIAL ................................................... 17 3.1 PROJECT TEAM INFORMATION ............................................................. 17 3.2 SUMMARY.................................................................................................... 18 List of Figures Figure 1: A diagram of our system ................................................................................... 11 Figure 2: Project Schedule ................................................................................................ 16 List of Tables Table 1: Financial Budget ................................................................................................. 15 Table 2 : Personal Effort Budget ...................................................................................... 15 Introductory Materials 1.1 Abstract Many large organizations are incorporating telecommunications into their everyday routine. To make this technology available to smaller institutions, a low cost multi-conference room video-communications system will be created. To accomplish this, the system will be Internet-based, and provide two-way audio and video streaming. This design allows information to be presented around the world. 1.2 Definition of Terms Asymmetric Key Encryption – AKE is usually known as Public/Private Key Encryption. The most famous, RSA, is one of the worlds strongest overall encryption methods. AKE involves choosing a strength (“bit strength”), and producing a set of keys. (which are then owned by the key set owner) The public key is made available to anyone who wishes to send you encrypted data. The data is run through the public key, and then transmitted to the key set owner. The data then is decrypted using the private key, made available only to the key set owner. The only way to decrypt this data is to own (or forcefully generate) the private key. For comparison, a 56-bit RSA key using the RC-5 algorithm has a possibility of 72 quadrillion combinations, and takes (using parallel processing) about 100 days to crack. Audio Capture Device – A device attached to the computer, usually plugged into the PCI bus on a computer, which converts sound patterns into Raw Audio Data. Usually dubbed a “Sound Card”. CODEC – See both Compression and Decompression Compression – The “CO” in CODEC, compression is the method of representing a piece data in a smaller size. In the early days of computing, compression was limited to general algorithms designed to shrink the size of raw, random data. In recent years, compression has become specific, from Audio/Video Compression (such as DiVX ;) or MPEG-3) to Textual-Based Compression. These newer compression methods use a combination of shiftingframe pattern recognition and data removal to shrink the size of data - without sacrificing quality. Most compression utilities can be configured to favor a smaller size over data quality and vice versa. Decompression – The “DEC” in CODEC, decompression is the method of recreating raw data from a compressed piece of data. The primary job of the decompression portion is to detect the style of compression used on a file, and produce raw data, in the best form as possible. Many decompression algorithms implore decompression and best-guess techniques to predict and reproduce data as close to the original as possible. Decryption – The process of revealing concealed (“encrypted”) data with the proper authorization. Demultiplexing – The process of taking a multiplexed stream of data, and splitting it back into its original individual parts. Encryption – The process of concealing data in an unreadable form, unless proper authorization is possessed to view (“decrypt”) the data. MCP – Acronym for motion controlled platform Multiplexing – The process of taking individual data streams and combining them into one single data stream. Also called a multiplexed stream. Raw Audio Data – The format of data in which standard audio generation can be applied to, to produce audible sounds through the users output device. For example, Compressed Audio Data (such as MPG – MPEG3 Compression) cannot be heard unless it is decompressed into Raw Audio Data. Only then can it be successfully heard on the users audio output device. Raw Video Data – The format of data in which standard graphical rendering can be applied to, to produce full-motion video on the users computer screen. For example, Compressed Video Data (such as DiVX ;) – MPEG-4 Compression) cannot be displayed unless it is decompressed it into Raw Video Data. Only then can it be successfully rendered on the users video output device. RPC – Remote Participant Control Socket – When someone says socket, they are usually using it in reference to Berkeley Sockets – a programming interface for communications over the Internet. Implementations of Berkeley Sockets are used on virtually all operating systems, from UNIX (IO-Socket libraries) to Windows (WinSock). They provide a seamless way for programmers to create network applications. Older UNIX implementations contained three states – Raw (on top of IP), TCP/IP and UDP/IP. Most implementations now provide access to only UDP/IP and TCP/IP sockets. Sockets allow one to create both clients and servers for network communications. TCP – TCP stands for Transmission Control Protocol. Most noted as being the Internet Protocol, TCP was established in the late 1970’s to provide reliable communications over packet-switched networks. Unlike its sibling UDP, TCP provides a three-way handshake to ensure a solid connection between two computers. TCP also ensures packet transmission success by providing a CRC (cyclical redundancy check) at the end of the packet to determine if, during its course of broadcast, the packet is corrupt. If so, TCP has built in send/resend as well as flow control to ensure the packet arrives safely and unharmed. If the physical connection causes the non-transmission of packets (a break in the line, for instance), TCP will “timeout” on the connection, and return an error to the user. Most application layer protocols (such as Telnet, FTP, HTTP and SSH) use TCP as their primary transport method. Stream, Streaming – The word “stream” is used for many different applications. Most notably, socket communications over the Internet are called “streaming” since they move data from one point to another constantly, like a river. Streams can also be referenced to raw unprocessed data, which usually comes directly from a socket buffer. Symmetric Key Encryption – SKE involves the process of “secret key” encryption and decryption. Unlike AKE, where everyone can know your public key, SKE keys must be held by only those involved in the data exchange. SKE’s are often a one-to-one key. The most basic SKE was invented by Julius Caesar. This cipher (known as the Caesar Cipher) was based off of the equation E=((N+K) modulus 26), where N is the numerical representation of a letter (A=1, B=2, etc), and K is the modifier. The modulus 26 is to ensure the numerical character range stays between 1 and 26 (A and Z). Synchronized Multimedia Stream – A synchronized multimedia stream is exactly what it implies – a stream of data in which video is matched up to its corresponding audio. This stream will eventually be split apart by the same means. For example, if you did not convert your multimedia stream to a synchronized multimedia stream, video of you talking may not match up to the corresponding audio. In Hollywood, this is called “bad dubbing”. TCP – TCP stands for Transmission Control Protocol. Most noted as being the Internet Protocol, TCP was established in the late 1970’s to provide reliable communications over packet-switched networks. Unlike its sibling UDP, TCP provides a three-way handshake to ensure a solid connection between two computers. TCP also ensures packet transmission success by providing a CRC (cyclical redundancy check) at the end of the packet to determine if, during its course of broadcast, the packet is corrupt. If so, TCP has built in send/resend as well as flow control to ensure the packet arrives safely and unharmed. If the physical connection causes the non-transmission of packets (a break in the line, for instance), TCP will “timeout” on the connection, and return an error to the user. Most application layer protocols (such as Telnet, FTP, HTTP and SSH) use TCP as their primary transport method. UDP – UDP stands for User Datagram Protocol. Sibling to TCP, UDP provides connectionless communications, providing no guarantee of data delivery or integrity. UDP does not support “timeout”, so it is the sole responsibility of the programmer to determine when a set amount of time has passed to call a connection “dead”. UDP is most often used when the overhead of a TCP connection is too much for the connection needed. UDP is most often used for streaming audio and video applications, where error correction and packet retransmission are detriments to overall performance. A handful of applicationlayer protocols use UDP (DNS, older versions of RealAudio and Shoutcast) as transport methods. Most programs that require the functionality of UDP for “streaming” implementations are now moving to the Multicast Standard. USB – Acronym for Universal Serial Bus Video Capture Device – A device attached to the computer, usually plugged into the USB (Universal Serial Bus) port on a computer, that converts light patterns into Raw Video Data. Visual Aid – Any format of a graphic that will illustrate what a speaker is attempting to communicate. Ex. Whiteboard drawings, notes on a piece of paper. 1 Project Plan 2.1 Introduction 2.1.1 General Background The Interconference room visual communication system’s objective is to create a low cost alternative to current telecommunication technology. To keep this system low cost, some traditional features were modified to make the system more cost effective while maintaining a relatively high level of quality. For example, the Internet will be used as the communication medium, since it is free, and already established. Also, the rooms will be limited in size to reduce the cost of equipment. Finally, the system will be portable so it can be shared between many people in the institution. 2.1.2 Technical Problem In order to make the system efficient, the project will be split up into two different segments – the hardware segment and the software segment. Hardware The hardware segment consists mainly of creating, buying and testing the hardware that will interface with the rest of the system. It will also involve creating drivers to interface those hardware devices with the software portion of the project. Software The software segment consists of writing a Graphical User Interface, implementing CODECs for audio, video and data compression, creating an encryption and decryption engine for the system, and providing seamless integration with the ever-changing technology existing on the Internet. 2.1.3 Operating Environment The software will be written for the Microsoft Windows family of operating systems. The hardware will be used indoors, but will need to be slightly durable, since it is portable. 2.1.4 Intended Users and Uses The intended users of the multimedia conference system are those who need to conduct meetings with others who are not in the immediate area, or cannot attend a meeting in person. It is designed for both the novice and expert computer user - providing an easy to understand interface for the beginner, along with the ability to control virtually all aspects of the system for the computer guru. 2.1.5 Assumptions and Limitations Several assumptions have been made for this project. This project also entails limitations as well. The assumptions and limitations are defined below: Assumptions Conference rooms will be small in size There will be only two rooms in communication with each other Each room will contain a maximum of five participants Limitations Internet Bandwidth Maximum CPU processing power Compression strength 2.2 Design Requirements 2.2.1 Design Objectives The project has been broken down into the objectives listed below. Display multimedia on both participants screens Both conference rooms will be able to see the remote video, as well as hear remote audio. Compressed and Encrypted video and audio streams The system will use various audio, video and data compression techniques to condense the audio/video streams, to conserve both participants bandwidth. The system shall also use an optional medium-strength Asymmetric Key encryption technique to provide data security- so that only the authorized participants are allowed to view the audio/video stream. Motion Controlled Platform (MCP) A platform will be created in which a camera could be mounted on. This will allow the camera to be moved both up and down and right to left. Figure 1: A diagram of our system 2.2.2 Functional Requirements The following defines the functional requirements the end product will perform. All participants can operate the system The system will be designed to provide ease of both setup and use, so that participants with a wide knowledge of computers can operate the system. Participants can choose to enable encryption and/or compression The system shall allow the participants to turn on or off both compression and encryption, to increase transmission performance, or save on CPU cycles. Local Camera Control The system shall give each participant local control over their camera, to focus the camera on a specific person, or all of the participants. Portability The system can be easily moved and set up in different conference rooms. 2.2.3 Design Constraints This section defines constraints considered during the design and implementation of the project. Cost The system must be low cost while maintaining its functionality. Local Internet bandwidth Since video will be streaming over the Internet, there must be sufficient bandwidth to view real time video at all client computers. CPU processing power The server must composite and stream real time audio and video, as well as process multiple incoming audio streams. This will require significant CPU power. 2.2.4 Measurable Milestones The following lists the measurable milestones of the project. Project scope and intended features defined (5%) Features for the completed system are defined that would create a complete and usable product. All subsystems’ functionalities and interfaces designed (20%) All subsystems can be implemented using the documented subsystem description and interfaces. All subsystems function properly under controlled conditions (30%) Using test inputs, all subsystems pass limited tests and can produce expected output for all features, even if under controlled inputs. All subsystems function properly under all conditions (20%) Using test inputs, all subsystems will work under all foreseeable conditions and produce expected outputs. Complete system operates under controlled conditions (15%) Using completed subsystems, the final product can be assembled, and produce expected outputs under controlled conditions. Complete system operates under all conditions (10%) The completed system will operate in a useable state, with no known errors. 2.3 End-Product Description The system will connect two small conference rooms with a small number of participants per room. It will broadcast both audio and video from one room to another using the Internet. The cameras will be mounted on a motion-controlled platform, which allows them to be focused on individual speakers, as well as the entire group. The system will be easily portable from one room to another, allowing many people to utilize one system. 2.4 Approach and Design 2.4.1 Technical Approaches This system will be composed of a software subsystem and a hardware subsystem. The following lists the different approaches that will be considered for the design phase of the project. Software The software segment of this system can be approached several ways. The various software components can be seen as self-contained modules that will communicate through an external language such as TCP/IP or pipes, or it can be seen as one module with different components like DLL’s or classes. The network communication can be seen one of two ways as well. It can be approached as a client-server model, or a peer-to-peer model. Hardware Motion Controlled Platform (MCP) There are several ways in which MCP could be designed. The MCP can be controlled using an X-Y axis location process with a joystick or similar device. It could also be preprogrammed to move to specific positions in the conference room, focusing on a specific area, rather than a panoramic view of the entire room. Loud Speakers Self-amplified computer speakers could be used or an amplified version of a stereo speaker via a receiver could be implemented. 2.4.2 Technical Design Software Server model The two conference rooms will each have a computer that will run the audio and video feeds. These computers can talk to each other as peers, or one can establish itself as a server to set conference properties such as encryption strength and compression type. Compression/Encryption The conference can be encrypted and compressed. There are numerous different algorithms to accomplish these tasks. Either one algorithm for each category will be used, or several will be available to the user to choose. The user will have the option to turn off encryption, since it will produce additional CPU strain. Video Capture The video from the cameras will need to be captured into a raw digital video format. Depending on the camera used, this can be done by writing an interface driver for a digital camera, or writing an interface driver for a video capture card. Audio Capture The audio for each conference room will be input to the computer through the sound card, so the raw audio will need to be extracted either from the operating system, or the sound card itself. Hardware Audio system The microphone will handle transmission of audio signals from one location to another. The quality of the microphones needs to be such that the remote participants can understand the conversation. The audio speakers will amplify audio signals from remote site. The audio speakers can be amplified so that all local participants can hear the conversation. . Video System The USB digital camera will record visual movements of the participants. The MCP will allow the USB digital camera to be moved to various positions. The MCP needs to be easily controlled by all local participants via a remote control device. . 2.4.3 Testing Description The system will be tested in phases. Each subsystem will be tested using a white box testing model by it’s designer to eliminate any obvious errors, and ensure that all desired functionality is present. Any errors found errors will then be corrected and the subsystem will be black box tested by a third party to ensure that it will work under all circumstances. Any errors will be corrected, and the subsystem will then be available for incorporation by the complete system. After the subsystems have been shown to be free from errors, the complete system will be assembled and white box tested, and all features will be verified for proper functionality and usability. Any visible errors will be corrected, and the entire system will be black box tested by a third party. Any remaining errors will be fixed to produce an error free final product. 2.4.4 Risks/Risk Management Team members might leave group For various reasons team members may be forced to quit the project. To cope with this problem, we will make sure that every task can be performed by at least two team members. Cannot get parts on time There may be a problem obtaining parts in a timely fashion. To minimize this risk, the parts will be ordered as soon as possible from a reputable company 2.5 Financial Budget Table 1 contains detailed financial estimates for the project. Table 1: Financial Budget Item Poster USB Digital Cameras Microphones Parts Original Estimated Cost $50 $150 $20 $30 Total Estimated Cost $250 2.6 Personal Effort Budget Table 2 contains detailed time estimates for the project. Table 2 : Personal Effort Budget Noah Korba Develop Project Plan Develop Project Poster Develop Project Design Implement Project Design Develop Final Report and Presentation Total Estimated Effort 2.7 Project Schedule Nick Brian Jalal McInerney Marshall Saidi 6 5 8 13 13 13 33 30 27 61 65 50 6 113 8 113 6 98 Melissa Weverka 0 0 0 0 0 0 Total Team Effort 5 24 15 54 29 119 61 237 8 110 28 434 Figure 2: Project Schedule 3 Closure Material 3.1 Project Team Information Team Members Noah Korba 2724 Stange Ave. #3 Ames, IA 50010 (515) 451-6125 nkorba@iastate.edu CpE Nick McInerney 425 15th St. Ames, IA 50010 (515) 232-2131 nmcinern@iastate.edu EE Brian Marshall 5521 Friley Nilesfoster Ames, IA 50012 (515) 572-5090 bmarshal@iastate.edu CpE Jalal Saidi 3414 Orien Dr. #140 Ames, IA 50010 (515) 232-9930 saidij@iastate.edu CpE Melissa Weverka 1233 Frederiksen Ct. Ames, IA 50010 (515) 572-7693 mweverka@iastate.edu EE 3.2 Summary Upon completion of this project, the cost efficient video conferencing system utilized all the needs of both the presenter and the active listening audience. This system was not only less expensive than some of its predecessors, but actually had some unique features like the remote users ability to change or modify the presenters notes or having the presenters screen indicate a message window when a concept is questioned by either the remote user or audience member. The designed video conferencing system met or exceeded its goals in a timely fashion due to the unlimited knowledge and experience from the participating group in this project.