ComEnTS Project Design Team May 03-13 Client Engineering Distance Education Paul Jewell, Coordinator of Technology Faculty Advisors Dr. S.S. Venkata Electrical and Computer Engineering Department Chair Dr. James McCalley Electrical and Computer Engineering Associate Professor Team Members Nick McInerney Melissa Weverka Noah Korba Brian Marshall Date of Submission Wednesday, December 18th, 2002 Table of Contents 1 Introductory Materials ................................... 1 1.1 ABSTRACT ....................................................................................................... 1 1.2 DEFINITION OF TERMS............................................................................... 1 2 Project Design ................................................... 6 2.1 INTRODUCTION ........................................................................................... 6 2.1.1 2.1.2 2.1.3 2.1.4 2.1.5 2.2 DESIGN REQUIREMENTS ......................................................................... 7 2.2.1 2.2.2 2.2.3 2.2.4 2.3 2.4 Design Objectives ........................................................................... 7 Functional Requirements ................................................................ 8 Design Constraints.......................................................................... 9 Measurable Milestones ................................................................... 9 END-PRODUCT DESCRIPTION ............................................................ 10 APPROACH AND DESIGN....................................................................... 10 2.4.1 2.4.2 2.4.3 2.4.4 2.4.5 2.4.6 2.5 2.6 2.7 3 General Background ................................................................. 6 Technical Problem ..................................................................... 6 Operating Environment ............................................................ 6 Intended Users and Uses .......................................................... 7 Assumptions and Limitations .................................................. 7 Technical Approaches ............................................................. 10 Technical Design ...................................................................... 18 Testing Description.................................................................. 31 Testing Form ............................................................................. 34 Risks/risk management ........................................................... 35 Recommendation for continued work ................................. 35 FINANCIAL BUDGET ............................................................................... 35 PERSONAL EFFORT BUDGET .............................................................. 35 PROJECT SCHEDULE ............................................................................... 36 Closure Material.............................................. 39 3.1 PROJECT TEAM INFORMATION ............................................................. 39 3.2 SUMMARY.................................................................................................... 40 i List of Figures Figure 1: System overview ................................................................................................. 8 Figure 2 : Software System Outline .................................................................................. 19 Figure 3 : Hardware System Outline ................................................................................ 20 Figure 4 : Packet Structure ................................................................................................ 20 Figure 5 : FT639 Chip....................................................................................................... 30 Figure 6: Estimated Project Schedule – Fall 2002 ............................................................ 37 Figure 7: Estimated Project Schedule - Spring 2003 ........................................................ 38 ii List of Tables Table 1 : Testing Form ...................................................................................................... 34 Table 2 : Financial Budget ................................................................................................ 35 Table 3 : Personal Effort Budget ...................................................................................... 36 iii 1 Introductory Materials This section contains the abstract and the definition of terms. 1.1 Abstract Many large organizations are incorporating telecommunications into their everyday routine. To make this technology available to smaller institutions, a low cost multi-conference room video-communications system will be created. To accomplish this, the system will be Internet-based, and provide two-way audio and video streaming. This design allows information to be presented around the world in a low cost manner. 1.2 Definition of Terms AC – Alternating Current. The back-and-forth movement of electrons in a wire. Asymmetric key encryption (AKE) – AKE is usually known as Public/Private Key Encryption. The most famous, RSA, is one of the worlds strongest overall encryption methods. AKE involves choosing a strength (“bit strength”), and producing a set of keys (which are then owned by the key set owner). The public key is made available to anyone who wishes to send encrypted data. The data is run through the public key and then transmitted to the key set owner. The data then is decrypted using the private key, made available only to the key set owner. The only way to decrypt this data is to own (or forcefully generate) the private key. For comparison, a 56-bit RSA key using the RC5 algorithm has a possibility of 72 quadrillion combinations, and takes (using parallel processing) about 100 days to crack. Audio capture device – A device attached to the computer, usually plugged into the PCI bus on a computer, which converts sound patterns into raw audio data. Usually dubbed a “sound card”. Blowfish – A variable-length encryption cipher of good quality. CODEC – See both compression and decompression. Compression – The “CO” in CODEC, compression is the method of representing a piece data in a smaller size. In the early days of computing, compression was limited to general algorithms designed to shrink the size of raw, random data. In recent years, compression has become specific, from audio/video compression (such as DiVX ;) or MPEG-3) to textual-based compression. These newer compression methods use a combination of shifting-frame pattern recognition and data removal to shrink the size of data - without sacrificing 1 quality. Most compression utilities can be configured to favor a smaller size over data quality and vice versa. CPU – Central processing unit, the brains of the computer. CRC – See cyclical redundancy check. Cyclical redundancy check (CRC) – CRC is an algorithm that will verify that a piece of data has not been altered. Decompression – The “DEC” in CODEC, decompression is the method of recreating raw data from a compressed piece of data. The primary job of the decompression portion is to detect the style of compression used on a file, and reproduce the raw data in the best form as possible. Many decompression algorithms implore decompression and best-guess techniques to predict and reproduce data as close to the original as possible. Decryption – The process of revealing concealed (“encrypted”) data with the proper authorization. Demultiplexing – The process of taking a multiplexed stream of data, and splitting it back into its original individual parts. DES – Data Encryption Standard. A common cipher, used to encrypt 64-bit blocks of data at a 56-bit encryption strength. DiVX ;) - A fast, free implementation of the MPEG-4 video/audio CODEC. DNS – Domain name server. A system that maps user recognizable addresses with machine IP addresses. FTP – File transfer protocol, a standard protocol for transferring files over the Internet. GUI - Acronym for graphical user interface. Hash - A one-way function that creates a unique piece of data from another piece of data. Unlike encryption, this cannot be reversed. It can be used to verify the integrity of a piece of data with the original without exposing the original to the public. HTTP – Hyper text transfer protocol, a standard system for transferring files over the internet, used in web browsing. 2 Encryption – The process of concealing data in an unreadable form, unless proper authorization is possessed to view (“decrypt”) the data. MCP – Acronym for motion controlled platform. MD5 – Message digest function, implementation 5. MD5 is a one-way hash, commonly used in password protection and error detection. Motion controlled platform (MCP) – A mounting platform that allows controlled movement in order to change the angle of a devise automatically. Multiplexing – The process of taking individual data streams and combining them into one single data stream. Also called a multiplexed stream. PCI – Peripheral component interconnect. PCI is the standard component interface bus on a computer’s motherboard. Raw audio data – The format of data in which standard audio generation can be applied to, to produce audible sounds through the users output device. For example, compressed audio data (such as MPG – MPEG3 compression) cannot be heard unless it is decompressed into raw audio data. Only then can it be successfully heard on the users audio output device. Raw video data – The format of data in which standard graphical rendering can be applied to, to produce full-motion video on the users computer screen. For example, compressed video data (such as DiVX ;) ) cannot be displayed unless it is decompressed into raw video data. Only then can it be successfully rendered on the users video output device. RC5 – RC5 is one of the fastest algorithms in existence that encrypts and decrypts data in RSA format. RSA – A strong public and private key encryption method. Socket – When someone says socket, they are usually using it in reference to Berkeley Sockets – a programming interface for communications over the Internet. Implementations of Berkeley Sockets are used on virtually all operating systems, from UNIX (IO-Socket libraries) to Windows (WinSock). They provide a seamless way for programmers to create network applications. Older UNIX implementations contained three states – raw (on top of IP), TCP/IP and UDP/IP. Most implementations now provide access to only UDP/IP and TCP/IP sockets. Sockets allow one to create both clients and servers for network communications. SSH – Secure shell, an encrypted method to access a computer remotely using textual commands. 3 Stream, Streaming – The word “stream” is used for many different applications. Most notably, socket communications over the Internet are called “streaming” since they move data from one point to another constantly, like a river. Streams can also be referenced to raw unprocessed data, which usually comes directly from a socket buffer. Symmetric key encryption (SKE) – SKE involves the process of “secret key” encryption and decryption. Unlike AKE, where everyone can know the public key, SKE keys must be held by only those involved in the data exchange. SKE’s are often a one-to-one key. The most basic SKE was invented by Julius Caesar. This cipher (known as the Caesar Cipher) was based off of the equation E=((N+K) modulus 26), where N is the numerical representation of a letter (A=1, B=2, etc), and K is the modifier. The modulus 26 is to ensure the numerical character range stays between 1 and 26 (A and Z). Synchronized multimedia stream – A synchronized multimedia stream is exactly what it implies – a stream of data in which video is matched up to its corresponding audio. This stream will eventually be split apart by the same means. For example, if the multimedia stream was not converted to a synchronized multimedia stream, video of the speaker talking may not match up to the corresponding audio. In Hollywood, this is called “bad dubbing”. TCP – TCP stands for transmission control protocol. Most noted as being the Internet Protocol, TCP was established in the late 1970’s to provide reliable communications over packet-switched networks. Unlike its sibling UDP, TCP provides a three-way handshake to ensure a solid connection between two computers. TCP also ensures packet transmission success by providing a CRC (cyclical redundancy check) at the end of the packet to determine if, during its course of broadcast, the packet is corrupt. If so, TCP has built in send/resend as well as flow control to ensure the packet arrives safely and unharmed. If the physical connection causes the non-transmission of packets (a break in the line, for instance), TCP will “timeout” on the connection, and return an error to the user. Most application layer protocols (such as Telnet, FTP, HTTP and SSH) use TCP as their primary transport method. UDP – UDP stands for user datagram protocol. Sibling to TCP, UDP provides connectionless communications, providing no guarantee of data delivery or integrity. UDP does not support “timeout”, so it is the sole responsibility of the programmer to determine when a set amount of time has passed to call a connection “dead”. UDP is most often used when the overhead of a TCP connection is too much for the connection needed. UDP is most often used for streaming audio and video applications, where error correction and packet retransmission are detriments to overall performance. A handful of applicationlayer protocols use UDP (DNS, older versions of RealAudio and Shoutcast) as 4 transport methods. Most programs that require the functionality of UDP for “streaming” implementations are now moving to the Multicast Standard. USB – Acronym for universal serial bus. Video capture device – A device attached to the computer, usually plugged into the USB (universal serial bus) port or in a PCI slot on a computer, which converts light patterns into raw video data. Visual aid – Any format of a graphic that will illustrate what a speaker is attempting to communicate, for example whiteboard drawings, notes on a piece of paper, or a Power Point presentation. 5 2 Project Design This section contains material describing the design of the product. 2.1 Introduction 2.1.1 General Background The interconference room visual communication system’s objective is to create a low cost alternative to current telecommunication technology. To keep this system low cost, some traditional features will be modified to make the system more cost effective while maintaining a relatively high level of quality. For example, the Internet will be used as the communication medium, since it is free, and already established. Also, the rooms will be limited in size to reduce the need for costly equipment. Finally, the system will be portable so it can be shared between many people in the institution. 2.1.2 Technical Problem In order to make the system efficient, the project will be split up into two different segments – the hardware segment and the software segment. Hardware The hardware segment consists mainly of creating, buying, and testing the hardware that will interface with the rest of the system. It will also involve creating drivers to interface those hardware devices with the software portion of the project. Software The software segment consists of writing a graphical user interface, implementing CODECs for audio, video and data compression, creating an encryption and decryption engine for the system, and providing seamless integration with the ever-changing technology existing on the Internet. 2.1.3 Operating Environment The software will be written for the Microsoft Windows family of operating systems. The hardware will be used indoors, but will need to be durable, since it is portable. 6 2.1.4 Intended Users and Uses The intended users of the multimedia conference system are those who need to conduct meetings with others who are not in the immediate area, or those who cannot attend a meeting in person. It is designed for both the novice and expert computer user by providing an easy to understand interface for the beginner, along with the ability to control virtually all aspects of the system for the computer expert. 2.1.5 Assumptions and Limitations Several assumptions have been made for this project. This project also entails limitations as well. The assumptions and limitations are defined below: Assumptions Participants will sit within ten feet of the camera. There will be only two rooms in communication with each other. Each room will contain a maximum of five participants. Limitations The system must be less than $300 in total costs. The camera assembly must be able to be easily carried by one person. Internet bandwidth will limit the amount of data that can be transmitted. Maximum CPU processing power will limit the compression and encryption that can be done in real time. The CPU must be at least a Pentium III 500MHz processor or equivalent. Compression strength will also limit the quality of video that can be transmitted. 2.2 Design Requirements 2.2.1 Design Objectives There are several objectives for the project listed below. A basic overview of the intended design can be seen in Figure 1. Display multimedia at both participants locations Both conference rooms will be able to see the remote video, as well as hear remote audio. Compressed and encrypted video and audio streams The system will use audio and video compression techniques to condense the audio/video streams in order to conserve both participants bandwidth. The 7 system shall also use an optional medium-strength encryption technique to provide data security- so that only the authorized participants are allowed to view the audio/video stream. Motion controlled platform (MCP) A platform will be created onto which a camera could be mounted on. This will allow the camera to be panned right and left and tilted up and down. MCP software Users shall have the ability to manually move the camera via a software interface, as well as create and use positional presets. Figure 1: System overview 2.2.2 Functional Requirements The following defines the functional requirements the end product will perform: All participants can operate the system The system will be designed to provide ease of both setup and use, so that participants with a reasonable knowledge of computers can operate the system. Participants can choose to enable encryption and/or compression The system shall allow the participants to turn on or off both compression and encryption, to increase transmission performance, or save on CPU cycles. Local camera control The system shall give each conference room local control over its camera, to focus the camera on a specific person, via presets or manual control. Portability The system can be easily moved and set up in different conference rooms. 8 2.2.3 Design Constraints This section defines constraints considered during the design and implementation of the project. Cost The system must be less than $300 while maintaining its functionality. Local Internet bandwidth Since video will be streaming over the Internet, there must be sufficient bandwidth to view real time video at all client computers. CPU processing power The server must composite and stream real time audio and video, as well as process multiple incoming audio streams. This will require significant CPU power. 2.2.4 Measurable Milestones The following lists the measurable milestones of the project along with a grading system used to determine the success or failure of the milestone. A grade of 1 is not successful, a grade of 2 is partially successful, and a grade of 3 is fully successful. An average weighted score of less than 3 is not passing. Project scope and intended features defined (5%) Features for the completed system are defined that would create a complete and usable product. Grade Requirements 1…Project scope and features are not defined. 2…Project scope and features are partially defined with some revisions needed. 3…Project scope and features are fully defined and do not need to be revised. All modules’ functionalities and interfaces designed (20%) All modules can be implemented using the documented module description and interfaces. Grade Requirements 1…Modules’ functionalities and interfaces are not designed. 2…Modules’ functionalities and interfaces are partially designed with some revisions needed. 3…Modules’ functionalities and interfaces are fully designed and do not need to be revised. All modules function properly under controlled conditions (30%) Using test inputs, all modules pass limited tests and can produce expected output for all features, even if under controlled inputs. 9 Grade Requirements 1…No modules function properly under controlled conditions. 2…Some modules function properly under controlled conditions. 3…All modules function properly under controlled conditions. All modules function properly under all conditions (20%) Using test inputs, all modules will work under all foreseeable conditions and produce expected outputs. Grade Requirements 1…Modules do not function properly under any conditions. 2…Modules function properly under some conditions. 3…Modules function properly under all conditions. Complete system operates under controlled conditions (15%) Using completed modules, the final product can be assembled and produce expected outputs under controlled conditions. Grade Requirements 1…System does not operate under any controlled conditions. 2…System operates properly under some controlled conditions. 3…System operates properly under all controlled conditions. Complete system operates under all conditions (10%) The completed system will operate in a useable state, with no known errors. Grade Requirements 1…System does not operate under any conditions. 2…System operates properly under some conditions. 3…System operates properly under all conditions. 2.3 End-Product Description The system will connect two small conference rooms with a small number of participants per room. It will broadcast both audio and video from one room to another using the Internet. The cameras will be mounted on a motion-controlled platform (MCP), which allows them to be focused on individual speakers. The system will be easily portable from one room to another, allowing many people to utilize one system. 2.4 Approach and Design 2.4.1 Technical Approaches This system will be composed of a software segment and a hardware segment. The following lists the different approaches that were considered for the design phase of the project. The modules that were chosen will be described below in 2.4.2 - Technical Design. 10 Software The software modules must be connected so they can get the required information with little overhead. Though some of the modules are obvious, there are several different ways that control and audio video sync can be handled. Approaches: 1. No time code – The audio and video will be compressed, encrypted, and sent without any time codes. The slower of the two streams will be held back from transmitting a set time to allow the audio and video to be approximately synced when it is displayed by the remote client. 2. Let receiver sync streams – A time code will be assigned by the sender to each chunk of data before it is compressed or encrypted. This time code will be sent with the chunk of data to the receiver where it is reassembled into seamless audio and video streams. 3. Sync audio and video at source – Send out synced audio and video data simultaneously, and hope there is no difference in transmission times. The video and audio should not be combined into the same packet because the audio should be at a higher frame rate than the video. Option 2 was chosen to ensure audio and video are synchronized. Video capture The video capture module will attempt to capture raw video from a video input device. It will deliver the captured raw video to the Video Compression/Decompression module. It will also report back its state to the control interface when queried. Inputs: Digitized image information from the computer’s USB port or PCI bus. Outputs: Raw video data Capture status from the user interface module Approaches: 1. No driver interface – The module will attempt to capture the raw data by using a customized driver, written specifically for the camera being used. This ensures the data will be formatted in an efficient manner, however requires much more development time since a custom driver will need to be written for every camera used. This will also allow more options for OS implementation. 2. Windows driver interface – The module will interface with the drivers Windows has installed for the camera. Windows provides a common driver interface for all multimedia camera devices, so the user would be able to use any camera they 11 wanted. The data would be in a predefined format, so it may require extra processing. This option limits the system to one operating system. Option two seems to be the favorite since it adds compatibility while reducing development time. Audio capture The audio capture module will attempt to capture raw audio from an audio input device. It will deliver the captured raw audio to the Audio Compression/Decompression module. It will also report back its state to the control interface when queried. Inputs: Digitized audio information from the computer’s sound card. Outputs: Raw audio data Capture status from the user interface module Time code specific to this data Approaches: 1. No driver interface – The module will attempt to capture the raw data by using a customized driver, written specifically for the sound card being used. This ensures the data will be formatted in an efficient manner, however requires much more development time, as a custom driver will need to be written for every sound card used. This will allow more options for OS implementation. 2. Windows driver interface – The module will interface with the drivers Windows have installed for the sound card. Windows provides a common driver interface for all sound cards, so the user would be able to use any sound card they wanted. The data would be in a predefined format, so it would require extra processing. This also limits the system to one operating system. Option two seems to be the favorite since it adds compatibility while reducing development time. Video compression/decompression The video compression/decompression module will attempt to shrink the overall size of the raw video data on transmission, and restore the compressed data back to raw video data on reception. It will establish the compression status with the remote host through the transmission module when requested by the control interface. It will also report back its state to the control interface when queried. 12 Inputs: Raw video from the video capture module Control signals from user interface Outputs: Compressed data Control return (ex. compression status) Approaches: 1. No compression – The compression/decompression module will simply pass through the data without compressing it. This will require almost no CPU resources, and will require no extra design time. It will cause the total amount of bandwidth needed to increase, or the quality of the video to decrease. 2. Chunked compression – A compression algorithm based on a chunked encoding system will be used. This will allow easy transfer and reassembly of fixed time chunks, and will also allow for higher compression rate algorithms that compress video after it has been created. However, the delay will be proportional to the chunk size. 3. Streamed video compression – The compression/decompression module will attempt to compress the data using a pre-manufactured video CODEC that is made for streaming data. This will not be the most efficient compression method, however it will induce less delay. Options two and three are both viable choices for compression, however option 2 was chosen due to its quality. Audio compression/decompression The Audio Compression/Decompression Module will attempt to shrink the overall size of the raw audio data on transmission, and restore the compressed data back to raw audio data on reception It will establish the compression status with the remote host through the transmission module when requested by the control interface. It will also report back its state to the control interface when queried. Inputs: Raw audio from the audio capture module Control signals from user interface Outputs: Compressed Data Control return (ex. Compression status) Approaches: 1. No compression – The compression/decompression module will simply pass through the data without compressing it. This will require almost no CPU resources, and will require no extra 13 design time. It will cause the total amount of bandwidth needed to increase, or the quality of the audio to decrease. 2. MPEG-3 CODEC – The compression/decompression module will attempt to compress the data using the MPEG-3 CODEC. The MPEG-3 CODEC allows for fast, efficient streaming compression and decompression of audio. This will be the most efficient compression method, and will be fairly easy to implement, due to the popularity of this CODEC. This method is mildly CPU-intensive. Option two is the only viable solution for this module. Security The security module will secure the data being transmitted over the Internet so that other parties cannot see the conference taking place. It will establish an encryption session with the remote host through the transmission module when requested by the control interface. It will also report back its state to the control interface when queried. Inputs: Video and audio data to be transmitted over Internet Control signals from user interface Outputs: Encrypted data Control data Approaches: 1. No security – the security module will simply pass through the data without encrypting it. This will require almost no CPU resources, and will require no extra design time. It will not, however, stop any third party from watching the conference. 2. Symmetric key encryption – The security module will use a symmetric key encryption method to secure the data, such as DES or Blowfish. This will be the most efficient secure method and will be easier to implement than Hybrid (see below), but the symmetric key must be given securely to both users outside the system. 3. Asymmetric key encryption – An asymmetric encryption method will be used to secure the data. It will be as easy to implement as symmetric encryption, and will eliminate the problem of transferring keys securely. It is very secure. This option will be very CPU intensive, since it relies on complex mathematical operations. 4. Hybrid encryption – This will combine symmetric and asymmetric encryption methods by using asymmetric to 14 transfer the secret key. This will only be CPU intensive during the initialization phase when CPU usage is not as critical. This is the most complex option to implement. It is more secure than symmetric, but less than asymmetric. Symmetric key encryption was chosen because it is easy to implement and is performs well. Transmission The transmission module will transmit and receive data over the Internet. It will establish a connection with the remote user, or wait for a connection when requested by the user interface. It will report any errors to the control interface and report its status when queried. Inputs: Data from encryption module Data from remote user via system network libraries Control signals from the control interface Outputs: Data to decryption module Data to remote user via system network libraries Networking events and errors to control interface Approaches: 1. TCP – Use the TCP transport layer to send data to remote user. This will be both reliable and easy to implement, but will cause extra overhead and delays in the transmission of data. 2. UDP – Use the UDP transport protocol to send data to remote user. This will require extra work to implement, but will allow for maximum throughput of data to remote users and a minimum delay. UDP will allow packets to be dropped, since it would be unwise to wait for them in a real-time context. UDP is the textbook solution for this system. User interface The user interface will consist of a control interface, a video rendering component, and an audio player. It will send and receive signals from the other modules to control the system as a whole. It will allow user input to set up and control a conference session, display live video and play live audio. 15 Inputs: Video data stream Audio data stream Status events from other modules User input events Outputs: Control signals to modules Approaches: 1. Text-based (command line) – The interface could receive user input via a text based, or command line interface. This is easy to implement and requires very little overhead, but not very user friendly. Also, a graphical element must be created for video display still. 2. Graphical Based – A standard GUI (graphical user interface) can be created to receive user input. This is much more difficult to implement and there is more overhead, but is much more user friendly. Option two is the most user-friendly option, and is a clear favorite. Hardware The hardware system will deal with all hardware elements of the product. The following modules will be connected to each other and with the software system. Control interface The control interface will mediate the movements of the MCP. It will take in electrical signals from the device and send them to the MCP motors, which will cause them to run. Approaches: 1. Computer interface – A computer could be used to control the MCP movements. Signals can be sent from the computer via USB or RS232 cabling to a controller chip. The controller chip would then send the signals to the motors, causing them to run. 2. Joystick interface – A joystick could be directly connected to the MCP motors to control their movements without a computer. Option 1 will provide much more control and precision over camera movement. 16 Mechanics The MCP will have the capability to pan (left and right movement) a maximum of 120 degrees and tilt (up and down movement) a maximum of 60 degrees. There are several ways in which the movement of the platform can be controlled. Approaches: 1. Automatic positioning – The MCP can be programmed via software to have preset positions to where the camera would move. These positions can be defined prior to a teleconference call and would limit the required user input during a call. The camera could be cued to move via a signal sent by the computer user interface. 2. Manual positioning – The MCP can only move via the signals sent by the computer user interface or external joystick. The user would have to move the camera to the desired positioning during a teleconference call. Using both option 1 and option 2 will give the user greater options when using the MCP. Circuitry – Motors Two motors are required for the MCP movements. One motor will be used for pan movements and the other will be used for tilt movements. There are several kinds of motors that can be used to move the platform. Approaches: 1. Servomotor – Servomotors are generally small and powerful for their size, and easy to control. They are used extensively in robotics and radio-controlled cars, airplanes, and boats. The servomotor has good accuracy and is relatively inexpensive. 2. Stepper motor – Stepper motors provide precise positioning and ease of use, especially in low acceleration or static load applications. The stepper motor is very accurate and expensive. 3. DC motor – DC motors are most commonly used in variable speed and torque applications. The DC motor has minimal accuracy and is inexpensive. A servomotor seems to work the best, since it has the minimum accuracy required for the project while being within budget constraints. 17 Circuitry – Power The MCP motors and circuitry need an external power source for it to operate. Approaches: 1. Battery 2. AC Adapter An AC adapter would provide a stable power source. 2.4.2 Technical Design General summary The system will be composed of hardware and software modules that work together to create the end product. These modules must be connected so they can get the required information with little overhead. Though some of the modules are obvious, there are several different ways that control and audio video sync can be handled. General approach A time-coded system was chosen because it will result in the best quality audio and video. Since the receiver will sync the video using a time code assigned before the data was split up, it will be aligned. Also, it allows the receiver to easily throw out data that has been delayed in transport, since data will have a timecode attached. General description The video conferencing system will be composed of several independent hardware components linked together through a personal computer as depicted by Figure 1. There will be two personal computers that will communicate with each other via the Internet. Each computer using the capture module will acquire raw video, and audio will be obtained by the audio capture module. The camera will sit atop a motion control platform (MCP) that will be controlled by the computer. The MCP will interface with the computer using an RS-232 interface and will accept two angular direction vectors supplied by the control software. The control software will be divided up into selfcontained modules, which will be developed and tested independently of each other. These modules will be described in detail later in this section. 18 The flow of data will be as follows: Audio and video are captured from the audio and video hardware as data streams in the capture modules. They are broken into chunks and assigned time codes to sync them at the receiving computer. The audio chunks and video chunks are compressed individually, and they are sent to the encryption module. After being encrypted, they are sent to the transfer module to be sent to the remote computer. There will be one packet structure depicted in Figure 4 that will transfer both audio and video. Figure 2 : Software System Outline 19 Figure 3 : Hardware System Outline Figure 4 : Packet Structure Software Video capture Summary The video capture module will attempt to capture raw video from a video input device. It will deliver the captured raw video to the video compression/decompression module. It will also report back its state to the control interface when queried. Approach: Windows driver interface was chosen since it will facilitate rapid development of this module and will not limit this project to specific equipment. 20 Inputs: Digitized image information from the computer’s USB port or PCI bus Outputs: Raw video data – To the video compression module Data-specific time code – To the video compression module Capture status – To the user interface module Description A standard driver will be used, so no design will be needed. Audio capture Summary The audio capture module will attempt to capture raw audio from an audio input device. It will deliver the captured raw audio to the audio compression/decompression module. It will also report back its state to the control interface when queried. Approach: Windows driver interface was chosen since it will facilitate rapid development of this module, and will not limit the project to specific equipment. Inputs: Digitized audio information – From the computer’s sound card Outputs: Raw audio data – To the audio compression module Data-specific time code – To the audio compression module Capture status – To the user interface module Description A standard driver will be used, so no design will be needed. Video compression/decompression Summary The video compression/decompression module will attempt to shrink the overall size of the raw video data on transmission, and restore the compressed data back to raw video data on reception. It will establish the compression status with the remote host through the transmission module when requested by the control interface. It will also report back its state to the control interface when queried. Approach: Chunked compression was chosen since it can produce a higher quality image over a fixed bandwidth transmission media. Since 21 video quality is a goal of the project, this option was the best available. Inputs: Both: Color bit depth – From the user interface module Resolution – From the user interface module Frame rate – From the user interface module Video chunk size – From the user interface module Video compression control – From the user interface module Compression: Raw video - From video capture module Time code - From video capture module Decompression: Compressed video – From encryption/decryption module Time code – From encryption/decryption module Outputs: Compression: Compressed video – To the encryption module Time code – To the encryption module Decompression: Raw video – To the user interface module Time code – To the user interface module Both: Compression status – To the user interface module Description The compression will split the raw video into chunks. Each chunk will be assigned a time code. These chunks of data will be compressed and outputted with their corresponding time codes. The module will be initialized from the user interface module with the following compressions settings: Color bit depth Resolution Frame rate Video chunk size Video compression control If the setting “video compression control” is disabled, the module will simply pass the raw video from the video capture module to the encryption/decryption module, with no alteration. 22 Audio compression/decompression Summary The audio compression/decompression module will attempt to shrink the overall size of the raw audio data on transmission, and restore the compressed data back to raw audio data on reception. It will establish the compression status with the remote host through the transmission module when requested by the control interface. It will also report back its state to the control interface when queried. Approach: MPEG-3 CODEC was chosen because it is easy to implement using the pre-made MPEG-3 CODEC. It is also known to be very efficient in both speed and quality. Inputs: Both: Audio depth – From the user interface module Frequency range – From the user interface module Audio compression control – From the user interface module Compression: Raw audio - From audio capture module Time code - From audio capture module Decompression: Compressed audio – From encryption/decryption module Time code – From encryption/decryption module Outputs: Compression: Compressed audio – To the encryption module Time code – To the encryption module Decompression: Raw audio – To the user interface module Time code – To the user interface module Both: Compression status – To the user interface module Description The compression will split the raw audio into chunks. Each chunk will be assigned a time code. These chunks of data will be compressed and outputted with their corresponding time codes. 23 The module will be initialized from the user interface module with the following compressions settings: Frequency range Audio depth Audio compression control If the setting “audio compression control” is disabled, the module will simply pass the raw audio from the audio capture module to the encryption/decryption module with no alteration. Security Summary The security module will secure the data being transmitted over the Internet so that other parties cannot see the conference taking place. It will establish a virtual encryption session with the remote host through the transmission module when requested by the control interface. It will also report back its state to the control interface when queried. The security module is broken into the following sub-modules: encrypt, decrypt, initialization, and notify. Approach Symmetric key encryption will be chosen because it is easy to implement and is high performance. A simple plain text password will be shared between the participating parties. A hash from this password will be generated that will be used to encrypt the audio and video data. Inputs: Encrypt Data chunk to be encrypted Data not to be encrypted (such as timecode) Decrypt Encrypted data chunk to be decrypted Unencrypted data (such as timecode, data type) Initialization Encryption type (such as off, DES, …) Encryption strength User entered password Outputs Encrypt Encrypted data chunk Unaltered data chunk Data type Decrypt Decrypted data Unaltered data 24 Initialization Result (such as successful, bad password, encryption error…) Notify Error decrypting Description General Data will be encrypted using the algorithm specified by the initialization request and with a key generated by the password. This key will be a hash of the password, such as MD5, that will create a large bit width key from a short alphanumeric password. Initialize This submodule will set up a virtual remote session with the remote server. The central control module will initiate it immediately after a connection has been established. Using the send sub-module of the transmission module as a transport, the initialization sub-module will send random data that has been encrypted. The remote initialization module will decrypt the data, and perform a transform on it. It will then encrypt the transformed data and send it back to the local initialization sub-module. If the data matches the original data transformed, then the encryption is successful. The result of the initialization is returned to the calling module. Encrypt /decrypt The data to be encrypted will be ciphered using this key, and sent to the transmission module to be sent. Encrypted data from the transmission module will be decrypted using the same key and sent to the appropriate decompression module. Transmission Summary The transmission module will transmit and receive data over the Internet. It will establish a connection with the remote user or wait for a connection when requested by the user interface. It will report any errors to the control interface and report its status when queried. The transmission module can be broken up with the following sub-modules: connect, serve, send, and receive. Approach UDP will be used because it can be optimized for speed and real time traffic. TCP has too much overhead and creates a buffered stream that can cause massive delays in video and audio performance. 25 Inputs Connect Host address from control module Host port from control module Serve Local port from control module Send Encrypted data from encryption module Timecode from encryption module Data type from encryption module Receive Data packets from remote transmission module Outputs Connect Result from connection process Serve Result from serve process Send Result from send procedure. Receive Data type Time code Data for decryption Errors in packet sent to control module Description Connect/Serve The transmission module will establish a connection with the remote client. One client will be the initiating client, and the other will be waiting server. The server will first bind to a port and wait for a connection from the initiating client. The initiating client will then connect to the waiting server using the connect sub module. There will be a handshake that will affirm that a connection can be established. At this point a success message will be returned to the control module, which will tell the encryption module to set up an encrypted session. Receive The receive sub-module will act on packets received from the remote client. If there is a packet that arrives late, i.e. after that timecode has been played, the packet is immediately dropped. The percentage of dropped or missing packets is calculated, and a value is sent back to the remote client to establish the rate of packets that can be sent over the link. Send The send sub-module will send a packet to the remote client only after a connection has been established using a standard packet 26 structure over UDP. Along with the data, time code, and sequence number, a value representing the percentage of dropped or missing packets is sent to the receiver. Flow Control Since this is real time data, no resending of data will occur. Instead, when a high percentage of late or missing packets are encountered, the sending rate will be decreased. This task is performed in the sending sub-module, which will drop packets at a percentage that is calculated from the percentage that the remote client reports receiving. User interface/control Summary The main purpose of the user interface is to gather user input and invoke the services of other modules to operate the system in a user-friendly manner. It will consist of a control component, a video rendering component, and an audio player component. The control component will send and receive signals from the other modules to operate the system as a whole. It will allow user input to set up and control a conference session, display live video and play live audio. Approach A graphical user interface will be constructed to get user input for the session state. It will also display video and play audio. This will create an easy to use unified appearance to the system. Inputs System control Signals from other modules User initiate connection Remote host address Remote port number Password Quality level User wait for connection Local port Password Encryption method (on or off) Compression method (on or off) Quality level Operation Terminate connection 27 Video rendering Raw video data from decompression module Settings from control module Audio player Audio data from decompression module Settings from control module Camera control Manual step movement Set preset Move to preset Outputs System control Control signals to other modules Video rendering Video data to window manager of OS for display on screen Audio player Audio data to sound manager of OS for output by speakers Camera control Pan angle Tilt angle Description There will be several windows and menus in a GUI to gather all the specified inputs. These will be translated into control signals that will operate the system. Quality level will let the user choose a preset of various settings to tailor the connection to their available bandwidth. Connection When the user wants to wait for a connection from a remote host, they will enter the required inputs into a form. These will be used to perform the following actions: 1. The wait interface from the transmission module will be invoked to wait for a connection. 2. Wait for a response. When the user wants to connect to a waiting client, the control module will perform the following actions: 1. The connect interface of the transmission module will be invoked to connect to the remote computer. 2. Wait for response. 28 Then the following steps will be performed: 1. Invoke the initialization interface from the security module component with the supplied password to start an encrypted session according to the users inputs. 2. Invoke the initialization interface from the compression module to set up compression according to the users input. 3. Tell the video capture and audio capture modules to begin capturing data. 4. Display video on screen as received from the remote computer using the audio player component 5. Play audio as received from the remote computer using the audio player component. Camera Control By selecting some user interface element, the user will be able to move the local camera. The event will trigger the camera control module to send gradually changing angular directions to the MCP module. The user will be able to save the current camera direction through another interface element to a preset. When this preset is accessed, the stored directions will be sent to the MCP module, moving the camera back to that location. Hardware Control interface Summary The control interface will operate the movements of the MCP. It will take in electrical signals from the device and send them to the MCP motors, which will cause them to run. Approach A computer interface was chosen because of its ease of use and ability to expand MCP features. Inputs: User commands Outputs: Computer generated signals sent via RS232 cabling to the FerretTronics FT639 controller chip. Description The computer will be the control interface for the MCP. As shown in figure 5, the computer will contain software that will send signals to a controller chip via RS232. The controller chip will then send signals to the motors causing them to run. The FerretTronics 29 FT639 controller chip will be used to interface the computer and motors. Figure 5 : FT639 Chip Mechanics Summary The mechanics of the MCP will be controlled by software. The MCP will have the capability to pan a maximum of 120 degrees and tilt a maximum of 60 degrees. Approach The MCP will be programmed to use both automatic positioning and manual positioning. This will give the user greater options on using the MCP. Description The MCP will be programmed via software to allow the selection of automatic positioning, manual positioning, or both. For automatic positioning, the MCP can be programmed via software to have preset positions to where the camera would move. These positions will be defined prior to a teleconference call. The camera will be cued to move via a signal sent by the computer user interface. For manual positioning, the MCP will move via the 30 signals sent by the computer user interface. The user would move the camera to the desired positioning during a teleconference call. Circuitry – Motors Summary Two motors are required for the MCP to have pan (left and right movement) and tilt (up and down movement) capabilities. One would be used for pan movement and one would be used for tilt movement. Approach A servomotor was chosen to control the movements of the MCP. A servomotor has the minimum accuracy required for the project while being within budget constraints. Inputs: DC power signals Control signals sent from FT639 Outputs: Motion of MCP Description The servomotors will be connected the FT639 controller chip and will allow the MCP to move as shown in figure 5. Circuitry – Power Summary The MCP needs external power for the controller chip and motors in order for it to operate. Approach An AC adapter was chosen as the power source for the MCP because it is portable and a more reliable power source than a battery. Inputs: AC power signals Outputs: DC power signals Description DC power is required for the MCP circuitry. The AC adapter will allow the MCP to plugged into a standard wall outlet while obtaining DC power. 2.4.3 Testing Description The system will be tested in phases. Each module will be tested using a white box testing model by its designer to eliminate any obvious errors and ensure that 31 all desired functionality is present. Any errors found will be corrected and the module will be black box tested by a third party to ensure that it will work under all circumstances. Any errors will be corrected, and the module will then be available for incorporation by the complete system. After the modules have been shown to be free from errors, the complete system will be assembled and white box tested and all features will be verified for proper functionality and usability. Any visible errors will be corrected and the entire system will be black box tested by a third party. Any remaining errors will be fixed to produce an error free final product. The following black box tests will be performed: Connect remote host without encryption Description Attempts will be made to connect to several remote hosts. Of these hosts, one will not exist, and one will not be accepting connections. Acceptance Connection will be valid when both programs report a valid connection. Any situation that fails should report the appropriate error. Connect remote host with encryption Description An encrypted session will be made to multiple hosts. Of these hosts, one will not support encryption, one will not exist and one will support encryption. Acceptance This test will pass when both host report an encrypted connection. Also any situations that fail shall report the appropriate error. Compress and decompress audio and video Description Test audio and video streams will be compressed and decompressed. The data sizes will be recorded in all three phases. Acceptance This test will pass if the original audio and video can be recovered from the compressed data, and the compressed data sizes are smaller than either the original stream, or the recovered data. It will fail with any other outcome. Transmit compressed audio and video Description The compressed data will be sent across both an encrypted and unencrypted connection. The recovered audio and video shall be played by both hosts. 32 Acceptance This test will pass when the audio and video data are recovered by both hosts. It will fail if the data is not received, or any other anomaly appears. Display video and play audio Description Video and audio as captured shall be played on the local computer. Acceptance The test will pass if the audio and video are played in a recognizable format. It will fail if any errors appear, or if the video or audio cannot be understood. Manually move local camera Description Test if the camera can be moved manually. Acceptance Try moving the camera through the User Interface. If the camera responds and moves in the desired direction, the test passes. Otherwise, the test fails. Set/recall preset (camera) Description Test if a preset camera position can correctly be set and recalled. Acceptance The system will first be tested to see if the system can remember a preset camera position. Then, to verify, the camera will be moved manually to another position and the preset recalled. The test passes if the camera moves to the position that was initially set. Otherwise, the test fails. Terminate connection Description Test if connection is truly closed after software closes connection. Acceptance An attempt is made to send data through a closed connection. If data is still received, this fails. If data is not received, this passes. Camera movement Description In testing the cameras movement, signals will be generated to make sure the camera moves. Acceptance The camera will pass the movement test if, given a signal, the camera correctly moves in that specified position by the operator. It will fail if the camera moves either in the opposite direction or not at all to the specified position of the operator. 33 Camera pan and tilt extremes Description This test will determine if the camera will pan a maximum of 120 degrees and tilt a maximum of 60 degrees. The pan and tilt degrees will be measured with a protractor. Acceptance The MCP will pass the pan and tilt extreme test if the camera does not pan more than 120 degrees and tilt more than 60 degrees regardless of user input. The test will fail if the camera pans more than 120 degrees or tilts more than 60 degrees. Camera motion precision Description This test will determine the camera motion precision of the automatic positioning option regardless of the camera starting position. Acceptance The MCP will pass this test if the camera moves to the specified preset position within a five-degree margin of error of pan or tilt measurements. The test will fail if the camera pans or tilts more than the margin of error allows. 2.4.4 Testing Form Table 1 contains the testing form, which will be filled out for each test performed. Table 1 : Testing Form Test performed: Date/time performed Expected Results Actual Results Tested by Test status (pass/fail) Comments: 34 2.4.5 Risks/risk management Team members might leave group For various reasons, team members may be forced to quit the project. To cope with this risk, tasks will be performable by at least two team members. Component availability There may be a problem obtaining parts in a timely fashion. To minimize this risk, the parts will be ordered as soon as possible from a reputable company. Component reliability Components may fail. To minimize this risk, only high-quality parts from reputable companies will be bought. 2.4.6 Recommendation for continued work It is recommended that this project be completed as designed. Complications to the encryption or compression modules may warrant modifications to system as designed. 2.5 Financial Budget Table 2 contains detailed financial estimates for the project. Table 2 : Financial Budget Item Poster Cameras Microphones Motion Control Parts Original Estimated Cost $50 $150 $20 $30 Revised Estimated Cost $50 $150 $20 $80 Total Estimated Cost $250 $300 2.6 Personal Effort Budget Table 3 contains detailed time estimates for the project. 35 Table 3 : Personal Effort Budget Noah Nick Brian Melissa Korba McInerney Marshall Weverka Total Team Effort Project Definition Technology Selection Product Design Product Implementation Product Testing Post Testing Product Revision Product Documentation Product Demonstration 6 7 14 33 16 8 7 6 25 28 8 5 8 15 22 7 7 6 25 26 29 26 34 98 92 11 2 12 10 4 12 12 3 12 9 4 12 42 13 48 Project Documentation 31 35 33 37 136 Total Estimated Effort 132 135 118 133 518 2.7 Project Schedule A general schedule mapping out the timetable for this project can be seen in Figure 6 (fall semester) and Figure 7 (spring semester). Vacation time has been marked off in gray to better reflect the true timetable. It is assumed the group will be on vacation during the time not shown between the two Figures. 36 Figure 6: Estimated Project Schedule – Fall 2002 37 Figure 7: Estimated Project Schedule - Spring 2003 38 3 Closure Material This section contains contact information for the client, the advisor, and the team members. Following the contact information is a summary of the project. 3.1 Project Team Information Client Engineering Distance Education Paul Jewell, Coordinator of Technology 2273 Howe Hall 1364 Ames, IA 50011-2273 (515) 294-1827 Advisors Dr. S.S. Venkata 2211 Coover Ames, IA 50011-3060 (515) 294-3459 venkata@iastate.edu Electrical and Computer Engineering Department Chair Dr. James McCally 1113 Coover Ames, IS 50011-3060 (515) 294-4844 jdm@iastate.edu Electrical and Computer Engineering Associate Professor Team Members Nick McInerney 4709 Steinbeck St. #3 Ames, IA 50014 (515) 268-9335 nmcinern@iastate.edu Electrical Engineering Undergraduate Student Melissa Weverka 1233 Frederiksen Ct. Ames, IA 50010 (515) 572-7693 39 mweverka@iastate.edu Electrical Engineering Undergraduate Student Noah Korba 2724 Stange Ave. #3 Ames, IA 50010 (515) 451-6125 nkorba@iastate.edu Computer Engineering Undergraduate Student Brian Marshall 5521 Friley Nilesfoster Ames, IA 50012 (515) 572-5090 bmarshal@iastate.edu Computer Engineering Undergraduate Student 3.2 Summary Development of the interconference room video communications system will provide its users with the ability to conduct a meeting between two separate rooms. By using the Internet as the communications link, the system could be utilized by anyone with Internet access. Using the Internet also allows the system to be cost efficient. The team’s knowledge, experience, and hard work will guarantee the success of this project. 40