CrystalMobile: multimedia framework for mobile platforms www.crystalmobile.com Kim A. Bondarenko Crystal Reality LLC, CSA kim@crystalreality.com Computer Technology Department Saint-Petersburg State University of Information Technology, Mechanics and Optics, Russia Introduction • Application of video technologies on mobile platforms is very wide. - entertainment and adult industry information communications collecting of private videos etc. Problems • Mobile platforms have not enough memory space for full-quality videos and GPRS channels have no required bandwidth capacity. • CPUs are too slow for application of complex algorithms. Why to compress video? • Uncompressed video occupies huge amounts of data. For example, for storing of one minute quality video with 320x240 resolution with 15 frames per second it is needed 320x240x3x15x60 = 200 MBytes of data. • Powerful CPU platforms allowed the implementation of complex algorithms giving 100-500 compression ratio. Common ideas of video compression • Superfluous information. • Neglect of the insignificant details for human-eye. Superfluous details that are considered in CM Video • The low space correlation of picture. In any image, near lying pixels have little difference between them. • The low time correlation of picture. In any video clip there are a lot of close frames with little difference between them. Low space correlation Normal Picture Half resolution Low time correlation Normal Picture Difference between two following frames The details that are insignificant for human-eye • The human eye almost does not analyze the color, but very good differentiate brightness of the image. • Insignificant noisy movements of image parts are mostly invisible for human-eye. • The disturbances on noisy image part are weakly differentiated. Chroma vs color 1/8 of color 1/8 of chroma Insignificant movements Normal picture The picture with random distortion The disturbances on noise Normal Picture Disturbances on noise (red) YUV Colorspace • During the translation into YUV-color every pixel (RGB • • • vector) is multiplied by 3x3 matrix. As the result, the Y – brightness channel and two U and V color compositions are determined. CME uses confidant transformations with integers – it gives significant performance improvements on most platforms. Meanwhile, the resolution of U and V channels is divided by two along each axis. As human eye does not differentiate colors as good as brightness, it’s possible to reduce the detalization of color planes twice. Low time correlation of the video • B, I and P frames. • The definition of context change and the division of video streams into blocks (‘sandwiches’). The encoding is done by sandwiches of 1-3 frames. B, I and P frames Frame processing. Macroblocks. • Every frame in sandwich is divided into 16x16 blocks (macroblock). • Every macroblock consists of 4 brightness blocks of 8x8 and two color blocks 8x8, since the color resolution of the frame is divided by two. • Most processing methods operate on macroblocks. Motion compensation • Motion compensation is intended for prediction of the motion on the picture. It uses low time correlation between close frames. • CME Video provides motion compensation of 16x16 blocks with bi-linear interpolation. Motion compensation Frame from the stream Motion vectors I-frames • Encoder analyses whole frame to find images cannot be predicted from the history. These frames should be stored as independent data blocks, I-frames. • Positioning is precise to I-frames, so I- frames periodicity should be at least one Iframe for 30 seconds. Wavelets (DWT) • The images are passed the 2D wavelet transformation. Every frame is passed by bicubic/bilinear wavelet transformation. The main advantage of the format is about wavelet transformation is applied to separated images in the picture. • The smoothing is done by bicubic/bilinear interpolation, there are no effects of DCT. The pictures on the next slide show the difference between the losses using DCT (left) and bicubic wavelet composition(right) using the same bitrate. DCT vs Wavelets DCT Wavelets Quantization • The sandwich is passed through the process of quantization – every point is rapidly divided by the definite number from vector of quantization. Every element of vector of quantization is correspondent to some frequency of wavelet composition. On this step, the most loss is occurring Storing the coefficients • Each frame is passed to zero-quantors extraction by applying quadro-tree processing. After quantization most number of small coefficients is zeroed, that is why quadro-tree processing is very efficient on this step – it’s the main part of compression • After group encoding, 16 data blocks of each frame are compressed using Huffman method. The current status of CME Engine • Video codec of CME Engine is done. Highperformance audio codec is under development. • The current implementation uses fixedpoint ANSI C without any proprietary libraries. Symbian 6.1 platform • CME Engine perfectly runs on Symbian 6.1 platform (Nokia Series 60 phones). There is a room for tuning video parameters, but overall playback is good. • Release version of player software for Symbian platform is ready and was offered to public on 01.10.2003. Player is working on Nokia3650 • Encoding on PC • Realtime playback on Nokia Series60 family • High quality of the video Results & comparisons • CM Video has better quality than H.263 and MPEG1 with the same bit rate. • CM Video technical parameters: - 176x144 10fps on Nokia3650/7650 - 1 Mb per minute bitstream for good quality. Standard video formats & players • MPEG1. Unusable for mobiles. • H.263 Video Recorder for Nokia from Hantro Oy and Emuzed. • Real Video. Real Player for Nokia from Real Networks. Crystal Reality LLC www.crystalreality.com • Founded in March 2003 • 900.000 downloads of Crystal Player Professional • Very strong user community in Europe, Russia • • • and USA Crystal Mobile Engine was developed during July 2003 – Sept 2003 3 fulltime and 2 part time developers are now employed Develops mobile technologies for the future