DDI Video Compression GUI Design Considerations Copyright 2008-2010 F. Scott Deaver - all rights reserved. Proprietary and confidential information, may not be disclosed. Page | 1 Table of Contents Introduction .................................................................................................................................................. 5 The approach ............................................................................................................................................ 5 Demo limitations....................................................................................................................................... 6 The Graphical User Interface (GUI) ............................................................................................................... 6 Global video settings................................................................................................................................. 7 Source images for the data ................................................................................................................... 7 Setting overriding preferences for one or more frames....................................................................... 8 Setting overriding preferences for one or more areas of images......................................................... 8 Injection of proxy data .......................................................................................................................... 9 Compilation status, progress, and interruption.................................................................................. 10 GUI design targets................................................................................................................................... 10 Overview of the compilation process ................................................................................................. 10 Base tolerances ............................................................................................................................... 10 Data change evaluation (DCE)......................................................................................................... 11 DCE criteria.................................................................................................................................. 11 Weighted tolerances ................................................................................................................... 12 Weighted tolerance scope ...................................................................................................... 12 Weighted tolerance testing .................................................................................................... 12 Bit-shapes........................................................................................................................................ 14 Delta bit-block indexing block size .................................................................................................. 14 User-configurable parameter support by image controller type ....................................................... 15 Global parameters .......................................................................................................................... 15 Source file path or image array source in memory..................................................................... 15 Source binary type ...................................................................................................................... 15 Source image type....................................................................................................................... 15 Source frame width..................................................................................................................... 15 Source frame height.................................................................................................................... 15 Cropped source frame horizontal offset ..................................................................................... 15 Cropped source frame vertical offset ......................................................................................... 15 Cropped frame width .................................................................................................................. 15 Copyright 2008-2010 F. Scott Deaver - all rights reserved. Proprietary and confidential information, may not be disclosed. Page | 2 Cropped frame height ................................................................................................................. 15 Output frame width .................................................................................................................... 16 Output frame height ................................................................................................................... 16 Output display rate ..................................................................................................................... 16 Total frame number .................................................................................................................... 16 Maximum number of look-ahead frames ................................................................................... 16 Maximum number of history frames .......................................................................................... 16 Frame preferences array............................................................................................................. 16 Frame preferences .......................................................................................................................... 16 Frame starting sequence number ............................................................................................... 17 Delta bit-block indexing block size .............................................................................................. 17 Get next video frame data callback ............................................................................................ 17 Quit if compiled output size exceeded ....................................................................................... 17 Pixel minimum for considering this frame empty....................................................................... 18 Zones ............................................................................................................................................... 18 Applicable to all zone types ........................................................................................................ 22 Zone horizontal offset ............................................................................................................. 22 Zone vertical offset ................................................................................................................. 22 Zone width .............................................................................................................................. 22 Zone height ............................................................................................................................. 22 Zone type ................................................................................................................................ 22 Zone priority............................................................................................................................ 22 Applicable only to inclusionary types ......................................................................................... 22 Pixel difference comparison callback ...................................................................................... 23 Turn logging on or off.............................................................................................................. 23 Allow look-ahead averaging .................................................................................................... 23 Number of look-ahead frames to average .............................................................................. 23 Base tolerance value ............................................................................................................... 23 Bit-shape tolerance value (experimental) .............................................................................. 23 Weighted tolerance scope value ............................................................................................ 24 Weightings .............................................................................................................................. 24 Copyright 2008-2010 F. Scott Deaver - all rights reserved. Proprietary and confidential information, may not be disclosed. Page | 3 Look-ahead weightings array .............................................................................................. 24 History weightings array ..................................................................................................... 24 Past frame different but overridden weightings array ....................................................... 24 Bit-shape member weighting value .................................................................................... 24 On-bit-shape-edge weighting value .................................................................................... 24 Not-on-bit-shape-edge weighting value ............................................................................. 24 Streaming video skipped frames weightings ...................................................................... 24 Number of past frames to consider for skipping ............................................................ 24 Number of skips required to trip frame skip weighting.................................................. 25 Skip weighting value ....................................................................................................... 25 Random thoughts ....................................................................................................................................... 25 Easy versus expert................................................................................................................................... 25 Playback .................................................................................................................................................. 25 In summary ................................................................................................................................................. 25 Copyright 2008-2010 F. Scott Deaver - all rights reserved. Proprietary and confidential information, may not be disclosed. Page | 4 Introduction DDI video compression tackles the video compression problem from an entirely different perspective than the common (i.e., MPEG) methods currently in vogue. Other video compression techniques emanate from a desire to capture an analog world onto a standard format convenient for playback by a variety of devices onto varying display surfaces, including analog output displays. The challenges for these techniques include modeling of three-dimensional objects onto a two-dimensional surface, considerations for legacy standards and technologies, and capacities/performance of any number of transport protocols, media, and forms. To help reduce those challenges, these techniques generally target a standardized output format, and rely on external tools and hardware to render the standardized format properly to a given device, with the emphasis being the desire to serve as many devices as possible from a standard format. DDI video compression approaches the problem by first assuming that there is a defined endpoint for the compressed result – for the purposes of the initial demo, there will be just one output assumption per compilation result, and that is a Windows-capable computer displaying to a 24-bit 1920 by 1080 display device. Production versions will allow the user to compile for other devices, and/or generic devices that will ultimately support several devices in one output file. By targeting a single device with known capabilities for a compilation objective, no compromises need to be made to accommodate other devices, and decisions about transports, displays, and rendering capabilities are already made for us (or at least narrowed to a reasonable subset of all the possibilities). The approach The traditional goals for video compression utilities are to 1.) model the analog world as realistically as possible in 2.) as small a file footprint as practical (in consideration of 1.) while 3.) serving as many output devices as possible via standardized file formats. Note the order of the priorities. DDI video compression has remarkably different goals. First some assumptions: We’re going to take it as a given that there are at the time of actual display a fixed number of pixels in play horizontally (for the demo, 1920) and vertically (the demo assumes 1080) on the output display device that will not change during the running of the video. We are also going to embrace the fact that the human eye cannot appreciate pixel color changes above a frequency of 24 to 30 frames per second, or certain subtle changes between colors. We are therefore, going to treat the individual pixels on the screen and the colors they display as the valuable resource, refusing any attempt to repaint any pixel faster than every 1/24th (BluRay) or 1/30th (other high-quality video standards) per second, and rejecting any color change that is imperceptible to the human eye. We are further going to reduce the size of the stored files by allowing both dynamic and subjective manipulation of these same pixel-protection controls in inactive or irrelevant areas of the screen, and for interim color changes between rapidly-changing pixel colors. To further reduce output file sizes, we are going to deploy proprietary compression technologies, including our patent-pending Copyright 2008-2010 F. Scott Deaver - all rights reserved. Proprietary and confidential information, may not be disclosed. Page | 5 delta bit-block indexing algorithm, our bit-shape detection and storage algorithms, and our data change evaluation algorithms. All of these technologies work very well together to produce very high-quality output video in much smaller file sizes than currently from other technologies such as BluRay, and welltailored to the capabilities of a modern notebook or desktop computer. Beyond compression, the DDI technology introduces a means to introduce custom content and behaviors into a video that cannot be supported by any other method, and which extends the viewing experience to other devices and software simultaneously at the time of display. Demo limitations Because of resource and time limitations, we do not consider audio in the demo. Integrating audio is trivial as a technical challenge, but requires a lot of coding time to implement. We are going to grab the low-hanging fruit first, then go up the tree as we progress. Since the greatest challenge and most “wow” factor is in the video compression (which is by an order of magnitude the largest portion of a video file), we’ll get that out first. In our second version, I anticipate incorporating an existing audio format into our video format, and as the third stage we will look at developing our own audio compression (we have expertise in LPC-10 and other voice compression technologies, and hopefully some of that will transfer over). In other words, we have a number of satisfactory options available for including audio with our video. Though we are bypassing audio initially, we need to ensure we allow access and allocate space in our GUI for controls and screens related to audio. Also, as noted above, we are limiting our initial demo to a 1920 by 1080 image size at 23.97 frames-persecond (the current BluRay limits), and the output device will need to be a modern Windows notebook or desktop computer (preferably with video card support for DirectX/DirectDraw). Nothing about the underlying compression technology is restricted to Windows, or even to a PC, but for the demo we need the GUI to target the greatest potential subset of users, which of course mandates Windows. The Graphical User Interface (GUI) The main functions of the graphical user interface will include configuring and managing the rendering of the compiled video, both initially and during the compilation process. The GUI will also monitor error and status reports published during compilation, and permit the viewing, playback, re-positioning, pausing, resumption, fast-forwarding, and premature termination of the video as it is being compiled. The GUI configuration facility will define and change operating parameters that are global to the video as a whole. It will also be able to apply customized parameters to single frames or groups of frames, preconfiguring them before compilation or creating and inserting them on the fly during compilation. And, the GUI application will also be able to configure several types of areas, called “zones”, within individual frames, which can also be dragged and dropped into multiple frames. Copyright 2008-2010 F. Scott Deaver - all rights reserved. Proprietary and confidential information, may not be disclosed. Page | 6 Global video settings Source images for the data For the demo, we will control the input images, and in the first version will accept only a fixed set of 24bit images framed as 1920 by 1080 arrays of RGB pixels. The GUI will need to present these arrays as sequences of thumbnail images, with zoom-in and zoom-out capabilities so that a greater or lesser number of frames can be selected or rubber-banded for the application of frame preference or zone configuration data. A typical 90-minute BluRay movie at 23.97 frames-per-second will include nearly 130,000 individual frames, so selection of frames will include multiple targeting controls having different granularities. A mockup of the interface for selecting an individual frame might look like this: Single and multiple frame selection plays a vital role in the workflow as I imagine it (certainly this is a topic for discussion, and I welcome other insights). I envision that the user will first go to a page where Copyright 2008-2010 F. Scott Deaver - all rights reserved. Proprietary and confidential information, may not be disclosed. Page | 7 he/she will select an array of images or (in later versions) a video in another output format to use as the source media for compression (for our demo version, I will provide you an array of 24-bit RGB images from a promo of “The DaVinci Code”). This part of the interface will be primarily a typical file browse and selection dialog, displaying a thumbnail of the first image of a file selection candidate (I suppose in later versions we could actually play the movie in the thumbnail view, if desired). In conjunction with source video file selection, or perhaps in a separate view, the user will also set a number of global parameters for compressing the video. Setting overriding preferences for one or more frames Though that would be all that is required to compress the video, I suspect the user will want to tailor customized compression settings for various sections of the video, and even certain areas of images for specific frames of a video section. Doing this will result in very high quality, very small file footprint images finely tuned to that movie (action scenes can be optimized separately from romantic scenery with settings appropriate to each). From that point, the user will go to the frame selection view and begin customizing sections of the movie and areas of the screen, defining frame preferences and zones, respectively. The frame preferences editor will use a source image clipped from the video for reference (using the frame selection interface described above), alongside one or more additional images showing the results of current frame preferences settings (which also take into consideration the current global settings then in effect). The frame preferences editor will also display a number of controls tied to various frame preference settings. It will show lists of previously-configured frame preference settings which can be selected as a template for the new frame preferences, or which can be edited directly. The interface will also contain a list of three types of previously-configured zones (inclusionary, exclusionary, and forced exclusionary) which can be applied to an area of the frame preferences, as well as an entry point into creating a new zone. Once the user is satisfied with the settings, he/she will be directed back to the frame selection interface to choose which (one or many) frames will be controlled by the new settings. Frame preferences can be named, described, and saved separately for later re-use. More details about the specific frame preferences parameters to be supported will be provided later in this document. Setting overriding preferences for one or more areas of images The zone editor will also use a source image clipped from the video for reference (using the frame selection interface described above) or transferred from the frame preferences editor if the user came from that screen, alongside one or more additional images showing the results of current zone settings (which also take into consideration the current global settings and frame preference settings). For the demo, zones are rectangular, though in future versions we will probably allow them to be free-form. Since an infinite number of zones can be stacked one on top of the other (there is a concept of zone priority which describes the inferior/superior relationships of zones to each other) and intersections are allowed, it is possible to described any shape possible on a fixed-pixel screen using rectangles (though it can become a pain, especially when the rectangles become just one pixel in size). The zone editor will Copyright 2008-2010 F. Scott Deaver - all rights reserved. Proprietary and confidential information, may not be disclosed. Page | 8 also display a number of controls tied to various zone settings. For exclusionary and force exclusionary zones (more about these later in the document), the settings will consist only of the zone type, zone priority, the horizontal and vertical size of the zone, and the default zone screen position (the final position of a zone on the screen is set in the frame preferences to which the zone is attached). For inclusionary zones, those controls will be shown with additional controls that can override frame preference settings for the specific area of the screen that the zone controls. It will show lists of previously-configured inclusionary zones which can be selected as a template for the new zone, or which can be edited directly. Zones can be named, described, and saved separately for later re-use. More details about the specific zone parameters to be supported will be provided later in this document. Injection of proxy data The compilation DLL fully supports the concept of proxy data, an invention that allows behaviors not available in any other video playback or compression technique. We will not support it for the first release of the demo because of time and resource constraints (and to be honest, because the additional data cuts into the impact of our compression ratios), but I would definitely want it to be in our second version as an option. My intentions are for the first demo to knock their socks off with our ability to compress video at very high quality and low file sizes, and for the second to wow them with that and the ability to experience awesome effects and behaviors when played back on a computer. Proxy data is free-form data that can be tied to an arriving or departing frame sequence number during playback, or to the recognition of a “different” pixel in a given frame, in a zone within a frame, or at a specific pixel location on the screen during playback. That “different” pixel could occur as the natural result of image changes in the video, or could be created artificially during configuration of video compression by tweaking a pixel in the margins or in an otherwise inactive portion of the screen. The proxy data can be anything – it can be a callback to a function embedded in the player itself. It can be script to be decoded by an interpreter installed on the host computer. It can be a URL to an Internet destination. It can be instructions to jump to a different area of the video. It can launch completely independent applications installed on the computer. For a computer specifically devoted to and equipped for home theatre, it could send out instructions to devices that control room lighting, or stage vibrations in a theater seat, or control audio equipment. Proxy data is an inherent part of the design for DDI video compression – it is the means by which we will implement menus and audio for our videos in the released consumer version. These other capabilities came to us as a gift because of the way I chose to implement those basic features. I look at our proxy data capabilities as a second stage booster rocket that will continue propelling our design forward as infrastructure improvements in data transmission time and competitors catching up tend to reduce the impact and relevance of our compression capabilities. In any case, we need to keep the support for proxy data in the back of our minds as we are implementing the DDI video compression configuration GUI. Copyright 2008-2010 F. Scott Deaver - all rights reserved. Proprietary and confidential information, may not be disclosed. Page | 9 Compilation status, progress, and interruption Once the global, frame preferences, and zones parameters are set, the user will begin compilation – at that point, the screen will display status and progress information (it would be really cool if we could display the new output compression images as they were being built, but we do not want to interfere with CPU access by the compilation function itself). The user will be presented with controls to pause, resume, terminate, and even edit frame preferences and zones during the compilation process (the algorithm looks for commands and new configuration settings between the processing of individual frames). GUI design targets In order to understand and prioritize the design of the GUI configuration pages, which for the most part will collect settings requested from the user and then selectively apply them to the video, it will be necessary to have some understanding what those settings mean or control. The following presentation isn’t meant to convey a complete understanding of all of the DDI video compression features or components but is intended to provide enough information to design and build the GUI. Overview of the compilation process The primary guiding principle of the technology is that a pixel color already displayed at a given location in an image does not require any file space to keep displaying that color at that location for any number of contiguous following frames. In other words, the longer we can keep a pixel at a location the same color (without affecting the quality of the user experience), the less file space we consume storing the video. To achieve this part of the design, we take two passes at a given video frame and several frames that immediately precede and follow the video frame we are evaluating. The first pass evaluates the current and following frames to identify those pixels in the frames that are different from pixels in the same location of the frame that immediately preceded them. The compilation process stores only those pixel colors that change within certain guidelines as it marches through the video, since colors that remain the same (or are not significantly different) do not need to be refreshed from frame to frame (the system takes care of maintaining unchanged colors for us). Base tolerances The actual definition of “different” is itself subject to interpretation. “Different” in the DDI video compression system is actually defined as being outside a tolerance value – in other words, a pixel color is considered different only if it is outside a given range (or color distance) of the original previous image color. These tolerances are expressed as separate double float values, one for the color distance permitted above the original previous image pixel color, and the other for the color distance permitted below the original previous image pixel color. Pixel color values that fall within the upper or lower color distances given in the tolerance value are considered not different from the previous frame pixel value. The first pass at identifying “different” pixels uses a “base” tolerance value. The base tolerance value is set as a global variable for the entire video, and can be modified by frame preferences applied to specific frames or groups of frames, or by overriding tolerance values in zones within areas of one or Copyright 2008-2010 F. Scott Deaver - all rights reserved. Proprietary and confidential information, may not be disclosed. Page | 10 more frames. The GUI will need to provide a means for setting global, frame preferences, and zone base tolerance values. Data change evaluation (DCE) Once all the pixels outside the base tolerance distance from their corresponding previous frame pixels have been identified for the current frame being evaluated as well as for several frames into the future (referred to as “look-ahead” frames), a second pass is made considering just the discovered different pixels within those frames. This pass employs a patent pending technique I call data change evaluation, or DCE. DCE examines the “different” pixels and considers a number of factors to determine if they really deserve or need to be printed to the screen in this specific frame (or in special cases, “averaged” with upcoming look-ahead pixel color changes). DCE criteria These are the issues and considerations factored into a DCE evaluation of a “different” pixel at a specific location in a given frame: 1. Does the pixel color change fall within the range of human perception as defined by CIE 1931/1976 color spaces (these color spaces are recognized as accurately representing human color perception)? 2. Does the same pixel change color in the immediate next frame or near future frames? If so, painting this particular pixel at this particular frame may be less important since it is going to turn around and change again soon anyway. a. If the above is true, and there are several contiguous changes of this pixel over the next several frames, would it be more desirable to average the pixel to a common color across the changed frames rather than repeatedly changing the color every 1/24 or 1/30 of a second? 3. Was the color at this pixel just changed in the previous frame, or recently changed in one of the several frames preceding it? If so, painting this particular pixel at this particular frame may be less important since it was just changed (especially if item 1 is also true). 4. Were there basic tolerance differences in the past several frames that did not result in actual pixel changes in the compressed video because they were overridden by DCE? If so, previously skipped differences increase the importance of expressing the current pixel difference into the compressed data. 5. Is this pixel part of an identified bit-shape (a “bit-shape” is a pattern of “different” pixels in the image that the DDI algorithm has recognized as having contact with one another). If so, the “difference” may be significant to maintaining the integrity of the bit-shape, especially if the bitshape is recognized in following images as a moving object or repeating pattern. a. If the above is true, is the pixel on the edge of a bit shape – in other words, do some of the sides or corners of the pixel have contact with other bit-shape members while other sides or corners of the pixel do not? Being on the edge of a bit-shape can mean that the pixel may be quite important in defining the bit-shape. In some cases, the reverse may Copyright 2008-2010 F. Scott Deaver - all rights reserved. Proprietary and confidential information, may not be disclosed. Page | 11 be true – pixels on the edge may not be as important as pixels in the interior of a bitshape, but in either case, edge position is a consideration. 6. For the demo, we are supporting file-based playback of the video. In the future we will also support streaming – transmission of compiled output to a remote player as a source file or image array is being converted or compressed from another format. When used in this manner, the compilation process will periodically request from the player a report of any frames that were skipped for display in order for the player to keep up with the specified frame rate. DCE will consider skipped frames as it produces its output, reducing the stress on the player to keep up (if necessary) by reducing the number of “different” pixels it actually stores to the compressed video – in this case, the criteria for storage an otherwise “different” pixel may be altered depending on the response to the question “How many of the last xxx frames were dropped to keep up with the frame display rate as shown in the frame skip report?” Weighted tolerances With the exception of the first item in the previous list, it is obvious the criteria have a highly subjective nature to them. How, then, are these decisions made, persisted, and then passed to the DCE algorithm? The application solves this problem though the use of weighted tolerances. Weighted tolerance scope Weighted tolerances start out with a pair of double float “scope” values (one representing a color distance above the previous frame’s corresponding pixel color value, and the other representing a color distance below the previous pixels frame’s corresponding pixel color value). The scope values define the ranges to which weightings will be applied. Pixel colors being considered for DCE have already been identified as exceeding the base tolerance value, so the scope values will always exceed the base tolerance ranges applicable to the candidate pixel’s frame and location (otherwise the pixel will always pass the weighted tolerance test and be included in the compiled output file). Any pixel whose distance from the previous frame’s corresponding pixel color value exceeds the weighted tolerance scope will automatically be included in the compiled output file. If, however, the candidate pixel’s color distance from the previous frame’s corresponding pixel color value falls between the base tolerance and the scope weighted tolerance scope, that pixel will be subjected to a weighted tolerance test. Weighted tolerance testing Currently, we support the following weighted tolerance components: 1. An independent integer weighting for each frame of the future ‘nnn’ frames (where ‘nnn’ is provided by the user via the GUI) where the pixel at the same location was “different” (considering only the base tolerance) – typically, these weightings will decrease exponentially as the future frame’s distance from the current frame increases; a. A Boolean value indicating whether look-ahead averaging is permitted, and an integer value delimiting the number of future frames (with pixel differences) that can be included in the averaging frames (both values will be supplied by the user via the GUI) 2. An independent integer weighting for each frame of the past ‘nnn’ frames (where ‘nnn’ is provided by the user via the GUI) where the pixel at the same location was “different” Copyright 2008-2010 F. Scott Deaver - all rights reserved. Proprietary and confidential information, may not be disclosed. Page | 12 (considering only the base tolerance) – typically, these weightings will decrease exponentially as the past frame’s distance from the current frame increases; 3. An independent integer weighting for each frame of the past ‘nnn’ frames (where ‘nnn’ is provided by the user via the GUI) where the pixel at the same location was “different” (considering only the base tolerance) but was overridden by the DCE and not published to the compiled output file – typically, these weightings will decrease exponentially as the past frame’s distance from the current frame increases; 4. An integer weighting value to be applied if the pixel is a member of a bit-shape; a. An integer weighting value to be applied if the pixel is on the edge of a bit-shape (see the note for the next item); b. An integer weighting value to be applied if the pixel is not the edge of a bit-shape – this weighting works in conjunction with the item above. In some cases, the bit-shape edge is more important for preservation, and in some cases preserving the bit-shape edge is less important. Increasing the tolerance for the item above makes it more likely than the bit-shape center to be rendered to the compiled output file, and increasing the tolerance when not on the edge makes the bit-shape center more likely to be rendered to the compiled output file than the bit-shape edge; 5. (For streaming only) An integer multiplier for each skipped playback frame in the last nnn frames (where ‘nnn’ is provided by the user via the GUI). The GUI will need to support setting the weightings for each component, in addition to the other parameters described in the list above. Since these weightings can be set globally, at the frame preferences, or in individual zones, the dialogs or views used to configure weightings should be mounted in a reusable GUI component. The higher the final tolerance value applied to a pixel color distance, the less likely that pixel will be preserved to the final compiled output file. The weightings above, when applied, are intended to increase the pixel’s likelihood of retention in the output file. Furthermore, we do not want weightings constrained such that when a user wants to increase the impact of a weighting, he/she has to reduce or otherwise manipulate the other weightings – to make a weighting more prominent, all he/she should have to do is set the new weighting higher than the current highest value of any other weighting. Therefore, the weightings are applied inversely to the scope range in the following steps: 1. All of the applicable weightings that could be applied to a given pixel location and frame (regardless of whether the conditions for applying that weighting are met) are totaled; 2. All of the weightings for the conditions actually met by the pixel location and frame are totaled; 3. The result of step two is compared to the results of step one and a double float percentage is derived; 4. The scope range minus the base tolerance is multiplied by 100.0 minus the results of step three, and then added to the base tolerance; 5. If the pixel color distance from the previous frame’s corresponding pixel color is greater than the result of step four, the pixel is stored into the compiled output file, otherwise it is Copyright 2008-2010 F. Scott Deaver - all rights reserved. Proprietary and confidential information, may not be disclosed. Page | 13 not stored and the pixel is marked to indicate a base tolerance difference was detected but overridden by DCE. Bit-shapes Bit-shapes are automatically extracted and inventoried as a component of the underlying proprietary DDI compression technology. However, the GUI must request from the user two (possibly three) values to configure bit-shape extraction. The first value needed is the smallest number of contiguous pixels that can be construed to represent a bit-shape shape primitive – this is typically an integer set to either three or four, but may be set higher in some conditions. It is inefficient for the system to track and inventory very small bit-shapes. The second value needed from the user is the smallest number of add-on pixels acceptable for creating a compound bit-shape. This would be an integer with a startup value of two or three. Again the intent is to avoid tracking and inventorying compound objects that add very little value to their base primitive shapes. Because the implementation of bit-shapes is highly efficient in terms of storage space in the final compiled output file, it may be desirable to take advantage of that and use a lower tolerance (higherquality result) than the base or weighted tolerances for any pixel that qualifies as a member of a bitshape. I am experimenting with bit-shape configuration currently, and at this point it is highly likely bitshape tolerance will be a part of the demo. Delta bit-block indexing block size Delta bit-block indexing is a fundamental component of the underlying proprietary DDI compression technology, and generally runs without the need for human intervention. However, for reasons that are too complex for presentation here (see the patent and design documentation), the size of the elemental bit-blocks can dramatically affect the efficiency of delta bit-block indexing. The possibilities are best presented in a list or combo box, and currently consist of the following possibilities (the first value of the two numbers shown for each has a relationship to the width of the image, the second to the height of the image): 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 2 x 2 (nibble); 2 x 2 (nibble, identify leftover pixels individually); 4 x 2 (byte); 4 x 2 (byte, identify leftover pixels individually); 2 x 4 (byte, alternate); 2 x 4 (byte, alternate, identify leftover pixels individually); 4 x 4 (unsigned short); 4 x 4 (unsigned short, identify leftover pixels individually – this is the default); 8 x 4 (unsigned __int32); 8 x 4 (unsigned __int32, identify leftover pixels individually); 4 x 8 (unsigned __int32, alternate); Copyright 2008-2010 F. Scott Deaver - all rights reserved. Proprietary and confidential information, may not be disclosed. Page | 14 12. 4 x 8 (unsigned __int32, alternate, identify leftover pixels individually); 13. 8 x 8 (unsigned __int64); and 14. 8 x 8 (unsigned __int64, identify leftover pixels individually). User-configurable parameter support by image controller type Global parameters Source file path or image array source in memory For the demo, this will be fixed to the file path for the “The DaVinci Code” trailer. In the future, this will be set by the user to point to his/her chosen image source. Source binary type For the demo, this will be set to “file path”. In the future, this will be set indirectly by the user’s choice of the source video or image array in memory. Source image type For the demo, this will be set to indicate a 24-bit RGB pixel array. In the future, this will set by the user to indicate the image format of the source video or image array in memory. Source frame width For the demo, this will be fixed at 1920 pixels – in the future, this will be set to the actual width in pixels of the source video or image array. Source frame height For the demo, this will be fixed at 1080 pixels – in the future, this will be set to the height width in pixels of the source video or image array. Cropped source frame horizontal offset For the demo, this will be set to 0. In the future, the user can crop the source display size, and would store the horizontal offset to the cropped area here. Cropped source frame vertical offset For the demo, this will be set to 0. In the future, the user can crop the source display size, and would store the vertical offset to the cropped area here. Cropped frame width For the demo, this will be set to match the source frame width. In the future, the user can crop the source display size, and would store the horizontal size (in pixels) of the cropped area here. Cropped frame height For the demo, this will be set to match the source frame height. In the future, the user can crop the source display size, and would store the vertical size (in pixels) of the cropped area here. Copyright 2008-2010 F. Scott Deaver - all rights reserved. Proprietary and confidential information, may not be disclosed. Page | 15 Output frame width For the demo, this will be set to match the cropped frame width (which in the demo is the same as the source frame width). In the future, if the output frame width is different than the cropped frame width, bi-cubic interpolation will be used to render an interim image used for compression. This will result in much higher image quality than you would see in more typical runtime interpolations from other video compression techniques, but will substantially increase the time required for compression. Output frame height For the demo, this will be set to match the cropped frame height (which in the demo is the same as the source frame height). In the future, if the output frame height is different than the cropped frame height, bi-cubic interpolation will be used to render an interim image used for compression. This will result in much higher image quality than you would see in more typical runtime interpolations from other video compression techniques, but will substantially increase the time required for video compression. Output display rate This will be set to 23.97 for the demo, but in the future will support frame-per-second rates from 16 fps to 30 fps as a double float. Total frame number This is the number of frames that make up the video. Maximum number of look-ahead frames This integer value will be set by the user via the GUI, and is limited to 255. A good working value would be between 2 and 8, with the default set to 6. Maximum number of history frames This integer value will be set by the user via the GUI, and is limited to 255. A good working value would be between 2 and 8, with the default set to 6. Frame preferences array The global settings carry an array of frame preferences to be applied at specific frame positions as frame processing advances. By default, the array contains just one frame preference with a frame sequence starting number of 0. The default frame preference contains just one inclusive zone covering the entire display area, having a zone priority value of 65535 (the lowest priority). This inclusive zone supplies the default values for the entire video, to be overridden wherever necessary by zone values contained in frame preferences. Other frame preferences are added by the user to manage video processing, and as frame preferences are added they are sorted by their frame starting sequence numbers. Frame preferences During processing, there is just one frame preferences in effect for any given frame being processed – the applicable frame preferences for a frame is determined by whether the frame’s sequence number is equal to or greater than the frame sequence number of the frame preferences, and less than any other frame preferences’ frame sequence number. Copyright 2008-2010 F. Scott Deaver - all rights reserved. Proprietary and confidential information, may not be disclosed. Page | 16 Frame starting sequence number The frame preferences starting sequence number is an integer value that describes at what position in the sequence of video frames this frame preferences object takes effect. One and only one frame preferences object can be in play for any given frame, and any attempt to insert a frame preferences object with the same frame sequence number as another will replace the previous frame preferences object – if you want to preserve the previous frame preferences object, it must have its starting frame sequence number changed to one that doesn’t conflict with any other frame preferences object. While no two frame preferences objects can share the same frame sequence number, the frame preference object can be repeated, albeit with different frame numbers in each instance, many times throughout the video. If, for example, a video has actions scenes interspersed with dialogue scenes, two frame preferences objects, each appropriate to one of the scene types, could be alternated as needed for the course of the movie. Delta bit-block indexing block size As described previously, this defines the block size used by the DDI delta bit-block compression. The following values are currently valid: 15. 16. 17. 18. 19. 20. 21. 22. 23. 24. 25. 26. 27. 28. 2 x 2 (nibble); 2 x 2 (nibble, identify leftover pixels individually); 4 x 2 (byte); 4 x 2 (byte, identify leftover pixels individually); 2 x 4 (byte, alternate); 2 x 4 (byte, alternate, identify leftover pixels individually); 4 x 4 (unsigned short); 4 x 4 (unsigned short, identify leftover pixels individually – this is the default); 8 x 4 (unsigned __int32); 8 x 4 (unsigned __int32, identify leftover pixels individually); 4 x 8 (unsigned __int32, alternate); 4 x 8 (unsigned __int32, alternate, identify leftover pixels individually); 8 x 8 (unsigned __int64); and 8 x 8 (unsigned __int64, identify leftover pixels individually). Get next video frame data callback This callback is provided for extension of the DDI compression DLL by third party vendors, and may be used by proxy data and the configuration GUI (especially during the production of streaming content). It allows injecting custom code just before individual frames get their pixel data from the compression engine, and there are currently no limits on the behaviors that could be injected Quit if compiled output size exceeded This integer value is used to provide a safety mechanism if during a long compilation the addition of a frame’s compiled data causes the compiled output file’s size to grow beyond the specified value. This value is usually seeded with the size in bytes of the original source video file or image array in memory. Copyright 2008-2010 F. Scott Deaver - all rights reserved. Proprietary and confidential information, may not be disclosed. Page | 17 Pixel minimum for considering this frame empty If the number of “different” pixels in the frame after DCE processing is less than this integer value, and none of those pixels has been identified as being a member of a bit-shape, this frame will be considered empty (for empty frames, a 1-bit placeholder for the frame is stored to the compile output file, and during playback whatever image was displayed to the screen in the previous frame is simple carried over to the next frame). Zones The idea behind zones is that you can control the compression of various areas of the screen, either by excluding them from being output to the compiled file at all, or in the case of inclusionary zones, by customizing base tolerances, weighted tolerances, weightings, and other parameters for compression management. These custom zones can then be applied across several frames through their frame preferences container. Looking at a very basic example, here we have a romantic scene where we as a viewer really don’t care (and our eyes won’t perceive) how accurately some parts of the image are displayed in terms of their change from the previous frame. This is the scene as displayed: Copyright 2008-2010 F. Scott Deaver - all rights reserved. Proprietary and confidential information, may not be disclosed. Page | 18 And these are zones which are less important to our eyes: We can increase the tolerances in these areas so minor color changes are not saved to the output compilation, and since the scene consumes a number of frames, the savings is significant. It is not necessary to target specific frames or groups of frames to achieve remarkable compression ratios. You can use the fact that zones can overlap and their relationships to one another can be controlled through zone priorities to keep focus areas of the movie crisp while allowing out-of-focus areas to be more forgiving. Consider the following image, where (as is typical for most video) the important parts of the scene are near the center: Copyright 2008-2010 F. Scott Deaver - all rights reserved. Proprietary and confidential information, may not be disclosed. Page | 19 We can create a zone policy for the entire movie which will focus its energies (and file bytes) on the screen areas most important to the viewer: Here we’ve created two high-tolerance, low priority zones at the two sides (1). We’ve then created a slightly lower-tolerance, slightly higher-priority zone that covers the entire area between the two side zones (2). We’ve then nested zones of decreasing tolerance but increasing priority layered on top of one Copyright 2008-2010 F. Scott Deaver - all rights reserved. Proprietary and confidential information, may not be disclosed. Page | 20 another until we get to zone 5, which is the lowest tolerance and highest priority zone – this zone will very accurately display color changes within it, and will override all the zones it overlays. We can then apply this zone scheme to the entire video via a single frame preferences object, causing the number of pixels saved at the edges to be less and the pixels in the center to be displayed more accurately. Currently we support only rectangular zones – in future versions we expect to allow free-form zones. However, because zones can be overlapped and because a pixel is for all practical purposes itself rectangular, we can describe any shape displayable on a screen using overlapping rectangles (with some of them admittedly very small. Consider this source image: And this zoning scheme to preserve detail in certain areas of the vehicle: Copyright 2008-2010 F. Scott Deaver - all rights reserved. Proprietary and confidential information, may not be disclosed. Page | 21 While not necessarily a practical example, it does demonstrate that zones of infinite complexity (within the limits of the display area) can be created. Applicable to all zone types Zone horizontal offset This is the zone’s horizontal offset within the frame, in pixels. Zone vertical offset This is the zone’s vertical offset within the frame, in pixels. Zone width This is the zone’s width, in pixels. Zone height This is the zone’s height, in pixels. Zone type This is the zone’s type. There are three types: inclusionary, exclusionary, and forced exclusionary. Exclusionary and forced-exclusionary zones force the exclusion of a pixel within their boundaries from being included in the compiled output file (unless overridden by an overlapping inclusionary zone having a higher priority – see the details under “Zone priority” below). An exclusionary or force-exclusionary zone does not provide any additional information for processing a pixel. An inclusionary zone signifies that pixels falling within its borders should be included in the compiled output file if the pixel meets the tolerance and other criteria that the inclusionary zone supplies. Zone priority Zone priority is used, along with zone type, to manage overlaps between zones. The zone priority is an unsigned short integer having a value between 0 and 65535, with 65535 being of the lowest priority, and 0 being the highest. An inclusionary zone at the same priority trumps an exclusionary zone having the same priority, but a force-exclusionary zone trumps an inclusionary zone at the same priority. Therefore, if a pixel was different than the corresponding previous frame’s pixel and that difference was outside the base tolerance level, and the pixel's location fell within an inclusionary zone having a priority of 0, and also fell within an exclusionary zone having a priority of 0, and also fell within a forceexclusionary zone having a priority of 0, the force-exclusionary zone would dominate and the pixel would not be published to the compiled output file. In all other cases, the zone priority would govern (an inclusionary zone of priority 7 would trump a force-exclusionary zone having a priority of 8, but would itself be trumped by an exclusionary zone with a priority of 5). Applicable only to inclusionary types The following data items are provided only by inclusionary zones, and they specify the criteria the pixel color must meet to be included in the compiled output file. Copyright 2008-2010 F. Scott Deaver - all rights reserved. Proprietary and confidential information, may not be disclosed. Page | 22 Pixel difference comparison callback This callback is provided for extension of the DDI compression DLL by third party vendors, and may be used by proxy data and the configuration GUI (especially during the production of streaming content). It allows injecting custom code just after a pixel is process for differences, zone membership, and/or bitshape membership, but before the pixel is included in or discarded from the compiled output file (the third-party vendor can influence that decision by manipulating the callback return code). There are currently no limits on the behaviors that could be injected. Turn logging on or off The DDI compression DLL has a very efficient proprietary non-blocking threaded logging engine built into it, and the inclusionary zone can turn logging on and off using this Boolean value. Allow look-ahead averaging If a pixel in the current frame passes the base and weighted-tolerance tests, and the corresponding pixels in frames ahead of it have passed the base tolerance test, the pixel may be a candidate for averaging with one or more of the pixels ahead of it. This Boolean value indicates whether that averaging is permitted for this inclusionary zone. Number of look-ahead frames to average If a pixel in the current frame passes the base and weighted-tolerance tests, and the corresponding pixels in frames ahead of it have passed the base tolerance test, the pixel may be a candidate for averaging with one or more of the pixels ahead of it. If permitted by the Boolean value described above, this signed integer values sets the maximum number of forward pixels that can be averaged. The number of pixels that will be averaged is the lesser of the number of contiguous “different” forward pixels and this value. If set to negative one, the value here means do not change whatever current number of look-ahead frames is in effect. This is useful because the zone’s “allow look-ahead averaging” may be set to temporarily turn off look-ahead averaging, but does not intend to alter the number of look ahead frames once look-ahead averaging is turned back on in a later frame preferences object. Base tolerance value This is the base tolerance value in effect for the pixels included in this zone, and defines whether a pixel is considered “different” from the corresponding pixel in the previously processed frame during the first pass at processing the current frame. Bit-shape tolerance value (experimental) Because the implementation of bit-shapes is highly efficient in terms of storage space in the final compiled output file, it may be desirable to take advantage of that and use a lower tolerance (higherquality result) than the base or weighted tolerances for any pixel that qualifies as a member of a bitshape. I am experimenting with bit-shape configuration currently, and at this point it is highly likely bitshape tolerance will be a part of the demo. Copyright 2008-2010 F. Scott Deaver - all rights reserved. Proprietary and confidential information, may not be disclosed. Page | 23 Weighted tolerance scope value This value sets the maximum tolerance range – the distance between the base tolerance and this value is where weightings are applied to determine the final tolerance the candidate pixel must pass to be included in the compiled output file. Please see the extensive explanation regarding weightings in the section “Data change evaluation (DCE)”. Weightings Look-ahead weightings array This is where the user-provided weightings are stored for this zone as applied to look-forward frames that have been detected as “different” (using the base tolerance value) from their predecessors. History weightings array This is where the user-provided weightings are stored for this zone as applied to past frames that have been detected as “different” (using the base tolerance value) from their predecessors. Past frame different but overridden weightings array This is where the user-provided weightings are stored for this zone as applied to past frames that have been detected as “different” (using the base tolerance value) from their predecessors, but were not written to the compiled output file because they were overridden by DCE. Bit-shape member weighting value This is the weighting applied to any pixel that is different and has been discovered by DCE to belong to a group of different pixels that form a bit-shape. On-bit-shape-edge weighting value This is the weighting applied to any pixel that is different and has been discovered by DCE to be on the edge of a group of different pixels that form a bit-shape. Not-on-bit-shape-edge weighting value This is the weighting applied to any pixel that is different and has been discovered by DCE to be a member of a group of different pixels that form a bit-shape, but is not on any edge of the bitshape. Streaming video skipped frames weightings When used for runtime streaming of compiled output from a different file format to a remote player, the compiler can adjust its output in response to the number of frames that the player is skipping in order to keep up with the designated frame display rate (if any). Number of past frames to consider for skipping This is how far back in history the compiler should look in the skipped frames report to assess whether a skipped-frame weighting should or should not be applied. Copyright 2008-2010 F. Scott Deaver - all rights reserved. Proprietary and confidential information, may not be disclosed. Page | 24 Number of skips required to trip frame skip weighting This is the number of playback frame skips allowed before frame skip weighting is deployed. If a greater number of playback frame skips occurs within the number of frames specified in “number of past frames to consider for skipping”, frame skip weighting is invoked. Skip weighting value The skipped-frames weighting is different from other weightings in that the weighting is NOT applied if frame skip compensation is required, but IS applied if no frame skip compensation is required. This value should be fairly large when compared to other weightings, and should comprise fifty percent or more of the total weightings in effect for the candidate pixel. Random thoughts Easy versus expert Some thought needs to be given to breaking out the esoteric settings into some kind of advanced or expert mode, page or view. Playback It occurs to me that in the creation of the configuration GUI, we will have also created most of the underpinnings for a standalone player. We’ll have to be sure to set those playback features aside as we build them into some kind of container – a class, assembly, or DLL (as to the latter, I’m concerned that C# lacks the performance we need, and I’m thinking that the playback file extraction and image recreation code will run in a C++ process, interacting with a GUI front-end toolbar via shared memory). In summary This is intended to be a working document, changing over time as the application develops and through version iterations. At the time of this writing, the document is crude, and intended primarily as an introduction to any programmer willing and able to assist with the project. It is my expectation that this document will eventually evolve into a competent high-level design (HLD) for the GUI component of the project. That having been said, the methods, techniques, and ideas described are proprietary and confidential and include intellectual property, trade secrets, and design considerations of great value to me, my partners, clients, and future customers. As a condition of having received this document, you are bound to protect this information from accidental or intentional discovery by other parties. Please securely and effectively destroy any hard copies or digital images of this document after review, and do not convey this information to any other party by any means without my express written permission. Thank you, Scott Deaver Copyright 2008-2010 F. Scott Deaver - all rights reserved. Proprietary and confidential information, may not be disclosed. Page | 25 Two’s Complement, LLC Copyright 2008-2010 F. Scott Deaver - all rights reserved. Proprietary and confidential information, may not be disclosed. Page | 26