Image Security and Encryption By Dr. Bjarne Berg-Saether Submitted in Partial Fulfillment of the Requirements of INFO-8200 Principles of Security University of North Carolina – Charlotte Spring 2005 Table of Content PAPER ABSTRACT .....................................................................................................................................................3 MOTIVATION AND BACKGROUND .......................................................................................................................3 JPEG STATIC IMAGES BACKGROUND ..................................................................................................................3 OTHER COMPRESSION METHODS ...............................................................................................................................4 THE JPEG COMPRESSION METHOD ...........................................................................................................................4 JPEG STREAMING IMAGES BACKGROUND .........................................................................................................6 ENCRYPTION ALGORITHM .....................................................................................................................................6 CYCLE CONTROL .......................................................................................................................................................8 LIMITATIONS .............................................................................................................................................................9 RISKS .........................................................................................................................................................................9 APPENDIX A – THE ENCRYPTION CODE ............................................................................................................ 11 WORKS CITED: ......................................................................................................................................................... 13 Paper Abstract As computers matured from simple number machines into more text based systems, the need for better abilities to secure the information has increased. While securing textual information is rather simple, the new usage of computers to stream images, music and movies has raised new concerns around image and sound security. This research paper reviews the current status of encryption and security of images as presented in the research literature. The paper focuses specifically on JPEG and MJ2 standard formats. In addition the paper proposes a conceptual framework for a proof-of-concept application that can encrypt images. The paper also discusses how a simple application can be incorporated into a larger application and the requirements that must be met for such an application to work. Motivation and Background As computers matured from simple number machines into more text based systems, the need for better abilities to secure the information increased. While securing textual information is rather simple, the new usage of computers to stream images, music and movies has raised new concerns around image and sound security. While static files are relatively easy to secure using the existing technology, continuous data streams are much harder to secure. In addition, the increased reliability of IP over the Internet as the core protocol to transmit this type data has also exposed many to significant risks. As an example, many public security cameras in cities, parking lots and airports are now transmitting their clear images across the Internet. Unfortunately, the vast majority of these images are unsecured and unencrypted. This exposes these types of images to several risks. First, the images can be monitored by others; they can be copied and viewed later. A more serious risk is the potential for criminal activities. If an outside camera was recording a certain business area, it might provide a false sense of security as the viewer starts to rely more on the camera than physical patrols. I.e. the images from that camera can be recorded by a criminal and later be played back to the viewer who believes he is seeing real-time images. This would allow the criminal to engage in break-ins, assault or other activity while the viewer is unaware. The problem above illustrates the need for strong image security for these cameras. In addition, a cycle control is needed. Even with encrypted images, the stream can be slowed so that the viewer is seeing images that are lagging substantially in-time while other activities are taking place. This slow-down in frames should also be monitored by a real security system. JPEG Static Images Background In 1982, the ISO formed the Photographic Experts Group (PEG) to research methods of transmitting video, still images, and text over ISDN lines. The goal was to produce industry standards for the transmission of graphics and image data over digital communications networks (Murray & vanRyper, 1994). Other Compression Methods Prior to JPEG most compression technology could not compress images containing a large number of colors and most could not support 24-bit raster images. I.e. GIF can only cover 256 colors (8 pixels deep). Also, the GIF compression method (LZW) has a hard time recognizing repeated patterns when images have “noise” (i.e. faxed images). Another common image format, BMP, relies on the RLE method and therefore has the same problems even through it supports the same number of bits as JPEG, (24-bits and 16 million colors). The JPEG Compression Method JPEG is not a single algorithm but a lossy method for compressing images. This means that an image that is compressed have some values removed and cannot be restored 100% back to the original format. There are many factors to this loss compression algorithm. However, a key factor is that high-frequency color information is removed from the image. This is known a chrominance components Cb and Cr. The high frequency gray scale is kept mostly intact, since this is critical to the contrast of the image (see illustration 1). (Illustration 1: Effects of chominance components compression) JPEG GIF This loss processing very unlike the RLE and the LZE methods mentioned above. JPEG has also limited support for an alternative 2D Differential Pulse Code Modulation (DPCM) scheme that does not prevent loss of data, but can predict where the loss will be, and can therefore restore more of the highly compressed image in the decoding of the image when viewed. Another way to encode high volume images is to tile them into many smaller images. JPEG supports pyramidal, composite and simple tiling and can therefore also encode and decode very large images. The complete process of encoding the image is a complex one. First, a transform is performed to change the image into a basic color scheme, then samples of pixels are grouped together and chrominance parts are analyzed for reduction processing. JPEG’s core compression (50%) is achieved by reducing the luminance pixel areas and substituting them with a chrominance pixel that spans a 2-by-2 area of luminance pixels. This means that 6 values are stored for each block instead of the 12 values normally needed. The key to the next step in the process is JPEGs use of the DCT (Discrete Cosine Transform) algorithm. The DCT provides this baseline compression that all JPEG compression systems are required to follow. In the DCT system the colors that are removed in the JPEG compression cannot be detected by the human eye and the number of color used can be reduced to re-usable objects. In a simplified expression, DCT which processes blocks of pixels and removes redundant data in the image. However, a major step in DCT is the conversion of 8-by-8 blocks into a frequency map that measures the average block value as well as the strengths of change across the block and the direction of the change (i.e. height or width of the pixel block). After this, each block are re-processed by using a set of weighted coefficients. This is done by dividing each of the 64 DCT block values (8-by-8) by a coefficient known as quantization and rounding the result to an integer. If the quantization is high, the accuracy of the DCT block value will be reduced and hence the image will have higher compression, but worse quality. The benefit of this re-processing is the optimization for the viewing by the human eye by a set of semi-fixed rules. The core idea is to make sure that similar groupings of pixels have the same encoding so that they appear uniform. The set of semi-fixed rules are actually quantization tables that are optimized for chrominance and luminance. The key is that these tables have higher quantization values for chrominance, thereby increasing the compression of colors, while providing less compression of luminance of the image. The possible values for the quantization coefficient can be manipulated by changing the desired quality by the end-user. It is important to note that the black-white-gray colors are maintained mostly intact for contrast purposes. As a result, multi-colors used in images (i.e. real world shades) compresses very well in JPEG, typically at 90% compression, while black and white documents have only marginal compression benefits. To determine the compression ratio, the creator can manipulate the Q factor for quality settings. However, each image has its own best Qfactor, so a compromise has to be made in streaming video where multiple images are transmitted. The most costly process of all these steps is the DCT quantization of each block. Therefore this is best done by hardware (i.e. pre-made chips), or pre-compiled software. A small but important note is that “Images containing large areas of a single color do not compress very well. In fact, JPEG will introduce "artifacts" into such images that are visible against a flat background, making them considerably worse in appearance than if you used a conventional lossless compression method.” (Murray & vanRyper, 1994). After all this processing, the weighted coefficients are re-processed to remove duplicates. This is normally done through the Huffman variable word-length algorithm, but a binary arithmetic entropy encoder also be employed to increase the compression by approximately 15% more without missing any data (it is however slower to decode and encode). The result of all this processing is a compressed image that can be interpreted by a decoder typically found in a browser, TV or PC hardware such as screens and projectors. JPEG Streaming Images Background Required in a streaming image process is a version marker. This indicates the capability needed to decode the data stream. Multiple version markers can indicate the functionality needed, the process image chain, the encode methods and the preferred display execution as well (i.e. pyramid tiling). While JPEG is the base for a streaming image, the static representation could only represent the image as interlaced lines. Naturally, you could decode the lines of the images non-sequentially (i.e. every second line, and then “fill-in the blanks”). However, this would be a poor graphics solution with image quality dramatically changing while the being viewed. A better solution is a progressive image building is the transfer of images with very high compression first, supplemented with better images onthe-fly. In 2000, format called MJ2 (or MJP2) for streaming JPEGs were launched. This is a sequence of JPEG 2000 images that also provides audio. This allows MJ2 to encode each frame separately using JPEG 2000 and does not require inter-frame coding as with earlier standards. However, the standard requires substantial computing power to encode and decode images, but is great for rapid transfer over low-bandwidth networks such as the internet. Encryption Algorithm The core issue around a enryption algorithm is that is has to both efficient as well as secure. Since the JPEG images are subject to a 8 level process of compression when created, as well as an 8 level decompression processs when being viewed, the computing power needed for the encryption has to be minimalized. This is perticularly true for straming images where high volumes is likely to occur. (Illustration 2: The compression and encryption process) The solution proposed consists of a simple executionable file generated by C++ coding that takes the compressed JPEG image, which is in binary form, and manipulates the binary file with a encryption key that creates an encrypted binary file that is protected from viewing by unauthorized people. Since file management is hard when reading blocks of code back and forth from a binary file, it is more effircient to read the binary file into memory and address it as an object with an associated pointer. The solution consists of a secret key file that consists of 128 characters (bytes), when converted to binary systems, this key file consists of 1024 bits that is used for encryption. The creation of the secret key is not part of this program, but is can be created through a random generator, or a 3rd party and distributed through normal secure channels i.e. RSA, DES or SSL. The secret key is then stored on the sender and the receiver’s side and has to be protected from anauthorized access. If this key is accessed, the security of the encrypted images are compromised. The soltion proposed, first reads the key file into an object based on an array with assoicated pointer. This object is called the “key”. It is stored in binary form and accessed as so in memory. Secondly, the program reads the binary JPEG file into an object that is defined as an array. For simplicity purposes this is referred to as the image buffer. A challenge is that while we know the fixed size of the key file (128), we do not know the size of the JPEG image. Therefore we have to test the file size by examining the beginning and the end and define the array dynamically based on the size of the file. Another challenge is the need for a simple processing of the encryption. For simplicity purposes, the solution loops through the buffer array and also sub-loops through the key array. For each 8bit that is read, the buffer bits are flipped with the values in the current 8-bit block of key that is being processed. This creates a stream of bits that is processed sequentially and transposed based on the values of the key file. The result is that the buffer array now contains bits that are “flipped” so that is a key value is “1” for a given position, and the buffer value is ‘0” the value in the new buffer (encrypted) will become 1. This is known as a ‘not’ process. The overall rules of this process is that ‘0 and 0 becomes 0; 1 and 1 becomes 0; 1 and 0 becomes 1; 0 and 1 becomes 1’. The implication is that if a single bit is changed on the key file, the image can never be decrypted again. This is true since the position of the other bits will shift and the decrypted file will become meaningless. The soltion proposed also support the processing of 32-bits each time by moving the binary values into a text arrray and processing blocks of the key and the buffer in that manner. However, that requires more computing power and may not work on smaller devices such as cameras that have little memory and less processing powers. Therefore, the 32-bit solution code is included in the program, but not called in the proof-of-concept demo. The next step is to write the newly populated buffer to a binary file known as the encrypted file.jpg. The file name can be substituted for variables that is entered by a user is a production system, but for the proof-of-concept it is hardcoded in the source code. It is this encrypted file that can be stored unsecurely, or transmitted on unsecured channels. The recipient of this file simply reprocesses the file with an opposite processing whereas the bits are being “flipped” back by the not process into their original state. For MicroSoft and their operating system, this is a very memory intensive process and better results could be achived if the solution was running on a Unix system (i.e. AIX or Solaris). The whole process occurs in memory and files are only accessed when read once and written once at the end of the process. The decrypted image can now be decompressed by any JPEG compliant decompressor and viewed on a supported viewer (i.e. browser). Again, this is a memory intensive process that will be enlarged by processing streaming images in a encrypted form. Cycle Control and Image Intrusion Detection Systems (IIDS) While not provided in the proof-of-concept, a cycle control that flagges each image with a separate number value is needed. The reason for this is that even with encrypted images, the stream can be slowed so that the viewer is seeing images that are lagging substantially in-time while other activities are taking place. This slow-down in frames should also be monitored by a real security system. The cycle control would also allow tracking of recorded images. Recorded images are images that are no-longer ‘real-time’. I.e. a set of encrypted images are recorded from a stream originated from a camera, can be recorded and resubmitted by an intruder after the camera is no longer functioning. To capture this, a cycle control can read each image’s value and keep it in an array and validate if the image has previously been processeed by the decryption program. If it shows that it has occurred, the system can notify the viewer that an intrusion has occurred. By doing this, the cycle control enables the decrytion software to also acts as an image intrusion detection system (IIDS). Limitations of the proof of Concept The process is very memory intensive and usually makes smaller machines with less than 512MB RAM have a memory buffer overflow at the operating system level if the image is of a large size. This is true for MicroSoft operating systems that tends to have very inefficient memory addresation. The core reason for this is that the OS places the buffer file from the image and the buffer file from the key and the resulting encrypted buffer file in non-continious memory addresses. Since these addresses are non-protected by the OS, there is an increased liekelyhood that other processes by the OS conflicts with the memory usage and simply ‘crashes’ the process being executed. The solution for MS OS is simply to icnrease the memory to allow for larger buffers to be processed, or to reduce the key size to less than 1024 bits. A far better solution would be to block off a set of memory addresses reseved to the process (remember we do not know the size of the images), and thereby reverve this for the processing. An alternative solkution would be to move the encrytion software to linux or unix based systems that better addressed memory contentions. However, for smaller images (around 30KB) the solution works fine on any MS operation system. Also, the program demonstrated does not process streamed images, but this can be added as a loop of file names being processed. We need a more powerful PC to make that work based on the algorithm tested, or reduce the size of the key to fewer bits. However, reducing the key size is a risky business, wince the flip processes the key file multiple times (i.e. for a 30K bit file, the key is processed 29 times). The security of the system is based on the size of that file and less than 1024 bits is a very risky proposition. The number of solutions for he current key file is 2^1023 which is 8.99E+307. Longer key length would actually be preferred, since each additional byte would increase the combination by a factor of 2. A single byte would make the combination of possible solution 128 times larger. The proof-of-concept allowes the key langth to be icreased, by simply declearing the array length larger when the object is created and the file is read to the key buffer. Risks The new usage of computers to stream images, music and movies has raised new concerns around image and sound security. While static files are relatively easy to secure using the existing technology, continuous data streams are much harder to secure. IP over the Internet as the core protocol to transmit this type data has exposed many to significant risks. As an example, many public security cameras in cities, parking lots and airports are now transmitting their clear images across the Internet. Unfortunately, the many of these images are unsecured and unencrypted. This exposes these types of images to several risks. First, the images can be monitored by others; Secondly, imaged can be copied and viewed later. A more serious risk is the potential for criminal activities. If an outside camera was recording a certain business area, it might provide a false sense of security as the viewer starts to rely more on the camera than physical patrols. Images from that camera can be recorded by a criminal and later be played back to the viewer who believes it is real-time images. This allows criminals to engage in break-ins, assault or other activity while the viewer is unaware. APPENDIX A – The Encryption Code The trick is to recognize that it is a binary file and not a text file...... Loading to a memory buffer (for speed purposes) Simple NOT processing of the bits in the key file (flipping) Works Cited: Eskicioglu E.M., and Delp E. J, "An Overview of Multimedia Content Protection in Consumer Electronics Devices," Signal Processing: Image Communication, Vol. 16, 2000, pp. 681-699. Eskicioglu E.M., Dexter S., and Delp E. J., "Protection Of Multicast Scalable Video By Secret Sharing: Simulation Results," Proceedings of the SPIE/IS&T Conference on Security and Watermarking of Multimedia Contents, Vol. 5020, January 2003, Santa Clara, California. Eskicioglu E.M., and Delp E. J., "An Integrated Approach To Encrypting Scalable Video," Proceedings of the IEEE International Conference on Multimedia and Expo, August 26-29, 2002, Lausanne, Switzerland. Eskicioglu E.M., Town J., and Delp E. J., "Security of Digital Entertainment Content from Creation to Consumption," Proceedings of the SPIE Conference on Applications of Digital Image Processing XXIV, Vol. 4472, San Diego, July 2001, pp. 187-211. Lin E.T., Podilchuk C.I., Kalker T., and Delp E.J., "Streaming Video and Rate Scalable Compression: What Are the Challenges for Watermarking?," Proceedings of the SPIE International Conference on Security and Watermarking of Multimedia Contents III, Vol. 4314, January 22 - 25, 2001, San Jose, CA. Naren K., Csilla F., and Wijesekera D., “Enforcing semantics aware security in multimedia surveillance,” Journal on Data Semantics (Springer LNCS), Murray J.D., vanRyper William, “Encyclopedia of Graphics File Formats,” O'Reilly; 1st ed. (1994) Wolfgang R.B., and Delp E.J., "Overview of Image Security Techniques with Applications in Multimedia Systems," Proceedings of the SPIE Conference on Multimedia Networks: Security, Displays, Terminals, and Gatweys, Vol. 3228, November 2-5, 1997, Dallas, Texas, pp. 297-308.