Multimedia Beta-version http://www.youtube.com/watch_popu p?v=jEjUAnPc2VA#t=20 Required background C , C++ or Java CS240 (Data Structures) or CS212 (for Engineers) This is NOT a graphics course (see CS460/560) Labs Data compression Elements of graphics in OpenGL Photo manipulation GIMP Steganography Creating & Rendering models © D.J. Foreman 2009 2 Compression Encoding/Decoding Codecs Huffman encoding example lab! Working with Images Image creation Photography Drawing RIB (Renderman Interface Bytestream) Modeling • OpenGL – lab! • Makehuman, 3DCanvas, etc. lab! Image manipulation Gimp, Photoshop, etc. lab! Stereoscopic & other 3D Rendering lab! © D.J. Foreman 2009 Bitmaps Ray tracing Animation Hand drawn Program generated • • • • Stop-motion 3D CGI Alice lab! DirectX & OpenGL Hardware acceleration RIB revisited Shaders Steganography lab! Additional topics (time permitting) Augmented reality Geospatial data systems 3 Modelers and environments Ayam – ortho & 3D 3DCanvas – 3D Blender – ortho & 3D Makehuman - 3D Renderers Aqsis Pixie – no GUI, file input © D.J. Foreman 2009 4 © D.J. Foreman 2009 5 A device or program for encoding and/or decoding a digital data stream or signal Encoding A->D for storage or transmission Decoding D->A for playback Lossy vs. lossless Raw uncompressed Pulse Code Modulation (PCM) audio, a digital representation of an analog signal where the magnitude of the signal is sampled regularly at uniform intervals E.g.; (44.1 kHz, 16 bit stereo, as represented on an audio CD or in a .wav or .aiff file) is a standard across multiple platforms. © D.J. Foreman 2009 6 May emphasize aspects of data Video: motion vs. color Audio: latency (cell phones) vs. high-fidelity (music) Data size: transmission speed or storage size vs. data loss © D.J. Foreman 2009 7 Forensic Computing: A Practitioner's Guide by T. Sammes & B. Jenkinson (Springer, 2000), ISBN: 9781852332990 . http://www.garykessler.net/library/file_sigs. html © D.J. Foreman 2009 8 Tradeoffs compression speed compressed data size quality (data loss) Areas of study information theory rate-distortion theory Popular algorithms (all lossless) Lempel-Ziv (LZ) LZW (fast decompression) LZR (LZ-Renau) (ZIP files) LZW (Lempel-Ziv-Welch) (for GIF files) © D.J. Foreman 2009 9 Huffman compression Bit string representing any symbol is never a prefix of the bit string representing any other symbol i.e.; if 010 is the code for a symbol, then no other symbol starts with 010. Frequency table must be known, computable or included Lossless – every character gets encoded Arithmetic encoding Probabilistic algorithm Slightly superior to Huffman, but often patented! © D.J. Foreman 2009 10 Compute a model of the data e.g.; a Huffman tree based on probability Probability determined from input file Map the data according to the model Read 1 symbol Search the model (tree) for that symbol If it’s a Huffman tree, going left=0, right=1 Append each 0 or 1 to the code string When symbol is found, you are done Note: all symbols will have < 8 bits (thus compression) © D.J. Foreman 2009 11 A simple Huffman-like code (w/o probability basis) A 110 B 010 C 0001 E 10 R 011 S 111 T 001 0 1 B e a r c a t s (8 bytes of text): 010101100110001110001111 encoded as 123456781234567812345678 3 bytes © D.J. Foreman 2009 12 Static Read all the data Compute frequencies Re-read the data & encode Dynamic Build simple basic model Modify model as more data is processed © D.J. Foreman 2009 13 Three parts: 24-byte header with magic # (defines codec) variable-length annotation block contiguous segment of audio data. Storage methodology network (big-endian) byte order multi-byte audio data may require byte reversal in order to operate on it by the arithmetic unit of certain processors © D.J. Foreman 2009 14 Following is from: #include <multimedia/libaudio.h> typedef unsigned long u_32; // unsigned 32-bit integer typedef struct { u_32 magic; // the “magic number” u_32 hdr_size; // byte offset to start of data u_32 data_size; // length (optional) u_32 encoding; // data encoding enumeration u_32 sample_rate; // samples per second u_32 channels; // # of interleaved channels } Audio_filehdr; AUDIO_FILE_MAGIC ((u_32)0x2e736e64) /* “.snd” */ © D.J. Foreman 2009 15 AUDIO_FILE_ENCODING_MULAW_8 (1) /* 8-bit ISDN u-law */ AUDIO_FILE_ENCODING_LINEAR_8 (2) /* 8-bit linear PCM */ AUDIO_FILE_ENCODING_LINEAR_16 (3) /* 16-bit linear PCM */ AUDIO_FILE_ENCODING_LINEAR_32 (5) /* 32-bit linear PCM */ AUDIO_FILE_ENCODING_FLOAT (6) /* 32-bit IEEE floating point */ AUDIO_FILE_ENCODING_DOUBLE (7) /* 64-bit IEEE floating point */ AUDIO_FILE_ENCODING_ADPCM_G721 (23) /* 4-bit CCITT g.721 ADPCM */ AUDIO_FILE_ENCODING_ADPCM_G723_3 (25) /* CCITT g.723 3-bit ADPCM */ AUDIO_FILE_ENCODING_ALAW_8 (27) /* 8-bit ISDN A-law */ “Linear” values are SIGNED int’s. Floats are signed, zero-centered, normalized to ( -1.0 <= x <= 1.0 ). © D.J. Foreman 2009 16 Purpose Size in bytes Signature header 4 Required version 2 GP flags 2 Method 2 Last mod time 2 Last mod date 2 CRC-32 4 Compressed size 4 Uncompressed size 4 Filename length 2 Extra field length 2 File name variable extra variable Ref: http://livedocs.adobe.com/flex/3/html/help.html?content=ByteArrays_3.html © D.J. Foreman 2009 17 http://www.pkware.com/documents/casestud ies/APPNOTE.TXT http://www.fileinfo.com/filetypes/compressed © D.J. Foreman 2009 18 Constant (CBR) rate at which a codec's output data should be consumed is constant Max bit-rate matters, not the average Uses all available bandwidth Not good for storage (lossy) Variable (VBR) Quantity of data/time unit (for output) varies Better quality for audio and video Slower to encode Supported by most portable devices post 2006 © D.J. Foreman 2009 19 Write a single program with 2 parameters: When parameter 1 is a 1: Open a test file, (e.g.; “xyz.txt”) Compress it using simple Huffman compression, using the frequency table from my FTP site Output compressed file (e.g.; “xyz.enc”) to SAME folder as input When parameter 1 is a 2: Open compressed file “xyz.enc” Decompress the input file Output file: “xyz.dec” so it can be compared to “xyz.txt” Parameter 2 is the full input file path & name (i.e.; file names MUST NOT be hard-coded) Remember: a code can be a single bit! © D.J. Foreman 2009 20 Modeling A container is a FILE format, NOT a code scheme E.g.; AVI is a container format Format for storage Independent of encoding or content Others: Ogg, ASF, QuickTime, RealMedia, Matroska, DivX, and MP4. © D.J. Foreman 2009 22 A pixel is 3 color-dots (squares) in a collection of dots (squares) making up the picture. Each color-dot is for one of the three primary colors, RGB. Minimum of 1 byte per color (depending on bit-depth, which varies with camera), thus at least 3 bytes per pixel. A 10MP camera (30 million color-dots) needs 30M bytes This gets compressed by the camera (JPEG format) Final file size = 30 million bytes divided by the amount of compression (compression ratio). An 8:1 compression ratio would reduce that 30 million bytes to 3.75 megabytes (as done on a 4 MP camera) A RAW file has NO compression © D.J. Foreman 2009 23 Basic mechanisms Drawing Line BrushModeling Modeling/rendering programs OpenGL Photography Film Digital © D.J. Foreman 2009 24 Graphic Model data structure nodes represent random variables Pairs of nodes connected by arcs correspond to variables that are not independent Graphic modeling using a Graphic Model to simulate a real-world object (this is not a formal definition) Rendering generating a 2D image from a 3D graphic model © D.J. Foreman 2009 25 Geometric primitives – Points, line segments and polygons Described by vertices Control points – Special points attached to a node on a Bézier curve Alter the shape and angle of adjacent curve segment Evaluators – Functions that interpolate a set of control points Produce a new set of control points © D.J. Foreman 2009 26 A standardized interface Created by modeling programs Input to rendering programs A plaintext file © D.J. Foreman 2009 27 openGL 3.0 (2.0 + deprecated functions) openGL 3.1 only newer graphics cards Many <= 3.0 functions deprecated Many actions now done via shaders 1st create a context (like the chicken & the egg) create an old (e.g.; 3.0) context activate it create the new context deactivate old context Old context needed to create new one © D.J. Foreman 2009 28 Vertex Data Vertex Processor Fragment Processor Frame Buffer Pixel Data Per-vertex operations: •Eye-space coordinates •Colors •Texture coordinates •Fog coordinates •Point size © D.J. Foreman 2009 29 Vertex Coordinates Normal Vector Model View Matrix Vertex Shader replaces these Model View Matrix Primitive Setup Color Values Texture Coordinates Projection Matrix Clipping Texture Matrix Fog Coordinates © D.J. Foreman 2009 30 From primitive setup Bitmaps/Pixel Rectangles Texture Mapping Color Summation fragment shader replaces these Per-pixel Fogging Fragment Tests Frame Buffer © D.J. Foreman 2009 31 Using shaders & Vertex Array Objects glBindVertexArray(my_vao_ID[0]); // select 1st VAO glDrawArrays(GL_TRIANGLES, 0, 3); // draw 1st object note difference from glBegin (GL_POLYGON)…. Note: name of out(put) variable in vertex shader must be the same as in(put) variable in fragment shader . i.e. vertex -> fragment (as in original pipeline) © D.J. Foreman 2009 32 Write a program using the OpenGL 2.0 interface: Create 3 figures: Triangle (2D) Rectangle (2D) Irregular pentahedron (5-sided, 3D figure) (a pyramid) • See: http://en.wikipedia.org/wiki/Pentahedron for examples • ANY one Vertex pinned at 0,0,0 • Allowed to be non-equilateral Draw all 3 axes (optional, but HIGHLY recommended) The triangle is within the bounds of the rectangle (or vice-versa), both objects in same plane, different colors The pyramid must have different colors on all 5 sides Create movement controls for the pyramid: Cursor keys ←→↑↓ change camera location for world view L & R keys rotate pyramid (CW/CCW) about Z-axis Pinned pyramid Vertex STAYS at 0,0,0 The pyramid rotates, the “camera” position remains fixed © D.J. Foreman 2009 33 0,0,0 0,0,1 0,0,0 6-vertex pentahedron Valid for lab Still has 5 sides! © D.J. Foreman 2009 0,0,1 34 PC Requirements OPENGL library - for manipulating the model opengl32.lib opengl32.dll glu32.dll glu.h and gl.h © D.J. Foreman 2009 (comes with Windows/XP & 7) “ “ 35 Glut - note the “t” after the glu http://www.opengl.org/resources/libraries/g lut/glut_downloads.php#windows glut32.dll glut32.lib glut.h ” and here too FreeGlut (newer – Open Source) http://freeglut.sourceforge.net/ freeglut.dll freeglut.lib glut.h glut.h does a #include of gl.h and glu.h © D.J. Foreman 2009 36 Define window Define objects Initialize “callback” functions Initialize window Run the gl main loop © D.J. Foreman 2009 37 Two parts 1. Modeling, coloring, etc. Use standard C code, with calls to OpenGL functions Your program runs as subroutines WITHIN the OpenGL MainLoop 2. Callback functions Called by OpenGL, from OUTSIDE your program Display Reshape Purpose of “registering” callbacks passes a pointer to your functions allows OpenGL to call them. © D.J. Foreman 2009 38 Orthographic All views are 2-dimensional Not good for games ignores the z-axis Perspective See model from “camera position” via “viewport” Φ camera © D.J. Foreman 2009 viewport 39 Movement functions Shape defining functions (lines, polygons, etc.) Reshape function (for window re-sizing) Init function Display function Main (includes call to glutMainLoop) NOTE: gl, glu & glut prefixes on functions. Be careful! © D.J. Foreman 2009 40 #include <GL/glut.h> #include <stdio.h> #include <math.h> #include <stdlib.h> float ex,ey,ez,theta; void leftMotion (int x, int y); void rightMotion (int x, int y); GLfloat pvertices [][3]= { }; // and then their colors GLfloat pcolors[][3]={ }; // now define the polygon that USES those vertices //define any ONE face of the pyramid at a time void pface(int a, int b, int c) // pface is a name I made up for “pyramid face” {// 3 vertices define a face glBegin(GL_POLYGON); glEnd(); } © D.J. Foreman 2009 41 void draw_pyramid() // implement the faces of the figure { glPolygonMode (GL_FRONT_AND_BACK,GL_FILL); } void draw_square() { } void draw_triangle() { } void draw_axes() { glPushMatrix (); glBegin(GL_LINES); // as pairs of points (default action of GL_LINES) glEnd(); glPopMatrix (); } © D.J. Foreman 2009 42 void dj_display() { /* this is the function that draws the graphic object in the pre-created window */ glClear(GL_COLOR_BUFFER_BIT|GL_DEPTH_BUFFER_BIT); /* clear the window */ glLoadIdentity(); // eye, at up gluLookAt(ex, ey, ez, 0,0,0, 0,1,0); draw_square(); draw_triangle(); draw_axes(); //last in code, means display on top. draw_pyramid();// now draw the pyramid glFlush(); glutSwapBuffers (); } © D.J. Foreman 2009 43 int main(int argc, char*argv[]) { glutInit(&argc, argv); glutInitWindowSize(wd,ht); glutInitWindowPosition(wx,wy); glutCreateWindow("DJ's 1st"); // define name on window glutSpecialFunc(myspecialkeys); /* register the call-back functions with Glut*/ glutKeyboardFunc(mykey); glutDisplayFunc(dj_display); // more setup glutReshapeFunc(dj_reshaper); dj_init(); // Some texts show init being called AFTER the glutDisplayFunc call, // but it doesn't actually CAUSE display action, it just sets up the environment. glutMainLoop(); // last (& the only real) executable stmt in program return 0; } © D.J. Foreman 2009 44 Register your “callback” functions glutSpecialFunc (your function name); glutKeyboardFunc (your function name); display, reshape Running the program dj_init(); // your init routine (if any) glutMainLoop(); © D.J. Foreman 2009 45 glClear(GL_COLOR_BUFFER_BIT|GL_DEPTH_BUFFER _BIT); glLoadIdentity(); // eye_point, at(x,y,z) up (x,y,z) gluLookAt (ex, ey, ez, 0,0,0, 0,1,0); draw_triangle/rectangle/pyramid, etc draw_axes; glutSwapBuffers (); glFlush(); © D.J. Foreman 2009 46 void reshape (int w, int h) { glViewport (0, 0, w, h); glMatrixMode (GL_PROJECTION); glLoadIdentity ( ); glOrtho(….); glMatrixMode (GL_MODELVIEW); } © D.J. Foreman 2009 47 Last drawn object is in “FRONT” MUST set bit depth AND enable bit depth test Rotation glRotate(Ѳ, x,y,z) operates along a vector starting at 0,0,0 Other rotations Requires pre-multiplication of matrices with the identity matrix © D.J. Foreman 2009 48 Silver halide crystals (AgCl, AgBr, AgI) Exposure to light turns them black Developer removes the loose (exposed) halide leaving pure silver Fixing bath flushes out unexposed AgHalide Greatest desire of photographers: a medium with High speed (gathers light quickly) Low noise (no undesired artifacts) High resolution (very detailed images) Visible lines per mm (lpm) at a specified contrast level Notes: Larger “clumps” are faster, but limit resolution Smaller “clumps” are hard to coat evenly Randomness of coating prevents Moiré patterns http://en.wikipedia.org/wiki/Moir%C3%A9_pattern © D.J. Foreman 2009 49 crystal © D.J. Foreman 2009 Radiation (light) 50 35 mm film for comparison A 36 x 24 mm frame of ISO 100-speed film contains (≈) the equivalent of 20 million pixels. 1 There are NO pixels in film! Full-frame digital (36 x 24mm) Canon EOS 5D, Nikon D3 (21MP) - $5K2 Medium format digital (6 x 4.5cm) Phase One P40+ (39MP) ($22K) APS-C sized (≈24x16mm) Nikon D90 (12.3MP) - $900, Canon Rebel (15MP) $700 Used in most DSLR’s Other SONY Alpha (note: image ratio 4/3) 1 Langford, Michael. Basic Photography (7th Ed. 2000). Oxford: Focal Press. ISBN 0 240 51592 7. 2 Prices as of 2009, given for comparison purposes only © D.J. Foreman 2009 51 Image capture formats RAW – 3 separate images (RGB) [12 bits/pixel * (4288 * 2848)]/8=18,318,336 bytes per picture JPEG – compressed (16x, 8x, 4x) Some cameras do BOTH (RAW + JPEG) for each image Pixel notes: Sensor size 35mm (or larger) vs. APS-C Pixel count – any increase →decrease in pixel size Pixel size – smaller pixels can provide Greater resolution (finer lines) Lower light sensitivity More “noise” (incorrect values) © D.J. Foreman 2009 52 Photographic images Gimp, Photoshop, etc Modeling OpenGL – for creating and lighting a model modeling programs Rendering converting a 3D (possibly wireframe) model to a 2D image, then lighting and (possibly) shading it Ray tracing another way of lighting & shading an image very CPU intensive © D.J. Foreman 2009 53 © D.J. Foreman 2009 54 3 windows: Image – contains the actual results Tools – operators, such as clone, paint, etc Layers – portions of the final image Avoids need for changes to original Stackable Allows separation of collections of changes Allows complex changes to be applied independently of other changes © D.J. Foreman 2009 55 Get a digital landscape or cityscape image Get a picture of yourself Insert the image of yourself into the ‘scape as a layer Save the new image as a new JPG file © D.J. Foreman 2009 56 Elements to manipulate Content Perspective Lighting/shading Sequence Coloring Video game images Dynamic Movie images Pre-determined © D.J. Foreman 2009 57 Developed at CMU Language-free programming environment Point & click usage Rapid prototyping © D.J. Foreman 2009 58 List of “world” objects Insert into methods Modify characteristics (length, width, etc.) List of world details Properties Methods (user created) Functions (built-in methods) World window (results) Method builder © D.J. Foreman 2009 59 Programmer Selects “verbs” from list For While Etc. Applies values for properties Statement structure pre-defined Blocks pre-defined Function calls for if… then… else No need to learn a “language” © D.J. Foreman 2009 60 Open any of the Alice worlds Apply the following rules 1. 2. 3. 4. 5. Display 3 articulated characters (A, B and C) A flips onto its head and rotates slowly (5 times) B waves both of its arms slowly (5 times) C walks around A and B while (2 and 3 happen) C changes direction and walks around A and B (while 2 and 3 repeat) 6. Stop Timing, size, shape, position are your choice © D.J. Foreman 2009 61 Hiding a message in plain sight Inside another message Original is called the “cover” Presence of message is not obvious Can be done with text or graphics © D.J. Foreman 2009 63 Many algorithms for hiding a message Embedding is easier than detection Extraction is straight-forward if: you know there is a hidden message you know the algorithm used Can be very complex to detect & extract © D.J. Foreman 2009 64 Concept: Change the least significant bit of a set of bytes 1→0 and 0→1 For all bytes Repeat for multiple sets of bytes Example with grayscale images One pixel is 8 bits Change the last bit Changes that pixel’s shade VERY slightly. Repeat for every pixel © D.J. Foreman 2009 65 Embedding function Emb (using Matlab or Freelab syntax) c = imread(‘my_decoy_image.bmp’); % Grayscale cover image % ‘b’ is a vector of m bits (secret message) k = 1; % Counter for i = 1 : height for j = 1 : width LSB = mod(c[i, j], 2); if LSB = b[k] | k > m s[i, j] = c[i, j]; else s[i, j] = c[i, j] + b[k] – LSB; end k = k + 1; end end imwrite(s, ‘stego_image.bmp’, ‘bmp’); % Stego image “s” saved to disk The return value “c” is an array containing the image data. If the file contains a grayscale image, “c” is an M-byN array. If the file contains a truecolor image, “c” is an M-byN-by-3 array. The class of “c” depends on the bits-per-sample of the image data, rounded to the next byte boundary. E.g.; imread returns 24-bit color data as an array of uint8 data because the sample size for each color component is 8 bits. © D.J. Foreman 2009 67 Extraction function Ext (Matlab syntax) s = imread(‘stego_image.bmp’); % Grayscale stego image k = 1; for i = 1 : height for j = 1 : width if k m b[k] = mod(s[i, j], 2); k = k + 1; end end end % b is the extracted secret message as a bit string LSBflip(x) = x + 1 – 2(x mod 2) FlipLSB(x) is idempotent, e.g., LSBflip(LSBflip(x)) = x for all x LSB flipping induces a permutation on {0, …, 255} 0 1, 2 3, 4 5, …, 254 255 LSB flipping is “asymmetrical” (e.g., 3 may change to 2 but never to 4) | LSB(x) – x | = 1 for all x (embedding distortion is 1 per pixel)