CS338-lectures

advertisement
Multimedia
Beta-version
http://www.youtube.com/watch_popu
p?v=jEjUAnPc2VA#t=20
 Required background
 C , C++ or Java
 CS240 (Data Structures) or CS212 (for Engineers)
 This is NOT a graphics course (see CS460/560)
 Labs
 Data compression
 Elements of graphics in OpenGL
 Photo manipulation
 GIMP
 Steganography
 Creating & Rendering models
© D.J. Foreman 2009
2
 Compression
 Encoding/Decoding
 Codecs
 Huffman encoding example
lab!
 Working with Images
 Image creation
 Photography
 Drawing
 RIB (Renderman Interface
Bytestream)
 Modeling
• OpenGL – lab!
• Makehuman, 3DCanvas, etc.
lab!
 Image manipulation
 Gimp, Photoshop, etc. lab!
 Stereoscopic & other 3D
 Rendering lab!
© D.J. Foreman 2009
 Bitmaps
 Ray tracing
 Animation
 Hand drawn
 Program generated
•
•
•
•
Stop-motion
3D CGI
Alice lab!
DirectX & OpenGL
 Hardware acceleration
 RIB revisited
 Shaders
 Steganography lab!
 Additional topics (time
permitting)
 Augmented reality
 Geospatial data systems
3
 Modelers and environments
 Ayam – ortho & 3D
 3DCanvas – 3D
 Blender – ortho & 3D
 Makehuman - 3D
 Renderers
 Aqsis
 Pixie – no GUI, file input
© D.J. Foreman 2009
4
© D.J. Foreman 2009
5
 A device or program for encoding and/or




decoding a digital data stream or signal
Encoding
A->D for storage or transmission
Decoding
D->A for playback
Lossy vs. lossless
Raw uncompressed Pulse Code Modulation
(PCM) audio, a digital representation of an
analog signal where the magnitude of the signal
is sampled regularly at uniform intervals
 E.g.; (44.1 kHz, 16 bit stereo, as represented on an
audio CD or in a .wav or .aiff file) is a standard across
multiple platforms.
© D.J. Foreman 2009
6
 May emphasize aspects of data
 Video:
 motion vs. color
 Audio:
 latency (cell phones) vs. high-fidelity (music)
 Data size:
 transmission speed or storage size vs. data loss
© D.J. Foreman 2009
7
 Forensic Computing: A Practitioner's Guide by T.
Sammes & B. Jenkinson (Springer, 2000),
ISBN: 9781852332990 .
 http://www.garykessler.net/library/file_sigs.
html
© D.J. Foreman 2009
8
 Tradeoffs
 compression speed
 compressed data size
 quality (data loss)
 Areas of study
 information theory
 rate-distortion theory
 Popular algorithms (all lossless)
 Lempel-Ziv (LZ)
 LZW (fast decompression)
 LZR (LZ-Renau) (ZIP files)
 LZW (Lempel-Ziv-Welch) (for GIF files)
© D.J. Foreman 2009
9
 Huffman compression
 Bit string representing any symbol is never a prefix
of the bit string representing any other symbol
i.e.; if 010 is the code for a symbol, then no other
symbol starts with 010.
 Frequency table must be known, computable or
included
 Lossless – every character gets encoded
 Arithmetic encoding
 Probabilistic algorithm
 Slightly superior to Huffman, but often patented!
© D.J. Foreman 2009
10
 Compute a model of the data
 e.g.; a Huffman tree based on probability
 Probability determined from input file
 Map the data according to the model
 Read 1 symbol
 Search the model (tree) for that symbol
 If it’s a Huffman tree, going left=0, right=1
 Append each 0 or 1 to the code string
 When symbol is found, you are done
 Note: all symbols will have < 8 bits (thus compression)
© D.J. Foreman 2009
11
A simple Huffman-like code
(w/o probability basis)
A
110
B
010
C
0001
E
10
R
011
S
111
T
001
0
1
B e a r c a t s (8 bytes of text):
010101100110001110001111 encoded as
123456781234567812345678 3 bytes
© D.J. Foreman 2009
12
 Static
 Read all the data
 Compute frequencies
 Re-read the data & encode
 Dynamic
 Build simple basic model
 Modify model as more data is processed
© D.J. Foreman 2009
13
 Three parts:
 24-byte header with magic # (defines codec)
 variable-length annotation block
 contiguous segment of audio data.
 Storage methodology
 network (big-endian) byte order
 multi-byte audio data may require byte reversal in
order to operate on it by the arithmetic unit of
certain processors
© D.J. Foreman 2009
14
Following is from: #include <multimedia/libaudio.h>
typedef unsigned long u_32;
// unsigned 32-bit integer
typedef struct {
u_32
magic;
// the “magic number”
u_32
hdr_size;
// byte offset to start of data
u_32
data_size;
// length (optional)
u_32
encoding;
// data encoding enumeration
u_32
sample_rate;
// samples per second
u_32
channels;
// # of interleaved channels
} Audio_filehdr;
AUDIO_FILE_MAGIC ((u_32)0x2e736e64) /* “.snd” */
© D.J. Foreman 2009
15
AUDIO_FILE_ENCODING_MULAW_8
(1)
/* 8-bit ISDN u-law */
AUDIO_FILE_ENCODING_LINEAR_8
(2)
/* 8-bit linear PCM */
AUDIO_FILE_ENCODING_LINEAR_16
(3)
/* 16-bit linear PCM */
AUDIO_FILE_ENCODING_LINEAR_32
(5)
/* 32-bit linear PCM */
AUDIO_FILE_ENCODING_FLOAT
(6)
/* 32-bit IEEE floating point */
AUDIO_FILE_ENCODING_DOUBLE
(7)
/* 64-bit IEEE floating point */
AUDIO_FILE_ENCODING_ADPCM_G721
(23)
/* 4-bit CCITT g.721 ADPCM */
AUDIO_FILE_ENCODING_ADPCM_G723_3
(25)
/* CCITT g.723 3-bit ADPCM */
AUDIO_FILE_ENCODING_ALAW_8
(27)
/* 8-bit ISDN A-law */
“Linear” values are SIGNED int’s.
Floats are signed, zero-centered, normalized to ( -1.0 <= x <= 1.0 ).
© D.J. Foreman 2009
16
Purpose
Size in bytes
Signature header
4
Required version
2
GP flags
2
Method
2
Last mod time
2
Last mod date
2
CRC-32
4
Compressed size
4
Uncompressed size
4
Filename length
2
Extra field length
2
File name
variable
extra
variable
Ref: http://livedocs.adobe.com/flex/3/html/help.html?content=ByteArrays_3.html
© D.J. Foreman 2009
17
 http://www.pkware.com/documents/casestud
ies/APPNOTE.TXT
 http://www.fileinfo.com/filetypes/compressed
© D.J. Foreman 2009
18
 Constant (CBR)
 rate at which a codec's output data should be
consumed is constant
 Max bit-rate matters, not the average
 Uses all available bandwidth
 Not good for storage (lossy)
 Variable (VBR)
 Quantity of data/time unit (for output) varies
 Better quality for audio and video
 Slower to encode
 Supported by most portable devices post 2006
© D.J. Foreman 2009
19
Write a single program with 2 parameters:
 When parameter 1 is a 1:



Open a test file, (e.g.; “xyz.txt”)
Compress it using simple Huffman compression, using the
frequency table from my FTP site
Output compressed file (e.g.; “xyz.enc”) to SAME folder as
input
 When parameter 1 is a 2:
 Open compressed file “xyz.enc”
 Decompress the input file
 Output file: “xyz.dec” so it can be compared to “xyz.txt”
 Parameter 2 is the full input file path & name
(i.e.; file names MUST NOT be hard-coded)
Remember: a code can be a single bit!
© D.J. Foreman 2009
20
Modeling
 A container is a FILE format, NOT a code
scheme
 E.g.; AVI is a container format
 Format for storage
 Independent of encoding or content
 Others:
 Ogg, ASF, QuickTime, RealMedia, Matroska, DivX,
and MP4.
© D.J. Foreman 2009
22
 A pixel is 3 color-dots (squares) in a collection of dots







(squares) making up the picture.
Each color-dot is for one of the three primary colors,
RGB.
Minimum of 1 byte per color (depending on bit-depth,
which varies with camera), thus at least 3 bytes per
pixel.
A 10MP camera (30 million color-dots) needs 30M
bytes
This gets compressed by the camera (JPEG format)
Final file size = 30 million bytes divided by the amount
of compression (compression ratio).
An 8:1 compression ratio would reduce that 30 million
bytes to 3.75 megabytes (as done on a 4 MP camera)
A RAW file has NO compression
© D.J. Foreman 2009
23
 Basic mechanisms
 Drawing
 Line
 BrushModeling
 Modeling/rendering programs
 OpenGL
 Photography
 Film
 Digital
© D.J. Foreman 2009
24
 Graphic Model
 data structure
 nodes represent random variables
 Pairs of nodes connected by arcs correspond to
variables that are not independent
 Graphic modeling
using a Graphic Model to simulate a real-world
object (this is not a formal definition)
 Rendering
generating a 2D image from a 3D graphic
model
© D.J. Foreman 2009
25
 Geometric primitives –
 Points, line segments and polygons
 Described by vertices
 Control points –
 Special points attached to a node on a Bézier curve
 Alter the shape and angle of adjacent curve segment
 Evaluators –
 Functions that interpolate a set of control points
 Produce a new set of control points
© D.J. Foreman 2009
26
 A standardized interface
 Created by modeling programs
 Input to rendering programs
 A plaintext file
© D.J. Foreman 2009
27
 openGL 3.0 (2.0 + deprecated functions)
 openGL 3.1 only newer graphics cards
 Many <= 3.0 functions deprecated
 Many actions now done via shaders
 1st create a context (like the chicken & the egg)
 create an old (e.g.; 3.0) context
 activate it
 create the new context
 deactivate old context
 Old context needed to create new one
© D.J. Foreman 2009
28
Vertex
Data
Vertex
Processor
Fragment
Processor
Frame
Buffer
Pixel Data
Per-vertex operations:
•Eye-space coordinates
•Colors
•Texture coordinates
•Fog coordinates
•Point size
© D.J. Foreman 2009
29
Vertex
Coordinates
Normal
Vector
Model View
Matrix
Vertex Shader
replaces these
Model View
Matrix
Primitive
Setup
Color Values
Texture
Coordinates
Projection
Matrix
Clipping
Texture
Matrix
Fog
Coordinates
© D.J. Foreman 2009
30
From primitive setup
Bitmaps/Pixel
Rectangles
Texture
Mapping
Color
Summation
fragment
shader
replaces
these
Per-pixel
Fogging
Fragment
Tests
Frame Buffer
© D.J. Foreman 2009
31
 Using shaders & Vertex Array Objects
glBindVertexArray(my_vao_ID[0]);
// select 1st
VAO
glDrawArrays(GL_TRIANGLES, 0, 3); // draw 1st
object
 note difference from glBegin (GL_POLYGON)….
 Note: name of out(put) variable in vertex
shader must be the same as in(put) variable in
fragment shader .
 i.e. vertex -> fragment (as in original pipeline)
© D.J. Foreman 2009
32
 Write a program using the OpenGL 2.0 interface:
 Create 3 figures:
 Triangle (2D)
 Rectangle (2D)
 Irregular pentahedron (5-sided, 3D figure) (a pyramid)
• See: http://en.wikipedia.org/wiki/Pentahedron for examples
• ANY one Vertex pinned at 0,0,0
• Allowed to be non-equilateral
 Draw all 3 axes (optional, but HIGHLY recommended)
 The triangle is within the bounds of the rectangle
(or vice-versa), both objects in same plane, different colors
 The pyramid must have different colors on all 5 sides
 Create movement controls for the pyramid:
 Cursor keys ←→↑↓ change camera location for world view
 L & R keys rotate pyramid (CW/CCW) about Z-axis
 Pinned pyramid Vertex STAYS at 0,0,0
 The pyramid rotates, the “camera” position remains fixed
© D.J. Foreman 2009
33
0,0,0
0,0,1
0,0,0
6-vertex
pentahedron
Valid for lab
Still has 5 sides!
© D.J. Foreman 2009
0,0,1
34
 PC Requirements
 OPENGL library - for manipulating the model
 opengl32.lib
 opengl32.dll
 glu32.dll
 glu.h and gl.h
© D.J. Foreman 2009
(comes with Windows/XP & 7)
“
“
35
 Glut - note the “t” after the glu
http://www.opengl.org/resources/libraries/g
lut/glut_downloads.php#windows
 glut32.dll
 glut32.lib
 glut.h
”
and here too
 FreeGlut (newer – Open Source)
http://freeglut.sourceforge.net/
 freeglut.dll
 freeglut.lib
 glut.h
 glut.h does a #include of gl.h and glu.h
© D.J. Foreman 2009
36
 Define window
 Define objects
 Initialize “callback” functions
 Initialize window
 Run the gl main loop
© D.J. Foreman 2009
37
 Two parts
1. Modeling, coloring, etc.


Use standard C code, with calls to OpenGL functions
Your program runs as subroutines WITHIN the
OpenGL MainLoop
2. Callback functions



Called by OpenGL, from OUTSIDE your program
Display
Reshape
 Purpose of “registering” callbacks
 passes a pointer to your functions
 allows OpenGL to call them.
© D.J. Foreman 2009
38
 Orthographic
 All views are 2-dimensional
 Not good for games ignores the z-axis
 Perspective
 See model from “camera position” via “viewport”
Φ
camera
© D.J. Foreman 2009
viewport
39
 Movement functions
 Shape defining functions (lines, polygons, etc.)
 Reshape function (for window re-sizing)
 Init function
 Display function
 Main (includes call to glutMainLoop)
NOTE: gl, glu & glut prefixes on functions.
Be careful!
© D.J. Foreman 2009
40
#include <GL/glut.h>
#include <stdio.h>
#include <math.h>
#include <stdlib.h>
float ex,ey,ez,theta;
void leftMotion (int x, int y);
void rightMotion (int x, int y);
GLfloat pvertices [][3]= { };
// and then their colors
GLfloat pcolors[][3]={
};
// now define the polygon that USES those vertices
//define any ONE face of the pyramid at a time
void pface(int a, int b, int c) // pface is a name I made up for “pyramid face”
{// 3 vertices define a face
glBegin(GL_POLYGON);
glEnd();
}
© D.J. Foreman 2009
41
void draw_pyramid() // implement the faces of the figure
{ glPolygonMode (GL_FRONT_AND_BACK,GL_FILL);
}
void draw_square()
{
}
void draw_triangle()
{
}
void draw_axes()
{
glPushMatrix ();
glBegin(GL_LINES); // as pairs of points (default action of
GL_LINES)
glEnd();
glPopMatrix ();
}
© D.J. Foreman 2009
42
void dj_display()
{ /* this is the function that draws the graphic object in the
pre-created window */
glClear(GL_COLOR_BUFFER_BIT|GL_DEPTH_BUFFER_BIT); /*
clear the window */
glLoadIdentity();
//
eye,
at
up
gluLookAt(ex, ey, ez, 0,0,0, 0,1,0);
draw_square();
draw_triangle();
draw_axes(); //last in code, means display on top.
draw_pyramid();// now draw the pyramid
glFlush();
glutSwapBuffers ();
}
© D.J. Foreman 2009
43
int main(int argc, char*argv[])
{
glutInit(&argc, argv);
glutInitWindowSize(wd,ht);
glutInitWindowPosition(wx,wy);
glutCreateWindow("DJ's 1st"); // define name on window
glutSpecialFunc(myspecialkeys);
/* register the call-back functions with Glut*/
glutKeyboardFunc(mykey);
glutDisplayFunc(dj_display); // more setup
glutReshapeFunc(dj_reshaper);
dj_init();
// Some texts show init being called AFTER the glutDisplayFunc call,
// but it doesn't actually CAUSE display action, it just sets up the
environment.
glutMainLoop();
// last (& the only real) executable stmt in program
return 0;
}
© D.J. Foreman 2009
44
 Register your “callback” functions
glutSpecialFunc (your function name);
glutKeyboardFunc (your function name);
display, reshape
 Running the program
dj_init(); // your init routine (if any)
glutMainLoop();
© D.J. Foreman 2009
45
glClear(GL_COLOR_BUFFER_BIT|GL_DEPTH_BUFFER
_BIT);
glLoadIdentity();
//
eye_point,
at(x,y,z)
up (x,y,z)
gluLookAt (ex, ey, ez,
0,0,0,
0,1,0);
draw_triangle/rectangle/pyramid, etc
draw_axes;
glutSwapBuffers ();
glFlush();
© D.J. Foreman 2009
46
void reshape (int w, int h)
{
glViewport (0, 0, w, h);
glMatrixMode (GL_PROJECTION);
glLoadIdentity ( );
glOrtho(….);
glMatrixMode (GL_MODELVIEW);
}
© D.J. Foreman 2009
47
 Last drawn object is in “FRONT”
 MUST set bit depth AND enable bit depth test
 Rotation
 glRotate(Ѳ, x,y,z)
 operates along a vector starting at 0,0,0
 Other rotations
 Requires pre-multiplication of matrices with the
identity matrix
© D.J. Foreman 2009
48
 Silver halide crystals (AgCl, AgBr, AgI)
 Exposure to light turns them black
 Developer removes the loose (exposed) halide leaving
pure silver
 Fixing bath flushes out unexposed AgHalide
 Greatest desire of photographers: a medium with
 High speed (gathers light quickly)
 Low noise (no undesired artifacts)
 High resolution (very detailed images)
 Visible lines per mm (lpm) at a specified contrast level
 Notes:
 Larger “clumps” are faster, but limit resolution
 Smaller “clumps” are hard to coat evenly
 Randomness of coating prevents Moiré patterns
http://en.wikipedia.org/wiki/Moir%C3%A9_pattern
© D.J. Foreman 2009
49
crystal
© D.J. Foreman 2009
Radiation (light)
50
 35 mm film for comparison
 A 36 x 24 mm frame of ISO 100-speed film contains (≈) the
equivalent of 20 million pixels. 1 There are NO pixels in film!
 Full-frame digital (36 x 24mm)
 Canon EOS 5D, Nikon D3 (21MP) - $5K2
 Medium format digital (6 x 4.5cm)
 Phase One P40+ (39MP) ($22K)
 APS-C sized (≈24x16mm)
 Nikon D90 (12.3MP) - $900, Canon Rebel (15MP) $700
 Used in most DSLR’s
 Other
 SONY Alpha (note: image ratio 4/3)
1
Langford, Michael. Basic Photography (7th Ed. 2000). Oxford:
Focal Press. ISBN 0 240 51592 7.
2 Prices as of 2009, given for comparison purposes only
© D.J. Foreman 2009
51
 Image capture formats
 RAW – 3 separate images (RGB)
 [12 bits/pixel * (4288 * 2848)]/8=18,318,336 bytes per picture
 JPEG – compressed (16x, 8x, 4x)
 Some cameras do BOTH (RAW + JPEG) for each image
 Pixel notes:
 Sensor size 35mm (or larger) vs. APS-C
 Pixel count – any increase →decrease in pixel size
 Pixel size – smaller pixels can provide
 Greater resolution (finer lines)
 Lower light sensitivity
 More “noise” (incorrect values)
© D.J. Foreman 2009
52
 Photographic images
 Gimp, Photoshop, etc
 Modeling
 OpenGL – for creating and lighting a model
 modeling programs
 Rendering
 converting a 3D (possibly wireframe) model to a 2D
image, then lighting and (possibly) shading it
 Ray tracing
 another way of lighting & shading an image
 very CPU intensive
© D.J. Foreman 2009
53
© D.J. Foreman 2009
54
 3 windows:
 Image – contains the actual results
 Tools – operators, such as clone, paint, etc
 Layers – portions of the final image
 Avoids need for changes to original
 Stackable
 Allows separation of collections of changes
 Allows complex changes to be applied independently
of other changes
© D.J. Foreman 2009
55
 Get a digital landscape or cityscape image
 Get a picture of yourself
 Insert the image of yourself into the ‘scape as a
layer
 Save the new image as a new JPG file
© D.J. Foreman 2009
56
 Elements to manipulate
 Content
 Perspective
 Lighting/shading
 Sequence
 Coloring
 Video game images
 Dynamic
 Movie images
 Pre-determined
© D.J. Foreman 2009
57
 Developed at CMU
 Language-free programming environment
 Point & click usage
 Rapid prototyping
© D.J. Foreman 2009
58
 List of “world” objects
 Insert into methods
 Modify characteristics (length, width, etc.)
 List of world details
 Properties
 Methods (user created)
 Functions (built-in methods)
 World window (results)
 Method builder
© D.J. Foreman 2009
59
 Programmer
 Selects “verbs” from list
 For
 While
 Etc.
 Applies values for properties
 Statement structure pre-defined
 Blocks pre-defined
 Function calls
 for
 if… then… else
 No need to learn a “language”
© D.J. Foreman 2009
60
 Open any of the Alice worlds
 Apply the following rules
1.
2.
3.
4.
5.
Display 3 articulated characters (A, B and C)
A flips onto its head and rotates slowly (5 times)
B waves both of its arms slowly (5 times)
C walks around A and B while (2 and 3 happen)
C changes direction and walks around A and B
(while 2 and 3 repeat)
6. Stop
Timing, size, shape, position are your choice
© D.J. Foreman 2009
61
 Hiding a message in plain sight
 Inside another message
 Original is called the “cover”
 Presence of message is not obvious
 Can be done with text or graphics
© D.J. Foreman 2009
63
 Many algorithms for hiding a message
 Embedding is easier than detection
 Extraction is straight-forward if:
 you know there is a hidden message
 you know the algorithm used
 Can be very complex to detect & extract
© D.J. Foreman 2009
64
 Concept:
 Change the least significant bit of a set of bytes
 1→0 and 0→1
 For all bytes
 Repeat for multiple sets of bytes
 Example with grayscale images
 One pixel is 8 bits
 Change the last bit
 Changes that pixel’s shade VERY slightly.
 Repeat for every pixel
© D.J. Foreman 2009
65
Embedding function Emb (using Matlab or Freelab syntax)
c = imread(‘my_decoy_image.bmp’);
% Grayscale cover image
% ‘b’ is a vector of m bits (secret message)
k = 1;
% Counter
for i = 1 : height
for j = 1 : width
LSB = mod(c[i, j], 2);
if LSB = b[k] | k > m
s[i, j] = c[i, j];
else
s[i, j] = c[i, j] + b[k] – LSB;
end
k = k + 1;
end
end
imwrite(s, ‘stego_image.bmp’, ‘bmp’); % Stego image “s” saved to disk
 The return value “c” is an array containing the
image data.
 If the file contains a grayscale image, “c” is an M-byN array.
 If the file contains a truecolor image, “c” is an M-byN-by-3 array.
 The class of “c” depends on the bits-per-sample of
the image data, rounded to the next byte boundary.
 E.g.; imread returns 24-bit color data as an array of
uint8 data because the sample size for each color
component is 8 bits.
© D.J. Foreman 2009
67
Extraction function Ext (Matlab syntax)
s = imread(‘stego_image.bmp’); % Grayscale stego image
k = 1;
for i = 1 : height
for j = 1 : width
if k  m
b[k] = mod(s[i, j], 2);
k = k + 1;
end
end
end
% b is the extracted secret message as a bit string

LSBflip(x) = x + 1 – 2(x mod 2)

FlipLSB(x) is idempotent, e.g., LSBflip(LSBflip(x)) = x for all x

LSB flipping induces a permutation on {0, …, 255}
0  1, 2  3, 4  5, …, 254  255

LSB flipping is “asymmetrical” (e.g., 3 may change to 2 but never to 4)

| LSB(x) – x | = 1
for all x (embedding distortion is 1 per pixel)
Download