Fundamentals of Digital Media Digital media consists of four main types: text, graphics, video, and audio. The first three are related to what we see and the last to what we hear. Both our eyes and ears are analog sensors, meaning that they can detect continuous light and sound signals, respectively. In order for us to see digital media, we can either show them on digital displays, such as a computer monitor or a digital TV, both are pixel-based and can directly display digital data – images or videos. There is no corresponding audio “display” for our ears, so we still use speakers – analog devices – to hear audio signals, either music or voice. All digital media can be generated by two methods. 1. Traditional (analog) media is acquired and imported into computers for software manipulation. Basically we use something called sampling to sample the analog media and try to approximate the analog media with a digital replica. 2. Digital media is created or authored digitally with software. We use sound synthesizers and electronic instruments based on the Musical Instrument Digital Interface (MIDI) standard to create digital sound. We use digital cameras and camcorders to produce digital images and video footages. We can also produce 3D animation completely from software. Let’s start with sound. We can either capture or create digital sound. To capture digital sound, we use microphones, which convert sound waves to electric waves. The electric signals produced by the microphone are then fed to a device called analog-to-digital (ADC) converter, which can be found in a sound card or a chip integrated on a motherboard. What an ADC does is to digitalize the analog signal by sampling the sound wave at a fixed frequency and representing the sampled value with 8-bit or 16 bit of information called bit depth. The more bits used to represent the signal, the more levels of differentiation of the result. In other words, the more fine details we can assign to the digitalized signal or the digitalized signal more closely resembles the original analog signal. Once the digitization process is completed, a digital sound is created. We can then do all kinds of processing with the computer. When we are ready to replay the digital sound, we have to convert it back to the analog form by a device called digital-to-analog (DAC) device typically also built in a computer. The resulting analog signal is then amplified and used to drive the speaker to reproduce the original analog sound. The whole process can be simplified as in the following diagram. The main device used in digital recording is an Analog-to-Digital Converter (ADC). The ADC captures a snapshot of the electric voltage on an audio line and represents it as a digital number that can be sent to a computer. By capturing the voltage thousands of times per second, you can get a very good approximation to the original audio signal. There are two factors that determine the quality of a digital recording: Sample rate: The rate at which the samples are captured or played back, measured in Hertz (Hz), or samples per second. An audio CD has a sample rate of 44,100 Hz, often written as 44 KHz for short. This is also the default sample rate that Audacity uses, because audio CDs are so prevalent. Sample format or sample size: Essentially this is the number of digits in the digital representation of each sample. Think of the sample rate as the horizontal precision of the digital waveform, and the sample format as the vertical precision. An audio CD has a precision of 16 bits (2 bytes), which corresponds to over 65K levels. Both the sampling rate (horizontal axis) and the sampling precision (vertical axis) have been improved by a factor of 2 in the right figure (40 gradations at 4,000 samples per second) than those of the left figure (20 gradations at a rate of 2,000 samples per second). Hence the digital representation at right is a better quality replica of the original sound (the red curve). Digital sound can also be created by using machines capable of producing sound resembling spoken words using a process called speech synthesis. The machine that can mimic the sound of a traditional instrument is called synthesizers. We can capture digital images using certain devices connected to the computer, such as a digital camera, scanner, or screenshot of the display. By scanning an image print, we digitalize the image and convert it into bits. Both digital cameras and scanners share a key piece of technology – charge coupled devices (CCD) – image sensors. The CCD captures the light falling on it and convert it into electrical signals. The CCD surface is divided like a grid, into small pixels. A pixel (short for picture element) is the smallest unit of picture that can be represented or controlled. Each pixel represents one pixel in the captured image. Similar to the sampling of analog sound, each pixel in an image file is a sample of an original image; more samples typically provide more accurate representations of the original analog image. The number of colors that can be displayed in a pixel is determined by the number of bits used to represent the pixel (known as color depth). This concept is similar to the bit depth of digital sound. The larger the color depth, the larger the number of colors that can be displayed in a pixel. For True Color bitmap graphics, each pixel is represented by 3 bytes (one byte for each primary color – Red, Green, Blue), which can display 224 (~16 million) different colors. We can also create digital images on the computer using painting and drawing programs and using a mouse, a digital pen or stylus. Types of Digital Graphics There are two types of digital graphics: bitmap or raster and vector graphics. A bitmap is a grid of individual pixels that collectively compose an image. Raster graphics render images as a collection of countless tiny squares. Each square, or pixel, is coded in a specific hue or shade. Individually, these pixels are worthless. Together, they’re worth a thousand words. Raster graphics are best used for non-line art images; specifically digitized photographs, scanned artwork or detailed graphics. Overall, as compared to vector graphics (to be defined), raster graphics are less economical, slower to display and print due to its large size, less versatile and more unwieldy to work with. However, some images, like photographs, are still best displayed in raster format. Common raster formats include TIFF, JPEG, GIF, PCX and BMP files. Of those, the BMP format is the Windows native bitmap graphics format; however, it is not supported by most browser. Scanners and digital cameras commonly store bitmap graphics in TIFF format because it supports TrueColor and can be easily converted to other graphics file format. The most popular web graphics format is JEPG, because it compresses large graphics files into smaller one using lossy compression. Avoid using lossy compression (if possible) when you want to keep the file for further editing. Another web popular format is the PNG (Portable Network Graphics) format, which can display up to 48-bit TrueColor. Unlike JPEG, PNG compresses bitmap files without losing any data. Despite its shortcomings, raster format is still the Web standard — within a few years, however, vector graphics will likely surpass raster graphics in both prevalence and popularity. Unlike pixel-based raster images, vector graphics are based on mathematical formulas that define geometric primitives such as polygons, lines, curves, circles and rectangles. Because vector graphics are composed of true geometric primitives, they are best used to represent more structured images, like line art graphics with flat, uniform colors. Most created images (as opposed to natural images) meet these specifications, including logos, letterhead, and fonts. Further, unlike raster graphics, vector images are not resolution-dependent. Vector images have no fixed intrinsic resolution, rather they display at the resolution capability of whatever output device (monitor, printer) is rendering them. Other than photo-realism that bitmap graphics excels, vector graphics is better in many other aspects, such as its infinite scalability, smaller file sizes, more versatile in shapes of graphics, 3-D and easier to edit. One of the best applications of vector graphics is 3-D graphics, which are essential to make realistic computer games, movie special effects, and objects like buildings and cars. A picture that has or appears to have height, width and depth is three-dimensional (or 3-D). A number of image parts go into making an object seem real. Among the most important of these are shapes, surface textures, lighting, perspective, depth of field and anti-aliasing. 3-D graphics are stored as a set of instructions that contain locations and lengths of lines forming a wireframe. We then use a process called rendering, which covers a wireframe with surface color and texture that makes 3-D vectors look more realistic. Furthermore, ray tracing adds light and shadows to a 3-D image to simulate the eye's perception of those objects. Below is a sequence of steps showing how 3-D graphics is applied to make a computer creation look like a real thing. This illustration shows the wireframe of a hand made from relatively few polygons -- 862 total. The outline of the wireframe can be made to look more natural and rounded, but many more polygons -- 3,444 -- are required. Adding a surface to the wireframe begins to change the image from something obviously mathematical to a picture we might recognize as a hand. Lighting in an image not only adds depth to the object through shading, it “anchors” objects to the ground with shadows.