219 19. Introduction to Multimedia 19.1 Introduction Hypermedia is an acronym which combines the words ‘hypertext’ and ‘multimedia’. The human mind does not operate in a strictly linear manner. Our train of thoughts tend to form associations - when we think of something, we will also think of something else that is related to it. We thus jump quickly from one topic to another related piece of information. This what the hypertext paradigm offers. It attempts to model this non-linear association with information repositories. Self-contained pieces of information are linked together by natural or topical association rather than organising them in the familiar paper-based book sequential structure. Figure 19-1. Hypertext A book or encyclopaedia nevertheless still allows the reader to ‘jump read’ to the references or related topics section to find out more about a particular topic. Likewise, the hypertext technology allows the reader to move from one location to another by following the links that connect the topic of interest, only that it is much easier and faster to browse through the electronic excerpts of hypertext documents than it would be with a paper-based book. A word in an electronic document may be highlighted and when the user selects the particular word, other documents containing the related text are made immediately available to the user. More in-depth explanation about that particular topic or an associated topic will be displayed. The term hypertext however suggests that all information are in the form of basic text. Multimedia, on the other hand, allows the use of information in other forms, such as graphics, pictures, sound, animation, video, etc. It unites the different media to create a single multisensory experience. 220 19. Introduction to Multimedia Figure 19-2. Multimedia document A multimedia computer system that can handle a multimedia document would normally be a PC that is upgraded with kits such as a CD-ROM drive, a sound card, a video card, microphones and speakers, and other specialised devices which are needed for the computer to read CDs containing large files, produce high quality sounds, capture and replay full motion pictures, etc. The system may be attached to scanners, music keyboards, VCRs and other peripheral equipment. Multimedia technology spurs many exciting applications in the home, education, entertainment, business, government and industry. Hypermedia has its roots in hypertext. It is an augmented (or generalised) hypertext because it incorporates multimedia. Hypermedia thus enables the user to selectively navigate through not only text, but virtually any kind of information that can be electronically stored, such as digital pictures, graphics, sound, animation and video. A hypermedia document, could for example, be an animated tour starting from an image of a map, say of the island of Borneo, through to pictures of the rainforests there. There could be a recorded voice-over narration, video clips on places, people or animals of interest, simulations of the weather, etc. The user may click to different parts of the display to get in-depth explanations from annotated text or track a subject through a variety of topics via the links. All in all, it provides a very rich and visually compelling presentation. 19.2 Media Object Electronic documents are increasingly being written, read and disseminated. Multimedia documents are composed with components such as music, photographs, clip art, video clips, fractals or holograms. These components are the visible manifestations of some type of data. Generally speaking, multimedia systems operate with ‘media objects’ which have some basic data type, i.e. a multimedia object is a homogeneous chunk of information (which we can see as a file of some file type residing in the computer’s memory). The basic types being text, image, audio and full motion video, and each type will have its own way for data handling, processing, storage and retrieval. Standardisation for multimedia is very important and there exist standards (as well as propriety) for data/file formats, file interchange and video processing standards. Standards such as RTF, TIFF, MIDI, PAL, AVI, MIDI, JPEG, MPEG etc. are part of the vocabulary of multimedia developers. Media objects can be visualised (i.e. displayed) on the user’s screen using a particular procedure of visualisation. Very often, software packages implementing such 19. Introduction to Multimedia 221 procedures are called ‘viewers’. Thus one may have a viewer to read the text of a document created by a certain word processor, or a viewer to display a facsimile transmission, or a viewer to watch the playback of a video. An object’s visualtisation or representation may also have controls to dynamically change the object’s rendering. These may include VCR-like buttons to rewind, play, fast-forward, pause or stop or sliders for the volume control of sounds. Figure 19-3. Different viewers for different media object types Media objects are created using editing systems. The media editor is an application software to create and edit the object. Preparing a multimedia document is a nontrivial task. It is very difficult, if possible at all, to find a single system capable of creating all media objects which are needed for a more or less complex multimedia application. Normally, different media objects are created by means of different editing systems. Each editing system essentially allows the user to perform operations to create, cut, copy, paste, delete, format, merge, move and save. Figure 19-4. Different editord for different media object types We are familiar, for example, with the use of a text editor such as a word processor which have features to format or style paragraphs (e.g. left justify, centre, single space), style characters (e.g. font, bold, italics, underline), or even check the spelling and grammar of our text document. There are also document image-scanning systems which allow image objects (i.e. not coded text and not with temporal properties) to be captured and then allow operations such as image scaling, zooming, rubber banding, panning, enhancement, etc. Images may also be created from drawing or bitmap paint editors that allow line or circle creation, rectangle filling with colours or texture patterns, pixel processing, histogram 222 19. Introduction to Multimedia sliding,, spatial filtering, etc. to produce simple clip art right up to impressive works of (electronic) art. Still pictures may be captured from digital cameras or from grabbing still video frames. Voice, music and sound may be captured from microphones, musical keyboards, cassette tapes or CDs, or WAVE file inputs. Analogue signals are converted to digital formats where they can be sampled, edited, added with special effects, or changed to a different instrument. Animation editors can create an illusion of movement by creating a sequence of still image frames. Objects can be toggled, rotated, twisted and colour palettes can be manipulated to create the perception of movement. The media editor for full motion video usually have a TV/VCR-metaphor user interface with functions such as video capture, channel play, sound volume plus editing functions mimicking the cutting floor of a movie (e.g. multiple film strip viewed at user-selected frame rates, audio/video indexing and marking, frame level splicing, soundtrack splicing, automatic scene change detection, etc.) to produce the desired or special effects. Having created the media objects using the various specialised media editors, these components can then be put together to compose a multimedia document. 19.3 Multimedia Documents A multimedia document is a compound semantical unit consisting of a number of different media objects within it. Each multimedia document has an internal structure which defines a combination of media objects in it. These media objects presumably have been precaptured (and edited) independently. The multimedia objects may be embedded within the container document itself, i.e. a copy of the object is physically stored in the document. As the original copy of the object may be somewhere else, editing the object within the container document does not affect the original. It also allows the copying or transfer of document to another computer easier. But of course, embedded objects do make the document larger and this not only uses up a lot of storage space, it also slows down retrieval. Figure 19.5. Internal representation of a multimedia document Alternatively, a multimedia object can be associated with a document via linking,. The multimedia object itself can reside in another database, presumably a database 19. Introduction to Multimedia 223 optimised for the object’s particular data type (e.g. an image database, an optical jukebox or a video server) and a link is established between the object and the document. The link reference would be a pointer to the file containing the media data object plus other information needed for object editing, display, playback, etc. This way, a multimedia object can also be shared by a number of different multimedia documents and storage use is minimised. We shall now look at the different ways in which we can construct the structure of documents that contain multimedia data objects within them. 19.3.1 The Layout Metaphor In the most simple case, the internal structure of a multimedia document can be defined using a ‘layout metaphor’ similar to pages of an ordinary book. A background text can be extended with ‘tags’ which mark particular places where the media objects should be displayed within the text. Figure 19.6. Tag placements in a metaphor layout Upon retrieval, the multimedia document is converted into a resultant image that combines all the source media objects with the specific objects displayed at the tag locations. Since the resultant document may be too big to display on the user’s screen, a scrollable window is normally used to visualise such multimedia documents. The layout metaphor has a number of obvious disadvantages. Truly dynamic media objects such as movies, sound and animation cannot be easily incorporated into a layout. It should also be noted that the layout metaphor does not provide a satisfactory user interaction interface. 19.3.2 The Scripting Metaphor Another very popular way of defining the internal structure of multimedia documents is called a ‘scripting metaphor’. A script consists of a sequence of operations and is interpreted by a multimedia system in a way that is similar to an interpretation of an ordinary computer program. The operations in the script are executed accordingly; for example, clicking on a poster frame would start a video clip. 224 19. Introduction to Multimedia Figure 19-7: Scripting multimedia objects in a document The script metaphor does not handle a time factor which is often involved in the presentation of multimedia materials. It does not provide a convenient way to handle two or more media objects that are operating simultaneously on the screen. 19.3.3 The Cast/Score Metaphor Consider for instance, a simultaneous animation of a number of media objects provided with a background sound. The ‘cast/score paradigm’ considers all media objects to be ‘actors’ playing in a ‘scene’ or a stage. The scene is a user’s screen with some background picture. The cast/score paradigm uses a music score as its primary authoring metaphor - the actions to be performed by actors are shown in various horizontal ‘tracks’ with simultaneity shown via the vertical columns. For example, a music jingle may be timed to synchronise with an animation. Figure 19-8: Positioning objects according to a ‘score’ It is timeline-based where a specific media object is positioned on the timeline. The timeline of each object shows its start point and its duration. When played back, the objects or actors begin to ’act’ according to the score. The true power of this metaphor lies in the ability to script the behaviour of each of the actors. This paradigm is best suited for animation intensive or synchronised media applications. 19. Introduction to Multimedia 225 19.4 Multimedia Authoring Multimedia applications, whether it is an information kiosk or an interactive game, are put together by combining and controlling the flow of the multimedia components. This is the process of authoring. Authoring multimedia systems can be quite complex given the variety of data objects and the degree of integration. The author, in putting together the application, must determine its scope, functionality and user interface. The author (or the group of people authoring) must plan for the overall structure of the application, create its content, design its interactive bahaviour and implement the user interface or look-and-feel of the application. Any user interface must of course be perceived by the end user to be efficient, intuitive, easy to use and responsive to the user’s needs. An authoring system is a development tool used to organise multimedia objects for end-user applications. It is a program which has pre-programmed elements for the development of interactive multimedia documents. Many authoring systems are available in the market and these vary widely in orientation, capabilities, and learning curve. How complex the system is depends on the functionality it must support and, as previously discussed, the metaphor for the representation of an internal structure of multimedia documents. The structuring metaphor can be seen as a methodology by which an authoring system accomplishes its task. Figure 19-9: Deciding on multimedia authoring systems Recollect that the following structuring metaphors exist: 1. Layout 2. Scripting 3. Cast/score Dedicated authoring systems are the simplest, designed usually for the single author working on documents structured along the layout metaphor. Familiar real-world interfaces, like a VCR interface, are used and the authoring is performed on precaptured multimedia objects. However, combining different media objects can prove difficult to implement. Writing scripts provide greater power and flexibility to the authoring process. Cast/score metaphors further allow structured timeline-based authoring for more complex presentations with detailed timing constraints. Thus a multimedia authoring system should be considered for a particular application if it supports a suitable structuring metaphor, at least. Of course, there exists a number of 226 19. Introduction to Multimedia implementations of each metaphor which varies in syntax and user-interface, nevertheless general facilities available in a particular authoring system are defined by the document structuring metaphor. Remember however that the actual content creation of the multimedia objects themselves (graphics, text, video, audio, animation, etc.) is not generally made by an authoring system, i.e. the authoring system does not manipulate the media objects directly. For more professional output, software packages (media editors) dedicated to the creation and editing of that medium should be used. The authoring system then coordinates the sequence (navigation) in which the application progresses and which objects should be used and when to meet the user requirements of the system. Figure 19-10: Development tools for multimedia applications 19.5 Multimedia Databases and Hypermedia Multimedia objects are characterised, amongst others, by their large storage volume, complexity in object relationships and temporal retrieval requirements. Large multimedia objects require mass storage devices that are online (high-speed magnetic disk systems), near online as well as offline (e.g. optical disk platters/jukeboxes or tapes) to serve as repositories. Storage is often best organised to consist of servers designed for specific data types as certain storage media technologies are more suited to certain data types. For example, video objects require constant playback speed and fast caching and video servers using magneto-optical technology may be more suitable. Other servers include image servers, audio servers, voice-mail servers, database servers, etc. Objects of similar characteristics and usage pattern may of course reside on the same physical server. Additionally, flexible access requires a high degree of data independence (i.e. insulation between the data object and the application using it). A multimedia object may contain other linked objects, (e.g. a video presentation may be a component of another multimedia document) - adding to the complexity in retrieval. Transaction management is very complex given the different media types to be handled, compounded by their distribution over multiple data servers and simultaneous access 19. Introduction to Multimedia 227 by many users. Clearly, issues of standards, data compression/decompression, document indexing, retrieval and management are issues of continuous challenge and progress. One significant challenge is the need to organise and manage the large, complex often distributed repository of multimedia documents. Flexibility and performance are prime concerns. A number of different technologies are available, the two common ones being: 1. Multimedia databases 2. Hypermedia databases Figure 19-11: Database management systems for multimedia systems A number of existing relational database management systems (RDBMSs) now provide extensions to support multimedia data types. In addition to the standard alphanumeric data types to support textual fields(plus some limited binary types to handles date fields, etc.), RDBMS now have data items called Long Binary Streams (LBS) or Binary Large Object (BLOB) to handle binary and free-form text. The media objects can be simply embedded into the relations as data items which store the location information for the LBS or BLOB. The LBS itself would be stored on a separate image server or video server. Generally such multimedia databases are used if a structure of multimedia documents can be separated from an actual content (i.e. from the media objects). Figure 19-12: Schema of a multimedia database extended to support LBS In other words, multimedia documents are considered to be instances of a predefined document types (i.e. templates). 228 19. Introduction to Multimedia Figure 19-13: Schema, document type and instances of a multimedia database Extended RDBMS have the advantage of the strengths of the database management systems, as in it rigorous security and integrity maintenance as well as its powerful concurrency and transaction control. However, there are shortcomings in the inability of standard SQL to manipulate the multimedia objects. Multimedia systems utilising relational systems cannot satisfactorily handle the complexity and richness of multimedia data. These objects are not only large, but they are also created and presented in different ways and cannot be interpreted or handled as alphanumeric data, upon which relational systems were initially designed for. For example, simple attributes like the seating capacity of a car may be easily stored as a database attribute, other attributes as found in an image of the car cannot be easily represented as database attributes. Clearly, hypermedia documents require an information model that is more complex to define the components, meanings and relationships together with the representation in the various data types. The systems must operate with multimedia documents that have their own, unique internal structures. Figure 19-14: The unique structure of hypermedia systems More will be discussed of hypermedia systems in the following chapter.