CHI 2008 Proceedings · Stories and Memories April 5-10, 2008 · Florence, Italy Mobile Multimedia Presentation Editor: Enabling Creation of Audio-Visual Stories on Mobile Devices Tero Jokela, Jaakko T. Lehikoinen, Hannu Korhonen Nokia Research Center P.O. Box 1000, FI-33721 Tampere, Finland {tero.jokela, jaakko.t.lehikoinen, hannu.j.korhonen}@nokia.com ABSTRACT written and printed media and more recently electronic and digital media. Digital storytelling [7] involves creating a personal narrative and then utilizing digital media like photographs, video, and audio to illustrate the narrative in the form of an integrated multimedia presentation. A mobile device provides an attractive tool for creating and sharing audio-visual stories. Earlier research has shown that the users enjoy creating digital stories with their mobile devices. However, designing editor interfaces that support creation of rich audio-visual presentations has been a major challenge due to the constrained input and output capabilities of mobile devices. In this paper, we present the design and evaluation of the Mobile Multimedia Presentation Editor, an application that makes it possible to author sophisticated multimedia presentations that integrate several different media types on mobile devices. Based on a user study, we present design principles for multimedia presentation editors on mobile devices. We describe an application design that supports these principles and so demonstrate that editing of sophisticated multimedia presentations is feasible on mobile devices. We report evaluations which indicate that the editor application was easy to use and supported the creativity of the mobile users well. Over the last years, mobile devices have evolved from voice-centric communication devices to powerful personal multimedia devices, which enable the users to consume multimedia content like listen to music or view videos. In addition, the mobile devices provide a wide range of features for creating and sharing content, including an integrated camera for capturing images and video clips, microphones for recording audio, a powerful generalpurpose computing platform for manipulation of digital content, and broadband wireless network connections for effective sharing of content. The mobile devices are almost always with the user, enabling the capture of interesting events anywhere as they occur, and they also have a broad user base that makes it possible to provide capabilities to produce and share content for large numbers of users. Together, these features make the mobile device an attractive tool for creating and sharing digital stories. Author Keywords Mobile devices, multimedia presentations, SMIL, multimedia messages, MMS, content creation, authoring, editor, storytelling, user interfaces, interaction design. Earlier research [9, 6] has indicated that the users enjoy editing and creating digital stories on their mobile devices. However, designing editor interfaces that support the creation of rich audio-visual stories has presented a major challenge due to the constrained input and output capabilities of mobile devices. Commercially-available mobile devices provide multimedia presentation editors that are primarily intended for authoring simple e-mail like multimedia messages consisting of text with image and audio attachments. Some of these editors also provide some support for creating more sophisticated multimedia presentations, although this support has been based on templates that a user is expected to fill in with appropriate images, sounds, and texts. Such a template-based approach severely restricts the creativity of the users. ACM Classification Keywords H.5.1 Information Interfaces and Presentation (e.g., HCI): Multimedia Information Systems. INTRODUCTION We all have stories to tell. The practice of telling stories “goes back as far as time allows us to remember” [8]. Storytelling has a strong history as an oral tradition, but it has evolved and extended to utilize new technological advances and media types as they have emerged, including Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. CHI 2008, April 5–10, 2008, Florence, Italy. Copyright 2008 ACM 978-1-60558-011-1/08/04…$5.00. In this paper, we present the design and evaluation of the Mobile Multimedia Presentation Editor, an application for creating rich and expressive audio-visual stories on mobile devices. Our key contribution is an editor interface that enables authoring of sophisticated multimedia presentations that integrate several different media types on mobile devices. The created presentations can be stored locally on 63 CHI 2008 Proceedings · Stories and Memories April 5-10, 2008 · Florence, Italy the device for later use or shared using any suitable distribution mechanism, including sending as multimedia (MMS) messages, which was the primary focus of our work, but also sending over e-mail, showing to others on the device display or on TV, sharing in proximity using Bluetooth, or publishing to the Internet on mobile blogs. We also believe that the application serves as an example of how complex tasks like editing of multimedia presentations can be made possible on mobile devices by systematically following a user-centered design approach from initial user studies through prototyping and usability evaluations to field trials in realistic user environments. others. The most common forms of editing were selecting the photographs to be shared from a larger set and organizing the selected photographs in the desired order. Overall, the editing process was something that the participants said they enjoyed. Balabanovi! et al. [1] present StoryTrack, a prototype device for storytelling with digital photographs. While StoryTrack allows creation of image sequences with audio clips attached to each image, our work aims to enable composition of more integrated and versatile multimedia presentations. The StoryTrack device is also considerably larger than current mobile devices and it has a set of hardware controls specifically designed for constructing and telling stories. Our work targets general-purpose mobile devices with much more restricted input and output capabilities. The rest of this paper is structured as follows. We first review the relevant related work. Next, we describe a user study we arranged to gain understanding of how users would like to edit multimedia presentations on their mobile devices and present design principles for mobile multimedia presentation editors based on the results of the user study. Then, we present our approach, giving a detailed description of the design of the Mobile Multimedia Presentation Editor. Finally, we report the results of laboratory and field evaluations, followed by discussion and conclusions. Commercially available mobile devices include editors for creating multimedia messages. These editors are usually based on the earlier SMS and e-mail message editors and are optimized for creating simple e-mail like messages consisting of text with image and audio attachments. Support for editing more complicated multimedia presentations, if any, is typically based on templates that a user is expected to fill in with appropriate images, sounds, and texts. Such a template-based approach allows only coarse control over the presentation structure and sets strict restrictions on the creativity of the user. RELATED WORK Mäkelä et al. [9] report the results of an early field trial on the use of camera phones and the sharing of images in the mobile environment. The trial participants were provided with PC-based prototype devices which enabled them to take digital images, to do simple image editing, and to share images wirelessly. Additionally, it was possible to make short, maximum five-pictures-long series, which were enhanced with transition and sound effects. The results indicate that the participants, especially the children, loved to create stories with series of images. The image editing possibilities were also commonly used and the participants wished that they could have more editing possibilities. Additionally, it was found that the images alone were not enough for functional communication and therefore it should be possible to annotate the images with text or audio. While these results give important support for our work, we aim to provide the users with a broader set of editing functions than the prototype system used by Mäkelä et al., enabling the creation of richer content. Probably closest to our work, Salovaara [10] describes a case study on the use of Comeks, a mobile tool for creating and sharing comic strips as MMS messages. Salovaara focuses on the appropriation of the application and only a high-level description of the design and features of Comeks is given. While Comeks has many similarities to our work, we aim to enable creation of more continuous media, giving the user more fine-grained control over the temporal structure of the presentation, and also enabling the use of audio in the created presentations. Fono et. al [2] present Sandboxes, a mobile tool for collaborative composition and sharing of a 2D multimedia collage. Compared to Sandboxes, our work focuses on personal creation of multimedia content. We also aim to enable creation of more continuous and integrated media content than the 2D canvas used in Sandboxes allows. Other studies have also identified the importance of editing. Koskinen et al. [6] present the results of another early trial with mobile devices and accessory digital cameras. The results indicate that the users should be able to do lightweight image editing on mobile devices and to send series of images as integrated stories. Attaching textual and also audio comments to the images should be possible. Special attention should be paid on making the features easy to learn and use. Kirk et al. [5] studied people’s practices between the capture and eventual sharing of photographs. The study found that all study participants did at least some editing before sharing the photographs with Finally, some solutions have also been presented for video editing on mobile devices. Wu et al. present mProducer [11], a system for point-of-capture archiving and editing of video with mobile devices. mProducer includes a key-frame based user interface for simple cutting of video but it does not support merging of video clips or advanced features like special effects or audio. Jokela et al. [3] present a more advanced video editing solution, which enables also the merging of several video clips to longer videos and enhancing the videos with text, images, music, and special effects. The user interface is based on the timeline metaphor familiar from desktop video editors, but redesigned for 64 CHI 2008 Proceedings · Stories and Memories April 5-10, 2008 · Florence, Italy small displays and keyboard navigation. Field trial results [4] indicate that some, especially younger, users eagerly adopted the video editor, but for others full video editing proved to be too complicated and time consuming. In our work, we aim to create an editing solution that would enable creation of rich and expressive content but that would be simpler to use and require less effort than fullscale video editing. with a personal text or audio comment and then send it as a multimedia message. In the beginning of the study, the aim and the tasks were explained to the participants. They were instructed to use the devices as they usually did in their everyday lives and encouraged to use multimedia messages in those situations where it felt natural for them. Also, editing the images before sending them was encouraged. In addition, the participants were instructed on how to use the editing applications, in order to avoid potential frustration due to the known usability problems with the applications. USER STUDY As a starting point for the design of the Mobile Multimedia Presentation Editor, we conducted a user study on user habits on composing, sending, and editing multimedia messages. The study was made in Finland in 2003. The objectives of the study were two-fold. First, the study aimed to gain understanding on current behaviors and practices related to the usage of multimedia messages. Special attention was paid to composing and editing of multimedia messages. Second, the study aimed to provide first insights into users' motivations for employing, and requirements for composing, more complex multi-page multimedia messages, i.e., multimedia presentations. Overall, the study produced design principles for the Mobile Multimedia Presentation Editor. Participants and Procedure Ten participants were recruited for the study. The participants consisted of three groups: a friends group and two family groups. The friends group comprised of three psychology students, all 24 years old. The family groups included adults between 24 and 60 years old, consisting primarily of young couples with small children and their grandparents. The participants in the family groups had various educational backgrounds, including technology, economics, politics, and health care. Figure 2. Example of a comic message created by a study participant. Figure 1, Examples of ready-made animations pre-installed on the devices provided to the study participants. After a four-week period of using the devices with the installed editing applications and animations, individual interviews were conducted. The interviews included two parts. In the first part, the experiences around the usage of multimedia messages and image editing applications were discussed in detail. The interviews consisted of the following themes: general experiences of multimedia messages, context of use, message dialogs, content types (like text, images, and audio), and editing of messages. The participants were asked to explain some of the messages they had sent and received during the usage period in order to get a realistic idea of the usage. In this paper, we mainly report the results related to the composing and editing behavior as they affected the design of the Mobile Multimedia Presentation Editor the most. The participants were provided with Nokia 7650 mobile devices to be used as their primary mobile devices for a period of four weeks. The devices were equipped with two commercially available image editing applications, Image Plus and Camera FX. With these editing applications, the participants could add speech bubbles, frames, and properties to the images and apply some image effects like adjusting the colors or mirroring the image. Additionally, a set of ready-made animations (animated GIF graphics) was pre-installed on the devices (see Fig. 1). These animations were designed to serve some particular purposes, like birthdays, or expressing moods, such as being happy or sad. The idea was that the user could customize the animation The second part of the interview was designed around a task of composing a “comic message” (see Fig. 2). We wanted to investigate the first intuitions of usage needs of more complex multi-page multimedia messages, and thus gave the participants free hands to come up with whatever kind of message they liked. Hence, we provided them with pens and an empty comic template on paper. Additionally, some ready-made image materials were given as a backup for those who felt uncomfortable with drawing. The participants were asked to choose two or three situations where they had sent a multimedia message in real life during the study period and to compose a comic message to the original recipient(s) in that situation. This was done in 65 CHI 2008 Proceedings · Stories and Memories April 5-10, 2008 · Florence, Italy order to make the situation in which the message was sent and the contents of the message as realistic as possible. However, the participants were allowed to choose whatever situation if they so wished. While drawing the message, the participants were encouraged to think aloud, in order to shed light to the process of composing the message. seconds and the maximum time around 15 minutes, according to the participants’ own estimations. The participants reported that they used two to five minutes for composing a message on the average. However, a more important factor affecting the decision on whether to start editing a multimedia message or not was, the often unconscious, idea of the ability to complete the message within the available time. One participant put it as follows: All interviews were video recorded and transcribed for data analysis. In addition, the comic messages that the participants produced were analyzed in detail. Finally, 135 multimedia messages that were sent during the study period were analyzed. “I think about the time I use for a message unconsciously, I mean, can I make it within the time I have.” Intuitions of Using Multi-Page Messages All of the study participants composed one to three comic messages during the interview sessions. The contents of the messages were mainly different kinds of events, e.g., parties, birthdays, exhibitions, and trips. In addition, greeting cards, driving instructions, and instructions for hobbyists (e.g., how to assemble a part of a miniature model) were chosen as topics for the comic messages. Further, the participants seemed to have a tendency of composing “before and after” type of messages, e.g., describing the situation before and after attending to a party. Results This section presents the main results of the user study emphasizing the composing and editing of multimedia messages as well as the intuitions of composing multi-page multimedia messages. The average frequency of sending multimedia messages varied from one per week to two per day during the study period. Multimedia messages contained mostly images. Other content types, like animations or voice or music, were rarely attached to the messages. The main content of the comic messages was images. The role of text was smaller than in the actual multimedia messages created during the study period. Attaching speech bubbles seemed to be an intuitive way to add text to the messages as they enabled fitting all material into one screen. Some of the participants brought up that they would also like to use background music to enlighten these kinds of messages. Composing and Editing Messages Of the analyzed 135 messages, 90 % contained text in addition to the image. In 30 % of the messages, the image was edited. The most common ways to edit the image were to attach a speech bubble to the image (44 % of the messages) and to attach a photo frame (37 % of the messages). Only two of the analyzed messages included ready-made animations. One of the participants reported that: When discussing the participants’ thoughts of using multipage messages in their everyday lives, a typical message was seen to contain two to three images with little or no text. According to the participants, more complex multipage messages would be created only in special occasions, like during a trip. Such messages would consist of several images taken over a longer period of time. Some of the participants commented that they would like to use video clips in the messages as well as animations that were downloaded from the Internet. “It was boring that everyone had the same animations. I think that they are not personal.” This quote brings out one of the most important motivations to edit messages, namely making the message more personal and thus more valuable. Generally, participants’ ways to compose and edit multimedia messages can be described as unorganized or even chaotic. Often, the participants did not have a clear image of the end-result in mind when they started to compose a message. Especially, when editing messages with Image Plus, the participants tried various options, adjusted a position of speech bubble, or tried out different photo frame styles. Also, the added items affected the content of the text. Thus, the user’s idea of the end-result developed during the process of editing the message. Design Principles The study provided useful insights for designing and developing the Mobile Multimedia Presentation Editor. Four main design principles were derived as a result of analyzing the interviews, the comic messages, and the actual messages that the participants had sent during the study period. The design principles were as follows: flexibility, awareness of the task context, expressiveness, and personalization. Images that were sent needed to be of high quality and it was common that several images were taken in order to get a good and representative one for the message. A criterion was that the image needed to convey the point clearly. The principle of flexibility was mainly derived from participants’ unorganized manner of composing both the messages they actually sent during the usage period and the comic messages they created during the interviews. Also, One of the interesting factors we wanted to examine was the time used for composing a message. On the average, the minimum time for composing a message was around 15 66 CHI 2008 Proceedings · Stories and Memories April 5-10, 2008 · Florence, Italy Constraints the fact that the idea of the end result typically changed during the composition process led us to bring this as one of the main design principles. By flexibility we mean that the presentation editor needs to support composition of multipage messages in an unorganized way. Users need to be able to move back and forth in the presentation easily and to have means to determine the contents and to be able to edit the messages according to their current idea. On the other hand, this means that template based editors that force users to proceed with the presentation in a certain order from the beginning to the end do not meet the flexibility principle. The design of the Mobile Multimedia Presentation Editor was primarily constrained by the restricted input and output capabilities of mobile devices. Our target device was Nokia 6600 (see Fig. 3), which provided a representative example of the typical characteristics of current mobile devices. Nokia 6600 has a 2.2-inch 65,000-color graphical display with the resolution of 176x208 pixels. The primary input mechanism is the keyboard, consisting of a four-way navigation key set with a selection key in the middle, two softkeys, whose functions are context dependent and indicated with textual labels on the display, and an extended alphanumeric ITU-T keypad. The user interface is based on S60.1 The device has a built-in microphone for recording audio and a VGA resolution camera for capturing still images and low resolution video clips. The second principle was awareness of the task context and it is tightly related to the flexibility principle. When composing a remarkably more complex entity than a single page multimedia message, the users need to be well aware of the task they are doing and the part or the page of the presentation they are in. This becomes essential when the users are allowed to compose the presentation freely, and is further emphasized by the limited input and output capabilities of mobile devices and the complex nature of the editing task. The third design principle was expressiveness. The participants paid particular attention to the expressiveness and elegance of the messages they were about to send. The images needed to be of high quality and the message needed to clearly convey the idea or “the point” that the participant had in mind. Each page had to fit on the screen so that the receiver would not need to scroll to view it. In single page messages sent during the study period, this was clearly a problem; sometimes the receiver did not notice either the text explaining the image or even the image itself. Figure 3. Nokia 6600. Application Overview Figure 4 shows the top-level structure of the user interface of the Mobile Multimedia Presentation Editor application. The user interface consists of several full-screen views and a number of dialogs that together form a complete application. The arrows illustrate the possible transitions between the views and the dialogs. The fourth principle was personalization. Personal look of the messages was highly appreciated among the participants and ready-made graphics and animations were rarely used due to their nature of being too generic and common. Therefore, the presentation editors should provide a rich set of editing features and these features should support editing of all the elements in the presentation, also including readymade content items like frames or music. The Presentations View allows the user to browse and manage the presentations stored in the device and to open presentations for viewing and editing. It also allows the user to create new presentations. The presentations are listed in reverse chronological order, and for each presentation, the list shows a thumbnail icon, a textual name, and the time and date when the presentation was last modified. The thumbnail icon is an image of the first page of the presentation (scaled down and cropped). Additionally, the list contains a special item “Create new”, which creates a new presentation and moves to the Edit Presentation View. DESIGN Based on the design principles derived from the user study, we began the design of the Mobile Multimedia Presentation Editor application. The design proceeded incrementally from rough sketches to detailed designs through numerous iterations. In early phases of the design, informal usability evaluations with hand-drawn prototypes were made between iterations and the results were used to improve the design. As the design matured, more detailed prototypes were prepared and systematically evaluated in the laboratory environment (see section Usability Evaluations). While several alternative designs were considered during the design process, we present here only the resulting final design. The Play Presentation View enables the user to view the selected presentation. The presentation player displays over the full screen when playing a presentation. The user can pause the playback or re-start the playback from the beginning. The user can also open the presentation being currently viewed for editing in the Edit Presentation View. 1 67 http://www.s60.com/ CHI 2008 Proceedings · Stories and Memories April 5-10, 2008 · Florence, Italy Presentations View Play Presentation View Edit Presentation View Insert Object Dialog Preview Presentation View Edit Object Dialog Figure 4. The top-level structure and navigation of the Mobile Multimedia Presentation Editor user interface. (MMS) and it is widely supported by currently available mobile devices. SMIL is a very sophisticated language and even the restricted mobile profile provides a huge number of different features and options that can be used when creating presentations. The first important design decision we made was that we did not try to design a user interface that would support all the features provided by the SMIL language. Instead, we aimed to understand the kinds of features the users wanted to express in the presentations they created based on the findings of the user study. Then we tried to find user interface solutions for these presentation features and finally describe the features in the SMIL language in a standards compliant and interoperable way. The Edit Presentation View is the primary view of the Mobile Multimedia Presentation Editor. It provides the user with the tools for creating and editing presentations. The Edit Presentation View and the related dialogs are presented in more detail in the next section. The Preview Presentation View allows the user to preview the presentation being currently edited in the Edit Presentation View. The user can preview the presentation at any point during the editing process and then return back to the Edit Presentation View. The Preview Presentation View provides functionality similar to the Play Presentation View. Edit Presentation View The primary challenge in the design of the Mobile Multimedia Presentation Editor was how to enable the creation of sophisticated multimedia presentations on mobile devices with constrained input and output capabilities. In the Mobile Multimedia Presentation Editor application, this functionality is provided by the Edit Presentation View. A second important design decision was related to the editing features to be included in the editor. There exists a very large number of potential editing features and we wanted to provide the users with tools for creating rich and expressive presentations. However, it was clear that a mobile user interface could not incorporate a very large number of features. Therefore, we selected a strategy that instead of providing a number of advanced and very specialized features, we aimed to identify the fundamental basic editing operations and then provide as much freedom in using each of the individual operations as possible and also enable freeform combinations of the different basic operations. As an example, we decided to support free and exact positioning of visual objects on the display (instead of defining the position, e.g., with higher-level concepts like top, bottom, left, right, and center). Further, we allowed the Technically, the presentations created with the Mobile Multimedia Presentation Editor are represented in the Synchronized Multimedia Integration Language (SMIL)2, which is a W3C standard for authoring audio-visual presentations. SMIL has been selected by 3GPP as the standard format for the multimedia messaging service 2 http://www.w3.org/AudioVideo/ 68 CHI 2008 Proceedings · Stories and Memories April 5-10, 2008 · Florence, Italy move position operation to be freely combined with any other operation, e.g., rotation or scaling of the object. selection key, the user can bring up a context menu, which shows the available editing commands for the currently selected object. Several alternative approaches were considered for modeling the temporal dimension of the multimedia presentations. Finally, we selected to model the presentation as a sequence of pages. Each page consists of a number of visual objects (images, speech bubbles, etc.), which remain visible for the entire duration of the page. While this approach sets some restrictions on the structure of the presentation, it was considered to be easy to understand and adequate for the vast majority of the presentation types identified during the user study. On the audio side, we decided to allow free temporal positioning of the sound clips – the sound clips may begin and end at any time independent of the page durations and the sound clips may also freely extend over several pages. This enables, e.g., background music that plays through the entire presentation. The editor user interface provides a generic structure that can support a wide range of different visual object types. In the application design phase, the following object types were designed in detail: images (e.g., photographs), stickers (small icons, e.g., smileys), texts, and text bubbles (e.g., speech or thought bubbles). The design could easily be extended to support other visual object types, e.g., video or image frame objects. The editor defines a number of basic editing operations that are supported by all visual objects, including move, scale/resize, rotate, and remove. Additionally, each object type can define object-typespecific editing commands, e.g., selecting the font and color of a text object. For visualizing the temporal structure, we decided to use a modified and enhanced version of the timeline described by Jokela et al. [3] (see Fig. 5 a). As we wanted to enable the creation of full screen presentations, the timeline floats on top of the presentation on a translucent background. As other editing operations are activated, the timeline is automatically hidden when not needed. The timeline consists of two tracks: the upper track shows the pages and the lower track shows the audio objects. The width of the page or audio object rectangle represents the duration of the page or object, and the duration of the currently selected page or object is also indicated with a small number on top of the rectangle. The timeline initially shows a duration of 20 seconds. If the presentation becomes longer, the timeline duration is increased step by step (30 seconds, 45 seconds, 60 seconds, etc.). The timeline always shows the shortest possible duration so that the presentation still fits on the timeline. The user may use the left and right navigation keys to move to the previous or the next page or audio object on the timeline and the up and down navigation keys to move between the page track and the audio track. If an audio object is selected, the Edit Presentation View shows an audio icon and the name of the audio object (see Fig. 5 b). If a page is selected, the Edit Presentation View shows a snapshot image of the page (see Fig. 5 a). The snapshot image covers the whole screen. All visual objects on the page are visible. If the user navigates up from the page track, the focus is transferred to the visual object closest to the bottom of the page (see Fig. 5 c). The user can then continue navigating up (and down) to traverse through the other visual objects on the page. Navigating to left or right moves to the previous or the next page, respectively. This simple navigation method was selected as, based on the user study, the typical number of visual objects on a page was expected to be small and the users commonly moved back and forth in the presentation, so fast navigation between the pages was seen essential. By pressing the a) b) c) d) Figure 5. Edit Presentation View with the focus on a) a page, b) an audio object, and c) a sticker object, and d) with the Insert Page dialog active. During the initial evaluations of the application design, we identified that in some cases the users repeated long series of commands again and again. This was especially common when inserting new pages to the presentation. E.g., the user would first insert an empty page, then select and insert an image object, position it, then insert a text object, and finally edit the text. Fono et al. report similar observations with Sandboxes [2]. To reduce the needed user effort, we introduced a set of standard page layouts and wizard-like functionality for the insert page operation. When the user selects the Insert Page command, a dialog showing a set of possible page layouts is displayed (see Fig. 5 d). If the user selects, e.g., the “Image and Text” page layout, the editor adds a new page to the presentation and then opens another 69 CHI 2008 Proceedings · Stories and Memories April 5-10, 2008 · Florence, Italy dialog for selecting the image. When the user selects the image, the editor automatically positions the image on the page and activates the move image command for fine tuning the position if needed. Finally, when the user has positioned the image, the editor activates the edit text command for editing the text. Note that the page layouts are only a tool for efficient creation of the initial version of the page. The user can later freely modify any part of the page. were able to complete the given tasks without major difficulties. Even though the idea of creating presentations was new to all participants, they rapidly learned how to use the editor and at the end of the evaluation session they were smoothly creating new presentations. Overall, the participants considered the editor to be easy to use and commented that they would like to use it in their everyday lives. Considering the first design principle, flexibility, the participants enjoyed the freedom that the application provided them on how to create presentations. The participants were able to do editing operations in any order they liked and modify any part of the presentation. Navigation between the pages was considered straightforward and fast. While it was possible to preview the presentation at any point by using the Preview View, the participants often used the fast page navigation feature in the Edit View to preview the presentation. Still, some needs that we were not able to predict during the design emerged in the evaluations – e.g., our design supported only importing of photographs captured earlier, while some participants would have wanted to take pictures during editing directly from the Edit Presentation View. Some participants also requested keyboard shortcuts for the most common operations. USABILITY EVALUATIONS Participants and Procedure In order to assess the usability of the application and to gain initial response from the users, we arranged a series of laboratory usability evaluations during the application development. In particular, we were interested in how well the application design met the design principles derived from the user study. A total of four usability evaluations were made and they were spread over a period of 18 months in 2003-2004 as the development of the application proceeded. Each usability evaluation was carried out with six participants, so overall a total of 24 users participated in the evaluations. We recruited both persons working in our laboratory as well as external participants in the tests. None of the participants had previous experience in using the application before the evaluations. Several different devices were used in the evaluations. The first evaluation was made with Nokia 3650, while Nokia 6600 and Nokia 6630 devices were used in the later evaluations. Each device was the latest model available at the time of the evaluation. We decided to utilize the latest models in order to benefit from the increasing hardware capabilities of the devices. Regarding the awareness of the task context principle, the page-based structure that was used to model the temporal dimension of the presentation was well understood. The primary tool for indicating the current task context in the editor user interface was the timeline. For approximately half of the participants, it was immediately obvious how the timeline works – for the rest, it took a moment to figure it out. After that the participants considered the timeline to provide an illustrative visualization of the presentation structure and to be intuitive to use. The main challenges were related to navigation between the page and audio tracks, which we addressed by improving the timeline visualizations. Scaling of the timeline depending on the presentation duration did not confuse the participants. The evaluations were carried out following a standard procedure in our usability laboratory. The participants completed a set of pre-defined tasks. We used two types of tasks in the tests. Descriptive tasks were presented in the form of use scenarios defining high-level objectives (e.g., to create a Valentine’s Day greeting) that the participants were free to complete using the application in any way they liked. These tasks were used to find out how the participants understood the structure of the application and they also helped us to understand in which order they carried out different tasks. Other tasks were more detailed and focused on finding usability problems related to particular features. The participants also answered questionnaires and were briefly interviewed by the moderator about their experiences with the application. Each session lasted for approximately one and a half hours. All necessary content (images, audio clips, etc.) for completing the tasks were pre-installed on the device. Related to the expressiveness and personalization principles, the participants commented in the post-test interviews that they were able to create the kinds of presentations they wanted to and that they were satisfied with the presentations they created. Some of the participants requested additional editing features, including more advanced image and audio editing functions and more detailed control over the appearance of text. FIELD TRIAL Participants and Procedure While the laboratory tests allowed us to evaluate the usability of the Mobile Multimedia Presentation Editor, i.e., how well the users could achieve the given tasks with the application, the laboratory evaluations provided less insight on the utility of the application, i.e., how well it met the Results In general, the results of the usability evaluations were encouraging. While numerous minor usability issues were identified and fixed related to, e.g., menu structures, terminology, navigation, and visualizations, the participants 70 CHI 2008 Proceedings · Stories and Memories April 5-10, 2008 · Florence, Italy user needs in a more realistic environment of use. Of the design principles, the laboratory evaluations provided a good tool for evaluating the usability-related design principles of flexibility and awareness of the task context, but they provided only limited insight on the more utilityrelated principles of expressiveness and personalization. To shed more light on these aspects of the application, we arranged in 2004 a field trial during which a group of users used the Mobile Multimedia Presentation Editor in their everyday lives for a period of several weeks. the application stabilized around 30-60 presentations per week. There was quite a lot of variation in the activity of use between the participants. The two most active users created more than 50 presentations, while six users created ten or less. The remaining seven users created between 20 and 40 presentations each. The presentations were primarily created for sharing with others and the presentations were typically shared immediately after they were completed. Of the created presentations, 88 % were sent to others as multimedia messages. The presentations were primarily used for personal communication – 67 % of the shared presentations were sent to a single recipient only. Of the remaining presentations, only a few were shared with more than four persons. In addition to the study participants, the presentations were shared also with other persons, mostly with friends and work colleagues. We recruited a total of 15 participants for the field trial. The participants were divided into two groups: a family group and a friends group. The family group consisted of relatives and friends of a couple and it had a total of nine participants. There were eight female participants and one male participant. During the trial, the actual number of participants was actually larger since some presentations were created, e.g., by the participants’ husbands, who were not originally recruited in the trial. The friends group was formed around two brothers and their friends. In this group, there were two female and four male participants. In both groups, the participants formed an existing social network, which enabled natural communication and provided reasons to communicate. This was considered to be important due to the nature of the application, which relied on personal communication. The ages of the participants varied between 18 and 56 years old and, on average, they had five or six years of experience of using mobile devices. However, none of the participants had used multimedia messages before the trial. The participants had both technical and non-technical backgrounds. Presentations were created for a wide variety of purposes. The topics included sharing of everyday events, invitations to get together, and entertaining and cheering up the recipient (e.g., inside jokes). The presentations were also used for utility purposes, e.g., shopping or homework. Of the created presentations, 64 % contained more than one page. Most of these longer presentations consisted of 2-4 pages. The longest presentation had 18 pages. The average time to compose a presentation was approximately five minutes, which the participants did not consider too long. Considering the design principles of flexibility and awareness of the task context, the results supported the earlier findings in the laboratory evaluations. In the interviews, the participants indicated that the application was easy to use and that the threshold of using it was low, resulting in a large number of created presentations. All participants were provided with Nokia 6600 mobile devices with the Mobile Multimedia Presentation Editor prototype to be used as their primary mobile devices. The mobile call and data charges were covered during the trial. Regarding the principles of expressiveness and personalization, the participants considered the presentations created with the application to be more expressive and personal than normal multimedia or text messages. The participants continued to use the application also over the last weeks of the trial, shared a high percentage of the created presentations with others, including persons outside the trial groups, and created presentations about a broad range of topics – all these findings provide further indicators that the application provided the participants a versatile tool for more expressive and personal communication. Creating longer presentations was considered to be challenging but rewarding. The average amount of text in the created presentations was approximately 30 characters, suggesting that other media forms were used to substitute text in the created presentations. Audio was found to be much more important during the field trial than what was indicated in the laboratory evaluations. Audio was used as a part of storytelling, as a structural element, and for creating the right atmosphere. The participants considered presentations that contained audio to be better than others. The field trial period lasted for 42 days including introduction sessions for both groups. Every participant was interviewed twice during the trial. The first interview was made approximately three weeks after the beginning of the trial and the participants were interviewed again at the end of the trial. In addition, the participants were encouraged to keep a diary of their experiences and the Mobile Multimedia Presentation Editor was modified to record a log of all application usage. We did not give any specific tasks to the participants during the trial, but advised them to create and send presentations as they liked. Results The field trial resulted in a large amount of detailed data about the usage of the Mobile Multimedia Presentation Editor. In this paper, it is only possible to highlight some of the key findings. During the trial period, the participants created a total of 372 presentations. After the initial burst of activity (126 presentations were created during the first week), the use of 71 CHI 2008 Proceedings · Stories and Memories April 5-10, 2008 · Florence, Italy DISCUSSION AND FUTURE WORK ACKNOWLEDGMENTS Overall, the usability evaluations and the field trial provided encouraging results and the feedback we received from the participants was very positive. The evaluation results show that the application design met well the initial design principles – the participants found the application to be easy and intuitive to use and felt that it enabled richer and more expressive and personal communication than what was possible with multimedia or text messages. The participants actively utilized many of the advanced editing features that the application provided: the majority of the created presentations had more than one page, audio was found to be an important element, and the role of text was smaller than in traditional messaging. The results also demonstrate that by careful user-centered design approach, it is possible to overcome the input and output constraints and enable the creation of sophisticated multimedia presentations on mobile devices. We would like to thank Timo Koskinen, Andrei Popescu, Guido Grassel, Ari Koivisto, Mika Röykkee, Terhi Lukkari, Kaj Mäkelä, Erika Reponen, Tero Hakala, Ciaran Harris, and all the other persons who contributed to the development of the Mobile Multimedia Presentation Editor. REFERENCES 1. Balabanovi!, M., Chu, L., Wolff, G. Storytelling with Digital Photographs. In Proc. of the SIGCHI Conference on Human Factors in Computing Systems, 2000. Pp. 564-571. 2. Fono, D., Counts, S. Sandboxes: Supporting Social Play through Collaborative Multimedia Composition on Mobile Phones. In Proc. of the Conference on Computer Supported Cooperative Work, 2006. Pp. 163-166. 3. Jokela, T., Karukka, M., and Mäkelä, K. Mobile Video Editor: Design and Evaluation. In Proc. of the Int. Conference on Human-Computer Interaction, 2007. Vol. 2: pp. 344-353. The work presented in this paper has focused on how to utilize audio-visual stories created with mobile devices in personal communication in the leisure-time context. Beyond this, we see several other potential applications for multimedia presentations created with mobile devices and plan to extend our work to explore them. Instead of sharing the created presentations as personal messages, the presentations could be published to the Internet on mobile blogs or distributed as podcasts or alternatively shared in proximity using short-range communication technologies. The Mobile Multimedia Presentation Editor might also be a useful tool for professional purposes, e.g., it might provide a tool for journalists to create, edit, and publish their stories directly from the mobile device without the need to transfer the material to other devices or to other persons for editing. 4. Jokela, T., Mäkelä, K., and Karukka, M. Empirical Observations on Video Editing in the Mobile Context. In Proc. of the Int. Conference on Mobile Technology, Applications, and Systems, 2007. Pp. 490-497. 5. Kirk, D., Sellen, A., Rother, C., and Wood, K. Understanding Photowork. In Proc. of the SIGCHI Conference on Human Factors in Computing Systems, 2006. Pp. 761-770. 6. Koskinen, I., Kurvinen, E., and Lehtonen, T.-K. Mobile Image. IT Press, 2002. 7. Landry, B. and Guzdial, M. Learning from Human Support: Informing the Design of Personal StoryAuthoring Tools. In Proc. of CODE 2006 [online]. Available: http://www.units.muohio.edu/ codeconference/papers/papers/Landry_Guzdial.pdf CONCLUSION We have described the design of the Mobile Multimedia Presentation Editor, an application that makes it possible to author sophisticated multimedia presentations on mobile devices. The application enables the creation of rich and expressive audio-visual stories, supporting the creativity of the mobile users. As a starting point for the application design, we have presented a study on user habits on composing and sending multimedia (MMS) messages, and based on the results of the study, derived four design principles for multimedia editors on mobile devices: flexibility, awareness of the task context, expressiveness, and personalization. We have presented a concrete application design that supports these principles and implemented a functional prototype of the application on a Nokia 6600 mobile device. To validate and evaluate the application design, we have conducted a series of laboratory evaluations and a field trial. In these evaluations, the participants found the editor easy to use and indicated that it enabled richer and more expressive communication than the traditional mobile messaging techniques. 8. Madej, K. Towards Digital Narrative for Children: From Education to Entertainment: A Historical Perspective. ACM Computers in Entertainment, 2003. Vol. 1: 1. Pp. 1-17. 9. Mäkelä, A., Giller, V., Tscheligi, M., and Sefelin, R. Joking, Storytelling, Artsharing, Expressing Affection: A Field Trial of How Children and Their Social Network Communicate with Digital Images in Leisure Time. In Proc. of the SIGCHI Conference on Human Factors in Computing Systems, 2000. Pp. 548-555. 10. Salovaara, A. Appropriation of a MMS-Based Comic Creator: From System Functionalities to Resources for Action. In Proc. of the SIGCHI Conference on Human Factors in Computing Systems, 2007. Pp. 1117-1126. 11. Wu, C., Teng, C., Chen, Y., Lin, T., Chu, H., and Hsu, J.: Point-of-Capture Archiving and Editing of Personal Experiences from a Mobile Device. Personal and Ubiquitous Computing [online]. September 2006. 72