Creating and Editing "Digital Blackboard" Videos using Pentimento: With a Focus on Syncing Audio and Visual Components by Alexandra Hsu S.B., C.S. M.I.T., 2013, M.Eng., C.S. M.I.T, 2015 Submitted to the Department of Electrical Engineering and Computer Science in Partial Fulfillment of the Requirements for the Degree of Master of Engineering in Electrical Engineering and Computer Science at the Massachusetts Institute of Technology June, 2015 Copyright 2015 Alexandra Hsu. All rights reserved. The author hereby grants to M.I.T. permission to reproduce and to distribute publicly paper and electronic copies of this thesis document in whole and in part in any medium now known or hereafter created. Author: Department of Electrical Engineering and Computer Science May 22, 2015 Certified by: Fredo Durand Thesis Supervisor May 22, 2015 Accepted by: Prof. Albert R. Meyer, Chairman, Masters of Engineering Thesis Committee 1 2 Creating and Editing "Digital Blackboard" Videos using Pentimento: With a Focus on Syncing Audio and Visual Components by Alexandra Hsu Submitted to the Department of Electrical Engineering and Computer Science May 22, 2015 In Partial Fulfillment of the Requirements for the Degree of Master of Engineering in Electrical Engineering and Computer Science Abstract Online education is a rapidly growing field and with the desire to create more online educational content comes the necessity to be able to easily generate and maintain that content. This project aims to allow for recording, editing, and maintaining “digital blackboard” style lectures, in which a handwritten lecture with voiceover narration is created. Current recording software requires that a lecture audio and visuals be recorded correctly in one take or they must be re-recorded. By utilizing vector graphics, a separation of audio and visual components, and a way for the user to be in control of the synchronization of audio and visuals, Pentimento is a unique piece of software that is specifically designed to record and edit online handwritten lectures. 3 4 Acknowledgements I would like to thank the many people who helped me reach the completion of this thesis. First and foremost, I would like to thank my family for many years of support. Especially my mom, Stephanie Maze-Hsu, for reading countless papers and constantly helping me get through everything. I would also like to thank my brother Robert Hsu for always being there for a laugh. Additionally, I would like to thank my friends without whom I couldn’t have gotten to this point. Most importantly I would like to acknowledge Zach Hynes for his unending encouragement and support. Kevin Hsiue and Eric Kline both helped me immensely giving feedback and motivation as I wrote my thesis. Professor Fredo Durand was a wonderful guiding force and mentor and was extremely understanding about non-academic issues I faced during my undergraduate and graduate careers. Without his vision and his guidance this project wouldn’t be realized. Jonathan Wang worked on this project with me and co-authored portions of this thesis. Without him I don’t think the Pentimento prototype would work. Finally, I would like to thank all of the medical professionals who have helped patch me together enough to hand in this thesis. There were times I thought I would never complete this document and the fact that I have is a testament to my amazing support system. Thanks to all of those people mentioned here and countless others, without whom I wouldn’t have been able to complete my thesis. 5 Table of Contents 1. Introduction ............................................................................................................................. 9 2. Background Information ....................................................................................................... 10 2.1 Visual Edits ..................................................................................................................... 10 2.2 Audio Edits ...................................................................................................................... 11 3. Related Work and Current Solutions..................................................................................... 12 4. Project Goals ......................................................................................................................... 14 4.1 Features ........................................................................................................................... 15 4.1.1 Implemented Recording Features ........................................................................... 15 4.1.2 Implemented Editing Features................................................................................ 17 5. User Guide and Tutorial ........................................................................................................ 18 5.1 The Main Recording UI .................................................................................................. 19 5.1.1 Recording Visuals .................................................................................................. 20 5.1.2 Editing Visuals ....................................................................................................... 22 5.1.3 Recording Audio .................................................................................................... 23 5.1.4 Editing Audio ......................................................................................................... 23 5.2 The Retimer ............................................................................................................... 23 6. Code Organization Overview ................................................................................................ 26 6.1 Lecture ............................................................................................................................. 28 6.1.1 Lecture Model ........................................................................................................ 28 6.1.2 Lecture Controller .................................................................................................. 29 6.2 Time Controller ............................................................................................................... 30 6.3 Visuals ............................................................................................................................. 32 6.3.1 Visuals Model......................................................................................................... 32 6.3.2 Visuals Controller................................................................................................... 33 6.3.3 Tools Controller ..................................................................................................... 34 6.3.3.1 Visuals Selection ................................................................................................. 34 6.4 Audio ............................................................................................................................... 34 6.4.1 Audio Model........................................................................................................... 35 6 6.4.2 Audio Controller..................................................................................................... 35 6.4.3 Audio Playback ...................................................................................................... 36 6.4.4 Audio Timeline....................................................................................................... 37 6.4.5 Audio Track Controller .......................................................................................... 37 6.4.6 Audio Segment Controller...................................................................................... 37 6.4.7 Audio Plug-in ......................................................................................................... 38 6.5 Retimer ............................................................................................................................ 38 6.5.1 Retimer Model ........................................................................................................ 39 6.5.2 Retimer Controller .................................................................................................. 40 6.6 Thumbnails Controller .................................................................................................... 41 6.7 Undo Manager ................................................................................................................. 41 6.8 Renderer .......................................................................................................................... 42 6.9 Save and Load Files ........................................................................................................ 42 7. Future Work .......................................................................................................................... 43 7.1 Future Features ................................................................................................................ 44 7.1.1 Recording ............................................................................................................... 44 7.1.2 Editing .................................................................................................................... 46 7.2 User Interface Additions ................................................................................................. 47 7.2.1 Main Visuals Recording and Editing UI ................................................................ 47 7.2.1 Audio and Retimer Timeline UI............................................................................. 48 7.3 Student Player.................................................................................................................. 49 8. Conclusions ........................................................................................................................... 49 Appendix A: Documentation ........................................................................................................ 51 A.1 Lecture ............................................................................................................................ 51 A.1.1 Lecture Model ....................................................................................................... 51 A.1.2 Lecture Controller ................................................................................................. 52 A.2 Time Controller .............................................................................................................. 54 A.3 Visuals ............................................................................................................................ 55 A.3.1 Visuals Model ........................................................................................................ 55 A.3.2 Visuals Controller .................................................................................................. 57 7 A.4 Tools Controller.............................................................................................................. 59 A.5 Audio .............................................................................................................................. 62 A.5.1 Audio Model .......................................................................................................... 62 A.5.2 Audio Controller .................................................................................................... 65 A.5.3 Track Controller .................................................................................................... 68 A.5.4 Segment Controller ................................................................................................ 70 A.6 Retimer ........................................................................................................................... 71 A.6.1 Retimer Model ....................................................................................................... 71 A.6.2 Retimer Controller ................................................................................................. 73 A.7 Thumbnails Controller.................................................................................................... 75 Appendix B: Example of Saved Lecture JSON Structure ............................................................ 76 References ..................................................................................................................................... 84 8 1. Introduction With the growing popularity of online education through programs such as Khan Academy and MIT’s EdX, there is an increased need for an easier way to create educational video lectures. This project strives to simplify the process of creating these online videos and to make editing these videos much easier. In current solutions, an entire incorrect segment must be re-recorded to edit the lecture. There are many advantages to separately editing audio and written visual components of an online lecture, such as content being updated as years pass and not becoming obsolete. It will also be less frustrating and time consuming for educators to record presentations that do not have to be done correctly in one take. The project focuses on the popular style exemplified by Khan Academy where an educator writes notes on a virtual blackboard as he or she gives a lecture or explains a topic. With these “blackboard lectures” students see the handwritten notes and drawings produced by the lecturer, while hearing a voiceover narration explaining the written content. Currently, creating these videos is done by recording the written notes using extremely basic tools. These tools essentially use a tablet and pen as input and screen capture to record the strokes in a simple paint program. Although this method is an improvement from merely taking a live video recording of the lecturer and the blackboard they are writing on, it still requires the educator to get everything correct in a single pass because there is no editing capability beyond cuts. The technology created by the Pentimento project makes it easy to edit and update the content of each lecture. Pentimento was started by MIT Professor Fredo Durand for the purpose of addressing the specific needs that arise when editing handwritten lecture content. The software is currently under development by Professor Durand and my thesis work involved taking the 9 Pentimento prototype (which can only be run on Mac OS) and converting it into a web based tool that can be used to record, edit, and view online lectures. 2. Background Information There are some key differences between editing a handwritten lecture, such as the ones that can be seen on Khan Academy or EdX, and editing a standard video or movie. Typical movies can be edited with cuts and if there are errors it often makes sense to re-record the entire scene or video. However, handwritten lectures introduce different types of errors and corrections that could ideally be fixed without having to remake a segment or worse, the entire lecture. For example, handwritten lecture style videos also often benefit from recording and editing the audio portions separately from the visuals. This minimizes cognitive load on the speaker while recording [4] as well as providing the ability to independently correct mistakes that occur only in the audio recording or only in the visuals. In addition to allowing for temporal editing, such as moving visuals to a different time in the lecture, the new editing capabilities introduced by Pentimento facilitate correction of the following types of errors: 2.1 Visual Edits • Correct existing lines: While writing a lecturer will often make a writing mistake that they would like to correct later, such as accidentally writing “x=…” instead of “y=…” In these cases, the software allows for modification of the strokes to change the “x” into a “y” without altering the timing of the writing or the synchronization with the audio. • Insert missing lines: As the lecture is recorded, the lecturer may omit something that they would like to add later, such as leaving the prime off of a variable (writing “x” instead of 10 “ x’ ”). While editing, this stroke can be added at the appropriate time in the lecture without altering the timing of the following strokes or synchronization with the audio. This can also be extended to include entirely new content, such as adding a clarifying step in a derivation. • Move/resize drawings and text: Sometimes when viewing the video, the lecturer discovers that it would make more sense to arrange the text/visuals differently or that a certain visual should be made bigger or smaller. While editing they will have the ability to rearrange and resize these visual components without altering the timing of the strokes or synchronization with the audio. 2.2 Audio Edits • Re-record and sync: This feature gives the lecturer the ability to only record the audio portion for a section of the lecture and have it remain synced with the visuals for that portion. This could be the first pass recording (e.g. recording audio after visuals have already been drawn) or an edited audio recording to correct a spoken error during the first pass of the lecture. • Eliminate Silence: Often it takes longer to write something than to say it, which can lead to long silences in the lecture. Pentimento allows the visuals to be sped up to fill only the time that corresponds to speech and to eliminate these silences. These differences require alternate tools and editing that currently available standard video editing software cannot provide. Pentimento strives to allow for these different types of video alterations and hopes to make it extremely easy for lecturers or others to edit the videos quickly and efficiently. 11 For visual edits, this will be accomplished by tracking the user’s drawing and writing inputs and representing the strokes as vectors, which can be modified later. By using a vector representation, the position, size and speed at which the text and drawings are presented can be altered and updated after the recording has taken place. The ability to move and edit drawings and text allows the presenter to fix or change the focus of a section of the video without rerecording the whole thing. Pentimento uses the vectors to represent the lecture in a vector graphics format, which can be edited much more simply than the current representations in the form of raster graphics. Raster graphics are images that are displayed as an array of pixels (i.e. a computer screen or television) [6]. Vector representations offer various advantages over raster graphics representations, such as compatibility with differing resolutions and ease of modification based on construction of vectors [10], and ease of modification based on construction of the vectors. The vectors can be formed either by storing two points and the connecting line segment, or storing a single point with an associated magnitude and direction [2]. Both of these constructions allow changes to be made fairly efficiently, since the image does not have to be redrawn. Instead, a parameter in the construction simply has to be modified to fix the problem. 3. Related Work and Current Solutions The current process for editing videos of this type is not quite so simple. Even the preeminent video editing tools (such as Apple’s Final Cut Pro, Adobe’s Premiere or Avid [4]) only allow traditional raster graphics to be edited and moved, but don’t facilitate the modification of handwritten videos stored as vectors. Some promising video editing work has been done to allow removal of segments of a video (selected from the video transcript) and then to seamlessly 12 transition between the portions of the video before and after the deletion [3]. However, these sorts of editing capabilities only address some of the difficulties with editing handwritten lectures. While these tools would be useful for changes such as removing long silences that occur while writing or helping to correct the speaker’s mistakes, they are unable to address and correct errors in the writing or help make significant changes to the content of the lecture. Editing these videos isn’t the only challenge with current technologies; even creating handwritten content for teaching purposes is somewhat difficult with currently available tools. There are very few ways to record, edit, and view freehand drawings [4]. Although there is some ability to animate the vector graphics with formats like SVG [9] the ability to easily maintain audio synchronization is not supported. More commonly, to give the appearance of real-time handwriting in web browsers Flash animation can be used [8]. However, this can be complex and it often requires using animated masks or other techniques [1], which can be difficult and requires a lot more effort and processing than simply recording a video of the handwritten content as it occurs. There is immense desire to create the appearance of handwritten lectures, and software has even been created to automatically create animations that give the appearance of handwriting [5]. However, none of these solutions allow for real time recording and post process editing of the actual writing in the ways that are necessary to record an effective video lecture. The Khan Academy style videos are recorded using technology available for capturing live handwritten lectures. These are usually recorded using a screencast method, which involves digitally capturing a certain portion of a virtual whiteboard screen during a live lecture (i.e. capturing the screen as the lecture is written). These methods can produce videos similar to the 13 style we hope to achieve, but they often require extensive preparation to plan out exactly what will be said and when to say it to avoid having to re-record large portions of the lecture [7]. As mentioned above, a key feature of our software is the capability to easily edit or update lectures after they have been recorded, which is not a possibility with the screencast methods (especially separately editing the visual and audio components and keeping them synced in time). 4. Project Goals There is a clear need and audience for software that allows for straightforward recording and editing of digitally handwritten video lectures. Professor Durand has been developing this software, but his current prototype version is only compatible with Mac OS. In order to make the software easily accessible and widespread we are aiming to create a web based version to allow for easy recording, editing, and viewing of these online video lectures. There are many advantages to creating a web based software that we considered when deciding to develop a second prototype. Most importantly, it is accessible to anyone regardless of the operating system they are using. Secondly, many existing premiere video editing tools are extremely expensive and are not able to be used cross platform. Another advantage is that web based tools allow for easier collaboration because they do not require all parties to have the same software. Ultimately, we hope to have an available web version that lecturers can easily use to record one of the handwritten whiteboard style lecture videos. The web version of Pentimento was written using JavaScript, HTML5, and CSS, as well as additional libraries including jQuery, jCanvas, wavesurfer.js, and spectrum.js. By using these technologies we have created a website prototype where lecturers can produce videos that can then be easily edited and updated after recording. 14 4.1 Features To create the web version of Pentimento there is a minimum feature set that must be implemented to make the program useful and effective. As I worked towards my thesis, I assisted in designing and implementing the main interface and the key features to make the web version of Pentimento successful. The key features will be updated as users actually test the website and, in theory, iteratively updated to match user need. New features will also be implemented by future students. There are two components of the program: recording and editing. Each has specific features that will make the website functional, in addition to other features that would be nice for a user to have available but weren’t essential during the first prototype (discussed in the future work section). 4.1.1 Implemented Recording Features Lecturers require recording capability to record both the audio and visual components of the lectures. During the recording phase the lecturer will write and draw the visual components of the lecture as well as record the corresponding audio explanation. To be able to do these tasks the software must support the following features: Feature Record Button Function This allows the lecturer to start the recording. Stop/Pause Button The lecturer is able to stop the recording at any point (and then resume recording from the same point later). Pen Tool The main use of the software is to create handwritten lectures and the pen is the basic input tool used to write/create the visual input content. Recording Canvas The writing and drawing must be done on something resembling a virtual whiteboard. The canvas is the area of the screen devoted to creating the actual lecture content (i.e. this is the area that would be recorded using the typical 15 Feature Function screen capture techniques). Insert New Slide The recording area consumes a finite space on the screen, but most lectures need more space available. Inserting a slide allows the lecturer to reveal new blank space to fill with content. Selection of Visuals While recording, the lecturer often needs to select content to move or delete. Note: Selection during recording is different than selection during editing. If something is selected while being recorded, that selection will be part of the final video. Deletion of Visuals During recording the lecturer may want to remove content. Note: Deletion during recording is different than deletion during editing. In the final video, something that was written and recorded and then deleted while recording will appear in the video and then the viewer will see it was removed (in contrast to deleting something while editing where the strokes will never appear in the video). Time Display and Slider While recording, the lecturer should be aware of how much time has elapsed in the recording. The time display shows the current recording time. The time slider, as part of the audio timeline, also allows the lecturer to choose when to insert a recording. Separate Recording of Audio/Visuals Lecturers have the ability to record solely audio, solely visuals or both components simultaneously, allowing for fine tuning the recording of the lectures. Pen Color Many online handwritten lectures utilize changes in pen color to emphasize certain topics or at least vary the visuals to make it easy to quickly notice the key points. Line Weight Similar to variations in pen color, being able to support different line weights allows lecturers more flexibility in the visual quality of their videos and allows different areas to be emphasized and stand out. Recording Indicator Sometimes it is difficult to tell if certain software is in recording mode or not, so we want to make it very obvious to the lecturer that the current actions are being recorded. Currently this is done by changing the “Recording Tools” label background to red, which probably is not obvious enough. 16 4.1.2 Implemented Editing Features Once the lecture has been recorded, the lecturer (or another person) may proceed to edit the recorded content. The editing phase requires a different tool set than while recording because the post recording editing capabilities are crucial to saving time and maintaining the quality of the content. The essential editing capabilities are: Feature Play/Pause Buttons Function Playback is an essential part of editing because it allows reviewing the portions of the video that have been recorded. Time Slider Navigating to a certain point in the lecture is necessary to be able to edit parts of the recording at certain times. The time slider also allows the lecturer to choose when to start or stop playback. Selection of Visuals While editing, many tools require use of a selection tool, which allows for selection of certain components to edit (e.g. to delete them). Note: this selection tool is different than the selection tool used during the recording phase. Selections made while editing will not be seen as part of the final video. Deletion of Visuals Removing errors or unwanted content is an essential part of editing video lectures. Note: deletion during editing is different from deletions during recording. In the editing phase strokes that are deleted will be removed from the final video (as if they were never written). Retiming and There is a tool that allows realigning visual and audio components of the Resynchronization video in case they are recorded separately or some adjustments need to be made. This also allows for removing long silences introduced by writing taking longer than speaking and adjusting other places where the visual/audio parts of the lecture may need to be edited separately and realigned. Additionally, temporal edits are necessary so that visuals are sped up or slowed down to match the speed of the audio. Note: the retimer is a separate user interface Stroke Color This would allow the editor to change the color of a stroke for the duration of the video (whereas if the pen color is changed while recording, the original color would be maintained where it was already recorded). 17 Feature Stroke Weight Function Similar to editing stroke color, this feature would allow the editor to change the weight of a stroke after recording and the new weight would be evident for the full duration of the recording for that stroke. 5. User Guide and Tutorial Pentimento was created to allow for easy creation and revision of handwritten “digital blackboard” style lecture videos. However, Pentimento transcends current solutions by adding a simple editing component, which facilitates increased flexibility in updating lecture content once recording is completed. Other solutions barely allow editing beyond cutting content, but Pentimento has much stronger editing capabilities, including separate editing and synchronization of audio and visual components. This section walks a user through the Pentimento web software, detailing the user interface and explaining how to do simple recordings of lectures. Since Pentimento allows for non-linear recording and editing of lectures there are a lot of options for how to begin recording a lecture. The basic Pentimento lecture consists of handwritten strokes on slides with a voiceover lecture, but there are many choices for how to create this lecture. As a lecture is recorded the user has the option to insert slides for organizational purposes or simply to create a blank slate to record visuals on. The visuals are currently in the form of strokes, which appear as the handwritten part of the lecture. An audio track is created while recording audio and new audio segments are created by breaks in audio recording. The audio segments can be rearranged by the user after recording. Finally, the user has a chance to create synchronization points to connect specific audio and visual moments in the lecture, allowing for playback to show user selected visuals at a user specified audio time. 18 The first unique aspect of the Pentimento software is the ability to record audio and visuals separately or together. Once audio and visual components of a lecture have been recorded, the audio and visuals can be synchronized chronized through the retimer. The second, and probably most important, innovation of Pentimento is the ability to edit the lecture after recording. By allowing the user to change content of the lecture (visual or audio) after recording and keep the timing timi the same, it is much simpler to create an accurate, effective effective, and up to date lecture video. Pentimento allows users to edit the he lecture in many ways such as: updating layout and display (e.g. changing the color of visuals), inserting content at any ti time me and synchronizing the audio and visual components to make the timing exactly what is desired. 5.1 The Main Recording UI The main recording portion off the web interface is where a user can begin recording and editing visuals. Figure 1:: Main Pentimento Recording Interface (in editing mode) 19 As a lecturer be begins gins recording he or she is given the option to record just the visuals, just the audio or both. User tests Figure 2: Record Button with Recording Options (current state would record both visuals and audio) indicated that most people choose to record the visuals first, then the audio and then add synchronization between the two [4]. For simplicity heree we will discuss how to record and edit each modality separately, but they can also be recorded at the same time. 5.1.1 Recording Visuals The basis for recordingg visuals is the pen tool. Afte After the record button is pressed, any time the pen or mouse is placed on the main drawing canvas the resulting strokes will be recorded as part of the lecture. Hitting stop in the top left corner then stops the recording. Figure 3:: Main Recording User Interface (in Recording Mode) Mode).. The pen tool is highlighted as the main input for lecturers. The recording canvas is shaded to indicate space where the pen tool can be used. Finally to stop recording the stop button in the top left corner would be clicked. 20 While recording visuals, it is possible to select visuals and then resize, delete or move those visuals. If a selection is made while in recording mode, that selection will become part of the recorded lecture, so when it is played back the person watching the lecture will be able to see the selection and any actions that have been taken (e.g. moving the selected visuals). Figure 4:: Using the Selection Tool. The selection tool is highlighted. In this example, the letter "C" is selected and could be deleted, moved or resized by the user. Additionally, the color and/or width of the pen strokes can be adjusted by selecting these options from the recording menu. Figure 5:: Pen Stroke Changes. On the left is the color palette to change the color of the pen strokes. The right image shows the available widths of the pen tool. 21 A lecturer also has the ability to insert a new slide by pressing the add slide button. This clears the canvas and allows for a blank slate while recording. Slides can be Figure 6: Add Slide Button used as organizational tools, or simply to wipe the screen clean for more space. Once some visuals have been recorded they can be played back by hitting the play button. Figure 7:: In editing mode visuals can be played back by clicking the play button (emphasized here) 5.1.2 Editing Visuals When recording has stopped Pentimento enters ers editing mode. This allows a user to make changes that are not recorded as part of the lecture, but instead change from the moment the visual appears. For example example,, changing the color of a stroke while editing will change the color of that stroke from the moment it was written, instead of changing it mid mid-playback playback (which is what would happen if the color was changed during recording). Some other examples of visual edits dits are changing the width, resizing, moving, and deleting visuals. This allows for errors to be corrected (e.g. if something is misspelled the Figure 8: Editing Toolbar visuals could be deleted in editing mode and the specific word could be re22 recorded)) and content to be updated updated.. Layout changes are also common, since sometimes it is difficult to allocate space properly the first time a lecture is recorded. 5.1.3 Recording Audio While audio can be recorded at the same time as the visuals, many users choose to record it separate separately. ly. Recording audio is as simple as hitting record and then Figure 9: Recording Only Audio speaking into the microphone. It is also possible to insert audio files, such as background music or audi audio examples to enhance a lecture. 5.1.4 Editing Audio The main type of audio edit that is necessary in handwritten lectures of this kind is removing long silences. Often, if recording audio and visuals at the same time, writing takes longer than speaking, filling the lecture with long silences that can be deleted in the audio editing phase. Audio udio segments can also be rearranged or dragged to a different time. Figure 10:: Audio Waveform displayed on the audio timeline 5.2 The Retimer Retiming is a key innovation of Pentimento Pentimento,, allowing the user to resynchronize the visual and audio components of a lecture. This is a form of editing that affects tthe he playback of the lecture, playing visuals at a user specified time during the audio. To achieve this synchronization the user uses the retimer display as shown. The display is comprised of a thumbnail timeline, displaying snapshots of visuals at time iintervals ntervals throughout the lecture. These correspond to the 23 audio timeline below. In between the thumbnails and the audio is the main feature of the retimer, where correspondences between audio and visuals are drawn. Figure 11:: The Audio Timeline and Retimer. This displays the user interface that can be used to add synchronization points between visual and audio time in a lecture. The top displays thumbnails of the lecture visuals. The bottom is the audio waveform representing the lecture audio. In between is the retiming canvas, which allows the user to add synch synchronization ronization points between the visuals (represented by thumbnails) and the audio (represented by an audio waveform). To insert a new constrain constraintt the “add constraint” button must be clicked and then the user must click on the place on n the retimer timeline where Figure 12: Add Constraint Button he or she wants to draw the correspondence. These synchronization points are represented by arrows pointing to the point in the audio tim timee and the corresponding point in the visual time. Note: Some constraints are added automatically at the beginning and end of recordings to preserve other constraint points. Automatic constraints are gray, while manually added constraints are black. Figure 13: New constraint added to the constraints can canvas by the user. 24 To fine tune the audio and visual correspondence the user can drag the ends of the arrow to line up with the exact audio time and the exact visual time they would like to be played Figure 14:: User dragging a constraint to synchronize a certain point in the audio (bottom of the arrow) with a new point in the visuals (the point the top of the arrow is dragged to) together. Then the visuals on either side of the constraint will be sped upp or slowed down appropriately to ensure that during playback the desired audio and visual points are played at the same time. Note: it is always the visual time being adjusted to correspond to the audio time (this decision was made because writing faster or slower flows much better than the lecturer suddenly talking faster or slower). To delete a constraint a user simply click clicks within the constraints timeline and drags drag a selection box over the constraint(s) he or she wishes to remove. Figure 15 15: User selecting a constraint to delete This turns the selected constraints red (to visually confirm that the desired constraint has been chosen). Then the user can click on the delete constraint(s) button to remove the correspondence. Figure 16(a): Selected constraint (indicated by turning red) Figure 16(b): Delete Constraint(s) Button 25 Figure 16(c): (c): Selected Constraint Removed 6. Code Organization Overview The base functionality of Pentimento is the ability to record a lecture. This process is initialized when a user clicks the record button and starts to record visuals and/or audio. This then begins the recording process in the LectureController, which propagates down to recording visuals and audio. As the user adds strokes on the main canvas these events are captured by the VisualsController and added to the visuals array in the current slide of the VisualsModel. Similarly, the AudioController processes audio input and creates an audio segment which is stored in the current audio track. Recording input is continually added to these data structures and changes are also processed and added. For example, if a user decides to change the color of a stroke, that property transformation is added to the data structure for that visual. Ultimately, when a recording is completed, users can then go back and edit the recorded content. This process also stores property transforms and spatial transforms as part of the visuals data structure. Retiming is another key part of editing. When a user adds a constraint to the retiming canvas that constraint is processed and added to the constraints array with the associated visual and audio times to be synchronized. All of these components are combined to create a Pentimento lecture. A lecture is the basic data structure and it is comprised of separate visual and audio pieces, each of which is organized into a hierarchy. The visuals are comprised of slides, each of which contains visual strokes written by the lecturer. These strokes are made up of vertices (points that are connected to display the stroke). The audio contains various tracks, each of which includes audio segments. The final component of a lecture is the retiming constraints, which are the synchronization information that unites the audio and visual components at a certain time. 26 The Pentimento code base is organized into a Model Model-View-Controller Controller (MVC) architectural pattern. The basis for any recording is the Lecture, which contains visuals, aaudio udio and retiming information. Each of these main components has a model and a controller, the details and specifications of which are outlined below. The models contain the specific data structures for each component, allowing lecture data to be manipulated.. The controllers connect the lecture data to the view (the user interface), handling user inputs and making the necessary changes to the models, updating the lecture appropriately. Figure 17:: All of the modules in the Pentmento code base. Arrows indicate that there is a reference in the file with the origin of the arrow to the module where the arrow is pointing. This allows for the file original file to access the functionality of the sub sub-file The web version of Pentimento was written using JavaScript, jQuery,, HTML5 and CSS. Additional packages were used for displaying certain aspects of the user interface. jCanvas was used for displaying the retimer constraints, allowing a simple API for drawing and dragging the constraints, as well as selection and ot other her canvas interactions. Wavesurfer.js is used for displaying audio wave forms. Spectrum.js is used as a color selection tool. 27 6.1 Lecture A Pentimento lecture is made up of visual and audio components. To allow the lecture to be played back correctly a Pentimento lecture also contains a “retimer,” which stores the synchronization information between the visuals and the audio. Figure 18: Illustration of the data types that comprise a Pentimento Lecture. At the highest level there is the lecture, which is comprised of visuals, audio, and retiming data. 6.1.1 Lecture Model The LectureModel contains access to the VisualsModel, AudioModel and RetimerModel. Each of these ese models has a getter and a setter in the lecture model, establishing the places to store and update the data associated with each component of the 28 lecture. The LectureModel also contains functionality for initializing and getting the total duration of the lecture, and for saving and loading a lecture to JSON. 6.1.2 Lecture Controller The LectureController handles the UI functionality for recording and playback, undo and redo, and loading and saving lectures. It also serves as the entry point for the application through the $(document).ready( ) function. For recording and playback, it uses the TimeController to start timing and then calls the appropriate methods in the audio and visuals controllers. The LectureController determines if the recording mode is visuals only, audio only, or both visuals and audio. This information is used in functions to start and stop recording a lecture. During a recording, the LectureController creates a grouping for the UndoManager so that all undoable actions fall within that undo group. When the undo button is pressed, it calls a method in the LectureController that calls the undo method of the UndoManager and redraws all of the other controllers. The LectureController also registers a function as a callback to the UndoManager and the role of this function is to update the state of the undo and redo buttons so that each one is inactive if there are no undo or redo actions that can be performed. The LectureController is also responsible for initializing the process of creating and loading saved Pentimento files. This is discussed in the Save File section. 29 6.2 Time Controller In a Pentimento lecture, time must be kept track of because visuals and audio of the lecture may operate on different timelines. The two timelines can occur if audio and visuals are recorded separately, or if the retimer is utilized to adjust visual time to coincide with certain audio times. The TimeController manages the “global time” of the lecture, or the time seen when the lecture is being played back (regardless of the corresponding time that the visuals being played were recorded at). The TimeController contains access to the current lecture time, as well as providing the necessary calls to start or stop keeping track of global lecture time. When referring to time, there are four different time measurements. 1. “Real Time” refers to the time of the system clock. Real time is the time returned by the system clock when using the JavaScript Date object: new Date().getTime(). 2. “Global Time” or “Lecture Time” refers to the global time for the lecture that is kept by the TimeController. The global time starts at 0 and the units are milliseconds. 3. “Audio Time” refers to the time used for keeping track of the audio elements. There is a 1:1 correspondence between global time and audio time, so audio time directly matches with the global time. Because of this, there is no real difference between the global time and the audio time. The only difference is that global time is used when referring to the time kept by the TimeController, and audio time is used when keeping track of the time in the context of the audio. 4. “Visual Time” is used when keeping track of the time for the visual elements, and it is aligned with the global time through the retimer and its constraints. All times from the 30 TimeController must be passed through the retimer in order to convert them into visual time. The audio, visuals, and retimer need the TimeController in order to get the time, but the TimeController operates independently from the audio, visuals, and retimer. The TimeController has functionality to get the current time, start timing (automatic time updating), allow a manual update of the time, and notify listeners of changes in the time. When the TimeController starts timing, the global time will begin to count up from its current time. Timing can be stopped with a method call to the TimeController. When the LectureController begins recording, it uses this timing functionality to advance the time of the lecture. Methods can also be registered as callbacks to the TimeController so that they are called when the time is updated automatically through timing or manually through the updateTime method. Internally, timing works by keeping track of the previous real time and then using a JavaScript interval to trigger a time update after a predetermined real time interval. When the interval triggers, the difference between the current real time and previous real time is calculated and used to increment the global time. The current real time is saved as the previous time. The updateTimeCallbacks are called with the new global time as an argument. When timing is not in progress, the getTime method just returns the current time. However, when timing is in progress, the getTime method will get the current real time and calculate the difference between that and the previous time, just as it happen during an interval update. Effectively, this pulls the current global time instead of just observing an outdated global time. This allows a finer granularity of time readings during timing. This mechanism is important because if the time were 31 only updated every interval without pulling the most recent global time, then there would be visuals occurring at different times but still reading the same global time. updateTimeCallbacks is not called when the time is pulled during a getTime call. This is to prevent an overwhelming number of functions getting called when there are a large number of getTime calls, such as those that occur during a recording when there are many visuals being drawn that require getTime to get the time of the visual. The TimeController also has methods to check if it is currently timing and to check the beginning and ending times of the previous timing. The TimeController does not have any notion of recording or playback. It is the LectureController that uses the TimeController timing to start a recording or playback. 6.3 Visuals The visuals component of a Pentimento lecture is organized in a hierarchy. Slides are the base level, which contain visuals. Each type of visual then has a certain data structure associated with it. Currently, strokes are the only type of visual that has been implemented. Strokes are comprised of vertices, which are points containing x, y, t and p coordinates (x and y coordinate position, time, and pressure respectively). 6.3.1 Visuals Model The VisualsModel contains the constructors for all components of visuals. The VisualsModel contains an array of slides, allowing slides to be created and manipulated. A slide provides a blank canvas for recording new visuals and allows the lecturer to have a level of 32 control over the organization of information. A slide contains visuals, slide duration and camera transforms. The visuals themselves have many components including type (e.g. stroke, dot, or image), properties (e.g. color, width and emphasis), tMin (the time when the visual first appears), tDeletion (time when the visual is removed), property transforms (e.g. changing color or width) and spatial transforms (e.g. moving or resizing). Property transforms have a value, time and duration. Spatial transforms also have a time and duration, as well as containing a matrix associated with the transform to be performed. Finally, to actually display the visuals, the type of visual is used to determine the drawing method. Currently, strokes are the only supported type of visuals and strokes are comprised of vertices. A vertex is represented by (x,y,t,p) coordinates, where x is the x position, y is the y position, t is the time and p is the pen pressure associated with that vertex. 6.3.2 Visuals Controller The VisualsController has access to the VisualsModel and the RetimerModel. The VisualsController also utilizes the ToolsController and the Renderer. The VisualsController is responsible for drawing the visuals onto the canvas as the lecture is being recorded. As visuals and slides are added to the view by the user, the VisualsController accesses the VisualsModel and adds the appropriate data structure. The VisualsController also allows the user to adjust properties of the visuals, such as the width and color. 33 6.3.3 Tools Controller The ToolsController allows the user to manipulate which tool they are using while recording or editing the visuals of the lecture. The ToolsController allows switching of tools as well as indicating what to do with each tool while the lecture is recording or in playback mode. The ToolsController also creates the distinction of which tools are available in editing mode vs. in recording mode. 6.3.3.1 Visuals Selection Visual elements can be selected by using the selection box tool. This tool works in both recording and editing modes. In the VisualsController, the selection is an array of visuals that is under the selection box drawn by the user. For StrokeVisuals, the renderer uses different properties to display these visuals so that the user has feedback that the visuals have been selected. The selection box itself is implemented on a separate HTML div on top of the rendering canvas. Inside this div, there is another div that is setup using jQuery UI Draggable and Resizable. This allows the box to be dragged and resized by the user. Callback methods are registered so that when the box is resized or dragged, a transform matrix will be created based on the change dimensions and position of the selection box. This transformation matrix is passed on to the VisualsModel. 6.4 Audio Similar to the visuals, the audio components of Pentimento lectures are organized into a hierarchy with audio tracks being organized into segments. A lecture could contain multiple tracks (the simplest example being one track containing the narration of the lecture while a 34 second track contains background music). Each track is organized into separate segments (e.g. the audio associated with a slide). 6.4.1 Audio Model The AudioModel consists of an array of audio tracks, where each audio track consists of an array of audio segments. An audio segment contains the URL for an audio clip, the total length of the clip, the start and end times within the clip, and the start and end locations within the track (audio time). The top level AudioModel has functions to insert and delete tracks. The audio track class has functions to insert and delete segments. All functionality for modifying segments within a track is handled by the audio track. This includes shifting segments, cropping segments, and scaling segments. This is because no segments can overlap within a track, so modifying a segment requires knowledge of the other segments within that track to ensure that the operation is valid. The audio segment class has methods for converting the track audio time into the time within the clip and vice versa. The AudioModel can be converted to and from JSON for the purpose of saving to and loading from a file. During the saving process, the audio clip URLs are converted into indices and the resources they point to are saved with filenames corresponding to those indices. 6.4.2 Audio Controller The AudioController accesses the AudioModel so user changes can be applied. Within the AudioController there is also access to the track and segment controllers. Each track and segment is initialized when the user begins recording, which is processed through the 35 AudioController. The AudioController also handles the end of recording. In addition to handling recording, the AudioController is responsible for playback of the audio. The audio TrackController contains access to all of the segments contained within that track. Each TrackController can also retrieve the track ID and the duration of the track. The TrackController also allows for manipulation of the segments within the track (dragging, cropping, inserting, and removing segments). The SegmentController handles access to specific segments and contains the means to display the audio segments. 6.4.3 Audio Playback When the LectureController begins playback, it calls the startPlayback method in the AudioController, which starts the playback in the tracks. The TrackController uses a timer to start playback for the segments after a delay. The delay is equal to the difference between the segment start time and the current audio time. If the current audio time intersects a segment, then playback for that segment begins immediately. Playback uses the wavesurfer.js library to play the audio resource in the audio segments. When a segment playback starts, the SegmentController uses wavesurfer.js to start playing audio. The start point of the audio can be specified so that it can start playing in the middle of the audio clip if specified by the segment parameters. Automatically stopping playback for the segment when the current audio time moves past the end of the segment is handled by wavesurfer.js by specifying the stop time for the audio clip. When playback is stopped in the LectureController, the stopPlayback method of the AudioController is called, and it stops playback in all of the TrackControllers, which 36 then stops playback in all of the SegmentControllers. These SegmentControllers manually stop any “wavesurfers” that are in the process of playing an audio clip. 6.4.4 Audio Timeline The audio timeline is used for displaying the audio tracks to illustrate where the segments are in relation to the audio/global time. It also has a plug-in functionality so that other items can be displayed on the timeline. The timeline has a pixel-to-second scale which is used for drawing items on the timeline, and it has the functionality to change this scale by zooming in and out. This scale is illustrated through the use of labels and gradations. 6.4.5 Audio Track Controller The TrackController draws the audio track from the model and handles playback for the track. It delegates playback for the individual segments to the SegmentController. It has the functionality for translating the UI events for editing the track into parameters that can be used to call the methods to change the audio model. 6.4.6 Audio Segment Controller The SegmentController draws the audio segment from the model. It uses the Wavesurfer JavaScript library to display and play the audio files that are associated with the segments. It creates the view for the segments and registers the callbacks associated with the various UI actions such as dragging and cropping. For the UI functionality of segment editing, the jQuery UI library is used. The Draggable and Resizable functionality is used to implement segment shifting and cropping, respectively. In order to enforce the constraint that the audio segments cannot overlap one another in the same track, the drag and resize functions tests to see if the new position of the segment view resulting 37 from the drag or resize action leads to a valid shift or crop action. If the action is invalid, the position of the segment view is restored to last valid position. The functionality for checking the validity of these operations resides in the AudioModel. The result is that the UI uses the AudioModel to check if the user actions are valid and the UI provides the relevant visual feedback to the user. 6.4.7 Audio Plug-in Audio timeline plug-ins are a way to display additional views in the audio timeline. This is a useful feature because it allows those views to line up with the audio segments. For example, visual thumbnails that will be played at the corresponding audio time can be displayed. Since the audio time is used as the global time, it makes sense to visualize the information in this way. When the timeline is panned from side to side, the plug-in views also pan with the rest of the timeline. The plug-in is able to register a callback function that gets called when the timeline is zoomed in or out. This allows the plug-in to define its own implementation of what is supposed to happen when the pixel-to-second scale of the timeline changes. The other components that currently use the plug-in functionality of the audio timeline are the retimer constraints view and the thumbnails view. For these views, it makes sense to display them as audio timeline plug-ins because it gives the user a sense of how the visual time relates to the audio time and how that relationship changes when retimer constraints are added and modified. 6.5 Retimer The retimer is one of the main innovations of the Pentimento lecture software. By allowing users to easily manipulate the synchronization between the visual and audio components of a 38 lecture, the retimer provides much needed flexibility in recording and editing lectures. The retimer contains constraints, which are the synchronization connections between a point in the audio time of the lecture and the visuals time of the lecture. Thus the retimer allows the playback of the lecture to have proper synchronization between the visuals timeline and the audio timeline. 6.5.1 Retimer Model The RetimerModel provides the ability to manipulate constraints, including addition, deletion, shifting and access to the constraints. The RetimerModel contains an array of constraints, which are used to synchronize the audio and visual time. A constraint is comprised of a type (automatic or manual), an audio time and a visual time. Automatic constraints are inserted mechanically as the lecture is recorded (e.g. at insertion points or at the beginning/end of a recording). Manual constraints are added by the user to synchronize a certain point in the audio with a certain point in the visuals. Adding constraints to the model requires finding the previous and next constraints (in audio time). Once these constraints have been determined, the visual time can be interpolated between the added constraint and the visual time of the two surrounding constraints, to allow for smooth playback. This is done because adding a constraint only affects the time of visuals between the two surrounding constraints. When visuals or audio is inserted, automatic constraints are added to preserve the synchronization provided by existing constraints. This requires shifting the existing constraints by the amount of time that is added by the insertion. This process is completed by making the constraints after the insertion point “dirty” until the insertion is completed. This means moving 39 the constraints to an audio time at “infinity” indicating they will be shifted. The original time is stored so that when the recording is completed the constraint times can be shifted by the appropriate amount. To perform the shift, the “dirty” constraints are “cleaned” by shifting the original time that has been stored by the duration of the inserted recording (and removing the value of infinity from the constraint time). 6.5.2 Retimer Controller The RetimerController has access to the RetimerModel, so that when a user manipulates constraints the necessary updates can be made to the constraints data. The retimer controller also has access to the visuals and audio controllers so that synchronizations can be inserted properly. Additionally, the RetimerController manages redrawing of constraints and thumbnails so that the view is properly updated when a user adds, drags or deletes a constraint. The RetimerController interacts with the UI, so all user input events are handled properly. When a constraint is added by a user, the RetimerController handles converting the location of the click on the retiming canvas (in x and y positions) to the audio and visual time represented by that location. Similarly, when a user is selecting to delete a constraint, the RetimerController processes the selection area and locates the constraints within the selection by converting constraint times to positions on the retiming canvas. Dragging constraints is also handled by the RetimerController and when a user stops dragging the RetimerController updates the RetimerModel to reflect the newly selected synchronization timing. 40 6.6 Thumbnails Controller The ThumbnailsController displays the visual thumbnails to the user (as part of the retimer and audio timeline display). The ThumbnailsController requires access to the Renderer to display the lecture visuals in a thumbnail timeline. The thumbnails are generated by calculating how many thumbnails will fit in the timeline (based on the length of the lecture). Then each thumbnail is drawn on its own canvas at the time in the middle of the time span represented by the thumbnail. The thumbnails are redrawn any time a user updates a recording or drags a constraint to update the visual or audio time synchronization. 6.7 Undo Manager The UndoManager allows for undoing/redoing any action while recording or editing any component of a lecture. The UndoManager is organized into an undo stack and a redo stack. The stacks contain actions that can be undone or redone. Actions can be a single event or a collection of actions all of which would be undone or redone. The UndoManager is integrated to work with the rest of the application by registering undo actions in the visuals, audio, and retimer models. For example, in the AudioModel, a segment can be shifted by incrementing the start and end times by a certain amount. The shift is then registered with the UndoManager, but the argument will be the inverse amount of the shift performed (e.g. if the segment was shifted +5 seconds the UndoManager would store an undo action shift moving the segment -5 seconds). If the action is undone the UndoManager, will then push the action onto the redo stack instead of the undo stack. Differentiating between undo and redo actions is handled by the UndoManager. In the implementation of the models, for each action, the code only needs to register the inverse action with the UndoManager. 41 In the LectureController, when recording begins, an undo group is started. When the recording ends, the group is ended. This allows the user to undo a recording as though it were one action. The beginning of a recording also registers changeTime as an undo action. The changeTime argument is the begin time of the recording. This way, if the recording is undone, the time in the TimeController can be set to the begin time before the recording started. When changeTime is called, it pushes another changeTime call to the UndoManager. The argument for this call is the current time. The result is that when undoing and then redoing, the time will switch to the place it was before the recording started and switch back to the place it was after the recording ended. 6.8 Renderer The Renderer is used for displaying visuals at a certain time, either as a still frame or during playback. The Renderer takes in the canvas where the visuals should be displayed (and handles scaling the visuals to the appropriate size for the given canvas). Then, to display the visuals, the Renderer uses a minimum and maximum time and draws all visuals that are active in that time range. The renderer is used for the main canvas while visuals are being recorded or during playback. The Renderer is also used to display thumbnails for retimer purposes. 6.9 Save and Load Files Pentimento saved files are regular .ZIP files with the content located in different files and folders. In the top-level of the .ZIP, there is a JSON file containing the model. There is also a folder containing the audio files. In the future, there could be other top-level folders for items such as images and other external resources. Inside the audio folder, the audio clips are stored 42 with filenames starting with the number 0 and counting up the rest of the files. The JSZip library is used for loading and saving .ZIP files. The save is handled through LectureController. The JSON representation of the model is obtained by using the saveToJSON method in the different models. In the AudioModel JSON, the URL references to the audio clips are replaced by index numbers which will be used as the file names of the audio files. The JSON is saved as a text file in the .ZIP, and then the audio files are converted to HTML5 blobs asynchronously. When all of the downloads have completed, the .ZIP file is downloaded to the computer’s local storage. Loading of the Pentimento .ZIP files is also handled by the LectureController. The .ZIP is loaded asynchronously and when the load is complete the contents are read. The different models are loaded from their JSON representations using their respective loadFromJSON methods. The audio files are loaded into the browser and the URL for the resource is obtained. This URL is substituted back into the audio segments where an index number was previously used as a substitute for the audio clip URL. 7. Future Work While we have successfully created a working prototype, there are many additional features that would certainly benefit Pentimento users. In addition to the features and additions outlined below, user testing and input would be essential for determining exactly how the Pentimento website would be used by real lecturers. While the feature sets below have been divided roughly into recording vs. editing tools, many of the recording tools would require additional tools for editing and many editing tools could be used while recording. 43 7.1 Future Features 7.1.1 Recording After the baseline feature set was created and established, we determined that there are many additional recording tools that lecturers find extremely helpful and somewhat necessary (that aren’t necessarily part of the minimum successful product). Some of these possible features are outlined below: Feature Auto Record Function There could be an option for “auto” recording, which will begin recording when the pen is touched to the screen (as opposed to manually clicking the button before recording begins). Auto record would stop after a few seconds of the pen being inactive. This would allow for more flow while recording lectures. Text Boxes Some lecturers may want to include typed text instead of handwriting. The textbox feature would also include the ability to resize text, adjust fonts, text formatting and other general text editing capabilities. Note: this feature is currently under development Set Background The background of the lecture or slide could be set to a certain a color or even to an inserted image or a slide deck. Emphasis This tool would allow the lecturer to circle or otherwise select a certain area of text to change either the line weight or color of the selection for emphasis. Since these videos omit the actual lecturer and are solely visual and audio based it is hard to direct student attention to a certain point of the screen without a tool that facilitates emphasizing exactly what should be focused on at that moment. Set/Go to “Home View” Some lectures are based off of a central idea that the lecturer would like to emphasize by revisiting the same set of visuals repeatedly throughout the lecture. This view that should be easily accessed and could be set as the home view. Then throughout the lecture to go to that view the lecturer can just click a button that allows them to go home, as opposed to moving the canvas until the desired view is found. 44 Feature Inclusion of Other Content Function Many times lecturers will want to include images, videos, hyperlinks or other media into their recordings. Zoom In/Out While recording to emphasize certain aspects of diagrams or ideas it might be useful for the lecturer to be able to zoom in/out on certain parts of the canvas. Traceable Images Many times lecturers want to accurately draw an image (e.g. a map). With this feature the image could be inserted onto the canvas, traced by the lecturer and then the image would be removed. Therefore during playback only the lecturer’s tracing will be shown and not the base image. Shapes Often times when creating slides it is nice to be able to draw shapes onto the slides using a shape tool as opposed to hand drawing them (e.g. inserting rectangles and circles). A shape tool is provided by many slide creation tools (e.g. PowerPoint) and would be a nice addition to Pentimento. Animations Animations are extremely useful educational tools in many scenarios. For example, when learning about mass spring systems it would be ideal if the lecturer could draw the mass and the spring and then have it animate over a specified curve. By adding the ability to insert these types of animations more lecture styles will be possible. Stroke Type Currently whenever the lecturer uses the pen tool, the handwritten strokes are recorded as calligraphy. Having a non-calligraphic mode would allow for more variation in lectures. Redraw Visuals This feature would allow a user to select certain visuals and redraw them while keeping the timing of the previous visuals preserved. This would allow for faster editing capabilities because visuals will not necessarily need to be resynchronized with the audio. Record Audio for Selected Visuals Similar to redrawing visuals to maintain synchronization with audio, this feature would allow for recording of audio for a set of selected visuals. This will then retime those visuals to playback in correspondence with the newly recorded audio (automatically adding the constraints instead of manually retiming later). Recording Buffer When a lecturer is writing sometimes it is difficult to plan how much space they will need. Adding a recording buffer area off the canvas as a warning that they are running out of space, but with the ability to still finish a thought can allow for more fluidity in recording lectures. After recording in the buffer space 45 Feature Function while editing the user could resize the visuals to make them fit onto the playback screen. 7.1.2 Editing In addition to the basic editing techniques enabled by the tools described above, there are some additional capabilities that the lecturers may want to be able to use while editing these lectures. Feature Background and Insertion of Media Function While editing the lecturer may realize that they wanted to change the background to something different or insert a video or other media at a certain point, which would be allowed during the editing phase. Grid View While editing the lecturer may want to view gridlines to make sure that everything is lined up properly. Note: this same tool can also be used during the recording phase (grid lines would not be recorded). Redraw Tool This tool is complex because it will combine recording and editing modes. With the redraw tool, editors will be able to select content that they want to rerecord, remove that content, and then they can re-record that content but maintain the same timing within the lecture (i.e. the redrawn strokes will play back in the video at the same time as the original strokes). Slide Manipulation Currently, a user can only add slides to a lecture. Adding a way to select and delete a slide or to rearrange slides would be highly beneficial for lecturers. This may require additional UI work (e.g. a slide view). Auto-Delete Silence One of the many issues Pentimento set out to address is the fact that writing often takes longer than speaking. If recording audio and visuals concurrently there may be long silences introduced by the writing taking longer. A feature to auto-delete these silences could make editing more efficient. Override Automatic Constraints Currently, automatic constraints are handled the same as manually added constraints. We do not allow visuals to be drawn “backwards” so a constraint may not be dragged past the previous or next constraint to preserve timing. However, user added constraints could override automatically added constraints. One way to do this would be allowing for dragging a manual constraint past an automatic constraint and deleting the automatic constraint. 46 Feature Handwriting Beautification Function There are certain algorithms available for handwriting beautification [11], which could be used for these handwritten lectures. A user could select the handwritten text and then make it appear neater, but still look like it is being written by hand (as opposed to simply typing). Keyboard Shortcuts Most computer programs allow for keyboard shortcuts and adding those to Pentimento would allow for experienced users to be more efficient. Some examples might be, initiating playback by pressing the space bar or being able to delete visuals by pressing the delete key. 7.2 User Interface Additions There are many additional improvements to the User Interface that could be beneficial for users of Pentimento. One such improvement could be the ability to toggle between the main recording view and the audio and retimer timeline view, instead of having them stacked on top of each other and scrolling between them. This could be implemented by having a tab or accordion system within the webpage, which could easily allow the user to update what he or she is seeing. This would make the UI cleaner and easier to scale to different size displays since each component of the view could have its own screen. While this would improve the interaction between the two main components of the user interface, each portion could also be improved. 7.2.1 Main Visuals Recording and Editing UI UI Feature Description Slides Interface It could be good for the lecturer to have the ability to see all of their slides (e.g. in a grid) so that slides could be manipulated more easily. For example this would allow a user to rearrange slides, insert a slide at a certain point or select and delete a slide. Save/Open Buttons These buttons could be resized or a different icon chosen to fit more cohesively into the overall UI Time Indicator The time indicator could be made more obvious. Additionally, if the main 47 UI Feature Description recording UI and audio/retimer timeline UI are separated a new time slider should be available for navigating through the visuals. Color Select Palette The spectrum.js color picker design doesn’t necessarily fit in with the rest of the Pentimento UI so it could be updated or manipulated to be displayed more attractively. Jump to Beginning or End Buttons could be added to allow the user to quickly jump to the beginning or end of the lecture without dragging a time slider. Jumping to the beginning is often useful for playback. Jumping to the end would be useful to add more content. Adding Tools The layout may need to be updated to add more tools. Currently the Recording and editing tools fit nicely alongside the canvas, but a column approach or different layout may become necessary as more tools become available. 7.2.1 Audio and Retimer Timeline UI UI Feature Selection of Thumbnails Description By clicking on the thumbnail timeline and dragging a user could potentially select visuals, which would add another way to select visuals that may appear across different slides or a long period of time and editing all of them at once. Fluid Arrow Dragging Currently when a user drags an arrow, it creates a straight line to connect the two ends of the arrow. However, it might be clearer if the center of the arrow was anchored and the arrow dragging looked more fluid (e.g. the arrow became curved so that the tip is still pointing straight up or straight down). jCanvas supports the drawing of many types of curves which could facilitate this improvement. Tick Marks for Thumbnails Speed Displaying tick marks below the visuals thumbnails or on the visuals side of the retimer canvas could indicate the speed at which the thumbnails are being played back. If the tick marks are close together the visuals have been sped up by constraints and if they are far apart it would indicate that the visuals have been slowed down to maintain synchronization with the audio. Audio “Transcript” It is slightly complicated for lecturers to synchronize audio and visuals because the audio is presented as a wavelength display. It would be extremely helpful if there was a way to process the wavelengths and display a transcript of the speech below the audio timeline so that it is faster and easier to find the correct 48 UI Feature Description point for synchronization. Audio Manipulation Buttons Currently the buttons are styled very simply to function, but they could be replaced by icons (to fit in with the rest of the UI) or styled in a way that is more consistent with the rest of the UI. 7.3 Student Player Another vital addition to the Pentimento lecture software will be creating a web based student viewer, where students can see these videos and interact with them in ways specified by the lecturer. For example, the lecturer could insert a pause point that requires the student to submit an answer to a question. The student view would still allow for interactivity and have some features available to the teachers as well (e.g. maybe they will have to record and show their work for a problem), but would mainly be used for recording new student content and wouldn’t require as much editing capability as the actual lecture recording. The student input may also be in forms that do not require writing, since writing is facilitated by a pen and tablet input, which is a resource that isn’t readily available to student. For example, a multiple choice question could appear at a certain point in the lecture requiring a student answer. However, many students would have access to a touch screen in the form of a tablet (e.g. iPad) or smartphone. This could suggest the desire for a mobile version of the student player. The need for secure accounts or online "classroom' structures adds additional complexity to student interfaces. 8. Conclusions The main goal of this project is to make creating and recording online educational video content easy and maintainable. To begin achieving this goal we were able to turn the current iOS prototype into a web based tool that can be accessed by anyone to record and edit presentations. 49 The presentations can be edited in both space and time and these changes can be applied to both the visual and audio elements. For visual changes, this means having the ability to change strokes and placement after the recording. This can include correcting existing strokes, adding missing ones (such as a missing prime on a variable), moving drawings and text, or resizing drawings and text. In the future we hope that it will also be possible to re-record, insert, or delete sections of audio and keep them synchronized with the visual recording. By using my thesis work to implement as base set of features, it will be possible for educators to use Pentimento to easily record high-quality, educational videos that can be maintained for many years. Since online learning is becoming extremely widespread it is important to have accessible software that can keep up with the demand and be easy to use for lecturers. 50 Appendix A: Documentation Here we will outline all of the functions available in the Pentimento code base, with their input parameters and outputs if any. A.1 Lecture A.1.1 Lecture Model getVisualsModel Return: the current visuals model setVisualsModel Parameters: newVisualsModel (the new visuals model that the visuals model should be set to) Sets the visuals model to the newly chosen visuals model getAudioModel Return: the current audio model setAudioModel Parameters: newAudioModel (the new audio model that the audio model should be set to) Sets the audio model to the newly chosen audio model getRetimerModel Return: the current retimer model setVisualsModel Parameters: newRetimerModel (the new retimer model that the retimer model should be set to) sets the retimer model to the newly chosen retimer model getLectureDuration Return: the duration of the lecture in milliseconds. The full lecture duration is the maximum duration of the audio recording or visuals recording times. loadFromJSON Parameters: json_object (the lecture in JSON form to be loaded from a file) Loads a lecture file. Calls the necessary load methods from the visuals, audio and retimer to construct the models for each and allow for playback. (Note: see JSON file structure in Saving and Loading appendix). 51 saveToJSON Saves the whole lecture to a JSON file. Gets the data from the saving methods from the visuals model, audio model and retimer model to create the JSON object. (Note: see JSON file structure in Saving and Loading appendix). A.1.2 Lecture Controller save Saves the current lecture to a JSON file. The JSON is put into a zip file with the audio blob (Note: see JSON file structure in Saving and Loading appendix). load Reads the selected lecture file so that it can be opened and displayed to the user. openFile Parameters: jszip (the JSON zip file including the JSON object that represents the lecture and the audio blob). Opens the specified file into the UI and resets all of the controllers and models to be consistent with the loaded lecture. (Note: see JSON file structure in Saving and Loading appendix). getLectureModel Return: the lecture model getTimeController Return: the time controller recordingTypeIsAudio Return: true if the recording will include audio (i.e. if the audio checkbox is checked), otherwise returns false recordingTypeIsVisual Return: true if the recording will include visuals (i.e. if the visuals checkbox is checked), otherwise returns false isRecording Return: true if a recording is in progress, otherwise returns false isPlaying Return: true if playback is in progress, otherwise returns false 52 startRecording Starts the recording and notifies other controllers (time, visuals, audio and retimer) to begin recording. Updates the UI to toggle the recording button to the stop button. Note: only notifies the audio and visuals controllers if their respective checkboxes are checked for recording. Return: true if successful stopRecording Stops the recording and notifies other controllers (time, visuals, audio and retimer) to end recording. Updates the UI to toggle the stop button to the recording button. Note: only notifies the audio and visuals controllers if their respective checkboxes are checked for recording. Return: true if successful startPlayback Starts playback and notifies the other controllers (time, visuals, audio) that playback has begun. Toggles the play button to the pause button. Return: true if successful stopPlayback Stops playback and notifies the other controllers (time, visuals, audio) that playback has ended. Toggles the pause button to the play button. Return: true if successful getPlaybackEndTime Return: the lecture time when playback is supposed to end (returns -1 if not currently in playback mode) draw Redraws the views of all of the controllers (visuals, audio and retimer). undo Undoes the last action and redraws the view to reflect the change. redo Redoes the last undone action and redraws the view to reflect the change. changeTime This function creates a wrapper around a call to the time controller and the undo manager. This is necessary because the time needs to revert back to the correct time if an action is undone or redone. 53 loadInputHandlers Initiates the input handlers (i.e. mousedown, mouseup, keydown, and keyup). Also registers if it is a touch screen to determine if pen pressure will be applied. Also connects the click events to the lecture buttons. updateButtons Toggles the UI display to between recording/stop button and play/pause button to reflect the current recording or playback state. A.2 Time Controller addUpdateTimeCallback Adds a callback that should notify listeners when the current time changes (note: functions should have one argument currentTime, in milliseconds) getTime Return: current time, in milliseconds updateTime Manually update the current time and notify callbacks globalTime Return: UTC time (to keep track of timing while it is in progress) isTiming Return: true if timing is in progress, otherwise returns false startTiming Starts progressing the lecture time Return: true if successful stopTiming Stops progressing the lecture time Return: true if successful getBeginTime Return: the time (in milliseconds) when the previous or current timing started (returns -1 if there was no previous or current timing). 54 getEndTime Return: the time (in milliseconds) when the previous timing ended (returns -1 if there was no previous timing event). A.3 Visuals A.3.1 Visuals Model getCanvasSize Return: an object with the size of the canvas where the visuals are being recorded, formatted as: {‘width’: <canvas width>, ‘height’: <canvas height>} getDuration Return: the total visuals duration (calculated by adding the durations of each of the slides) getSlides Return: the array of all of the lecture slides getSlidesIterator Return: an iterator over the slides getSlideAtTime Parameters: time (in milliseconds) Return: the slide that is displayed at the specified time insertSlide Parameters: prevSlide (the previous slide before the point where the new slide will be inserted), newSlide (the slide to be inserted) Return: true if successful (false if the previous slide does not exist) removeSlide Parameters: slide (the slide to be removed, type slide) Return: true if successful (false if there are no slides or if there is only one slide remaining) addVisuals Parameters: visual (the visual to be added, type visual) Gets the slide at the minimum time of the visual and then adds the indicated visual to the visuals belonging to that slide. 55 deleteVisuals Parameters: visual (the visual to be deleted, type visual) Gets the slide at the minimum time of the visual and then removes the indicated visual from the visuals belonging to that slide visualsSetTDeletion Parameters: visual (the visual to be deleted), visuals_time (the time to delete the visual, in milliseconds) Sets the deletion time property of the given visual to the specified deletion time. setDirtyVisuals Parameters: currentVisualTime (the current visual time, after this time all visuals will be set to dirty) Creates wrappers around the visuals that keeps track of their previous time and the times of their vertices. Then move the visuals to positive infinity. Used at the end of a recording so that the visuals will not overlap with the ones being recorded. Only processes visuals in the current slide after the current time. cleanVisuals Parameters: amount (the amount of time, in milliseconds, that the visuals that were previously set as dirty will need to be shifted by to accommodate the new recording) Restores visuals to their previous time plus the amount indicated. Used at the end of a recording during insertion to shift visuals forward. doShiftVisual and shiftVisual Don’t function, but were written by previous M.Eng student as part of the “shift as you go” approach to shifting visuals during insertion. Left in the code base for the possibility of going back to that method. prevNeighbor Parameters: visual Return: the previous visual (i.e. the visual that occurs right before the specified visual in time). nextNeighbor Parameters: visual Return: the next visual (i.e. the visual that occurs right after the specified visual in time). segmentVisuals Parameters: visuals (an array of all visuals) Return: returns an array of segments, where each segment consists of a set of contiguous visuals. 56 getSegmentShifts Parameters: segments (an array of visual segments, where a segment is a set of contiguous visuals) Return: returns an array of the amount by which to shift each segment saveToJSON Saves the visuals as a JSON object loadFromJSON Parameters: json_object Return: an instance of the visuals model with the data specified in the JSON object (loaded from a file) A.3.2 Visuals Controller getVisualsModel Return: visuals model getRetimerModel Return: retimer model drawVisuals Parameters: audio_time Draws visuals on the canvas using the renderer. The time argument is optional, but if specified is the audio time at which to draw the associated visuals (visual time calculated from the retimer). If the time is not specified visuals are drawn at the current time of the time controller. startRecording Parameters: currentTime (time at which to start recording) Begins recording visuals on the slide at the current time. stopRecording Parameters: currentTime (time at which to stop recording) Stops the recording. If it is an insertion visuals after the recording time are “cleaned” to move to the end of the insertion. Durations are updated. startPlayback Parameters: currentTime (time at which to start playback) Starts playback 57 stopPlayback Parameters: currentTime (time at which to stop playback) Stops playback currentVisualTime Return: visual time (converted from the time controller time through the retimer) currentSlide Return: the slide at the current time (gotten from the visuals model) addSlide Adds a slide to the visuals model addVisual Adds a visual to the visuals model (once it is done being drawn) recordingDeleteSelection Deletes the selected visuals during recording and sets the tDeletion property for all of the selected visuals. editingDeleteSelection Deletes the selected visuals while in editing mode, which removes the selected visuals entirely from all points in time. recordingSpatialTransformSelection Parameters: transform_matrix (the matrix that will transform the selected visuals to the correct place). Transforms the visuals spatially during recording. Gets the selected visuals and calculates the new position versus the original position to calculate the final transform matrix and it is added to the spatial transforms of those visuals. editingSpatialTransformSelection Parameters: transform_matrix (the matrix that will transform the selected visuals to the correct place). Transforms the visuals spatially during editing. Gets the selected visuals and adds the transform matrix to the spatial transforms of those visuals. recordingPropertyTransformSelection Parameters: visual_property_transform (visual property that will be changed by the selection, i.e. color or width). 58 Changes the properties of the selected visuals during recording. Adds a property transform to the selected visuals property transforms editingPropertyTransformSelection Parameters: property_name (property that will be changed), new_value (value to change the property to) Changes the properties of the selected visuals during editing. Updates the specified property to the new property value (e.g. changes from one color to another). A.4 Tools Controller startRecording Activates the recording tools and hides the editing tools stopRecording Activates editing tools and hides recording tools toolEventHandler Parameters: event Handles a click event on one of the tool buttons (handles both recording and editing tools) acitvateCanvasTool This activates a tool on the canvas. This is used for tools such as pen, highlight, and select. The tool that is registered is the active tool for the current mode (recording/editing). Initializes mouse and touch events for the active tool. drawMouseDown Parameters: event Used when the pen tool is active. Called when the mouse is pressed down or a touch event is started. Activates the mouse move and mouse up handlers and starts a new current visual (i.e. the visual that is being drawn by the pen). drawMouseMove Parameters: event Used when the pen tool is active. When the mouse is down and moved or touch is moving, appends a new vertex to the current visual. 59 drawMouseUp Parameters: event Used when the pen tool is active. When the mouse is released or a touch ends, clears the handlers and adds the completed visual. resetSelectionBox Parameters: event Resets the selection box so that it is not visible. selectMouseDown Used when the selection tool is active. When the mouse is pressed down or a touch event is started, activates the selection box and the mouse move and mouse up handlers selectMouseMove Parameters: event Used when the selection tool is active. When the mouse is down and moved or a touch event is moving, updates the dimensions of the selection box and selection vertices. selectMouseUp Parameters: event Used when the selection tool is active. When the mouse is released or a touch ends, clears the handlers and turns on dragging and resizing of the selection box. selectBoxStartTranslate Parameters: event, ui While dragging a selection box, stores the original UI element dimensions selectBoxEndTranslate Parameters: event, ui While editing handles the end of dragging a selection box selectBoxEndScale Parameters: event, ui While editing handles the end of resizing a selection box widthChanged Parameters: new_width (the newly selected width for the pen tool) Handles changing the width of the pen tool. 60 colorChanged Parameters: new_spectrum_color (the newly chosen color for the pen tool. The color is passed in as a spectrum.js color and then converted to hex). Handles changing the color of the pen tool. isInside Parameters: rectPoint1 (top left corner of selection rectangle), rectPoint2 (bottom right corner of selection rectangle), testPoint (vertex point). Return: true if the test vertex is inside the selection, otherwise returns false. Tests if a vertex inside the rectangle formed by the two rectangle points that form the selection box getCanvasPoint Parameters: event Return: Vertex(x,y,t,p) with x,y on the canvas, and t a global time Gives the location of the mouse event on the canvas, as opposed to on the page getTouchPoint Parameters: eventX, eventY (the coordinates of the touch event) Return: Vertex(x,y,t,p) with x,y on the canvas, and t a global time Gives the location of the touch event on the canvas, as opposed to on the page calculateTranslateMatrix Parameters: original_position, new_position (position is represented as { left, top }) Return: translation matrix Given the original and new position of a box in the canvas, calculate and return the math.js matrix. Necessary to translate the box from the original to the new coordinates. calculateScaleMatrix Parameters: original_position, original_size, new_position, new_size (position is represented as { left, top }, size is represented as { width, height }) Return: scaling matrix Given the original and new dimensions of a selection box in the canvas, calculate and return the math.js matrix necessary to scale the box from the original to the new coordinates. Scaling normally ends up translating, so the matrix returned by this function will negate that translation. 61 A.5 Audio A.5.1 Audio Model getAudioTracks Return: array containing all audio tracks setAudioTracks Parameters: tracks (array containing audio tracks) Sets the audio tracks to the specified tracks addTrack Parameters: track, insert_index (optional argument) Adds the track to the end of the audio tracks, unless the insertion index is specified, then insert the track at the chosen index. removeTrack Parameters: track Return: true if completes, false otherwise Removes the specified audio track getDuration Return: the total duration of the audio (in milliseconds), which is the max of the all audio track lengths. Returns 0 if no audio tracks getBlobURLs Return: an array of all the unique audio blob URLs saveToJSON Return: a JSON object containing the audio JSON Saves the model to JSON loadFromJSON Parameters: json_object (JSON object containing the audio information) Return: audio model populated with the information from the JSON object getAudioSegments Return: an array of all audio segments 62 setAudioSegments Parameters: segments Sets the segments in the track to the specified segments insertSegment Parameters: new_segment, do_shift_split Return: true if insert succeeded, unless a split occurs. If there is a split returns an object {left, right, remove} with the left and right side of the split segment and the segment that was removed to become the left and right parts. Insert the provided segment. Note: another segments in the track may need to be split to insert the specified new segment. addSegment Parameters: segment Add the segment to the audio segments array. removeSegment Parameters: segment Return: rue of the segment is removed Removes the specified audio segment. canShiftSegment Parameters: segment, shift_millisec Return: true if the shift is valid, otherwise return the shift value of the greatest magnitude that would have produced a valid shift Determines whether the specified segment can be shifted to the left or right. If a negative number is given for shift_millisec, then the shift will be left. The final value of the segment starting time cannot be negative. The segment cannot overlap existing segments in the track. If the shift will cause either of these conditions to be true, then the shift cannot occur. shiftSegment Parameters: segment, shift_millisec, check (optional and defaults to true. If false, shift is performed without checking for validity) Return: true if shift succeeds, return the shift value of the greatest magnitude that would have produced a valid shift Shifts the specified segment left or right by a certain number of milliseconds. If a negative number is given for shift_millisec, then the shift will be left. 63 canCropSegment Parameters: segment, crop_millisec, left_side (boolean indicating whether the left side is being cropped) Return: Returns true if the crop is valid, otherwise returns a crop millisecond of the greatest magnitude that would have produced a valid crop Determines whether the specified segment can be cropped on the left or right. If a negative number is given for crop_millisec, then the crop will shrink the segment. If a positive number is given for crop_millisec, then the crop will extend the segment. The segment cannot overlap existing segments in the track. The segment cannot extend past the audio length and cannot shrink below a length of 0. cropSegment Parameters: segment, crop_millisec, left_side (boolean indicating whether the left side is being cropped), check (optional and defaults to true. If false, it will crop without checking for validity) Return: Returns true if the crop is valid, otherwise returns a crop millisecond of the greatest magnitude that would have produced a valid crop Crop the specified segment by the specified number of milliseconds. If a negative number is given for crop_millisec, then the crop will shrink the segment side endTime Return: the end time of the track in milliseconds, which is the greatest segment end time. Returns 0 if the track is empty. saveToJSON Return: a JSON object containing the audio track JSON Saves the audio tracks to JSON loadFromJSON Parameters: json_object (JSON object containing the audio information) Return: audio track with the information from the JSON object audioResource Return: the URL of the audio resource blob needed for playback totalAudioLength Return: total length of the audio resource blob lengthInTrack Return: the length of the segment in the track 64 audioLength Return: the length of the audio that should be played back splitSegment Parameters: splitTime Return: an object {left, right} with two segments that are the result of splitting the segment at the specified track time. Returns null if the track time does not intersect the segment within (start_time, end_time) Splits an audio segment at the specified time trackToAudioTime Parameters: trackTime Return: audio time. Returns false if the given track time is invalid. Converts a track time to the corresponding time in the audio resource at the current scale audioToTrackTime Parameters: audioTime Return: track time. Returns false if given audio time is invalid. Converts a time in the audio resource to the corresponding time in the track at the current scale saveToJSON Return: a JSON object containing the audio segment JSON Saves the audio segment to JSON loadFromJSON Parameters: json_object (JSON object containing the audio segment information) Return: audio segment with the information from the JSON object A.5.2 Audio Controller getAudioModel Return: the audio model addTrack Creates a new track in the model to add. Redraws the audio timeline removeTrack Remove a track from the audio model. Redraws the audio timeline changeActiveTrack 65 Parameters: index (index of track to make active) Changes the active track index to refer to another track startRecording Parameters: currentTime Starts recording the audio at the given track time (in milliseconds) stopRecording Parameters: currentTime Ends the recording (only applies if there is an ongoing recording) startPlayback Parameters: currentTime Begins audio playback at the given track time (in milliseconds) stopPlayback Parameters: currentTime Stops all playback activity. millisecondsToPixels Parameters: millSec Return: pixel value Converts milliseconds to pixels according to the current audio timeline scale pixelsToMilliseconds Parameters: pixels Return: millisecond value Converts pixels to milliseconds according to the current audio timeline scale tickFormatter Parameters: tickpoint Return: time (e.g. 00:30:00) Changes tickpoints into time display (e.g. 00:30:00). Each tickpoint unit is one second which is then scaled by the audio timeline scale. disableEditUI Disables all UI functionality for editing audio (used during recording and playback) enableEditUI Enables all UI functionality for editing audio (used when recording or playback stops) 66 drawTracksContainer Return: jquery tracks container object Draw the container that will be used to hold audio tracks pluginTopOffset Parameters: pluginIndex Return: offset from the top of the tracks container (in pixels) Gets the offset (pixels) from the top of the tracks container for the nth plugin. Using a pluginIndex equal to the number of plugins will return the offset needed by the tracks that are drawn under the plugins. refreshGradations Redraw the gradations container to fit the current audio tracks drawGradations Draw the graduation marks on the audio timeline refreshPlayhead Refreshes played position drawPlayehead Draws the playhead for showing playback location zoom Parameters: zoomOut (default true to indicate zoom out, false means zoom in) Zooms the audio timeline in or out draw Draws all parts of the audio timeline onto the page updatePlayheadTime Parameters: currentTime Updates the current time (ms) of the audio timeline (the time indicated by the playhead) updateTicker Parameters: time Updates the ticker display indicating the current time as a string timelineClicked 67 Parameters: event When the timeline is clicked, update the playhead to be drawn at the time of the clicked position. addTimelinePlugin Parameters: plugin Adds the plugin to the list of plugins getTimelinePluginID Parameters: plugin Return: the ID of the plugin, which is calculated as the base plus the index of the plugin in the array A.5.3 Track Controller getID Return: the ID of the track getLength Return: the length of the track (in milliseconds) getAudioTrack Return: the audio track insertSegment Parameters: newSegment (segment to be inserted) Insert a new segment into the audio track removeSegment Parameters: segment Remove a segment from the track segmentDragStart Parameters: event, ui, segmentController Callback for when a segment UI div starts to be dragged. Sets initial internal variables. segmentDragging Parameters: event, ui, segmentController Callback for when a segment UI div is being dragged. Tests whether or not the drag is valid. If the dragging is valid, it does nothing, allowing the segment UI div to be dragged to the new position. If the dragging is invalid, it sets the segment UI div back to the last valid position. 68 segmentDragFinish Parameters: event, ui, segmentController Callback for when a segment UI div is finished being dragged. Performs the drag in the audio model. segmentCropStart Parameters: event, ui, segmentController Callback for when a segment UI div starts to be cropped. Sets the initial internal variables. segmentCropping Parameters: event, ui, segmentController Callback for when a segment UI div is being cropped. If the cropping is valid, it does nothing. If the cropping is invalid, it sets the UI div back to the original size and position. segmentCropFinish Parameters: event, ui, segmentController Callback for when a segment UI div has finished being cropped. The cropping should always be valid because the 'segmentCropping' callback only allows cropping to happen in valid ranges. Performs the crop in the audio track. removeFocusedSegments Remove all segments that have focus. startPlayback Parameters: startTime, endTime Start the playback of the track at the specified time interval. Stops the previous playback if there is one currently going. The time is specified in milliseconds. If the end time is not specified, playback goes until the end of the track. stopPlayback Stop the playback of the track. Does nothing if the track is not playing. refreshView Refresh the view to reflect the state of the model for an audio track draw Parameters: jqParent (jQuery container where track should be drawn) Return: a new jQuery track Draw a track into the parent jQuery container 69 A.5.4 Segment Controller getID Return: the segment ID getWavesurferContainerID Return: the ID of the wavesurfer container getClassName Return: the name of the class used to represent audio segments getAudioSegment Return: the audio segment getParentTrackController Return: the parent track controller startPlayback Parameters: delay, trackStartTime, trackEndTime Play the audio segment back after a delay at the specified time interval (milliseconds). If the end time is undefined, play until the end. If playback is currently going or scheduled, then cancel the current and start a new one. stopPlayback Stop any ongoing or scheduled playback refreshView Refresh the view to reflect the state of the model for the audio segment draw Parameters: jqParent (jQuery container where segment should be drawn) Return: a new jQuery segment Draw a segment into the parent jQuery container shiftWavesurferContainer Parameters: pixelShift Shift the internal wavesurfer container left (negative) or right (positive) in pixels. This is used when cropping to move the container so the cropping motion looks natural. 70 A.6 Retimer A.6.1 Retimer Model getConstraints Return: an array containing all of the constraints makeConstraintDirty Parameters: constraint Return: the constraint having been disabled cleanConstraints Parameters: constraint, amount (amount to shift the original time of the constraint) Shifts the dirty constraints by the specified amount (from their original time) and enables the constraints. checkConstraint Parameters: constraint Return: true if this is a valid constraint, false otherwise Check to see if the constraint is in a valid position updateConstraintVisualsTime Parameters: constraint, audioTimeCorrespondingToNewVisualsTime, test (default is false, optional Boolean indicating whether to test the update without actually updating) Return: a Boolean indicating whether the update was successful Update the visuals part of the constraint located at the specified audio time (tAud) updateConstraintAudioTime Parameters: constraint, newTAudio, test (default is false, optional Boolean indicating whether to test the update without actually updating) Return: a Boolean indicating whether the update was successful Update the audio part of the constraint located at the specified visuals time (tVid) addConstraint Parameters: constraint Return: true if constraint is successfully added Add a constraint to the lecture 71 deleteConstraint Parameters: constraint Deletes the specified constraint shiftConstraints Parameters: constraints, amount Shifts the specified constraints by the specified amount of time getConstraintsIterator Return: an iterator over all constraints getPreviousConstraint Parameters: time, type (visual or audio) Return: the constraint that appears in time before the time of the given constraint getNextConstraint Parameters: time, type (visual or audio) Return: the constraint that appears in time after the time of the given constraint getVisualTime Parameters: audioTime Return: visual time associated with the given audio time Converts audio time to visual time getAudioTime Parameters: visualTime Return: audio time associated with the given visual time Converts visual time to audio time saveToJSON Return: a JSON object containing the constraints JSON information Saves the constraints to JSON loadFromJSON Parameters: json_object Return: an instance of the retimer model with the data specified in the JSON object (loaded from a file) 72 A.6.2 Retimer Controller addArrowHandler Parameters: event The event handler for when a user clicks on the constraints canvas after clicking on the “add constraint” button. It adds the constraint to the model, and then draws the arrow on the canvas drawTickMarks Draws tick marks on the retimer canvas to indicate how quickly or slowly the visuals are being played back. (Note: not active currently, interpolation isn’t working properly) drawConstraint Parameters: constraint_num (unique id for each constraint added (incremented by the retimer) Draw the constraint on the constraints canvas (for manual/user added constraints) redrawConstraints Refresh the canvas and redraw the constraints redrawConstraint Parameters: constraint, constraint_num Redraw an individual constraint on the retimer canvas addConstraint When a user adds a constraint, add the constraint to the retimer model selectArea Parameters: event Handles the event when a user clicks on the retimer canvas to select a constraint selectionDrag Parameters: event As a user drags along the retimer canvas the selection box is updated and drawn endSelect Parameters: event Handles the end of a selection dragging along the retimer canvas selectConstraints Parameters: event Finds the constraints that are within the selection area 73 displaySelectedConstraints Parameters: event Redraws the constraints that have been selected to be displayed in red deleteConstraints Parameters: event Deletes the selected constraint(s) from the retimer model constraintDragStart Parameters: layer (jCanvas layer containing the constraint to be dragged) When dragging starts, record whether the drag is for the top or bottom of the arrow (visuals end or audio end respectively) and record the original x position of that end of the arrow. constraintDrag Parameters: layer (jCanvas layer containing the constraint being dragged) Dragging moves one end of the arrow while the other tip remains in place constraintDragStop Parameters: layer (jCanvas layer containing the constraint that has stopped being dragged) When dragging stops, update the visuals or audio time of the constraint depending on whether the drag was top or bottom. Updates the thumbnails accordingly. constraintDragCancel Parameters: layer (jCanvas layer containing the constraint being dragged) When dragging cancels (i.e. if a user drags the constraint off the canvas), it should reset to its original value. beginRecording Parameters: currentTime Adds automatic constraints at the beginning of a recording endRecording Parameters: currentTime Add an automatic constraint at the end of a recording 74 A.7 Thumbnails Controller drawThumbnails Draw the thumbnails whenever the visuals in the main window are updated or changed. Calculates number of thumbnails to draw. Setup all the thumbnail canvases (each thumbnail is drawn on a separate canvas). Iterate over the number of thumbnails and call generate thumbnail. generateThumbnail Parameters: thumbOffset (the number of the thumbnail in the sequence of all of the thumbnails), visuals_min (the minimum time to be displayed by the current thumbnail), visuals_max (maximumm time to be displayed by the current thumbnail), thumbnail_width (the width of the thumbnails canvas, specified to ensure that it will line up with the audio timeline) Generate a thumbnail by getting the visuals from the slides. 75 Appendix B: Example of Saved Lecture JSON Structure { "visuals_model": { "slides": [ { "visuals": [ { "type": "Stroke", "hyperlink": null, "tDeletion": null, "propertyTransforms": [], "spatialTransforms": [], "tMin": 947, "properties": { "c": "#777", "w": 2 }, "vertices": [ { "x": 92.0625, "y": 31, "t": 949 }, { "x": 92.0625, "y": 32, "t": 1034 }, { "x": 93.0625, "y": 33, "t": 1046 }, { "x": 93.0625, "y": 34, "t": 1059 }, { "x": 93.0625, 76 "y": 36, "t": 1073 } ] }, { "type": "Stroke", "hyperlink": null, "tDeletion": null, "propertyTransforms": [], "spatialTransforms": [], "tMin": 2531, "properties": { "c": "#777", "w": 2 }, "vertices": [ { "x": 163.0625, "y": 56, "t": 2531 }, { "x": 166.0625, "y": 53, "t": 2594 }, { "x": 168.0625, "y": 51, "t": 2603 }, { "x": 171.0625, "y": 50, "t": 2617 }, { "x": 174.0625, "y": 48, "t": 2629 77 } ] }, { "type": "Stroke", "hyperlink": null, "tDeletion": null, "propertyTransforms": [], "spatialTransforms": [], "tMin": 9468, "properties": { "c": "#777", "w": 2 }, "vertices": [ { "x": 125.0625, "y": 258, "t": 9470 }, { "x": 125.0625, "y": 257, "t": 9491 }, { "x": 127.0625, "y": 254, "t": 9522 }, { "x": 131.0625, "y": 251, "t": 9528 } ] } ], "duration": 23116 }, { 78 "visuals": [ { "type": "Stroke", "hyperlink": null, "tDeletion": null, "propertyTransforms": [], "spatialTransforms": [], "tMin": 947, "properties": { "c": "#777", "w": 2 }, "vertices": [ { "x": 92.0625, "y": 31, "t": 949 }, { "x": 92.0625, "y": 32, "t": 1034 }, { "x": 93.0625, "y": 33, "t": 1046 }, { "x": 93.0625, "y": 34, "t": 1059 } ] }, { "type": "Stroke", "hyperlink": null, "tDeletion": null, "propertyTransforms": [], "spatialTransforms": [], 79 "tMin": 2531, "properties": { "c": "#777", "w": 2 }, "vertices": [ { "x": 163.0625, "y": 56, "t": 2531 }, { "x": 166.0625, "y": 53, "t": 2594 }, { "x": 168.0625, "y": 51, "t": 2603 }, { "x": 171.0625, "y": 50, "t": 2617 }, { "x": 174.0625, "y": 48, "t": 2629 } ] }, { "type": "Stroke", "hyperlink": null, "tDeletion": null, "propertyTransforms": [], "spatialTransforms": [], "tMin": 9468, "properties": { 80 "c": "#777", "w": 2 }, "vertices": [ { "x": 125.0625, "y": 258, "t": 9470 }, { "x": 125.0625, "y": 257, "t": 9491 }, { "x": 127.0625, "y": 254, "t": 9522 }, { "x": 131.0625, "y": 251, "t": 9528 } ] } ], "duration": 23116 } ], "canvas_width": 800, "canvas_height": 500 }, "audio_model": { "audio_tracks": [ { "audio_segments": [ { "audio_clip": 0, "total_audio_length": 12528, "audio_start_time": 0, 81 "audio_end_time": 12528, "start_time": 0, "end_time": 12528 }, { "audio_clip": 1, "total_audio_length": 4399, "audio_start_time": 0, "audio_end_time": 4399, "start_time": 12528, "end_time": 16927 } ] } ] }, "retimer_model": { "constraints": [ { "tVis": 0, "tAud": 0, "constraintType": "Automatic" }, { "tVis": 6650, "tAud": 6650, "constraintType": "Manual" }, { "tVis": 9525, "tAud": 9525, "constraintType": "Manual" }, { "tVis": 12528, "tAud": 12528, "constraintType": "Automatic" }, { "tVis": 14500, "tAud": 14500, 82 "constraintType": "Manual" }, { "tVis": 16927, "tAud": 16927, "constraintType": "Automatic" } ] } } 83 References [1] Adriatic 11, “Handwriting flash cs3 tutorial,” 2008. [2] D. Doerman, “An Introduction to Vectorization and Segmentation,” Lecture Notes in Computer Science, 1998, Volume 1389/1998, p. 1-8. [3] F. Berthouzoz, W. Li, M. Agrawala, “Tools for placing cuts and transitions in interview video,” ACM Transactions and Graphics 31, 4, 67. [4] F. Durand, “Non-Sequential Authoring of Handwritten Video Lectures With Pentimento” [5] J. Loviscach, “A real-time production tool for animated hand sketches,” 2011 in CVMP. [6] P. Shirley, S. Marschner, et al., “Fundamentals of Computer Graphics,” A K Peters Ltd. 2009. [7] R. Talbert, “How I Make Screencasts: The Whiteboard Screencast,” The Chronicle of Higher Education blog, 2011. [8] Sclipo , “How to create flash animations with a wacom tablet,” 2008. [9]W3C, “Scalable vector graphics (svg) 1.1.” 2011. [10] Y. Lai, S. Hu, R. Martin, “Automatic and Topology-Preserving Gradient Mesh Generation for Image Vectorization,” ACM Transactions on Graphics, Vol. 28, No. 3, Article 85, August 2009. [11] C. Lawrence Zitnik. “Handwriting Beautification Using Token Means,” Microsoft Research. 84 85