Pentimento: Non-sequential Authoring of Handwritten Lectures

Pentimento:

Non-sequential Authoring of Handwritten Lectures

by Jonathan Wang

S.B., C.S. M.I.T., 2013

Submitted to the

Department of Electrical Engineering and Computer Science in Partial

Fulfillment of the Requirements for the Degree of

Master of Engineering in Electrical Engineering and Computer Science at the

Massachusetts Institute of Technology

June 2015

Copyright 2015 Jonathan Wang. All rights reserved.

The author hereby grants to M.I.T. permission to reproduce and to distribute publicly paper and electronic copies of this thesis document in whole and in part in any medium now known or hereafter created.

Author: ________________________________________________________________

Department of Electrical Engineering and Computer Science

May 21, 2015

Certified by: ________________________________________________________________

Prof. Fredo Durand, Thesis Supervisor

May 21, 2015

Accepted by: ________________________________________________________________

Prof. Albert R. Meyer, Chairman, Masters of Engineering Thesis Committee

Pentimento: Non-sequential Authoring of Handwritten Lectures by Jonathan Wang

Submitted to the Department of Electrical Engineering and Computer Science

May 21, 2015

In Partial Fulfillment of the Requirements for the Degree of Master of Engineering in Electrical

Engineering and Computer Science

Abstract

Pentimento is software developed under the supervision of Fredo Durand in the

Computer Graphics Group at CSAIL that focuses on dramatically simplifying the creation of online educational video lectures such as those of Khan Academy. In these videos, the lecture style is that the educator draws on a virtual whiteboard as he/she speaks. Currently, the type of software that the educator uses is very rudimentary in its functionality and only allows for basic functionality such as screen and voice recording. A downside of this approach is that the educator must get it right on the first approach, as there is no ability to simply edit the content taken during a screen capture after the initial recording without using unnecessarily complex video editing software. Even with video editing software, the user is not able to access the original drawing content used to create video.

The overall goal of this project is to develop lecture recording software that uses a vector based representation to keep track of the user’s sketching, which will allow the user to easily editing the original drawing content retroactively.

The goal for my contribution to this project is to implement components for a web-based version of Pentimento . This will allow the application to reach a broader range of users. The goal is to have an HTML5 and Javascript based application that can run on many of popular the web browsers in use today. One of my main focuses in this project is to work on the audio recording and editing component. This includes the working on the user interface component and integrating it with the rest of the parts in the software.

2

Introduction

Pentimento is software developed under the supervision of Fredo Durand in the Computer

Graphics Group at CSAIL that focuses on dramatically simplifying the creation of online educational video lectures such as those of Khan Academy. In these videos, the lecture style is that the educator draws on a virtual whiteboard as he/she speaks. Currently, the type of software that the educator uses is very rudimentary in its functionality and only allows for basic functionality such as screen and voice recording. A downside of this approach is that the educator must get it right on the first approach, as there is no ability to simply edit the content taken during a screen capture after the initial recording without using unnecessarily complex video editing software. Even with video editing software, the user is not able to access the original drawing content used to create video. The overall goal of this project is to develop lecture recording software that uses a vector based representation to keep track of the user’s sketching, which will allow the user to easily editing the original drawing content retroactively.

There is a Mac-based application called Pentimento that allows the recording of the educator’s voice and the drawing on the screen. These two components are separate. The drawing is done using a vector-based representation, where individual strokes can be edited independently of the audio recording. The drawing and voice recording is stored in a data format that allows it to be accessed by the current software for playback. The data format also allows the capability of playback on other platforms so that playback capability will not be strictly limited to using the

Pentimento software. The drawing component of the software is similar to other simple drawing programs such as Microsoft Paint. It allows the user to create drawings comprised of individual points by using a pen tool. This allows for lines of different colors and widths. The editing

3

capabilities allow the user to jump to any point during the recording and edit the content at that time. This includes using adding, deleting, and moving the drawings. Movements of the content such as transformations and translations can also be recorded. There is also a basic set of tools for editing capabilities such as the undo tool and the lasso tool, which allows the selection of multiple drawing points at one time.

The next step in this project is to implement a web-based version of Pentimento . This will allow the application to reach a broader range of users. The goal is to have an HTML5 and Javascript based application that can run on many of popular the web browsers in use today. One of my main focuses in this project is to work on the audio recording and editing component. This includes the working on the user interface component and integrating it with the rest of the parts in the software.

4

User Guide and Tutorial

Pentimento was created to allow for easy creation and revision of handwritten “digital blackboard” style lecture videos. However, Pentimento transcends current solutions by adding a simple editing component, which facilitates increased flexibility in updating lecture content once recording is completed. Other solutions barely allow editing beyond cutting content, but

Pentimento has much stronger editing capabilities, including separate editing and synchronization of audio and visual components.

This section walks a user through the Pentimento web software, detailing the user interface and explaining how to do simple recordings of lectures. Since Pentimento allows for non-linear recording and editing of lectures there are a lot of options for how to begin recording a lecture.

The basic Pentimento lecture consists of handwritten strokes on slides with a voiceover lecture, but there are many choices for how to create this lecture. As a lecture is recorded the user has the option to insert slides for organizational purposes or simply to create a blank slate to record visuals on. The visuals are currently in the form of strokes, which appear as the handwritten part of the lecture. An audio track is created while recording audio and new audio segments are created by breaks in audio recording. The audio segments can be rearranged by the user after recording. Finally, the user has a chance to create synchronization points to connect specific audio and visual moments in the lecture, allowing for playback to show user selected visuals at a user specified audio time.

The first unique aspect of the Pentimento software is the ability to record audio and visuals separately or together. Once audio and visual components of a lecture have been recorded, the audio and visuals can be synchronized through the retimer. The second, and probably most important, innovation of Pentimento is the ability to edit the lecture after recording. By allowing

5

the user to change content of the lecture (visual or audio) after recording, but keep the timing the same, it is much simpler to create an accurate, effective, and up to date lecture video.

Pentimento allows users to edit the lecture in many ways such as: updating layout and display

(e.g. changing the color of visuals), inserting content at any time and synchronizing the audio and visual components to make the timing exactly what is desired.

The Main Recording UI

The main recording portion of the web interface is where a user can begin recording and editing visuals.

Figure 1: Main Pentimento Recording Interface (in editing mode)

As a lecturer begins recording he or she is given the option to record just the visuals, just the audio or both. User tests

Figure 2: Record Button with

Recording Options (current state would record both visuals and audio) indicated that most people choose to record the visuals first,

6

then the audio and then add synchronization between the two [4]. For simplicity here we will discuss how to record and edit each modality separately, but they can also be recorded at the same time.

Recording Visuals

The basis for recording visuals is the pen tool. After the record button is pressed, any time the pen or mouse is placed on the main drawing canvas the resulting strokes will be recorded as part of the lecture. Hitting stop in the top left corner then stops the recording.

Figure 3: Main Recording User Interface (in Recording Mode). The pen tool is highlighted as the main input for lecturers. The recording canvas is shaded to indicate space where the pen tool can be used. Finally to stop recording the stop button in the top left corner would be clicked.

7

While recording visuals, it is possible to select visuals and then resize, delete or move those visuals. If a selection is made while in recording mode, that selection will become part of the recorded lecture, so when it is played back the person watching the lecture will be able to see the selection and any actions that have been taken (e.g. moving the selected visuals).

Figure 4: Using the Selection Tool. The selection tool is highlighted. In this example, the letter "C" is selected and could be deleted, moved or resized by the user.

Additionally, the color and/or width of the pen strokes can be adjusted by selecting these options from the recording menu.

Figure 5: Pen Stroke Changes. On the left is the color palette to change the color of the pen strokes. The right image shows the available widths of the pen tool.

8

A lecturer also has the ability to insert a new slide by pressing the add slide button.

This clears the canvas and allows for a blank slate while recording. Slides can be

Figure 6: Add

Slide Button used as organizational tools, or simply to wipe the screen clean for more space.

Once some visuals have been recorded they can be played back by hitting the play button.

Figure 7: In editing mode visuals can be played back by clicking the play button

(emphasized here)

Editing Visuals

Figure 8: Editing

Toolbar

When recording has stopped Pentimento enters editing mode. This allows a user to make changes that are not recorded as part of the lecture, but instead change from the moment the visual appears. For example, changing the color of a stroke while editing will change the color of that stroke from the moment it was written, instead of changing it mid-playback (which is what would happen if the color was changed during recording). Some other examples of visual edits are changing the width, resizing, moving, and deleting visuals.

This allows for errors to be corrected (e.g. if something is misspelled the visuals could be deleted in editing mode and the specific word could be re-

9

recorded) and content to be updated. Layout changes are also common, since sometimes it is difficult to allocate space properly the first time a lecture is recorded.

Recording Audio

While audio can be recorded at the same time as the visuals, many users choose to record it separately. Recording audio is as simple as hitting record and then

Figure 9:

Recording Only

Audio speaking into the microphone. It is also possible to insert audio files, such as background music or audio examples to enhance a lecture.

Editing Audio

The main type of audio edit that is necessary in handwritten lectures of this kind is removing long silences. Often, if recording audio and visuals at the same time, writing takes longer than speaking, filling the lecture with long silences that can be deleted in the audio editing phase.

Audio segments can also be rearranged or dragged to a different time.

Figure 10: Audio Waveform displayed on the audio timeline

The Retimer

Retiming is a key innovation of Pentimento , allowing the user to resynchronize the visual and audio components of a lecture. This is a form of editing that affects the playback of the lecture, playing visuals at a user specified time during the audio. To achieve this synchronization the user uses the retimer display as shown. The display is comprised of a thumbnail timeline, displaying snapshots of visuals at time intervals throughout the lecture. These correspond to the

10

audio timeline below. In between the thumbnails and the audio is the main feature of the retimer, where correspondences between audio and visuals are drawn.

Figure 11: The Audio Timeline and Retimer. This displays the user interface that can be used to add synchronization points between visual and audio time in a lecture. The top displays thumbnails of the lecture visuals. The bottom is the audio waveform representing the lecture audio. In between is the retiming canvas, which allows the user to add synchronization points between the visuals (represented by thumbnails) and the audio (represented by an audio waveform).

To insert a new constraint the “add constraint” button must be clicked and then the user must click on the place on the retimer timeline where

Figure 12: Add Constraint

Button he or she wants to draw the correspondence. These synchronization points are represented by arrows pointing to the point in the audio time and the corresponding point in the visual time. Note: Some constraints are added automatically at the beginning and end of recordings to preserve other constraint points. Automatic constraints are gray, while manually added constraints are black.

Figure 13: New constraint added to the constraints canvas by the user.

11

To fine tune the audio and visual correspondence the user can drag the ends of the arrow to line up with the exact audio time and the exact visual time they would like to be played

Figure 14: User dragging a constraint to synchronize a certain point in the audio

(bottom of the arrow) with a new point in the visuals (the point the top of the arrow is dragged to) together. Then the visuals on either side of the constraint will be sped up or slowed down appropriately to ensure that during playback the desired audio and visual points are played at the same time. Note: it is always the visual time being adjusted to correspond to the audio time (this decision was made because writing faster or slower flows much better than the lecturer suddenly talking faster or slower).

To delete a constraint a user simply clicks within the constraints timeline and drags a selection box over the constraint(s) he or she wishes to remove.

Figure 15: User selecting a constraint to delete

This turns the selected constraints red (to visually confirm that the desired constraint has been chosen). Then the user can click on the delete constraint(s) button to remove the correspondence.

Figure 16(a): Selected constraint

(indicated by turning red)

Figure 17(b): Delete

Constraint(s) Button

Figure 16(c): Selected

Constraint Removed

12

Code Organization Overview

The base functionality of Pentimento is the ability to record a lecture. This process is initialized when a user clicks the record button and starts to record visuals and/or audio. This then begins the recording process in the LectureController , which propagates down to recording visuals and audio. As the user adds strokes on the main canvas these events are captured by the

VisualsController and added to the visuals array in the current slide of the

VisualsModel . Similarly, the AudioController processes audio input and creates an audio segment which is stored in the current audio track. Recording input is continually added to these data structures and changes are also processed and added. For example, if a user decides to change the color of a stroke, that property transformation is added to the data structure for that visual. Ultimately, when a recording is completed, users can then go back and edit the recorded content. This process also stores property transforms and spatial transforms as part of the visuals data structure. Retiming is another key part of editing. When a user adds a constraint to the retiming canvas that constraint is processed and added to the constraints array with the associated visual and audio times to be synchronized.

All of these components are combined to create a Pentimento lecture. A lecture is the basic data structure and it is comprised of separate visual and audio pieces, each of which is organized into a hierarchy. The visuals are comprised of slides, each of which contains visual strokes written by the lecturer. These strokes are made up of vertices (points that are connected to display the stroke). The audio contains various tracks, each of which includes audio segments. The final component of a lecture is the retiming constraints, which are the synchronization information that unites the audio and visual components at a certain time.

13

The Pentimento code base is organized into a Model-View-Controller (MVC) architectural pattern. The basis for any recording is the Lecture, which contains visuals, audio and retiming information. Each of these main components has a model and a controller, the details and specifications of which are outlined below. The models contain the specific data structures for each component, allowing lecture data to be manipulated. The controllers connect the lecture data to the view (the user interface), handling user inputs and making the necessary changes to the models, updating the lecture appropriately.

Figure 18: All of the modules in the Pentmento code base. Arrows indicate that there is a reference in the file with the origin of the arrow to the module where the arrow is pointing. This allows for the file original file to access the functionality of the sub-file

The web version of Pentimento was written using JavaScript, jQuery, HTML5 and CSS.

Additional packages were used for displaying certain aspects of the user interface. jCanvas was used for displaying the retimer constraints, allowing a simple API for drawing and dragging the constraints, as well as selection and other canvas interactions. Wavesurfer.js is used for displaying audio wave forms. Spectrum.js is used as a color selection tool.

14

Code Design

Lecture

Pentimento lecture is made up of visual and audio components. To allow the lecture to be played back correctly a Pentimento lecture also contains a “retimer,” which stores the synchronization information between the visuals and the audio.

Figure 19: Illustration of the data types that comprise a Pentimento Lecture. At the highest level there is the lecture, which is comprised of visuals, audio, and retiming data.

15

Model

The LectureModel is the model that represents the data of the lecture, and it is a collection of the models for visuals, audio, and retimer. It has functionality for initializing, getting the total duration of the lecture, and for saving and loading to JSON.

Controller

The LectureController handles the UI functionality for recording and playback, undo and redo, and loading and saving lectures. It also serves as the entry point for the application through the $(document).ready( ) function. For recording and playback, it uses the

TimeController to start a timing and then calls the appropriate methods in the audio and visuals controllers. During a recording, it creates a grouping for the UndoManager so that all undoable actions fall within that undo group. When the undo button is pressed, it calls a method in the LectureController that calls the undo method of the UndoManager and redraws all of the other controllers. The LectureController also registers a function as a callback to the UndoManager , and the role of this function is to update the state of the undo and redo buttons so that each one is inactive if there are no undo or redo actions that can be performed.

The LectureController is also responsible for initializing the process of creating and loading Pentimento save files. This is discussed in the Save File section.

Time

When referring to time, there are four different time measurements.

16

1.

Real time refers to the time of the system clock. Real time is the time returned by the system clock when using the Javascript Date object: new Date()).getTime() .

Real time is only used to keep track of the time elapsed and the relative difference between two actions in time.

2.

Global time, or lecture time, refers to the global time for the lecture that is kept by the

TimeController . The global time starts at 0 and the unit is milliseconds.

3.

Audio time refers to the time used for keeping track of the audio elements. There is a 1to-1 correspondence between global time and audio time, so audio time directly matches with the global time. Because of this, there is no real difference between the global time and the audio time. The only difference is notional in that global time is used when referring to the time kept by the TimeController , and audio time is used when keeping track of the time in the context of the audio.

4.

Visual time is used when keeping track of the time for the visual elements, and it is aligned with the global time through the retimer and its constraints. All times from the

TimeController must be passed through the retimer in order to convert them into visual time.

The audio, visuals, and retimer need the TimeController in order to get the time, but the

TimeController operates independently from the audio, visuals, and retimer. The

TimeController has functionality to get the current time, start a timing (automatic time updating), allow a manual update of the time, and notify listeners of changes in the time. When the TimeController starts a timing, the global time will begin to count up from its current

17

time. This timing can be stopped with a method call to the TimeController . When the

LectureController begins recording, it uses this timing functionality to advance the time of the lecture. Methods can also be registered as callbacks to the TimeController so that they are called when the time is updated automatically through a timing or manually through the updateTime method.

Internally, a timing works by keeping track of the previous real time and then using a Javascript interval to trigger a time update after a predetermined real time interval. When the interval triggers, the difference between the current real time and previous real time is calculated and used to increment the global time. The current real time is saved as the previous time. The updateTimeCallbacks are called with the new global time as an argument. When a timing is not in progress, the getTime method just returns the current time. However, when a timing is in progress, the getTime method will get the current real time and calculate the difference between that and the previous time, just as it happen during an interval update. Effectively, this pulls the current global time instead of just observing an outdated global time. This allows a finer granularity of time readings during a timing. This mechanism is important because if the time were only updated every interval without pulling the most recent global time, then there would be visuals occurring at different times but still reading the same global time. updateTimeCallbacks are not called when the time is pulled during a getTime call. This is to prevent an overwhelming number of functions getting called when there are a large number of getTime calls, such as those that occur during a recording when there are many visuals being drawn that require getTime to get the time of the visual.

18

The time TimeController also has methods to check if it is currently in a timing and to check the begin and end times of the previous timing. The TimeController does not have any notion of recording or playback. It is the LectureController that uses the

TimeController ’s timing to start a recording or playback.

Audio

Model

The audio model consists of an array of audio tracks, where each audio track consists of an array of audio segments. An audio segment contains the URL for an audio clip, the total length of the clip, the start and end times within the clip, and the start and end locations within the track (audio time).

The top level audio model has functions to insert and delete tracks. The audio track class has functions to insert and delete segments. All functionality for modifying segments within a track are handled by the audio track. This includes shifting segments, cropping segments, and scaling segments. This is because no segments can overlap within a track, so modifying a segment requires knowledge of the other segments within that track to ensure that the operation is valid.

The audio segment class has methods for converting the track audio time into the time within the clip and vice versa.

19

The audio model can be converted to and from JSON for the purpose of saving to and loading from a file. During the saving process, the audio clip URLs are converted into indices, and the resources they point to are saved with filenames corresponding to those indices.

Recording

When the LectureController begins a recording, it calls the startRecording method in the AudioController . Recording uses the RecordRTC library, which works on web browsers supporting WebRTC. When a recording starts, the AudioController uses

RecordRTC to start recording audio. When the recording stops, the stopRecording method of the AudioController is called, and there is a call to make RecordRTC stop recording.

The URL of the newly recorded resource is passed as an argument in the callback function for stopping recording in RecordRTC, and this URL is used to create a new audio segment that is inserted into the model. The start and end times for this segment are the begin and end times provided by the TimeController .

Playback

When the LectureController begins a playback, it calls the startPlayback method in the AudioController , which starts the playback in the tracks. The TrackController uses a timer to start playback for the segments after a delay. The delay is equal to the difference between the segment start time and the current audio time. If the current audio time intersects a segment, then playback for that segment begins immediately. Playback uses the Wavesurfer library to play the audio resource in the audio segments. When a segment playback starts, the

SegmentController uses Wavesurfer to start playing audio. The start point of the audio can

20

be specified so that it can start playing in the middle of the audio clip if specified by the segment parameters.

Automatically stopping playback for the segment when the current audio time moves past the end of the segment is handled by Wavesurfer by specifying the stop time for the audio clip.

When playback is stopped in the LectureController , the stopPlayback method of the

AudioController is called, and it stops playback in all of the TrackControllers , which then stops playback in all of the SegmentControllers . These

SegmentControllers manually stop any Wavesurfers that are in the process of playing an audio clip.

Timeline

The audio timeline is used for displaying the audio tracks to illustrate where the segments are in relation to the audio/global time. It also has a plugin functionality so that other items can be displayed on the timeline. The timeline has a pixel-to-second scale which is used for drawing items on the timeline, and it has the functionality to change this scale by zooming in and out.

This scale is illustrated through the use of labels and gradations.

Track Controller

The TrackController draws the audio track from the model and handles playback for the track. It delegates playback for the individual segments to the SegmentController . It has

21

the functionality for translating the UI events for editing the track into parameters that can be used to call the methods to change the audio model.

Segment Controller

The SegmentController draws the audio segment from the model. It uses the Wavesurfer

Javascript library to display and play the audio files that are associated with the segments. It creates the view for the segments and registers the callbacks associated with the various UI actions such as dragging and cropping.

For the UI functionality of segment editing, the JQuery UI library is used. The Draggable and

Resizable functionality is used to implement segment shifting and cropping, respectively. In order to enforce the constraint that the audio segments cannot overlap one another in the same track, the drag and resize functions tests to see if the new position of the segment view resulting from the drag or resize action leads to a valid shift or crop action. If the action is invalid, the position of the segment view is restored to last valid position. The functionality for checking the validity of these operations resides in the model. The result is that the UI uses the model to check if the user actions are valid, and the UI provides the relevant visual feedback to the user.

Plugins

Audio timeline plugins are a way to display views in the audio timeline. This is a useful feature because it allows those views to line up with the audio segments. Because the audio time is used as the global time, it makes sense to visualize the information in this way. When the timeline is panned from side to side, the plugin views also pan with the rest of the timeline. The plugin is

22

able to register a callback function that gets called when the timeline is zoomed in or out. This allows the plugin to define its own implementation of what is supposed to happen when the pixel-to-second scale of the timeline changes.

The other components that use the plugin functionality of the audio timeline are the retimer constraints view and the thumbnails view. For these views, it makes sense to display them as timeline plugins because it gives the user a sense of how the visual time relates to the audio time and how that relationship changes when retimer constraints are added and modified.

Visuals

The visuals component of a Pentimento lecture is organized in a hierarchy. Slides are the base level, which contain visuals. Each type of visual then has a certain data structure associated with it. Currently, strokes are the only type of visual that has been implemented. Strokes are comprised of vertices, which are points containing a x, y, t and p coordinates (x, y coodrinate position, time and pressure respectively).

Visuals Model

The VisualsModel contains the constructors for all components of visuals. The visuals model contains an array of slides, allowing slides to be created and manipulated. A slide provides a blank canvas for recording new visuals and allows the lecturer to have a level of control over the organization of information. A slide contains visuals, slide duration and camera transforms.

23

The visuals themselves have many components including type (e.g. stroke, dot, or image), properties (e.g. color, width and emphasis), tMin (the time when the visual first appears), tDeletion (time when the visual is removed), property transforms (e.g. changing color or width) and spatial transforms (e.g. moving or resizing). Property transforms have a value, time and duration. Spatial transforms also have a time and duration, as well as containing a matrix associated with the transform to be performed.

Finally, to actually display the visuals the type of visual is used to determine the drawing method. Currently, strokes are the only supported type of visuals and strokes are comprised of vertices. A vertex is represented by (x,y,t,p) coordinates, where x is the x position, y is the y position, t is the time and p is the pen pressure associated with that vertex.

Visuals Controller

The VisualsController has access to the VisualsModel and the RetimerModel .

The visuals controller also utilizes the tools controller and the renderer. The visuals controller is responsible for drawing the visuals onto the canvas as the lecture is being recorded. As visuals and slides are added to the view by the user, the visuals controller accesses the visuals model and adds the appropriate data structure. The visuals controller also allows the user to adjust properties of the visuals, such as the width and color.

Tools Controller

The tools controller allows the user to manipulate which tool they are using while recording or editing the visuals of the lecture. The tools controller allows switching of tools as well as

24

indicating what to do with each tool while the lecture is recording or in playback mode. The tools controller also creates the distinction of which tools are available in editing mode vs. in recording mode.

Selection

Visual elements can be selected by using the selection box tool. This tool works in both recording and editing modes. In the VisualsController , the selection is an array of

Visuals that is under the selection box draw by the user. For StrokeVisuals , the render uses different properties to display these visuals so that the user has feedback that the visuals have been selected.

The selection box itself is implemented on a separate HTML div on top of the rendering canvas.

Inside this div, there is a another div that is setup using JQuery UI Draggable and Resizable. This allows the box to be dragged and resized by the user. Callback methods are registered so that when the box is resized or dragged, a transform matrix will be created based on the change dimensions and position of the selection box. This transformation matrix is passed on to the visuals model.

Retimer

The retimer is one of the main innovations of the Pentimento lecture software. By allowing users to easily manipulate the synchronization between the visual and audio components of a lecture, the retimer provides much needed flexibility in recording and editing lectures. The retimer contains constraints, which are the synchronization connections between a point in the

25

audio time of the lecture and the visuals time of the lecture. Thus the retimer allows the playback of the lecture to have proper synchronization between the visuals timeline and the

audio timeline.

Retimer Model

The RetimerModel provides the ability to manipulate constraints, including addition, deletion, shifting and access to the constraints. The RetimerModel contains an array of constraints, which are used to synchronize the audio and visual time. A constraint is comprised of a type (automatic or manual), an audio time and a visual time. Automatic constraints are inserted mechanically as the lecture is recorded (e.g. at insertion points or at the beginning/end of a recording). Manual constraints are added by the user to synchronize a certain point in the audio with a certain point in the visuals.

Adding constraints to the model requires finding the previous and next constraints (in audio time). Once these constraints have been determined, the visual time can be interpolated between the added constraint and the visual time of the two surrounding constraints, to allow for smooth playback. This is done because adding a constraint only affects the time of visuals between the two surrounding constraints.

When visuals or audio is inserted, automatic constraints are added to preserve the synchronization provided by existing constraints. This requires shifting the existing constraints by the amount of time that is added by the insertion. This process is completed by making the constraints after the insertion point “dirty” until the insertion is completed. This means moving

26

the constraints to an audio time at “infinity” indicating they will be shifted. The original time is stored so that when the recording is completed the constraint times can be shifted by the appropriate amount. To perform the shift, the “dirty” constraints are “cleaned” by shifting the

original time that has been stored by the duration of the inserted recording (and removing the value of infinity from the constraint time).

Retimer Controller

The RetimerController has access to the RetimerModel , so that when a user manipulates constraints the necessary updates can be made to the constraints data. The retimer controller also has access to the visuals and audio controllers so that synchronizations can be

inserted properly.

Additionally, the RetimerController manages redrawing of constraints and thumbnails so that the view is properly updated when a user adds, drags or deletes a constraint. The

RetimerController interacts with the UI, so all user input events are handled properly.

When a constraint is added by a user, the RetimerController handles converting the location of the click on the retiming canvas (in x and y positions) to the audio and visual time represented by that location. Similarly, when a user is selecting to delete a constraint, the

RetimerController processes the selection area and locates the constraints within the selection by converting constraint times to positions on the retiming canvas. Dragging constraints is also handled by the RetimerController and when a user stops dragging the

RetimerController updates the RetimerModel to reflect the newly selected synchronization timing.

27

Thumbnails Controller

The ThumbnailsController displays the visual thumbnails to the user (as part of the retimer and audio timeline display). The ThumbnailsController requires access to the

Renderer to display the lecture visuals in a thumbnail timeline. The thumbnails are generated by calculating how many thumbnails will fit in the timeline (based on the length of the lecture).

Then each thumbnail is drawn on its own canvas at the time in the middle of the time span represented by the thumbnail. The thumbnails are redrawn any time a user updates a recording or drags a constraint to update the visual or audio time synchronization.

Undo Manager

The undo manager is integrated to work with the rest of the application by registering undo actions in the visuals, audio, and retimer models. For example, in the audio model, the shifting a segment works by modifying the audio segment by incrementing the start and end times with the shift amount, and then registering the same shift function as an undo action. This shift will have as an argument the inverse amount of shift that the first shift was performed with. When undoing, this shift will be called and it will register another undo action. Through the

UndoManager , this time the action will be pushed onto the redo stack instead of the undo stack. Differentiating between undo and redo actions is handled by the UndoManager . In the implementation of the models, for each action, the code only needs to register the inverse action with the UndoManager .

28

In the LectureController , when a recording begins, an undo group is started. When the recording ends, the group is ended. This allows the user to undo a recording as though it were one action.

When a recording is started, it also registers changeTime as an undo action. The changeTime argument is the begin time of the recording. This way, when the recording is undone, the time in the TimeController can be set to the begin time before the recording started. When changeTime is called, it pushes registers another changeTime call to the

UndoManager. The argument for this call is the current time. The result is that when undoing and then redoing, the time will switch to the place it was before the recording started and switch back to the place it was after the recording ended.

Renderer

The Renderer is used for displaying visuals at a certain time, either as a still frame or during playback. The Renderer takes in the canvas where the visuals should be displayed (and handles scaling the visuals to the appropriate size for the given canvas). Then, to display the visuals, the

Renderer uses a minimum and maximum time and draws all visuals that are active in that time range. The renderer is used for the main canvas while visuals are being recorded or during playback. The Renderer is also used to display thumbnails for retimer purposes.

29

Save and Load Files

Pentimento save files are regular Zip files with the content located in different files and folders.

In the top-level of the Zip, there is a JSON file containing the model. There is also a folder containing the audio files. In the future, there could be other top-level folders for items such as images and other external resources. Inside the audio folder, the audio clips are stored with filenames starting with the number 0 and counting up the rest of the files. The JSZip library is used for loading and saving Zip files.

The save is handled through LectureController . The JSON representation of the model is obtained by using the saveToJSON method in the different models. In the audio model JSON, the URL references to the audio clips are replaced by index numbers which will be used as the file names of the audio files. The JSON is saved as a text file in the zip, and then the audio files are converted to HTML5 blobs asynchronously. When all the downloads have completed, the

Zip file is downloaded to the computer’s local storage.

Loading of the Pentimento Zip files is also handled by the LectureController . The Zip is loaded asynchronously, and when the load is complete, the contents are read. The different models are loaded from their JSON representations using their respective loadFromJSON methods. The audio files are loaded into the browser and the URL for the resource is obtained.

This URL is substituted back into the audio segments where an index number was previously used as a substitute for the audio clip URL.

30

Discussion

Audio

In order to understand the code design decisions for the audio, it is important to understand some of the requirements for the audio model and user interface. The functionality needed by the audio is to be able to record and edit audio recorded by the user.

For recording audio, there are different options available for recording audio. One possibily is to use Flash. A benefit of using Flash is that it is cross-compatible with all commonly-used browsers. However, some downsides are that it requires separate software to be installed, is not as easily integrated with Javascript and HTML. Another option is Java. The advantages and disadvantages for using Java are the same as those of Flash. Finally, there is also the WebRTC component of HTML5. The advantage is that this is natively supported and easily integrates with

Javascript through the RecordRTC library. WebRTC is supported in the Chrome, Firefox, and

Opera browsers. The downside to it is that it is not supported in Internet Explorer and Safari. The

WebRTC approach is used in this project because the of the advantages of ease of integration and native browser compatibility.

For using the Wavesurfer library to display the audio waveform and to play the audio clip, the choice is simple to make because there are not many other strong alternatives. The other option is to draw the waveform manually using a canvas element, but in this case, it is unnecessary due to the completeness and ease of use of the Wavesurfer library.

31

The audio model is designed to allow the non-concurrent recording of multiple segments of audio. Because full audio editing functionality is not supported, the best way to handle multiple segments is to represent them as such, instead of combining these segments into a single audio clip. So for the audio model, when an audio recording is completed, a new audio segment is inserted into the model. Previously recorded audio clips are not modified. The insertion of this segment is followed by the shifting of segments in the same track that come after its insertion time.

The audio model also has support for multiple audio tracks. Each track contains its own audio segments. Using multiple tracks allows the user to have more flexibility in managing and editing audio. The user can record an audio segment into a new track without having to deal with the shifting that would occur if the segment had been inserted into an existing track with other segments. Multiple tracks also allows audio segments to overlap in time. A potential

Other External Libraries

The flot.js library is used to draw the axes and grid lines for the audio timeline. This library is a complete library that offers many plotting capabilities. The main functionality needed from this library for the audio timeline is just drawing the axes and grid line. No data is actually plotted.

Using this library makes it easy to change the scale of the graph without having to write custom logic for handling the spacing between tick marks and labels.

A zip folder is used in order to save the lecture as a file. An alternative to using a zip folder is to save each file individually. However, this makes keeping track of the files messy because the

32

user needs to ensure that none of the files are deleted or renamed. In order to save and open these zip files, the JSZip Javascript library is used. It supports zipping and unzipping with nested folders. This supports all of the zip functionality needed to create the Pentimento save files.

Future Improvements

There are many areas where the audio editing capabilities can be improved. Currently, audio segments can be split. However, there is no way to join two audio segments with different clips together into one segment. This feature would likely require the use of an audio processing library to combine the two segments’ audio clips into one.

Another improvement would be the ability to change the speed at which the audio plays back.

Right now, the audio plays at 1x speed. In order to change the playback speed in the model, all that needs to be done is to change the start and end time for the track time, while keeping the audio clip start and end times the same. The result is that the same length of audio clip is going to be played in a shorter amount of track time, effectively speeding up the audio segment. This speedup can be calculated and used as the audio rate in the Wavesurfer library for playback. A

UI for this functionality would also have to be added. A possible way of implementing it would be to have a drop-down menu with options for different playback speeds as well as an option where the user could enter a custom speed.

Currently, segments cannot be moved among different tracks. A segment remains in the track that it started out in after it was recorded. A feature that can be added is allowing a segment to be moved from one track to another. On the model side, this can easily be done by removing the

33

segment and inserting it into the other track. On the UI side, this could be implemented with a drag and drop functionality or with a button to move the selected segment to the selected track.

When recording, the timeline cursor progresses to indicate the current time of the recording.

However, this is not a clear indicator of how much has been recorded up to that point in time. An improvement would be to have a ghost segment in the position of the currently recorded segment. This segment should start at the time when the recording began and end at the current time of the recording.

Adding these features will improve the audio recording and editing functionality experience of

Pentimento and allow users to create more easily create lectures.

Acknowledgements

Alexandra Hsu

34

Appendix

Appendix A: Documentation

Here we will outline all of the functions available in the Pentimento code base, with their input parameters and outputs if any.

Lecture Model getVisualsModel

Return: the current visuals model setVisualsModel

Parameters: newVisualsModel (the new visuals model that the visuals model should be set to)

Sets the visuals model to the newly chosen visuals model getAudioModel

Return: the current audio model setAudioModel

Parameters: newAudioModel (the new audio model that the audio model should be set to)

Sets the audio model to the newly chosen audio model getRetimerModel

35

Return: the current retimer model setVisualsModel

Parameters: newRetimerModel (the new retimer model that the retimer model should be set to) sets the retimer model to the newly chosen retimer model getLectureDuration

Return: the duration of the lecture in milliseconds. The full lecture duration is the maximum duration of the audio recording or visuals recording times. loadFromJSON

Parameters: json_object (the lecture in JSON form to be loaded from a file)

Loads a lecture file. Calls the necessary load methods from the visuals, audio and retimer to construct the models for each and allow for playback. (Note: see JSON file structure in Saving and Loading appendix). saveToJSON

Saves the whole lecture to a JSON file. Gets the data from the saving methods from the visuals model, audio model and retimer model to create the JSON object. (Note: see JSON file structure in Saving and Loading appendix).

36

Lecture Controller save

Saves the current lecture to a JSON file. The JSON is put into a zip file with the audio blob

(Note: see JSON file structure in Saving and Loading appendix). load

Reads the selected lecture file so that it can be opened and displayed to the user. openFile

Parameters: jszip (the JSON zip file including the JSON object that represents the lecture and the audio blob).

Opens the specified file into the UI and resets all of the controllers and models to be consistent with the loaded lecture. (Note: see JSON file structure in Saving and Loading appendix). getLectureModel

Return: the lecture model getTimeController

Return: the time controller recordingTypeIsAudio

Return: true if the recording will include audio (i.e. if the audio checkbox is checked), otherwise returns false

37

recordingTypeIsVisual

Return: true if the recording will include visuals (i.e. if the visuals checkbox is checked), otherwise returns false isRecording

Return: true if a recording is in progress, otherwise returns false isPlaying

Return: true if playback is in progress, otherwise returns false startRecording

Starts the recording and notifies other controllers (time, visuals, audio and retimer) to begin recording. Updates the UI to toggle the recording button to the stop button. Note: only notifies the audio and visuals controllers if their respective checkboxes are checked for recording.

Return: true if successful stopRecording

Stops the recording and notifies other controllers (time, visuals, audio and retimer) to end recording. Updates the UI to toggle the stop button to the recording button. Note: only notifies the audio and visuals controllers if their respective checkboxes are checked for recording.

Return: true if successful startPlayback

38

Starts playback and notifies the other controllers (time, visuals, audio) that playback has begun.

Toggles the play button to the pause button.

Return: true if successful stopPlayback

Stops playback and notifies the other controllers (time, visuals, audio) that playback has ended.

Toggles the pause button to the play button.

Return: true if successful getPlaybackEndTime

Return: the lecture time when playback is supposed to end (returns -1 if not currently in playback mode) draw

Redraws the views of all of the controllers (visuals, audio and retimer). undo

Undoes the last action and redraws the view to reflect the change. redo

Redoes the last undone action and redraws the view to reflect the change. changeTime

39

This function creates a wrapper around a call to the time controller and the undo manager. This is necessary because the time needs to revert back to the correct time if an action is undone or redone. loadInputHandlers

Initiates the input handlers (i.e. mousedown, mouseup, keydown, and keyup). Also registers if it is a touch screen to determine if pen pressure will be applied. Also connects the click events to the lecture buttons. updateButtons

Toggles the UI display to between recording/stop button and play/pause button to reflect the current recording or playback state.

Time Controller addUpdateTimeCallback

Adds a callback that should notify listeners when the current time changes (note: functions should have one argument currentTime, in milliseconds) getTime

Return: current time, in milliseconds updateTime

40

Manually update the current time and notify callbacks globalTime

Return: UTC time (to keep track of timing while it is in progress) isTiming

Return: true if timing is in progress, otherwise returns false startTiming

Starts progressing the lecture time

Return: true if successful stopTiming

Stops progressing the lecture time

Return: true if successful getBeginTime

Return: the time (in milliseconds) when the previous or current timing started (returns -1 if there was no previous or current timing). getEndTime

Return: the time (in milliseconds) when the previous timing ended (returns -1 if there was no previous timing event).

41

Visuals Model getCanvasSize

Return: an object with the size of the canvas where the visuals are being recorded, formatted as:

{‘width’: <canvas width>, ‘height’: <canvas height>} getDuration

Return: the total visuals duration (calculated by adding the durations of each of the slides) getSlides

Return: the array of all of the lecture slides getSlidesIterator

Return: an iterator over the slides getSlideAtTime

Parameters: time (in milliseconds)

Return: the slide that is displayed at the specified time insertSlide

Parameters: prevSlide (the previous slide before the point where the new slide will be inserted), newSlide (the slide to be inserted)

Return: true if successful (false if the previous slide does not exist)

42

removeSlide

Parameters: slide (the slide to be removed, type slide)

Return: true if successful (false if there are no slides or if there is only one slide remaining) addVisuals

Parameters: visual (the visual to be added, type visual)

Gets the slide at the minimum time of the visual and then adds the indicated visual to the visuals belonging to that slide. deleteVisuals

Parameters: visual (the visual to be deleted, type visual)

Gets the slide at the minimum time of the visual and then removes the indicated visual from the visuals belonging to that slide visualsSetTDeletion

Parameters: visual (the visual to be deleted), visuals_time (the time to delete the visual, in milliseconds)

Sets the deletion time property of the given visual to the specified deletion time. setDirtyVisuals

Parameters: currentVisualTime (the current visual time, after this time all visuals will be set to dirty)

Creates wrappers around the visuals that keeps track of their previous time and the times of their vertices. Then move the visuals to positive infinity. Used at the end of a recording so that the

43

visuals will not overlap with the ones being recorded. Only processes visuals in the current slide after the current time. cleanVisuals

Parameters: amount (the amount of time, in milliseconds, that the visuals that were previously set as dirty will need to be shifted by to accommodate the new recording)

Restores visuals to their previous time plus the amount indicated. Used at the end of a recording during insertion to shift visuals forward. doShiftVisual and shiftVisual

Don’t function, but were written by previous M.Eng student as part of the “shift as you go” approach to shifting visuals during insertion. Left in the code base for the possibility of going back to that method. prevNeighbor

Parameters: visual

Return: the previous visual (i.e. the visual that occurs right before the specified visual in time). nextNeighbor

Parameters: visual

Return: the next visual (i.e. the visual that occurs right after the specified visual in time). segmentVisuals

Parameters: visuals (an array of all visuals)

44

Return: returns an array of segments, where each segment consists of a set of contiguous visuals. getSegmentShifts

Parameters: segments (an array of visual segments, where a segment is a set of contiguous visuals)

Return: returns an array of the amount by which to shift each segment saveToJSON

Saves the visuals as a JSON object loadFromJSON

Parameters: json_object

Return: an instance of the visuals model with the data specified in the JSON object (loaded from a file)

Visuals Controller getVisualsModel

Return: visuals model getRetimerModel

Return: retimer model

45

drawVisuals

Parameters: audio_time

Draws visuals on the canvas using the renderer. The time argument is optional, but if specified is the audio time at which to draw the associated visuals (visual time calculated from the retimer). If the time is not specified visuals are drawn at the current time of the time controller. startRecording

Parameters: currentTime (time at which to start recording)

Begins recording visuals on the slide at the current time. stopRecording

Parameters: currentTime (time at which to stop recording)

Stops the recording. If it is an insertion visuals after the recording time are “cleaned” to move to the end of the insertion. Durations are updated. startPlayback

Parameters: currentTime (time at which to start playback)

Starts playback stopPlayback

Parameters: currentTime (time at which to stop playback)

Stops playback currentVisualTime

46

Return: visual time (converted from the time controller time through the retimer) currentSlide

Return: the slide at the current time (gotten from the visuals model) addSlide

Adds a slide to the visuals model addVisual

Adds a visual to the visuals model (once it is done being drawn) recordingDeleteSelection

Deletes the selected visuals during recording and sets the tDeletion property for all of the selected visuals. editingDeleteSelection

Deletes the selected visuals while in editing mode, which removes the selected visuals entirely from all points in time. recordingSpatialTransformSelection

Parameters: transform_matrix (the matrix that will transform the selected visuals to the correct place).

47

Transforms the visuals spatially during recording. Gets the selected visuals and calculates the new position versus the original position to calculate the final transform matrix and it is added to the spatial transforms of those visuals. editingSpatialTransformSelection

Parameters: transform_matrix (the matrix that will transform the selected visuals to the correct place).

Transforms the visuals spatially during editing. Gets the selected visuals and adds the transform matrix to the spatial transforms of those visuals. recordingPropertyTransformSelection

Parameters: visual_property_transform (visual property that will be changed by the selection, i.e. color or width).

Changes the properties of the selected visuals during recording. Adds a property transform to the selected visuals property transforms editingPropertyTransformSelection

Parameters: property_name (property that will be changed), new_value (value to change the property to)

Changes the properties of the selected visuals during editing. Updates the specified property to the new property value (e.g. changes from one color to another).

48

Tools Controller startRecording

Activates the recording tools and hides the editing tools stopRecording

Activates editing tools and hides recording tools toolEventHandler

Parameters: event

Handles a click event on one of the tool buttons (handles both recording and editing tools) acitvateCanvasTool

This activates a tool on the canvas. This is used for tools such as pen, highlight, and select. The tool that is registered is the active tool for the current mode (recording/editing). Initializes mouse and touch events for the active tool. drawMouseDown

Parameters: event

Used when the pen tool is active. Called when the mouse is pressed down or a touch event is started. Activates the mouse move and mouse up handlers and starts a new current visual (i.e. the visual that is being drawn by the pen). drawMouseMove

49

Parameters: event

Used when the pen tool is active. When the mouse is down and moved or touch is moving, appends a new vertex to the current visual. drawMouseUp

Parameters: event

Used when the pen tool is active. When the mouse is released or a touch ends, clears the handlers and adds the completed visual. resetSelectionBox

Parameters: event

Resets the selection box so that it is not visible. selectMouseDown

Used when the selection tool is active. When the mouse is pressed down or a touch event is started, activates the selection box and the mouse move and mouse up handlers selectMouseMove

Parameters: event

Used when the selection tool is active. When the mouse is down and moved or a touch event is moving, updates the dimensions of the selection box and selection vertices. selectMouseUp

Parameters: event

50

Used when the selection tool is active. When the mouse is released or a touch ends, clears the handlers and turns on dragging and resizing of the selection box. selectBoxStartTranslate

Parameters: event, ui

While dragging a selection box, stores the original UI element dimensions selectBoxEndTranslate


While editing handles the end of dragging a selection box selectBoxEndScale


While editing handles the end of resizing a selection box widthChanged

Parameters: new_width (the newly selected width for the pen tool)

Handles changing the width of the pen tool. colorChanged

Parameters: new_spectrum_color (the newly chosen color for the pen tool. The color is passed in as a spectrum.js color and then converted to hex).

Handles changing the color of the pen tool.

51

isInside

Parameters: rectPoint1 (top left corner of selection rectangle), rectPoint2 (bottom right corner of selection rectangle), testPoint (vertex point).

Return: true if the test vertex is inside the selection, otherwise returns false.

Tests if a vertex inside the rectangle formed by the two rectangle points that form the selection box getCanvasPoint

Parameters: event

Return: Vertex(x,y,t,p) with x,y on the canvas, and t a global time

Gives the location of the mouse event on the canvas, as opposed to on the page getTouchPoint

Parameters: eventX, eventY (the coordinates of the touch event)

Return: Vertex(x,y,t,p) with x,y on the canvas, and t a global time

Gives the location of the touch event on the canvas, as opposed to on the page calculateTranslateMatrix

Parameters: original_position, new_position (position is represented as { left, top })

Return: translation matrix

Given the original and new position of a box in the canvas, calculate and return the math.js matrix. Necessary to translate the box from the original to the new coordinates.

52

calculateScaleMatrix

Parameters: original_position, original_size, new_position, new_size (position is represented as { left, top }, size is represented as { width, height })

Return: scaling matrix

Given the original and new dimensions of a selection box in the canvas, calculate and return the math.js matrix necessary to scale the box from the original to the new coordinates. Scaling normally ends up translating, so the matrix returned by this function will negate that translation.

Audio Model getAudioTracks

Return: array containing all audio tracks setAudioTracks

Parameters: tracks (array containing audio tracks)

Sets the audio tracks to the specified tracks addTrack

Parameters: track, insert_index (optional argument)

Adds the track to the end of the audio tracks, unless the insertion index is specified, then insert the track at the chosen index. removeTrack

53

Parameters: track

Return: true if completes, false otherwise

Removes the specified audio track getDuration

Return: the total duration of the audio (in milliseconds), which is the max of the all audio track lengths. Returns 0 if no audio tracks getBlobURLs

Return: an array of all the unique audio blob URLs saveToJSON

Return: a JSON object containing the audio JSON

Saves the model to JSON loadFromJSON

Parameters: json_object (JSON object containing the audio information)

Return: audio model populated with the information from the JSON object getAudioSegments

Return: an array of all audio segments setAudioSegments

Parameters: segments

54

Sets the segments in the track to the specified segments insertSegment

Parameters: new_segment, do_shift_split

Return: true if insert succeeded, unless a split occurs. If there is a split returns an object {left, right, remove} with the left and right side of the split segment and the segment that was removed to become the left and right parts.

Insert the provided segment. Note: another segments in the track may need to be split to insert the specified new segment. addSegment

Parameters: segment

Add the segment to the audio segments array. removeSegment

Parameters: segment

Return: rue of the segment is removed

Removes the specified audio segment. canShiftSegment

Parameters: segment, shift_millisec

Return: true if the shift is valid, otherwise return the shift value of the greatest magnitude that would have produced a valid shift

55

Determines whether the specified segment can be shifted to the left or right. If a negative number is given for shift_millisec, then the shift will be left. The final value of the segment starting time cannot be negative. The segment cannot overlap existing segments in the track. If the shift will cause either of these conditions to be true, then the shift cannot occur. shiftSegment

Parameters: segment, shift_millisec, check (optional and defaults to true. If false, shift is performed without checking for validity)

Return: true if shift succeeds, return the shift value of the greatest magnitude that would have produced a valid shift

Shifts the specified segment left or right by a certain number of milliseconds. If a negative number is given for shift_millisec, then the shift will be left. canCropSegment

Parameters: segment, crop_millisec, left_side (boolean indicating whether the left side is being cropped)

Return: Returns true if the crop is valid, otherwise returns a crop millisecond of the greatest magnitude that would have produced a valid crop

Determines whether the specified segment can be cropped on the left or right. If a negative number is given for crop_millisec, then the crop will shrink the segment. If a positive number is given for crop_millisec, then the crop will extend the segment. The segment cannot overlap existing segments in the track. The segment cannot extend past the audio length and cannot shrink below a length of 0.

56

cropSegment

Parameters: segment, crop_millisec, left_side (boolean indicating whether the left side is being cropped), check (optional and defaults to true. If false, it will crop without checking for validity)

Return: Returns true if the crop is valid, otherwise returns a crop millisecond of the greatest magnitude that would have produced a valid crop

Crop the specified segment by the specified number of milliseconds. If a negative number is given for crop_millisec, then the crop will shrink the segment side endTime

Return: the end time of the track in milliseconds, which is the greatest segment end time.

Returns 0 if the track is empty. saveToJSON

Return: a JSON object containing the audio track JSON

Saves the audio tracks to JSON loadFromJSON

Parameters: json_object (JSON object containing the audio information)

Return: audio track with the information from the JSON object audioResource

Return: the URL of the audio resource blob needed for playback totalAudioLength

57

Return: total length of the audio resource blob lengthInTrack

Return: the length of the segment in the track audioLength

Return: the length of the audio that should be played back splitSegment

Parameters: splitTime

Return: an object {left, right} with two segments that are the result of splitting the segment at the specified track time. Returns null if the track time does not intersect the segment within

(start_time, end_time)

Splits an audio segment at the specified time trackToAudioTime

Parameters: trackTime

Return: audio time. Returns false if the given track time is invalid.

Converts a track time to the corresponding time in the audio resource at the current scale audioToTrackTime

Parameters: audioTime

Return: track time. Returns false if given audio time is invalid.

Converts a time in the audio resource to the corresponding time in the track at the current scale

58

saveToJSON

Return: a JSON object containing the audio segment JSON

Saves the audio segment to JSON loadFromJSON

Parameters: json_object (JSON object containing the audio segment information)

Return: audio segment with the information from the JSON object

Audio Controller getAudioModel

Return: the audio model addTrack

Creates a new track in the model to add. Redraws the audio timeline removeTrack

Remove a track from the audio model. Redraws the audio timeline changeActiveTrack

Parameters: index (index of track to make active)

Changes the active track index to refer to another track

59

startRecording

Parameters: currentTime

Starts recording the audio at the given track time (in milliseconds) stopRecording


Ends the recording (only applies if there is an ongoing recording) startPlayback


Begins audio playback at the given track time (in milliseconds) stopPlayback


Stops all playback activity. millisecondsToPixels

Parameters: millSec

Return: pixel value

Converts milliseconds to pixels according to the current audio timeline scale pixelsToMilliseconds

Parameters: pixels

60

Return: millisecond value

Converts pixels to milliseconds according to the current audio timeline scale tickFormatter

Parameters: tickpoint

Return: time (e.g. 00:30:00)

Changes tickpoints into time display (e.g. 00:30:00). Each tickpoint unit is one second which is then scaled by the audio timeline scale. disableEditUI

Disables all UI functionality for editing audio (used during recording and playback) enableEditUI

Enables all UI functionality for editing audio (used when recording or playback stops) drawTracksContainer

Return: jquery tracks container object

Draw the container that will be used to hold audio tracks pluginTopOffset

Parameters: pluginIndex

Return: offset from the top of the tracks container (in pixels)

61

Gets the offset (pixels) from the top of the tracks container for the nth plugin. Using a pluginIndex equal to the number of plugins will return the offset needed by the tracks that are drawn under the plugins. refreshGradations

Redraw the gradations container to fit the current audio tracks drawGradations

Draw the graduation marks on the audio timeline refreshPlayhead

Refreshes played position drawPlayehead

Draws the playhead for showing playback location zoom

Parameters: zoomOut (default true to indicate zoom out, false means zoom in)

Zooms the audio timeline in or out draw

Draws all parts of the audio timeline onto the page updatePlayheadTime

62


Updates the current time (ms) of the audio timeline (the time indicated by the playhead) updateTicker

Parameters: time

Updates the ticker display indicating the current time as a string timelineClicked

Parameters: event

When the timeline is clicked, update the playhead to be drawn at the time of the clicked position. addTimelinePlugin

Parameters: plugin

Adds the plugin to the list of plugins getTimelinePluginID

Parameters: plugin

Return: the ID of the plugin, which is calculated as the base plus the index of the plugin in the array

63

Track Controller getID

Return: the ID of the track getLength

Return: the length of the track (in milliseconds) getAudioTrack

Return: the audio track insertSegment

Parameters: newSegment (segment to be inserted)

Insert a new segment into the audio track removeSegment

Parameters: segment

Remove a segment from the track segmentDragStart

Parameters: event, ui, segmentController

Callback for when a segment UI div starts to be dragged. Sets initial internal variables. segmentDragging

64


Callback for when a segment UI div is being dragged. Tests whether or not the drag is valid. If the dragging is valid, it does nothing, allowing the segment UI div to be dragged to the new position. If the dragging is invalid, it sets the segment UI div back to the last valid position. segmentDragFinish


Callback for when a segment UI div is finished being dragged. Performs the drag in the audio model. segmentCropStart


Callback for when a segment UI div starts to be cropped. Sets the initial internal variables. segmentCropping


Callback for when a segment UI div is being cropped. If the cropping is valid, it does nothing. If the cropping is invalid, it sets the UI div back to the original size and position. segmentCropFinish


Callback for when a segment UI div has finished being cropped. The cropping should always be valid because the 'segmentCropping' callback only allows cropping to happen in valid ranges.

Performs the crop in the audio track.

65

removeFocusedSegments

Remove all segments that have focus. startPlayback

Parameters: startTime, endTime

Start the playback of the track at the specified time interval. Stops the previous playback if there is one currently going. The time is specified in milliseconds. If the end time is not specified, playback goes until the end of the track. stopPlayback

Stop the playback of the track. Does nothing if the track is not playing. refreshView

Refresh the view to reflect the state of the model for an audio track draw

Parameters: jqParent (jQuery container where track should be drawn)

Return: a new jQuery track

Draw a track into the parent jQuery container

66

Segment Controller getID

Return: the segment ID getWavesurferContainerID

Return: the ID of the wavesurfer container getClassName

Return: the name of the class used to represent audio segments getAudioSegment

Return: the audio segment getParentTrackController

Return: the parent track controller startPlayback

Parameters: delay, trackStartTime, trackEndTime

Play the audio segment back after a delay at the specified time interval (milliseconds). If the end time is undefined, play until the end. If playback is currently going or scheduled, then cancel the current and start a new one. stopPlayback

67

Stop any ongoing or scheduled playback refreshView

Refresh the view to reflect the state of the model for the audio segment draw

Parameters: jqParent (jQuery container where segment should be drawn)

Return: a new jQuery segment

Draw a segment into the parent jQuery container shiftWavesurferContainer

Parameters: pixelShift

Shift the internal wavesurfer container left (negative) or right (positive) in pixels. This is used when cropping to move the container so the cropping motion looks natural.

Retimer Model getConstraints

Return: an array containing all of the constraints makeConstraintDirty

Parameters: constraint

Return: the constraint having been disabled

68

cleanConstraints

Parameters: constraint, amount (amount to shift the original time of the constraint)

Shifts the dirty constraints by the specified amount (from their original time) and enables the constraints. checkConstraint


Return: true if this is a valid constraint, false otherwise

Check to see if the constraint is in a valid position updateConstraintVisualsTime

Parameters: constraint, audioTimeCorrespondingToNewVisualsTime, test (default is false, optional Boolean indicating whether to test the update without actually updating)

Return: a Boolean indicating whether the update was successful

Update the visuals part of the constraint located at the specified audio time (tAud) updateConstraintAudioTime

Parameters: constraint, newTAudio, test (default is false, optional Boolean indicating whether to test the update without actually updating)

Return: a Boolean indicating whether the update was successful

Update the audio part of the constraint located at the specified visuals time (tVid) addConstraint

69


Return: true if constraint is successfully added

Add a constraint to the lecture deleteConstraint


Deletes the specified constraint shiftConstraints

Parameters: constraints, amount

Shifts the specified constraints by the specified amount of time getConstraintsIterator

Return: an iterator over all constraints getPreviousConstraint

Parameters: time, type (visual or audio)

Return: the constraint that appears in time before the time of the given constraint getNextConstraint

Parameters: time, type (visual or audio)

Return: the constraint that appears in time after the time of the given constraint getVisualTime

70

Parameters: audioTime

Return: visual time associated with the given audio time

Converts audio time to visual time getAudioTime

Parameters: visualTime

Return: audio time associated with the given visual time

Converts visual time to audio time saveToJSON

Return: a JSON object containing the constraints JSON information

Saves the constraints to JSON loadFromJSON

Parameters: json_object

Return: an instance of the retimer model with the data specified in the JSON object (loaded from a file)

71

Retimer Controller addArrowHandler

Parameters: event

The event handler for when a user clicks on the constraints canvas after clicking on the “add constraint” button. It adds the constraint to the model, and then draws the arrow on the canvas drawTickMarks

Draws tick marks on the retimer canvas to indicate how quickly or slowly the visuals are being played back. (Note: not active currently, interpolation isn’t working properly) drawConstraint

Parameters: constraint_num (unique id for each constraint added (incremented by the retimer)

Draw the constraint on the constraints canvas (for manual/user added constraints) redrawConstraints

Refresh the canvas and redraw the constraints redrawConstraint

Parameters: constraint, constraint_num

Redraw an individual constraint on the retimer canvas addConstraint

When a user adds a constraint, add the constraint to the retimer model

72

selectArea

Parameters: event

Handles the event when a user clicks on the retimer canvas to select a constraint selectionDrag

Parameters: event

As a user drags along the retimer canvas the selection box is updated and drawn endSelect

Parameters: event

Handles the end of a selection dragging along the retimer canvas selectConstraints

Parameters: event

Finds the constraints that are within the selection area displaySelectedConstraints

Parameters: event

Redraws the constraints that have been selected to be displayed in red deleteConstraints

Parameters: event

Deletes the selected constraint(s) from the retimer model

73

constraintDragStart

Parameters: layer (jCanvas layer containing the constraint to be dragged)

When dragging starts, record whether the drag is for the top or bottom of the arrow (visuals end or audio end respectively) and record the original x position of that end of the arrow. constraintDrag

Parameters: layer (jCanvas layer containing the constraint being dragged)

Dragging moves one end of the arrow while the other tip remains in place constraintDragStop

Parameters: layer (jCanvas layer containing the constraint that has stopped being dragged)

When dragging stops, update the visuals or audio time of the constraint depending on whether the drag was top or bottom. Updates the thumbnails accordingly. constraintDragCancel

Parameters: layer (jCanvas layer containing the constraint being dragged)

When dragging cancels (i.e. if a user drags the constraint off the canvas), it should reset to its original value. beginRecording


Adds automatic constraints at the beginning of a recording

74

endRecording


Add an automatic constraint at the end of a recording

Thumbnails Controller drawThumbnails

Draw the thumbnails whenever the visuals in the main window are updated or changed.

Calculates number of thumbnails to draw. Setup all the thumbnail canvases (each thumbnail is drawn on a separate canvas). Iterate over the number of thumbnails and call generate thumbnail. generateThumbnail

Parameters: thumbOffset (the number of the thumbnail in the sequence of all of the thumbnails), visuals_min (the minimum time to be displayed by the current thumbnail), visuals_max

(maximumm time to be displayed by the current thumbnail), thumbnail_width (the width of the thumbnails canvas, specified to ensure that it will line up with the audio timeline)

Generate a thumbnail by getting the visuals from the slides.

75

Appendix B: JSON Save Document

{

"visuals_model": {

"slides": [

{

"visuals": [

{

"type": "Stroke",

"hyperlink": null,

"tDeletion": null,

"propertyTransforms": [],

"spatialTransforms": [],

"tMin": 947,

"properties": {

"c": "#777",

"w": 2

},

"vertices": [

{

"x": 92.0625,

"y": 31,

"t": 949

},

76

{

"x": 92.0625,

"y": 32,

"t": 1034

},

{

"x": 93.0625,

"y": 33,

"t": 1046

},

{

"x": 93.0625,

"y": 34,

"t": 1059

},

{

"x": 93.0625,

"y": 36,

"t": 1073

}

]

},

{

77

"type": "Stroke",

"hyperlink": null,

"tDeletion": null,



"tMin": 2531,

"properties": {

"c": "#777",

"w": 2

},

"vertices": [

{

"x": 163.0625,

"y": 56,

"t": 2531

},

{

"x": 166.0625,

"y": 53,

"t": 2594

},

{

"x": 168.0625,

78

"y": 51,

"t": 2603

},

{

"x": 171.0625,

"y": 50,

"t": 2617

},

{

"x": 174.0625,

"y": 48,

"t": 2629

}

]

},

{

"type": "Stroke",

"hyperlink": null,

"tDeletion": null,



"tMin": 9468,

"properties": {

79

"c": "#777",

"w": 2

},

"vertices": [

{

"x": 125.0625,

"y": 258,

"t": 9470

},

{

"x": 125.0625,

"y": 257,

"t": 9491

},

{

"x": 127.0625,

"y": 254,

"t": 9522

},

{

"x": 131.0625,

"y": 251,

"t": 9528

80

}

]

}

],

"duration": 23116

},

{

"visuals": [

{

"type": "Stroke",

"hyperlink": null,

"tDeletion": null,



"tMin": 947,

"properties": {

"c": "#777",

"w": 2

},

"vertices": [

{

"x": 92.0625,

"y": 31,

81

"t": 949

},

{

"x": 92.0625,

"y": 32,

"t": 1034

},

{

"x": 93.0625,

"y": 33,

"t": 1046

},

{

"x": 93.0625,

"y": 34,

"t": 1059

}

]

},

{

"type": "Stroke",

"hyperlink": null,

"tDeletion": null,

82



"tMin": 2531,

"properties": {

"c": "#777",

"w": 2

},

"vertices": [

{

"x": 163.0625,

"y": 56,

"t": 2531

},

{

"x": 166.0625,

"y": 53,

"t": 2594

},

{

"x": 168.0625,

"y": 51,

"t": 2603

},

83

{

"x": 171.0625,

"y": 50,

"t": 2617

},

{

"x": 174.0625,

"y": 48,

"t": 2629

}

]

},

{

"type": "Stroke",

"hyperlink": null,

"tDeletion": null,



"tMin": 9468,

"properties": {

"c": "#777",

"w": 2

},

84

"vertices": [

{

"x": 125.0625,

"y": 258,

"t": 9470

},

{

"x": 125.0625,

"y": 257,

"t": 9491

},

{

"x": 127.0625,

"y": 254,

"t": 9522

},

{

"x": 131.0625,

"y": 251,

"t": 9528

}

]

}

85

],

"duration": 23116

}

],

"canvas_width": 800,

"canvas_height": 500

},

"audio_model": {

"audio_tracks": [

{

"audio_segments": [

{

"audio_clip": 0,

"total_audio_length": 12528,

"audio_start_time": 0,

"audio_end_time": 12528,

"start_time": 0,

"end_time": 12528

},

{

"audio_clip": 1,

"total_audio_length": 4399,

"audio_start_time": 0,

86

"audio_end_time": 4399,

"start_time": 12528,

"end_time": 16927

}

]

}

]

},

"retimer_model": {

"constraints": [

{

"tVis": 0,

"tAud": 0,

"constraintType": "Automatic"

},

{

"tVis": 6650,

"tAud": 6650,

"constraintType": "Manual"

},

{

"tVis": 9525,

"tAud": 9525,

87


},

{

"tVis": 12528,

"tAud": 12528,


},

{

"tVis": 14500,

"tAud": 14500,


},

{

"tVis": 16927,

"tAud": 16927,

}

}


}

]

88

Pentimento: Non-sequential Authoring of Handwritten Lectures

Pentimento:

Non-sequential Authoring of Handwritten Lectures

Abstract

Introduction

User Guide and Tutorial

The Main Recording UI

Code Organization Overview

Code Design

Lecture

Time

Audio

Visuals

Retimer

Undo Manager

Renderer

Save and Load Files

Discussion

Audio

Other External Libraries

Future Improvements

Acknowledgements

Appendix

Appendix A: Documentation

Appendix B: JSON Save Document

Related documents

Products

Support

Pentimento: Non-­sequential Authoring of Handwritten Lectures

Pentimento:

Non-­sequential Authoring of Handwritten Lectures

Abstract

Introduction

User Guide and Tutorial

The Main Recording UI

Code Organization Overview

Code Design

Lecture

Time

Audio

Visuals

Retimer

Undo Manager

Renderer

Save and Load Files

Discussion

Audio

Other External Libraries

Future Improvements

Acknowledgements

Appendix

Appendix A: Documentation

Appendix B: JSON Save Document

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib

Pentimento: Non-sequential Authoring of Handwritten Lectures

Non-sequential Authoring of Handwritten Lectures